Klambauer, G., Unterthiner, T., Mayr, A., & Hochreiter, S. (2017). Self-normalizing neural networks. In Advances in neural information processing systems (pp. 971-980). Paper
Notes
train deep networks with many layers
employ strong regularization schemes
which induce self-normalizing properties like variance stabilization which in turn avoids exploding and vanishing gradients.