- Arjovsky, M., Chintala, S., & Bottou, L. (2017). Wasserstein gan. arXiv preprint arXiv:1701.07875.
- Infinite anime faces
- Dataset Link
- No requirement of maintaining balance between discriminator and generator training
- mode collapse is reduced
- Use EM distance instead of KL divergence
- alpha = .00005, c = .01, m = 64, ncrit = 5
- KL divergence
- is a way of measuring the matching between two distributions
DKL(p∣∣q)∑p(xi)⋅(log p(xi)−log q(xi))
- Wasserstein Distance
- EM distance is continuous and differentiable a.e. means that
we can (and should) train the critic till optimality.
- The argument is simple, the
more we train the critic, the more reliable gradient of the Wasserstein we get, which
is actually useful by the fact that Wasserstein is differentiable almost everywhere.
- improved stability of the optimization process