> It is worth noting the parallels to Generative Adversarial Networks (GANs ), with the student
playing the role of generator, and the teacher playing the role of discriminator. As opposed to GANs,
however, the student is not attempting to fool the teacher in an adversarial manner; rather it cooperates
by attempting to match the teacher’s probabilities. Furthermore the teacher is held constant,
rather than being trained in tandem with the student, and both models yield tractable normalised
WaveNet2's main resemblance to a GAN is that it uses another neural network for the loss function.