
Do We Need Zero Training Loss After Achieving Zero Training Error? - sillysaurusx
https://arxiv.org/abs/2002.08709
======
sillysaurusx
This technique is called "flood loss," and it's one of the most underrated
papers in ML due to its implementation simplicity:
[https://twitter.com/theshawwn/status/1253505386172145666](https://twitter.com/theshawwn/status/1253505386172145666)

Loss getting too low?

    
    
      loss = abs(loss - x) + x
    

where x is a value like 0.2.

Presto, your loss is no longer <0.2.

We have some pretty shocking screenshots of before/after using this technique
for our BigGAN training runs.

Before flood loss, no progress:
[https://media.discordapp.net/attachments/696383989145010216/...](https://media.discordapp.net/attachments/696383989145010216/702832115162808340/individualImage.png)

After flood loss:
[https://media.discordapp.net/attachments/696383989145010216/...](https://media.discordapp.net/attachments/696383989145010216/702831280634462228/individualImage.png)

Both of these are screenshots from around step ~3k, about an hour into
training. As you can see, flood loss improved things immediately and
dramatically.

Other people are reporting that it's stabilizing their runs as well, but
that's hearsay for the moment.

