Hacker News new | past | comments | ask | show | jobs | submit login

All similarly sparse data samples would suffer from the bachnorm issue. I don’t remember if I tried a convnet with batchnorm on galaxy classification but I did try it on piano rolls - it was bad - precisely because of batchnorm, and had I first tried the same model on mnist I would have caught the issue much faster (I tested it on cifar).

I suspect a chess position evaluation would suffer from batchnorm just as much, if the intermediate feature maps remain sparse.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: