Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

grokking doesn't and will not have practical uses, imo - it is just an experiment that revealed cool things that we mostly already suspected about implicit regularization

however, techniques we learn from grokking about implicit regularization might be helpful for the training regimes we actually use



> grokking doesn't and will not have practical uses, imo

I'm not so sure. Reasoning is the next big hurdle, and grokking and parametric memory seem very effective here.

[1] https://arxiv.org/abs/2405.15071




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: