Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's not that memory usage isn't important, it's that dividing error by memory gives you a useless number. The benefit from incremental error decrease is highly nonlinear, as with memory. Improving error by 1% matters a lot more starting from 10% error than 80%. Also a model that used no memory and got everything wrong would have the best score.


I see, I agree with you. But I would imagine the useful metric to be “error rate below X GB memory”. We really just need memory and/or compute reported when these evaluations are performed to compile that. People do it for training reports since compute and memory is implicit based on training time (since people saturate it and report what hardware they’re using). But for inference no such details :\




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: