Hacker News new | past | comments | ask | show | jobs | submit login

Papers code is probably optimized for "first to publish". Also overfit in a non-traditional sense since ppl wanta to beat SOTA by as much as possible. Also the heuristic tips an tricks and autotuning you'd want in a production models would exceed paper lenth 10x. Also the author is motivated to NOT provide a bug free easy to put in production version of the code since that would lower the $ value of their expertise. A cocktail of all the wrong incentives!

Probably the production versions of those models are suboptimal in different ways but work better in practice...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: