Hacker News new | past | comments | ask | show | jobs | submit login
On the Difficulty of Extrapolation with NN Scaling (lukemetz.com)
21 points by ericjang on Jan 26, 2022 | hide | past | favorite | 2 comments



It seems like fancy hyperparameter optimization techniques (e.g. Bayesian black-box optimization) probably don't help here either, because they don't solve the problem of extrapolating outside the range of hyperparameter values have have already been tried. Is that a valid conclusion?


I think in theory those techniques should still work in the sense that they give you the best prediction (for some definition of “best”) of the next point to test given all the previous information, but the more hyperparameters you can vary and the further you extrapolate from observations the more likely it is that something surprising happens. You should not expect a fancy tuning algorithm to anticipate surprises—they’re designed to do the opposite by exploiting predictable trends.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: