Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
Just Ask for Generalization (2021)
(
evjang.com
)
38 points
by
jxmorris12
13 days ago
|
hide
|
past
|
favorite
|
4 comments
xg15
13 days ago
|
next
[–]
(2021), still very interesting. Especially the "post-overfitting" training strategy is unexpected.
reply
dev_hugepages
12 days ago
|
parent
|
next
[–]
This is talking about the double descent phenomenon (
https://en.wikipedia.org/wiki/Double_descent
)
reply
luckystarr
12 days ago
|
parent
|
prev
|
next
[–]
I remember vaguely that this was observed when training GPT-3 (probably?) as well. Just trained on and on, and the error went up and then down again. Like a phase transition in the model.
reply
esafak
12 days ago
|
prev
[–]
The low sample efficiency of RL is well explained.
reply
Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search:
reply