Hacker Newsnew | past | comments | ask | show | jobs | submit | more E-Reverance's commentslogin

> Residual connections are more than a trick to help gradients flow. They’re a conservation law.

> Not a hack, not a trick. A principled constraint that makes the architecture work at scale.


OK, I thought I was reading too much into it but those same sentences also jumped out for me


pangram thinks the whole thing was LLM generated fwiw, as dodgy as AI detectors are it is probably among the best. I don't doubt the author started with their own text, but I think it's been substantially revised via ChatGPT


yes this reads like classic intellectual fellicitatio


> blowing up kids

not to refute the difference in extent but this is somewhat notable https://en.wikipedia.org/wiki/Dahyan_airstrike


How was he equally brutal


I haven't tried this myself and this might be absurd, but attending PhD defences might be an interesting way to meet new people


It should be noted that this is NOT the official scores on the private evaluation set


Here it matters much less than in generic LLMs though. There's no chance of test set leakage since the network is not general purpose / not trained on the internet.


I didn't know how to title this. I definitely don't believe his proof claims but I found this whole event to be psychologically interesting


> Opus 4.5 likes it

And to think, people said peer review in academia is dead.


One can care about both


> I do actually believe that zero teenagers should make banking apps or run non-profits.

That sounds like a lot of fun and should be a pretty social experience.

Also I'm going to assume his parents are proud, which should put his family at ease.


Surprised there wasn't any mention of Equilibrium Matching [1] in the future work section

[1] https://raywang4.github.io/equilibrium_matching/


Just for reference, the main author's stance on god : https://youtu.be/k_VBzweMIlM?t=125


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: