Hacker Newsnew | past | comments | ask | show | jobs | submit | fromlogin
Models have some pretty funny attractor states (lesswrong.com)
3 points by semiquaver 1 hour ago | past | discuss
Shaping the exploration of the motivation-space matters for AI safety (lesswrong.com)
1 point by gmays 3 hours ago | past | discuss
Large-Scale Online Deanonymization with LLMs (lesswrong.com)
1 point by cubefox 3 hours ago | past | discuss
The optimal age to freeze eggs is 19 (lesswrong.com)
32 points by surprisetalk 4 hours ago | past | 27 comments
To the Polypropylene Makers (lesswrong.com)
88 points by raldi 1 day ago | past | 27 comments
Sacred Values of Future AIs (lesswrong.com)
1 point by gmays 3 days ago | past | discuss
Refusal in LLMs is mediated by a single direction (lesswrong.com)
2 points by rzk 4 days ago | past | discuss
Models have some pretty funny attractor states (lesswrong.com)
3 points by debesyla 5 days ago | past | discuss
Canada Lost Its Measles Elimination Status Because Few Nurses Speak Low German (lesswrong.com)
4 points by surprisetalk 7 days ago | past | 2 comments
AI found 12 OpenSSL zero-days (lesswrong.com)
24 points by theptip 9 days ago | past | 1 comment
Are there lessons from high-reliability engineering for AGI safety? (lesswrong.com)
1 point by Gathering6678 10 days ago | past | discuss
Responsible Scaling Policy v3 (lesswrong.com)
1 point by ndr 11 days ago | past | discuss
Great Mathematicians on Math Competitions(2010) (lesswrong.com)
1 point by o4c 12 days ago | past | discuss
Life at the Frontlines of Demographic Collapse (lesswrong.com)
4 points by reducesuffering 14 days ago | past | 1 comment
"Pinky Promise Diplomacy" Once Stopped a War in the Middle East (lesswrong.com)
2 points by positivesum 14 days ago | past
Childhoods of Exceptional People (2023) (lesswrong.com)
1 point by Kinrany 17 days ago | past
AI found 12 of 12 OpenSSL zero-days (lesswrong.com)
8 points by AndrewDucker 18 days ago | past | 8 comments
My journey to the microwave alternate timeline (lesswrong.com)
385 points by jstanley 18 days ago | past | 191 comments
Notes on International Klein Blue (lesswrong.com)
4 points by mhb 20 days ago | past
Why I'm Worried About Job Loss and Thoughts on Comparative Advantage (lesswrong.com)
85 points by cubefox 20 days ago | past | 124 comments
Spectral Signatures of Gradual Disempowerment (lesswrong.com)
1 point by OgsyedIE 26 days ago | past
To be well-calibrated is to be punctual (lesswrong.com)
2 points by surprisetalk 27 days ago | past
Google Translate apparently vulnerable to prompt injection (lesswrong.com)
59 points by julkali 30 days ago | past | 3 comments
AI found 12 of 12 OpenSSL zero-days (lesswrong.com)
2 points by greedo 32 days ago | past | 1 comment
52.5% of Moltbook posts show desire for self-improvement (lesswrong.com)
1 point by one-2 35 days ago | past
Are We in a Continual Learning Overhang? (lesswrong.com)
2 points by cubefox 38 days ago | past
AI found 12 of 12 OpenSSL zero-days (lesswrong.com)
2 points by jelsisi 38 days ago | past
A Simple Method for Accelerating Grokking (lesswrong.com)
2 points by vuciv 38 days ago | past
Test your interpretability techniques by de-censoring Chinese models (lesswrong.com)
2 points by allenleee 39 days ago | past
How AI is learning to think in secret (lesswrong.com)
2 points by jstanley 40 days ago | past | 1 comment

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: