Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
refulgentis
39 days ago
|
parent
|
context
|
favorite
| on:
30% drop in O1-preview accuracy when Putnam proble...
No, an absolute massive amount of people do. In fact they have been doing exactly as you recommend, because as you note, it's obvious and required for a basic proper evaluation.
Consider applying for YC's Spring batch! Applications are open till Feb 11.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: