Hacker News new | past | comments | ask | show | jobs | submit login

> As long as an LLM is capable of inserting "9.99 > 10.01?" into an evaluation tool, we're on a good way

chatgpt will switch to python for some arithmetic with the result that you get floating point math issues when a 8yo will get the result right. I think "switch to a tool" still requires understanding of which tool to get a reliable result, which in turn means understanding the problem. It's an interesting issue.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: