Hacker News new | past | comments | ask | show | jobs | submit login

Could some of the "wrong" answers be the LLM attempting to give an explanation rather than the answer, eg. instead of answering 'X', the LLM answers 'The letter is partially hidden by the oval, so cannot be certain, but it appears to be the english letter X'.

The scoring criteria would rank this answer as 'T', which is wrong.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: