Hacker News new | past | comments | ask | show | jobs | submit login

I don't think that's accurate, it generates novel outputs that were not observed in the training data.



It doesn't generate new tokens.

Train an LLM on text that only uses lowercase, and it will never output an uppercase letter.


So the model is limited to using words and characters that already exist. I agree with you but I don't see why is a limitation worth pointing out.


you literally have to put in every number for it to do mathematics correctly...

its as stupid as that. some try to get around it by indeed only having the 10 different digits and glue them together, but its a hallucination that that works.

an important point in generalization is for example that you teach it something. This is literally important

'ycombinator is a website' is a prompt that is almost impossible of ycombinator is not in your training set


But can it put two tokens together

10 01 = 1001?




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: