Hacker News new | past | comments | ask | show | jobs | submit login

I'd love to read a semi-technical book on everything that we've learned about what works and what does not on LLMs.





It would be out of date in months.

Things that didn’t work 6 months ago do now. Things that don’t work now, who knows…


There are still some tropes from the GPT-3 days that are fundamental to the construction of LLMs that affect how they can be used and will not change unless they no longer are trained to optimize for next-token-prediction (e.g. hallucinations and the need for prompt engineering)

Do you mean performance that was missing in the past is now routinely achieved?

Or do you actually mean that the same routines and data that didn't work before suddenly work?


B

Each new model opens up new possibilities for my work. In a year it's gone from sort of useful but I'd rather write a script, to "gets me 90% of the way there with zero shots and 95% with few-shot"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: