Hacker News new | past | comments | ask | show | jobs | submit login

The significance of the paper is moreso the implications of how far ICL can take you rather than the ease/viability of the solution proposed.

Sure there are better methods for arithmetic but arithmetic is extremely quantifiable with rigid steps. What happens when you step out of that kind of domain ? Like the above blog. or Code documentation. For example, you can paste new documentation in a gpt-4 instance and it will use it for your queries as if it trained on it.

Basically Memory Augmented Large Language Models are Computationally Universal https://arxiv.org/abs/2301.04589. and you kind of get the feeling of that from the previous paper.




You've got a limited context window (for now). There's only so much you can put into a prompt, so how much you can teach it this way is going to be pretty limited. Whatever you teach it had better be the primary task you're using it for.

You can't do it for everything, but if you can generate code and run it outside the LLM, you should.


The limits of the context window become much less important (but can still be a problem I agree) when crucial context can be dynamically inserted only when relevant.

Gpt-3.5 doesn't need the algorithm prompt for every single query. It just needs it for every query that requires arithmetic. Much more feasible.


It seems like most of the time, a calculator plugin would do a better job, though? When would it make sense to have it reason it out itself?


It wouldn't make sense for arithmetic, i agree. I was more thinking about other kinds of problems.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: