I kinda assume that work on LLMs published by major companies is a red herring.
I worked at a company once, where four teams tried different techniques. The best approach was shipped to users and the other three ended up as papers at a major conference.
I think it can be informative generally of what they may be targeting. Here, it seems that the interest is in somehow reducing the size and 'throwing away' some factual information for more reasoning capability (obviously this is very loose and uses undefined terms).
The 'factual knowledge' only needs to be roughly available with a way to do a look up to recall information.
If you think about a P=NP proof… you show it’s false and collect a million dollars, but if you prove it’s true, you sit tight on your buttocks and print money with your top secret solution.
It's all fun and games until you end up with companies doing "peer review" and purposefully tanking good papers, making it seem a particular research direction isn't fruitful, while secretly using that same work to get ahead of the competition.
If the output of prompts can be improved with "think step by step", "tree of thoughts", etc., one can produce high quality training data with the LLM itself which the next iteration of that model can use. Rinse and repeat.
You can't do better than perfect. But if the default is sufficiently bad, you can seem to violate conservation by putting more effort in.
Similarly to when a machine has an efficiency of 1%, optimizing it to 10% does not violate thermodynamics.
There is a limit to what you can learn from limited knowledge. Every bit of information you learn can only divide the space of possible theories by half. However, current LLMs are ludicrously far from that limit.
There must be. Since there is a finite amount of real world reward signal captured by these models, there can only be a finite amount of grounded knowledge that can come out. Usually without reasoning, the output knowledge "yield" will be far lower than input.
The interesting piece is if we can get LLMs to reason well, that finite reward capture can take LLMs much farther than possibly humans could do with the same signal.
Oof I really don’t think there must be such a law. The basic physics of the universe and the mathematical structure of a human engineered LLM are just very different things.
The human equivalent would be the process of armchair philosophizing. Humans get around the limit by being able to go and look at the world to obtain more knowledge -- if you didn't let humans look at the world, they too would have a "knowledge limit".
And if you _do_ let LLMs look at the world, and train on that, they also wouldn't have that limitation.
That depends. You can implement an error checking algorithm with the LLM itself to check for drift in meaning and loss of original context from version to version as you improve your training dataset, as well as measure performance as you train new versions of your LLM on that dataset.
I think curriculum training is going to come back around now that we have found an approach that generally works. That seems like the direction a lot of these papers are going.
What reason do we have to think that the "explain like I'm five" output actually represents the reasoning process of the upstream LLM, and not something completely retconned? It often would be in humans, after all.
It could be retconned, but it at least gives the model the chance to use the space to work out a solution in smaller, easier parts.
If you ask for an answer first and then ask for an explanation after, you will most likely get self-justifying BS for an answer. You have to ask for the explanation first.
All the LLM is doing is generating text. Forcing it to describe a step-by-step reasoning process is the only way to get it to work through one. And the whole point of using this as a prompting strategy is that it often gets you something different and better than what you might call the "first guess" it gives you if you don't ask for intermediate results.
Was confused because I thought this was in reference to the Quantum Physics software Orca, would be interesting if AI could be used in such a way though
I worked at a company once, where four teams tried different techniques. The best approach was shipped to users and the other three ended up as papers at a major conference.