Hacker News new | past | comments | ask | show | jobs | submit login
Orca: Progressive Learning from Complex Explanation Traces of GPT-4 (arxiv.org)
113 points by kordlessagain on June 13, 2023 | hide | past | favorite | 29 comments



I kinda assume that work on LLMs published by major companies is a red herring.

I worked at a company once, where four teams tried different techniques. The best approach was shipped to users and the other three ended up as papers at a major conference.


I think it can be informative generally of what they may be targeting. Here, it seems that the interest is in somehow reducing the size and 'throwing away' some factual information for more reasoning capability (obviously this is very loose and uses undefined terms). The 'factual knowledge' only needs to be roughly available with a way to do a look up to recall information.


If you think about a P=NP proof… you show it’s false and collect a million dollars, but if you prove it’s true, you sit tight on your buttocks and print money with your top secret solution.

Just people playing the capitalism game.


It's all fun and games until you end up with companies doing "peer review" and purposefully tanking good papers, making it seem a particular research direction isn't fruitful, while secretly using that same work to get ahead of the competition.


If the output of prompts can be improved with "think step by step", "tree of thoughts", etc., one can produce high quality training data with the LLM itself which the next iteration of that model can use. Rinse and repeat.


Like law of conservation energy, matter and momentum, I guess there must be law of conservation of knowledge in context of LLMs?


You can't do better than perfect. But if the default is sufficiently bad, you can seem to violate conservation by putting more effort in.

Similarly to when a machine has an efficiency of 1%, optimizing it to 10% does not violate thermodynamics.

There is a limit to what you can learn from limited knowledge. Every bit of information you learn can only divide the space of possible theories by half. However, current LLMs are ludicrously far from that limit.


There must be. Since there is a finite amount of real world reward signal captured by these models, there can only be a finite amount of grounded knowledge that can come out. Usually without reasoning, the output knowledge "yield" will be far lower than input.

The interesting piece is if we can get LLMs to reason well, that finite reward capture can take LLMs much farther than possibly humans could do with the same signal.


Oof I really don’t think there must be such a law. The basic physics of the universe and the mathematical structure of a human engineered LLM are just very different things.


The "compressed" knowledge is fixed. But reasoning can create new knowledge, that's what humans do all the time.


Nope! That surely doesn't apply to humans, why would it necessarily apply to learning machines?


The human equivalent would be the process of armchair philosophizing. Humans get around the limit by being able to go and look at the world to obtain more knowledge -- if you didn't let humans look at the world, they too would have a "knowledge limit".

And if you _do_ let LLMs look at the world, and train on that, they also wouldn't have that limitation.


My Pareto law senses are tingling.


> one can produce high quality training data with the LLM itself which the next iteration of that model can use. Rinse and repeat.

it may be higher quality, but far from perfection.

After many rinses and repeats dataset will accumulate significant amount of errors.


That depends. You can implement an error checking algorithm with the LLM itself to check for drift in meaning and loss of original context from version to version as you improve your training dataset, as well as measure performance as you train new versions of your LLM on that dataset.


I think curriculum training is going to come back around now that we have found an approach that generally works. That seems like the direction a lot of these papers are going.


Any citations to past work in that area which you like? (TIA)


I suppose the paper in TFA will have helpful citations.


What reason do we have to think that the "explain like I'm five" output actually represents the reasoning process of the upstream LLM, and not something completely retconned? It often would be in humans, after all.


It could be retconned, but it at least gives the model the chance to use the space to work out a solution in smaller, easier parts.

If you ask for an answer first and then ask for an explanation after, you will most likely get self-justifying BS for an answer. You have to ask for the explanation first.


Yeah, I can easily buy that it makes it easier to learn a process. Just maybe not the one that the upstream LLM actually used.


All the LLM is doing is generating text. Forcing it to describe a step-by-step reasoning process is the only way to get it to work through one. And the whole point of using this as a prompting strategy is that it often gets you something different and better than what you might call the "first guess" it gives you if you don't ask for intermediate results.


Was confused because I thought this was in reference to the Quantum Physics software Orca, would be interesting if AI could be used in such a way though


The first thing that came to my mind was Orca (ORCΛ), the audiovisual cellular automota. https://hundredrabbits.itch.io/orca

https://twitter.com/search?q=%23ORC%CE%9B&src=typed_query&f=...


Doesn't GPT-4 have some restriction on being able to use models trained this way commercially? Are people ignoring that?


Only if it's enforced, which doesn't seem too feasible


The paper was published a week ago, and it suppose to be an open source model ... Where is it? Why it is still not open?


Yes, exactly. Everyone is talking about how great it is, yet its nowhere to be found.


I'm waiting too. Given MS association with OpenAI, maybe there are some legal or financial negative incentives to release this open source.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: