Orca: Progressive Learning from Complex Explanation Traces of GPT-4

hiddencost · on June 13, 2023

I kinda assume that work on LLMs published by major companies is a red herring.

I worked at a company once, where four teams tried different techniques. The best approach was shipped to users and the other three ended up as papers at a major conference.

chaxor · on June 13, 2023

I think it can be informative generally of what they may be targeting. Here, it seems that the interest is in somehow reducing the size and 'throwing away' some factual information for more reasoning capability (obviously this is very loose and uses undefined terms). The 'factual knowledge' only needs to be roughly available with a way to do a look up to recall information.

baq · on June 13, 2023

If you think about a P=NP proof… you show it’s false and collect a million dollars, but if you prove it’s true, you sit tight on your buttocks and print money with your top secret solution.

Just people playing the capitalism game.

TeMPOraL · on June 13, 2023

It's all fun and games until you end up with companies doing "peer review" and purposefully tanking good papers, making it seem a particular research direction isn't fruitful, while secretly using that same work to get ahead of the competition.

davidkunz · on June 13, 2023

If the output of prompts can be improved with "think step by step", "tree of thoughts", etc., one can produce high quality training data with the LLM itself which the next iteration of that model can use. Rinse and repeat.

wg0 · on June 13, 2023

Like law of conservation energy, matter and momentum, I guess there must be law of conservation of knowledge in context of LLMs?

FeepingCreature · on June 13, 2023

You can't do better than perfect. But if the default is sufficiently bad, you can seem to violate conservation by putting more effort in.

Similarly to when a machine has an efficiency of 1%, optimizing it to 10% does not violate thermodynamics.

There is a limit to what you can learn from limited knowledge. Every bit of information you learn can only divide the space of possible theories by half. However, current LLMs are ludicrously far from that limit.

imranq · on June 13, 2023

There must be. Since there is a finite amount of real world reward signal captured by these models, there can only be a finite amount of grounded knowledge that can come out. Usually without reasoning, the output knowledge "yield" will be far lower than input.

The interesting piece is if we can get LLMs to reason well, that finite reward capture can take LLMs much farther than possibly humans could do with the same signal.

SequoiaHope · on June 13, 2023

Oof I really don’t think there must be such a law. The basic physics of the universe and the mathematical structure of a human engineered LLM are just very different things.

davidkunz · on June 13, 2023

The "compressed" knowledge is fixed. But reasoning can create new knowledge, that's what humans do all the time.

tehsauce · on June 13, 2023

Nope! That surely doesn't apply to humans, why would it necessarily apply to learning machines?

JoshuaDavid · on June 14, 2023

The human equivalent would be the process of armchair philosophizing. Humans get around the limit by being able to go and look at the world to obtain more knowledge -- if you didn't let humans look at the world, they too would have a "knowledge limit".

And if you _do_ let LLMs look at the world, and train on that, they also wouldn't have that limitation.

reportgunner · on June 13, 2023

My Pareto law senses are tingling.

riku_iki · on June 13, 2023

> one can produce high quality training data with the LLM itself which the next iteration of that model can use. Rinse and repeat.

it may be higher quality, but far from perfection.

After many rinses and repeats dataset will accumulate significant amount of errors.

Tostino · on June 15, 2023

That depends. You can implement an error checking algorithm with the LLM itself to check for drift in meaning and loss of original context from version to version as you improve your training dataset, as well as measure performance as you train new versions of your LLM on that dataset.

Tostino · on June 13, 2023

I think curriculum training is going to come back around now that we have found an approach that generally works. That seems like the direction a lot of these papers are going.

theptip · on June 13, 2023

Any citations to past work in that area which you like? (TIA)

SequoiaHope · on June 13, 2023

I suppose the paper in TFA will have helpful citations.

regularfry · on June 13, 2023

What reason do we have to think that the "explain like I'm five" output actually represents the reasoning process of the upstream LLM, and not something completely retconned? It often would be in humans, after all.

sp332 · on June 13, 2023

It could be retconned, but it at least gives the model the chance to use the space to work out a solution in smaller, easier parts.

If you ask for an answer first and then ask for an explanation after, you will most likely get self-justifying BS for an answer. You have to ask for the explanation first.

regularfry · on June 13, 2023

Yeah, I can easily buy that it makes it easier to learn a process. Just maybe not the one that the upstream LLM actually used.

rst · on June 13, 2023

All the LLM is doing is generating text. Forcing it to describe a step-by-step reasoning process is the only way to get it to work through one. And the whole point of using this as a prompting strategy is that it often gets you something different and better than what you might call the "first guess" it gives you if you don't ask for intermediate results.

sigseg-v · on June 13, 2023

Was confused because I thought this was in reference to the Quantum Physics software Orca, would be interesting if AI could be used in such a way though

Conscat · on June 13, 2023

The first thing that came to my mind was Orca (ORCΛ), the audiovisual cellular automota. https://hundredrabbits.itch.io/orca

https://twitter.com/search?q=%23ORC%CE%9B&src=typed_query&f=...

ilaksh · on June 13, 2023

Doesn't GPT-4 have some restriction on being able to use models trained this way commercially? Are people ignoring that?

klysm · on June 13, 2023

Only if it's enforced, which doesn't seem too feasible

pk-protect-ai · on June 13, 2023

The paper was published a week ago, and it suppose to be an open source model ... Where is it? Why it is still not open?

Cowfusion · on June 20, 2023

Yes, exactly. Everyone is talking about how great it is, yet its nowhere to be found.

carlosbaraza · on June 15, 2023

I'm waiting too. Given MS association with OpenAI, maybe there are some legal or financial negative incentives to release this open source.