Fine tuning looks like it only provides specific results for specific queries. A...

shagie · on March 15, 2023

Think of it more as injecting a significant amount of data via prompts in the there and saving that information. It's also structured.

Going through the playground ( https://platform.openai.com/playground )

    Pretend that the following statements are true:
    Bannanas are blue.
    The sky is orange.
    Apples are purple.
    ###
    Answer the following questions:
    1. What color is a Red Banana?
    2. What color is a Cavendish?
    3. What color is the sky?
    4. What color are honeycrisp?
    5. What color are Pink Lady?
    6. What color are lemons?

(note the trying to trick it with specific varieties of bananas and apples that have colors in their names... a blue banana (actual variety) wouldn't be as impressive)

And this returns:

    1. Red Banana is blue.
    2. Cavendish is blue.
    3. The sky is orange.
    4. Honeycrisp are purple.
    5. Pink Lady are purple.
    6. Lemons are yellow.

The thing is that this used 30 tokens to insert that information.

The fine tuning adds the updated information to the model to a similar effect as adding the prompt - and as you can see its not a "this question that value" but rather it understands more things.

It's just that the format for training isn't just a bunch of prompts but rather prompts and responses.

You can see similar things with https://platform.openai.com/playground/p/default-marv-sarcas... where it has the initial prompt and response to "prime the pump" for the rest of the model.

Another article on fine tuning (it's a new one - only a week old):

Fine-tuning a Classifier to Improve Truthfulness https://help.openai.com/en/articles/5528730-fine-tuning-a-cl...

jay_kyburz · on March 15, 2023

Thanks very much for the detailed answer!

leschak · on March 15, 2023

you can think of fine-tuning as rewiring where it matters/can be probed. a kind of exhaustive reorganisation of the latent model space given some seed statement like you describe might be possible with LLMs that are jointly trained as a knowledge graph.