Hacker News new | past | comments | ask | show | jobs | submit login

Instruction fine tuning definitely increases the win rate in human judged comparisons but that just means it's better at generating a style humans like. Humans aren't always right.

https://arxiv.org/pdf/2203.02155.pdf Page 56/68 in table 14 it looks like the things the fine tunes beat the base models at are basically HellaSwag and the ones that use human evaluations. Otherwise base gpt models are winning.

And just before that they're discussing how the fine-tunes do cause performance regressions on page 55.




I completely agree that humans are not always right, but neither is the next token in a text sequence always "right",i.e. the thing that a raw LM is best at.

I just object to the blanket statement that fine tuning will make a model dumber. It will mostly mean the model is not as good at the original training task, but that really doesn't mean it is "dumber" by any definition.

The question then is what fine tuning does to tasks that are neither the original training task or the fine tuning task. This will depend on the fine tuning task. It seems that instruction fine tuning improves performance on tasks that involve human interaction, so I have a hard time seeing it as the model becoming dumber. Other fine tuning tasks, such as removing toxicity, may have a higher cost on unrelated tasks, so there one could say they caused the model to become dumber.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: