Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I agree I left out option 0, but I think the other two were presented correctly?

- Black box distillation uses direct answers to questions and conversation style. This is less useful as you still have to do supervised fine-tuning on the answers, as they may be wrong, and don't lead to greater insights (which reinforcement learning does)

- RLIAF relies on preferences and values to judge answers. These don't need supervised fine-tuning and help guide the new model to better answers rather than just correcting specific previously asked answers

 help



Well, I mean you mixed up "fine-tuning" and "reinforcement learning" a bit when describing these options.

Regarding the value of these options, SFT communicates more information to the model being trained, but there's a risk of overfitting. So I'd guess they might use both - do a bit of SFT and then finish with RLAIF.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: