Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don’t think so — the Phi training plan was to pull answers from textbooks and have GPT-4 write questions for the answers, thus ensuring high quality completions. They then trained on this data, fairly indiscriminately. This is about quality of training data, but it’s much more general in that it’s an approach that can target broad scale web data using a small / cheap model to ‘sort’ and prioritize.


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: