I don’t think so — the Phi training plan was to pull answers from textbooks and have GPT-4 write questions for the answers, thus ensuring high quality completions. They then trained on this data, fairly indiscriminately. This is about quality of training data, but it’s much more general in that it’s an approach that can target broad scale web data using a small / cheap model to ‘sort’ and prioritize.