My understanding is that Microsoft research has already published a paper where ...

My understanding is that Microsoft research has already published a paper where they used synthetic chat interactions of the same form that chatGPT uses to train a new model. GPT4 could be used to select the best interactions from which to create a training set. I’d be very, very surprised if OpenAI hasn’t already been doing this internally.

https://arxiv.org/pdf/2306.02707.pdf