Are there any publications out there analyzing this more in depth? How are these...

Are there any publications out there analyzing this more in depth? How are these datasets scheduled? Do you have your highest quality data first, or do you actually train using "dumb" data first until you establish some general language understanding before giving the high quality information? There is a lot of interesting research to do here that I'm sure people have already investigated....