> We do know that the human brain can generalize from much less data.
Adult human brains receive magnitudes more data than GPT-3 by age 18. Probably even by age one. Take vision, for example, which is believed to be approx. 8000x8000 24Hz stream with HDR (so more than 3 bytes per pixel). This alone generates (uncompressed) 252TB per year. Slightly over half of GPT-3 training data is Common Crawl, which only recently grew to 320TB.
Where have you seen a 2yo baby, who can solve tasks GPT-3 can?
Where have you seen a 2yo baby, who read 320TB of text?
If we fed a few years worth of video an audio to a neural network, could it write War and Peace? Could it play chess based on a text description of the game?
> Obviously, it has to be the same data to make a comparison.
No, it is very much not obvious. There's quite a bit of research showing models, that receive X samples of type A and Y samples on type B might be better in tasks on A, than models that are trained just on X.
Adult human brains receive magnitudes more data than GPT-3 by age 18. Probably even by age one. Take vision, for example, which is believed to be approx. 8000x8000 24Hz stream with HDR (so more than 3 bytes per pixel). This alone generates (uncompressed) 252TB per year. Slightly over half of GPT-3 training data is Common Crawl, which only recently grew to 320TB.
Where have you seen a 2yo baby, who can solve tasks GPT-3 can?