Well, yes, but it's still humans standing on the shoulders of other humans. Even though human players do memorize opening books, it stays in the family so to speak. Meanwhile a human player facing an AI engine is battling both the AI, and great human players of the past (who invented the openings).
Well, we can extend that to say the biological systems are self-assembled randomly and selected through evolutionary algorithms, starting from random molecules on the sea floor.
That's already a known method to transfer "knowledge" from one model to another. I should double-check before quoting a paper, but I think that this one talks about this (http://arxiv.org/abs/1503.02531).
You train many models. Then you "distill" their predictions into one model by using the multiple predictions (from many models) as targets (for the single model trained afterwards).
You're right to point out that humans don't do that.
I think it would be "cheating" if you train BetaGo on AlphaGo, for the purposes for doing that experiment. The goal would be to have some kind of "clean room" where people fumble around.
Of course, you can also run the other experiment to see how fast you can bootstrap BetaGo from AlphaGo. That's also interesting.
I'm pretty sure that the reinforcement learning algorithm they are using is guaranteed to converge. It just takes a very long time to train, and using human games probably sped it up.
As far as I know, using neural networks for function approximation destroys the various convergence guarantees available. NNs can easily diverge and have catastrophic forgetting, and this is one of the things that made them challenging to use in RL applications despite their power, and why one needs patches like experience replay and freezing the networks.
The niggling thought in my mind was that AlphaGo's strength is built on human strength.