Learning architectures come in all shapes, sizes, and forms. This could mean there are fundamental principles of cognition driving all of them, just implemented in different ways. If that's true, one would do well to first understand the extremely simple and go from there.
Building a very simple self-organizing system from first principles is the flying machine. Trying to copy an extremely complex system by generating statistically plausible data is the non-flying bird.