Where I think this analogy breaks down is that these models are not simply copyi...

Where I think this analogy breaks down is that these models are not simply copying / memorizing the training data. They are using the data to fit a model, highly compressing the specifics while holding on to general patterns and abstractions.

I really feel like "reading lots of books" is a better analogy. (Of course its not perfect, since these systems can scale way past the number of books a human can read).