It's a novel question and impressive that Gemini was able to solve it but the tweet's author is claiming that this is because of the large context window and not because of all the Harry Potter related training data that was available to it.
The generic models definitely know a lot about Harry Potter without any additional context.
Probably 80% of my questions to ChatGPT were about Harry Potter plot and character details as my kid was reading the books. It was extremely knowledgeable about all the minutiae, probably thanks to all the online discussion more than the books themselves. It was actually the first LLM killer app for me.
That's a good point. I would describe this as a test of Gemini's ability to re-read something it's already familiar with, not a valid test of its ability to read a large new corpus.