It's a tokenization issue because there can't be a circuit to count letters because the same letters are represented in myriad different ways because of tokenization.
You are wrong, there can be a circuit to count letters because it can easily normalize them internally, as we know it can transform text to base64 just fine. So there is no reason there can't be a circuit to count letters.
The training just is too dumb to create such a circuit even with all that massive data input, but its super easy for a human to make such a neural net with those input tokens. Its just a kind of problem that transformers are exceedingly bad at solving, so they don't learn it very well even though its a very simple computation for them to do.
I was saying that the seahorse emoji failure is not a tokenization issue. If you ask an LLM to do research, you will sometimes get hallucinated articles -- potentially plausible articles that, if they existed, would have been embedded at the position from which the model tried to decode. This is what we see happening with the seahorse emoji. The model identifies where the seahorse emoji would have been embedded if it existed and then decodes from that position.
In the research case you get articles that were never written. In the seahorse case later layers hallucinate the seahorse emoji, but in the final decoding step, output gets mapped onto another nearby emoji.
Admittedly, in one way the seahorse example is different from the research case. Article titles, since they use normal characters, can be produced whether they exist or not (e.g., "This is a fake hallucinated article" gets produced just as easily as "A real article title"). It's actually nice that the model can't produce the seahorse emoji since it gets forced (by tokens, yes) to decode back into reality.
Yes, tokenization affects how the hallucination manifests, but the underlying problem is not a tokenization one.