Maybe go back and read what I said rather than make up nonsense. 'often fail' isn't 'always fail'. And many models fail the strawberry example, that's why it's famous. I even lay out some training samples that are of the type that enable current models to succeed at spelling 'games' in a fragile way.
Problematic and fragile at spelling games compared to using character or byte level 'tokenization' isn't a giant deal. These are largely "gotchas" that don't reduce the value of the product materially. Everyone in the field is aware. Hyperbole isn't required.
Someone linked you to one of the relevant papers above... and you still contort yourself into a pretzel. If you can't intuitively get the difficulty posed by current tokenization, and how character/byte level 'tokenization' would make those things trivial (albeit with a tradeoff that doesn't make it worth it) maybe you don't have the horsepower required for the field.
"""
While current LLMs with BPE vocabularies lack
direct access to a token’s characters, they perform
well on some tasks requiring this information, but
perform poorly on others. The models seem to
understand the composition of their tokens in direct probing, but mostly fail to understand the concept of orthographic similarity. Their performance
on text manipulation tasks at the character level
lags far behind their performance at the word level.
LLM developers currently apply no methods which
specifically address these issues (to our knowledge), and so we recommend more research to
better master orthography. Character-level models
are a promising direction. With instruction tuning, they might provide a solution to many of the
shortcomings exposed by our CUTE benchmark
"""
Problematic and fragile at spelling games compared to using character or byte level 'tokenization' isn't a giant deal. These are largely "gotchas" that don't reduce the value of the product materially. Everyone in the field is aware. Hyperbole isn't required.
Someone linked you to one of the relevant papers above... and you still contort yourself into a pretzel. If you can't intuitively get the difficulty posed by current tokenization, and how character/byte level 'tokenization' would make those things trivial (albeit with a tradeoff that doesn't make it worth it) maybe you don't have the horsepower required for the field.