Well yes, because transformers just predict the next word to use. Their advantag...

Well yes, because transformers just predict the next word to use. Their advantage is the concept of attention in order to better predict the next word (by deciding how relevant much earlier words are to the overall conversation).

The transformer architecture isn't capable of general/unstructured thought like we are, but I'd love for someone to build an ideation model that feeds into a transformer when it needs to get input/output from the outside world. Abstract concepts will give us stronger AI, not 6TB of text; that should be reserved exclusively for communication purposes.

It's like when you're thinking of something, but you can't remember the word for it. Then you remember the word. Transformers only work with the latter, what we need is something that can operate on the "something" without having to know the word for it (which is only relevant when talking to a human anyway).