I think you are mixing layers of abstraction. To make a crude but I think not unhelpful analogy: 'Understanding' is a natural language concept that is our way to describe whats happening in our heads, and like most other such concepts is resistant to any clear definition and will exhibit sorites type paradoxes when one is attempted. It belongs to the presentation layer of the stack. While the process of curve fitting, however it is implemented, with whatever NN structure (like transformers) or maybe something else entirely belongs to the physical layer of the stack -- akin to frequency modulation.
While I am unsure whether LLMs are really understanding, whatever that means, I think it is not difficult to believe that any form of understanding we implement will involve 'curve fitting' as a central part.
While I am unsure whether LLMs are really understanding, whatever that means, I think it is not difficult to believe that any form of understanding we implement will involve 'curve fitting' as a central part.