I wonder if a better approach is to build and train the ML model to recognize th...

I wonder if a better approach is to build and train the ML model to recognize the AST of the code in question rather than text. The workflow would be Build AST -> run ML to get an optimised solution render new code out from that optimised AST.

A lot of work I think and as is pointed out here devs wants their tooling for free.