Interesting take. Basically Open AI's O1 and every reasoning model's behaviour can be replicated by proper COT given that it is followed correctly. I learn COT from this amazing blog on I found on Reddit and then experimented and LLM was able to perform much better. So I believe Alibaba in their new model must have improved their system prompt when compared to their new model.