CoT works because it’s an instruction the LLM has encountered in its instruction-tuning (and, to a smaller extend, in its original training data). You could fine tune a base model to respond to every question by first outlining steps (sub prompts, if you will). Not useful for a general purpose chat model. But useful for some types of tasks.