If you ask them to reason, then their text-prediction works differently, because it now predicts text containing reasons. They do not actually reason.
I know it is hard to believe, because the results are (usually) so impressive, but this is nothing but text-prediction.