If a new guarantee or behaviour were a breaking change, every non-patch release of a semversioned tool (which python isn't) would be major.
And randomised at interpreter start, since Python 3.3.
And that's sort of the problem. If it was broken, you'd know it and you'd fix/rearchitect it. Instead, it will appear to work.
For example, if you wrote something in pypy it would be ordered versus cpython.
If a new guarantee or behaviour were a breaking change, every non-patch release of a semversioned tool (which python isn't) would be major.