It can hold any bytes, it just happens that one way to contruct/represent it can be done with a string-like syntax as a convenience for developers. But you can actually built it in another way, or make it hold data in any other format:
So it makes sense that Python is treating it has a raw bytes array (what you call "a byte string"): it has no way to know that it is UTF8 or CP850 if you don't tell it.
But because of c/c++ experience or habits from python 2, one tends to confuse the concept of text (represented with the type "str" in python) with some specific low level implementation (the raw bytes array).
Python explicitly avoid this problem, by defining that either you know what it is (utf8 text, big endian number, etc) or you don't (raw bytes array). Manipulating text as a raw byte sequence manually would be the equivalent of manipulating directly the IEEE 754 representation of a number: it's not what you want for a high level scripting language, and hence it's why Python 3 doesn't do that anymore.
So it makes sense that Python is treating it has a raw bytes array (what you call "a byte string"): it has no way to know that it is UTF8 or CP850 if you don't tell it.
But because of c/c++ experience or habits from python 2, one tends to confuse the concept of text (represented with the type "str" in python) with some specific low level implementation (the raw bytes array).
Python explicitly avoid this problem, by defining that either you know what it is (utf8 text, big endian number, etc) or you don't (raw bytes array). Manipulating text as a raw byte sequence manually would be the equivalent of manipulating directly the IEEE 754 representation of a number: it's not what you want for a high level scripting language, and hence it's why Python 3 doesn't do that anymore.