With radix sort, you're usually considering using it precisely because K is likely to be significantly smaller than log N, and so it's absolutely relevant to the actual problem at hand.
(For that matter, multiplication is not constant time either - it's O(N) in the number of bits, which is O(log N) in the size of the values stored - but this is conveniently forgotten in most algorithm analysis. If you limit the problem to integers that fit into a machine word, then this factor drops out as a constant, and nobody cares.)
Regardless of what algorithm you're working with, you have to be aware of the limits of the abstraction you use to analyze it. Fibonacci heaps are O(1), but nobody uses them because the constant factors swamp other simpler algorithms with worse computational complexity. And sometimes it's faster to use a red-black tree (or even linear search over an array) than a hashmap because hashmaps are technically O(k) in key size; red-black trees are too, for comparisons, but in a sparse key space the processor usually only has to examine the first 1-2 characters before it can bail out of the comparison routine while the hashmap has to examine every character.
Actually, multiplication is worth than O(bits). A naive approach will yield O(bits^2), but you can get something like O(bits^r) where r is not too much bigger than 1. See https://en.wikipedia.org/wiki/Multiplication_algorithm#Fast_... for an introduction