E.g. the the two cores in the dark grey box can steal work from each other. But they will only see load averages of the neighbouring domain. In certain cases the current scheduler calculates the load figures sort of odd, so the idle core decides that a neighboring overloaded 'scheduling domain' is not overloaded.
In figure 1 the levels/shades represent distance from node/socket 1, darker being closer. So node 1 is distance 0 from itself, two other nodes are distance 1, and one node is distance 2.
Also, what CPUs besides Intel's and IBM's use SMT?
Itanium also supports some form of SMT, I think, although I am not sure if anyone actually uses those.
Yeah, an oddly thought out design for sure. The idea seems to make sense to me but execution wasn't good enough I guess.