that's done on the PS3 and i hear it isn't easy to write efficient programs for it.

Each SPE has a tiny amount of memory (256K, IIRC), severely limited connectivity to other SPEs and a downright cruel instruction set. A friendlier ISA and Transputer-like connectivity between the nodes would alleviate some of these problems.

