Because it's only a superior solution if you just want one box, and that mostly for inference. Once you start scaling to larger loads, it's much trickier to get a clusters of Macs to efficiently process them in parallel, whereas datacenter GPUs are designed for clusters.