Since AI is heavily parallelizable, it only matters that cost(as in dollars) will keep exponentially decreasing.
It doesn't matter if you can't double the transistor density of a single cpu if you can just double the number of machines. At the end of the day you still managed to double performance for the same price.
Hardware Acceleration/Parallization is the next frontier. We've already seen the benefit of some pretty simple ASICs (TPU was built to be simple) as well as more general purpose accelerators. Hardware architects used to have a hard time, because often the best option was to simply wait for CPUs to get faster. Now that we've seen CPU power begin to stall it makes economic sense not only to invest in more parallel software but more appliciation specific accelerators.
CPUs/GPUs are beasts of hardware architecture, being complex mostly due to their flexibility. We can achieve higher performance with dedicated hardware (or FPGAs), and it looks like the economic reasons to do so are slowly becoming more certain.
Some problems don't parallelize well (some mixed integer programming problems) and some matrix operations. We look at the end of Moore's law with horror.
It doesn't matter if you can't double the transistor density of a single cpu if you can just double the number of machines. At the end of the day you still managed to double performance for the same price.
See https://en.wikipedia.org/wiki/FLOPS#Hardware_costs (note: in another thread someone noted that this wiki is outdated/inaccurate. If anyone have the relevant expertise they should help edit it)