I think your viewpoint is the more naive one and I think you have the burden of proof to argue why local trends are representative of global trends. This is because the non-local argument accepts a wider range of possibilities and the local argument is the the more specific one. Think of it this way, I'm saying that every function looks linear given the right perspective and you're arguing that a particular function _is_ linear. I'm arguing that we don't know that. The burden of proof would be on you to prove that this function were linear beyond what we have data for. (Path towards HLI doesn't have to be linear, exponential, nor logarithmic but you do have to make a pretty strong argument to convince others that the local trend is generalizable).
You are arguing completely beside the original point. Gato was just meant as one surprising example out of many to show how much more the architectures we already have today are capable of. Whether you believe how the scaling curve for this very particular model will work in detail is totally irrelevant to me or anyone else and I'm not trying to convince you of anything. I was just pointing out that there is a clear direction for future research and if you think people like Carmack are on the wrong path - so be it. You don't have to pursue it as well. But I don't care and he certainly doesn't either.
I understood your argument, in the context of the discussion we've been having, that Gato was a good example of us being on a good path towards building intelligent machines. I do not think Gato demonstrates that.