- There _was_ a problem with diminishing returns from increasing data size. Then they surpassed that by curating data.
- Then the limits on the amount of curatable data available made the performance gains level off. So they started generating data and that pushed the nose up again.
- Eventually, even with generated data, gains flattened out. So they started increasing inference time. They have now proven that this improves the performance quite a bit.
It's always been a series of S-curves and we have always (sooner or later) innovated to the next level.
Marcus has always been a mouth just trying to take down neural networks.
Someday we will move on from LLMs, large multimodal models, transformers, maybe even neural networks, in order to add new levels and types of intelligence.
But Marcus's mouth will never stop yapping about how it won't work.
I think we are now at the point where we can literally build a digital twin video avatar to handily win a debate with Marcus, and he will continue to deny that any of it really works.
He does not spend an appreciable amount of effort or time advocating for that though. He spends 95% of his energy trying to take down the merits of NN-based approaches.
If he had something to show for it, like neurosymbolic wins over benchmarks for LLMs, that would be different. But he's not a researcher anymore. He's a mouth, and he is so inaccurate that it is actually dangerous, because some government officials listen to him.
I actually think that neurosymbolic approaches could be incredible and bring huge gains in performance and interpretability. But I don't see Marcus spending a lot of effort and doing quality research in that area that achieves much.
The quality of his arguments seems to be at the level of a used furniture salesman.
> He spends 95% of his energy trying to take down the merits of NN-based approaches.
The 95% figure comes from where? (I don't think the commenter above has a basis for it.)
How often does Marcus-the-writer take aim at NN-based approaches? Does he get this specific?
I often see Gary Marcus highlighting some examples where generative AI technologies are not as impressive as some people claim. I can't recall him doing the opposite.
Neither can I recall a time when Marcus specifically explained why certain architectures are {inappropriate or irredeemable} either {in general or in particular}.
Have I missed some writing where Marcus lays out a compelling multi-sided evaluation of AI systems or companies? I doubt it. But, please, if he has, let me know.
Marcus knows how to cherry-pick failure. I'm not looking for a writer who has staked out one side of the arguments. Selection bias is on full display. It is really painful to read, because it seems like he would have the mental horsepower to not fall into these traps. Does he not have enough self-awareness nor intellectual honesty to write thoughtfully? Or is this purely a self-interested optimization -- he wants to build an audience, and the One-Sided Argument Pattern works well for him.
Just thinking about this.. do you know if anyone has figured out a way to reliably encode a Turing machine or simple virtual machine in the layers of a neural network, in a somewhat optimal way, using a minimized number of parameters?
Or maybe fully integrating differentiable programming into networks. It just seems like you want to keep everything in matrices in the AI hardware to get the really high efficiency gains. But even without that, I would not complain about an article that Marcus wrote about something along those lines.
But the one you showed has interesting ideas but lacked substance to me and doesn't seem up to date.
No one in their right mind will argue neural nets cannot outperform humans at resampling data they have previously been exposed to.
So, digital twins and debate, they probably can do better than any human.
I'm still waiting for a computer that can make my morning coffee. Until it's there I don't really believe in this whole "computer" or "internet" thing, it's all a giant scam that has no real-world benefit.
- Then the limits on the amount of curatable data available made the performance gains level off. So they started generating data and that pushed the nose up again.
- Eventually, even with generated data, gains flattened out. So they started increasing inference time. They have now proven that this improves the performance quite a bit.
It's always been a series of S-curves and we have always (sooner or later) innovated to the next level.
Marcus has always been a mouth just trying to take down neural networks.
Someday we will move on from LLMs, large multimodal models, transformers, maybe even neural networks, in order to add new levels and types of intelligence.
But Marcus's mouth will never stop yapping about how it won't work.
I think we are now at the point where we can literally build a digital twin video avatar to handily win a debate with Marcus, and he will continue to deny that any of it really works.