> These NPUs are tying up a substantial amount of silicon area so it would be a real shame if they end up not being used for much.
This has been my thinking. Today you have to go out of your way to buy a system with an NPU, so I don't have any. But tomorrow, will they just be included by default? That seems like a waste for those of us who aren't going to be running models. I wonder what other uses they could be put to?
It enables many features on the phone that people like, all without sending your personal data to the cloud. Like searching your photos for "dog" or "receipt".
It's not really a question of minding if it's there, unless its presence increases cost, anyway. It just seems a waste to let it go idle, so my mind wanders to what other use I could put that circuitry to.
Yes, but I'm not interested in those sorts of uses. I'm wondering what else an NPU could be used for. I don't know what an NPU actually is at a technical level, so I'm ignorant of the possibilities.
I'm probably about to show my ignorance here (I'm not neck-deep in the AI space but I am a software architect...) but are they not just dedicated matrix multiplication engines (plus some other AI stuff)? So instead of asking the CPU to do the math, you have a dedicated area that does it instead... well, that's my understanding of it.
As to why, I think it's along the lines of this: the CPU does 100 things, one of those is AI acceleration. Let's take the AI acceleration and give it its own space instead so we can keep the power down a bit, add some specialization, and leave the CPU to do other stuff.
Again, I'm coming at this from a high-level as if explaining it to my ageing parents.
Yes, that's my understanding as well. What I meant is that I don't know the fine details. My ignorance is purely because I don't actually have a machine that has an NPU, so I haven't bothered to study up on them.
I agree with the premise as a Linux user myself, but if you're using any JetBrains products, or Zoom, you're running models on the client-side. I suspect small models will continue to creep into apps. Even Firefox ships ML models in the browser.
Luckily, while NPUs do nothing about data exfiltration, they're a poor solution for training models. Your data is still going to get sucked up to the mothership, but offloading training to your machine hopefully won't happen.
We already can't fit much more in CPUs. You can't just throw cores in there. CPUs these days are, like, 80% cache if you look at the die. We constantly shrink the compute part, but we don't put much more compute - that space is just used for cache.
So, I'm not sure that you're wasting much with the NPU. But I'm not an expert.
> But tomorrow, will they just be included by default?
That's already the way things are going due to Microsoft decreeing that Copilot+ is the future of Windows, so AMD and Intel are both putting NPUs which meet the Copilot+ performance standard into every consumer part they make going forwards to secure OEM sales.
It almost makes me want to find some use for them on my Linux box (not that is has an NPU), but I truly can't think of anything. Too small to run a meaningful LLM, and I'd want that in bursts anyway, I hate voice controls (at least with the current tech), and Recall sounds thoroughly useless. Could you do mediocre machine translation on it, perhaps? Local github copilot? An LLM that is purely used to build an abstract index of my notes in the background?
Actually, could they be used to make better AI in games? That'd be neat. A shooter character with some kind of organic tactics, or a Civilisation/Stellaris AI that doesn't suck.
In short: no. Current-gen NPUs are so slow they can't do anything useful. AMD and Intel have 2nd-gen ones that came out a few weeks ago, and by spec they may able to run local translation and small LLMs (haven't seen benchmarks yet), but for now they are laptop-only.
Yeah, but the lead times on silicon mean we're going to be stuck with Microsoft's decision for while regardless of how hard they commit to it. AMD and Intel probably already have two or three future generations of Copilot+ CPUs in the pipeline.
This has been my thinking. Today you have to go out of your way to buy a system with an NPU, so I don't have any. But tomorrow, will they just be included by default? That seems like a waste for those of us who aren't going to be running models. I wonder what other uses they could be put to?