Hacker News new | past | comments | ask | show | jobs | submit login

> These NPUs are tying up a substantial amount of silicon area so it would be a real shame if they end up not being used for much.

This has been my thinking. Today you have to go out of your way to buy a system with an NPU, so I don't have any. But tomorrow, will they just be included by default? That seems like a waste for those of us who aren't going to be running models. I wonder what other uses they could be put to?






NPUs are already included by default in the Apple ecosystem. Nobody seems to mind.

It enables many features on the phone that people like, all without sending your personal data to the cloud. Like searching your photos for "dog" or "receipt".

It's not really a question of minding if it's there, unless its presence increases cost, anyway. It just seems a waste to let it go idle, so my mind wanders to what other use I could put that circuitry to.

I actually love that Apple includes this — especially now that they’re actually doing something with it via Apple Intelligence

Aren't they used for speech recognition -- for dictation? Also for FaceID.

They're useful for more things than just LLM's.


Yes, but I'm not interested in those sorts of uses. I'm wondering what else an NPU could be used for. I don't know what an NPU actually is at a technical level, so I'm ignorant of the possibilities.

I'm probably about to show my ignorance here (I'm not neck-deep in the AI space but I am a software architect...) but are they not just dedicated matrix multiplication engines (plus some other AI stuff)? So instead of asking the CPU to do the math, you have a dedicated area that does it instead... well, that's my understanding of it.

As to why, I think it's along the lines of this: the CPU does 100 things, one of those is AI acceleration. Let's take the AI acceleration and give it its own space instead so we can keep the power down a bit, add some specialization, and leave the CPU to do other stuff.

Again, I'm coming at this from a high-level as if explaining it to my ageing parents.


Yes, that's my understanding as well. What I meant is that I don't know the fine details. My ignorance is purely because I don't actually have a machine that has an NPU, so I haven't bothered to study up on them.

The idea is that your OS and apps will integrate ML models, so you will be running models whether you know it or not.

I'm confident that I'll be able to know and control whether or not my Linux and BSD machines will be using ML models.

I agree with the premise as a Linux user myself, but if you're using any JetBrains products, or Zoom, you're running models on the client-side. I suspect small models will continue to creep into apps. Even Firefox ships ML models in the browser.

--and whether anyone is using your interactions with your computer to train a model.

Luckily, while NPUs do nothing about data exfiltration, they're a poor solution for training models. Your data is still going to get sucked up to the mothership, but offloading training to your machine hopefully won't happen.

We already can't fit much more in CPUs. You can't just throw cores in there. CPUs these days are, like, 80% cache if you look at the die. We constantly shrink the compute part, but we don't put much more compute - that space is just used for cache.

So, I'm not sure that you're wasting much with the NPU. But I'm not an expert.


> But tomorrow, will they just be included by default?

That's already the way things are going due to Microsoft decreeing that Copilot+ is the future of Windows, so AMD and Intel are both putting NPUs which meet the Copilot+ performance standard into every consumer part they make going forwards to secure OEM sales.


It almost makes me want to find some use for them on my Linux box (not that is has an NPU), but I truly can't think of anything. Too small to run a meaningful LLM, and I'd want that in bursts anyway, I hate voice controls (at least with the current tech), and Recall sounds thoroughly useless. Could you do mediocre machine translation on it, perhaps? Local github copilot? An LLM that is purely used to build an abstract index of my notes in the background?

Actually, could they be used to make better AI in games? That'd be neat. A shooter character with some kind of organic tactics, or a Civilisation/Stellaris AI that doesn't suck.


> box

Presumably you have a GPU? If so there is nothing an NPU can do that a discrete GPU can’t (and it would be much slower than a recent GPU).

The real benefits are power efficiency and cost since they are built into the SoC which are not necessarily that useful on a desktop PC.


In short: no. Current-gen NPUs are so slow they can't do anything useful. AMD and Intel have 2nd-gen ones that came out a few weeks ago, and by spec they may able to run local translation and small LLMs (haven't seen benchmarks yet), but for now they are laptop-only.

Microsoft has declared a whole lot of things to be the future of Windows, almost all of them were quietly sidelined in a version or two.

https://www.joelonsoftware.com/2002/01/06/fire-and-motion/


Yeah, but the lead times on silicon mean we're going to be stuck with Microsoft's decision for while regardless of how hard they commit to it. AMD and Intel probably already have two or three future generations of Copilot+ CPUs in the pipeline.

Voice to text



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: