> These NPUs are tying up a substantial amount of silicon area so it would be a ...

jonas21 · 2024-10-16T22:28:38.000000Z

NPUs are already included by default in the Apple ecosystem. Nobody seems to mind.

acchow · 2024-10-16T23:21:55.000000Z

It enables many features on the phone that people like, all without sending your personal data to the cloud. Like searching your photos for "dog" or "receipt".

JohnFen · 2024-10-16T22:36:03.000000Z

It's not really a question of minding if it's there, unless its presence increases cost, anyway. It just seems a waste to let it go idle, so my mind wanders to what other use I could put that circuitry to.

shepherdjerred · 2024-10-17T00:13:51.000000Z

I actually love that Apple includes this — especially now that they’re actually doing something with it via Apple Intelligence

crazygringo · 2024-10-16T23:17:38.000000Z

Aren't they used for speech recognition -- for dictation? Also for FaceID.

They're useful for more things than just LLM's.

JohnFen · 2024-10-17T04:38:13.000000Z

Yes, but I'm not interested in those sorts of uses. I'm wondering what else an NPU could be used for. I don't know what an NPU actually is at a technical level, so I'm ignorant of the possibilities.

ItsBob · 2024-10-17T09:35:55.000000Z

I'm probably about to show my ignorance here (I'm not neck-deep in the AI space but I am a software architect...) but are they not just dedicated matrix multiplication engines (plus some other AI stuff)? So instead of asking the CPU to do the math, you have a dedicated area that does it instead... well, that's my understanding of it.

As to why, I think it's along the lines of this: the CPU does 100 things, one of those is AI acceleration. Let's take the AI acceleration and give it its own space instead so we can keep the power down a bit, add some specialization, and leave the CPU to do other stuff.

Again, I'm coming at this from a high-level as if explaining it to my ageing parents.

JohnFen · 2024-10-17T10:55:59.000000Z

Yes, that's my understanding as well. What I meant is that I don't know the fine details. My ignorance is purely because I don't actually have a machine that has an NPU, so I haven't bothered to study up on them.

heavyset_go · 2024-10-17T00:16:05.000000Z

The idea is that your OS and apps will integrate ML models, so you will be running models whether you know it or not.

JohnFen · 2024-10-17T04:28:16.000000Z

I'm confident that I'll be able to know and control whether or not my Linux and BSD machines will be using ML models.

heavyset_go · 2024-10-18T00:31:03.000000Z

I agree with the premise as a Linux user myself, but if you're using any JetBrains products, or Zoom, you're running models on the client-side. I suspect small models will continue to creep into apps. Even Firefox ships ML models in the browser.

hollerith · 2024-10-17T04:45:30.000000Z

--and whether anyone is using your interactions with your computer to train a model.

heavyset_go · 2024-10-18T00:32:36.000000Z

Luckily, while NPUs do nothing about data exfiltration, they're a poor solution for training models. Your data is still going to get sucked up to the mothership, but offloading training to your machine hopefully won't happen.

consteval · 2024-10-17T15:42:04.000000Z

We already can't fit much more in CPUs. You can't just throw cores in there. CPUs these days are, like, 80% cache if you look at the die. We constantly shrink the compute part, but we don't put much more compute - that space is just used for cache.

So, I'm not sure that you're wasting much with the NPU. But I'm not an expert.

jsheard · 2024-10-16T22:21:15.000000Z

> But tomorrow, will they just be included by default?

That's already the way things are going due to Microsoft decreeing that Copilot+ is the future of Windows, so AMD and Intel are both putting NPUs which meet the Copilot+ performance standard into every consumer part they make going forwards to secure OEM sales.

AlexAndScripts · 2024-10-16T22:47:17.000000Z

It almost makes me want to find some use for them on my Linux box (not that is has an NPU), but I truly can't think of anything. Too small to run a meaningful LLM, and I'd want that in bursts anyway, I hate voice controls (at least with the current tech), and Recall sounds thoroughly useless. Could you do mediocre machine translation on it, perhaps? Local github copilot? An LLM that is purely used to build an abstract index of my notes in the background?

Actually, could they be used to make better AI in games? That'd be neat. A shooter character with some kind of organic tactics, or a Civilisation/Stellaris AI that doesn't suck.

ywvcbk · 2024-10-17T08:42:13.000000Z

> box

Presumably you have a GPU? If so there is nothing an NPU can do that a discrete GPU can’t (and it would be much slower than a recent GPU).

The real benefits are power efficiency and cost since they are built into the SoC which are not necessarily that useful on a desktop PC.

Miraste · 2024-10-17T18:16:49.000000Z

In short: no. Current-gen NPUs are so slow they can't do anything useful. AMD and Intel have 2nd-gen ones that came out a few weeks ago, and by spec they may able to run local translation and small LLMs (haven't seen benchmarks yet), but for now they are laptop-only.

bcoates · 2024-10-17T14:31:01.000000Z

Microsoft has declared a whole lot of things to be the future of Windows, almost all of them were quietly sidelined in a version or two.

https://www.joelonsoftware.com/2002/01/06/fire-and-motion/

jsheard · 2024-10-17T16:17:04.000000Z

Yeah, but the lead times on silicon mean we're going to be stuck with Microsoft's decision for while regardless of how hard they commit to it. AMD and Intel probably already have two or three future generations of Copilot+ CPUs in the pipeline.

idunnoman1222 · 2024-10-17T00:51:30.000000Z

Voice to text