Hacker News new | past | comments | ask | show | jobs | submit login
Qualcomm works with Meta to enable on-device AI applications using Llama 2 (qualcomm.com)
104 points by ahiknsr 9 months ago | hide | past | favorite | 85 comments



Meanwhile Apple is stubbornly insisting on its own ways, and awkwardly silent during this whole AI revolution. I gave up using Siri years ago due to its glaring stupidity compared with Google Assistant and Alexa.

While Apple keeps making money from overpriced hardware, the competitors are working on actually being pioneers in AI. It makes me sad to see so much computational power in my iPhone and iPad getting wasted on silly subpar iOS apps.


Apple are never first, sometimes they are quite late, but when they do launch their version of a product it is polished and fixes many issues with all other implications.

I have no doubt that Apple are working on local LLMs and combining them with Siri, it's only a matter of time. I expect we will see something next year, probably tied to the next iteration of hardware - they do want to sell hardware after all.

Apple have indicated that are working on this stuff, subtly. They avoided saying "AI", instead going for "ML", "encoder/decoder models" and "transformer models". The spell check on the next iOS is based on a transformer model. It's packed with image models for extraction, lighting and image correction. iOS is littered with ML stuff, that have "Nural cores" in their silicone.

This stuff is coming from them, when it's ready.

What I expect they will launch with a "LLM Siri" will blow our minds, it will be an all encompassing personal assistant. It will have linkage of all our documents, email, messages, movements, browser history, all locally securely on our devises. Not in the cloud. It will be so close to appearing like AGI that some people will clams it is.


>Apple are never first, sometimes they are quite late, but when they do launch their version of a product it is polished

Apple maps would have a laugh at that


Apple Maps, as it is today, is so much better than Google Maps. The UI is way more polished than Google Maps. The Google UI looks like it is designed by committee - it is so ugly and dense. I find it really offensive.

The Apple Maps has better directions for my city at least. The Google one re-directs me through side streets and weird turns into traffic crossing lanes at the last minute.


I agree, the only thing that's still better is google maps reviews for restaurants, just because of the sheer amount of users compared to yelp (worldwide). But I really dislike google maps reviews UI


You're getting downvoted but honestly I'd agree insofar as the UI.

Google Maps is loaded with suggested content. What about lunch? Need a haircut? Check out these top rated places!! Would you recommend Google Maps to a friend or colleague?

Apple Maps is like what Google Maps was years ago: a map and a search box.


Sometimes I really do want to find somewhere to eat lunch at though - i.e. Google maps minus some of the webshit is reasonably good at being a centrally dense place for all that informatjon


Agree but GP's point is that the launch was terrible.


Nowadays, when I want to screenshot the place in the lap, I use Apple’s. Google Maps has so much stuff that the actual map gets eclipsed.


Yeah I remember that as their only misstep tbh. Today, Apple Maps is my go-to for California at least.


There have been other failed Apple products. Lisa. Newton. Macintosh Portable. iPhone 6. Etc.

https://www.cnet.com/tech/mobile/apples-worst-failures-of-al...

No one gets product planning right 100%. The key is to recognize failures early and cut them off.


It’s estimated that the iPhone 6 and 6 Plus sold over 225 million.


The key is to have enough cash that you know you will hit the nail. Microsoft is failing constantly, even launching several times different products for the same use case. It doesn't matter at that level.


Yeah I remember that as their only misstep tbh.

MobileMe was quite a big misstep too.


Was it? They changed nothing when they rebranded from MobileMe to iCloud, and that is wildly popular.



Have you checked it recently? It’s better than Google Maps now.


Read again the comment I'm replaying to mentioning "polished at launched", not polished after 10 years of updated and fixes.


not the first computer with windowed UI

not the first portable mp3 player

not the first smartphone

nor the first touchscreen phone

not the first bluetooth headset

not the first tablet computer

not the first smartwatch

not the first VR/AR headset


Many of these were not "polished" at release either.

- The Lisa was so expensive that it failed to find a market regardless of how visionary it was, ultimately making less of an impact than the Apple II.

- The first iPod models were plagued with battery issues and had no meaningful way to replace them once it went bad (this extended into smaller models like the Nano).

- The iPhone was notoriously gimped at launch and only barely delivered on it's promises, with the majority of features people know today being added in updates.

- The first generation Airpods should be classified as instruments of sonic and physical abuse, not headphones.

- The first generation iPad was depreciated almost immediately and got ~2 years total software support.

To say nothing of their success, sure, but Apple clearly has their own honeymoon phases to work through.


I thought it was common belief, well at least in my circle of friends, not to buy anything apple until there's third generation of it.


Mostly agree but I don't know why you hate AirPods so much. AirPods was great from the beginning (in retrospect after people accepted the design).


I wasn't really a fan of the Earpods, but they were cheap and sounded fine for $20. The Airpods do this wirelessly while sounding worse for $180. The fact that they reserve comfort-fit ear tips for their premium headphones is an insult to every user at that price point, and it sabotages any chance they had at sounding balanced or natural.

Airpods Pro are much nicer, but should be the standard. The Airpods 1 (and frankly the Airpods 2 as well) sound thin and are borderline unacceptable for what they cost. Just my $0.02 though, audio and comfort are ultimately subjective.


I used to laught at Apple Maps even a year ago. Usability is much much better than Google and the data has caught up too.


> I have no doubt that Apple are working on local LLMs and combining them with Siri, it's only a matter of time. I expect we will see something next year, probably tied to the next iteration of hardware - they do want to sell hardware after all.

This was recently submitted to HN and I think it's largely reasonable supposition:

https://www.primoh.net/p/apples-ai-strategy

https://news.ycombinator.com/item?id=36769225


>> I have no doubt that Apple are working on local LLMs and combining them with Siri,

Yeah, Apple would want to have voice chat, not text chat. Especially on a phone


People said that about VR/AR, and then Apple came up with something ridiculously bad compared to the Quest.


It’s different from the quest for sure, it’s still too early to say it is worse (or even bad if they are not comparable at all).


Apple has been pretty good about quietly integrating AI-based image processing into iOS.

All text within photos is being automatically recognized and is fully searchable. That has saved my ass on occasion when some piece of information was in a photo.

The AI background removal feature in Photos is cool too — just drag from a foreground element, and it usually gets it right.

But despite this kind of flawless feature integration work, clearly Apple as a company has some issues with applying generative models on a more fundamental scale. They look a bit stuck in their own loop of trying to replicate past success with an upcoming $3,500 device that’s all about ever fancier glass and aluminum hardware.


They just haven’t been loud about it. But they have specialized silicon for running neural nets (which they, admittedly have been quiet about outside of “adopt CoreML”). They have CoreML. And if you browse places like r/localllama you very frequently find people recommending Mac Studios as a great machine for a hobbyists interested in local inference and training. They even had a subtle jab at nvidia in their last keynote talking about how much memory they support on on their latest M2 machines.

Qualcomm is announcing hardware that won’t be available until (at least) next week. That’s just not Apple’s way, and I think it’s a mistake to say they’re ignoring what’s going on.


>awkwardly silent during this whole AI revolution

They weren't silent about AI child porn scanning on-device so we know they definitely have the capability for AI on device.


Apple has not been silent. The whole iOS 17 release has AI stuff sprinkled all through it. They just aren't speaking the same language as everyone else. They are also hiring generative AI folks like crazy for things like Siri. They want to own the on-device space and are well on the way to revolutionizing their operating systems and hardware to do it.


That seems right. Apple leads out with "We're doing feature X, Y, and Z" and downplays the particular technology behind those features. Nobody really cares if X, Y, and Z are powered by AI, as long as they're awesome.

Contrast that with the rest of the industry who lead out with "We're doing AI!" and then they connect it to this and that feature.

I'd argue that consumers care about having great features, not about what particular technology powers it. My phone could run on unicorn poop, and I don't care as long as it does what I want it to do.


Have you considered that LLMs in their current form just…aren’t very good or useful? That may be why Apple isn’t using them yet.


Despite all the hype, and being genuinely intriguing, this is the truth. They can do some neat tricks, but overall utility is low. Model hallucination in particular is a problem. I don’t see Apple rolling out any model that has the potential to lie to you.


This is my guess too. They’re using it on their iOS 17 keyboard for better text prediction though.


I slowly started to think Apple knows there will be a golden hour to jump on a mature, economic LLM. Until then, they are probably watching without a massive investment into LLMs if it's not drastically changing their products.


This is what I don't understand.

The power of LLMs is that they can allow a human to interact with a computer using natural language.

Imagine just asking Siri to order you something and Siri finds the cheapest legit supplier and just orders it from them. It would remove the first hop needed to go to Amazon and unlike Alexa Siri is always with you. Google and Apple could do to Amazon what they did to Facebook.


Apple ported Stable Diffusion to Metal almost immediately.

And they are making tremendous investments in Metal development.

Also, there are native apps utilizing both Stable Diffusion and Llama... And they are buried under a tremendous mountain of other AppStore junk.

The large gap between Apple's demos and backend development and actual deployed ML apps is very unfortunate and sad, but its not uncommon either.


Yeah...Apple doesn't tease products before they announce them.


If you have a modern iPhone you can try running an LLM directly on it today using the MLC iPhone app: https://mlc.ai/mlc-llm/#iphone

It can run Vicuna-7B which is a pretty impressive model.

(They have an Android app too but I haven't tried that yet).


>the MLC iPhone app

Seems like US only


MLC is great, but unfortunately is a demo more than anything. Other front ends aren't really integrating MLC yet.


> The ability to run generative AI models like Llama 2 on devices such as smartphones, PCs, VR/AR headsets

Maybe it's an upcoming feature for the Quest 3?

To that end, I've been pretty amazed by how far quantization has come. Some early llama-2 quantizations[0] have gotten down to ~2.8gb, though I haven't tested it to see how it performs yet. Still though, we're now talking about models that can comfortably run on pretty low-end hardware. It will be interesting to see where llama crops up with so many options for inferencing hardware.

[0] https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML/tree/ma...


ChatGPT 3.5 is the base level people expect LLMs to be, it would be 2-3 generation(3-4 years) of hardware before we can reach that. Anything below is just going to get bad reviews


A small model might be useful for e.g. NPC interactions in a Quest game


3-4 years to run it on your phone seems generous, barring algorithmic breakthroughs.

If I can run 100B+ models on my high-end desktop in 3-4 years I will be very happy.


What do you consider a high end desktop now?


He means a videocard with 512gb of memory or more.


Is 512gb a typo? The current biggest consumer card has 24GB, so we're probably 15 years from a 512GB card (judging from the increase of 4Gb to 24GB between 2012 and 2022).


That's why 3-4 years would be impressive


With 4-bit quantization the requirements are more like 64gb.

I expect we'll see more unified memory designs like Apple's 128GB M1 Ultra.


I doubt it to be honest, desktop GPUs use too much power (and hence produce too much heat) to be integrated in that fashion, and any kind of shared memory will be too high latency.


There are 'desktop' (well server) cpus with 64GB of HBM memory per socket now. And big LLMs can be run on lower memory bandwidth systems (like zen4 chips with 12x ddr5 per socket) at lower performance, but where 1-2TB of ram is no big deal.


But for what applications? Sure, for answering free-form questions I expect GPT-3.5+ quality. I don't think GPT-3.5 is necessary to provide auto-complete in your email client.


Llama 65B finetunes already exceed it in some niches, like roleplaying or specific (coding and spoken) languages


Isn't that the same LLM that doesn't know how many e's are in "ketchup"? Nice.


Aren't you the same user that doesn't understand word-level tokenization?


You must have me confused with an LLM bot that does know how many e's are in "ketchup".


Facebook hardware in my devices that claims it’s there to protect my privacy.

What’s the catch? Firmware based “anonymous” telemetry?


Not Facebook hardware in this case, it would be their model running on QCOM hardware. If people can write apps in which the model is used locally and no information goes to the net (to Meta, Google, or anyone else) that's a huge privacy win. Presumably Meta would make money by licensing it to Qualcomm and getting some amount per device.

The key issue would be whether local-only apps can be written, or if Qualcomm and Meta would insist on seeing what the user is up to and would try to monetize that.


personally I hope that the facebook altruism as of late has been a market effect from their needing to compete with other mega-corps, and that it comes with no catch or ill-will because it's simply there to be 'the other'.

it's probably a naive hope, but I hope it's right.

(i've never trusted FB further than I could throw their company, so this is a big leap of faith for me.)


FB should release a phone, they already have quest OS


As I said in the other Llama 2 thread[0], this is Meta doing a traditional textbook "commoditise your compliment". It's a preemptive attack with this particular announcement on Apple and Google. They don't what them own LLMs on portable devices, they want it to be a commodity.

[0]: https://news.ycombinator.com/item?id=36775642


The move from Meta is pretty obvious, they just want to attract the best AI researchers, they know that this generation of AI is not it, and they know that the best AI researchers want to work on open sourced stuff.


What could possibly go wrong

https://www.cnet.com/tech/mobile/heres-why-the-facebook-phon...

Next up: they should make a social media app where each user can customize their profile page and have their favorite music start playing. It would be called YourSpace


Hmm I wonder what sort of privacy impacts there are in a future of having Meta (or Google et al) AI running on chips on your phone when the parent company has so much info on you and blatantly flaunts privacy laws.


How would it work to get a model that needs 8Gb+ of vram currently into some chiplet form factor? Is there an obvious way of translating this more directly to hardware?


Mobile chips use a shared memory model and 12+ gigs of ram for mobile hardware isn't exotic at all. Seems reasonable to think that it won't be an issue.


They're going to use an extremely compact model for extremely limited use cases.


If your filesystem is fast enough, then you can dynamically load and unload chunks of it as you use it


I don't think that works for LLMs. My understanding is that every single token produced by the model requires running floating point operations against all x-billion parameters, so the entire thing needs to be loaded into memory at all times for it to work.


Nope, you can use mmap to virtually map it to memory, and then you don’t have to hold the whole thing in RAM at once. I have spent the last 4-5 weeks working on this and optimising it

You can see some information here: https://justine.lol/mmap/


How does that work? Does it mean that for every token generated it has to page areas of disk into RAM and then back out again?


You'd have to have enough RAM to hold the whole model in memory or performance will be awful, mmap is just a way to get faster startup (if the mmap'd file looks exactly like the in-memory representation) and easier sharing (the mmap region can be shared read-only memory that multiple processes use).


You can shuffle it back and forth between disk and memory. It's slow, but it works.

There are people working on compute-in-memory hardware, like flash chips that can do matrix multiplication in place. None of it is close to reaching the market but there's a lot more interest now that neural networks are obviously useful.


I doubt Qualcomm will be able to increase their performance ahead of Nvidia decreasing their energy requirements.


Since NVidia hardly ships on Android phones, it doesn't really matter.


This isn't just about mobile telephones, or android for that matter. It's about on-device TPUs or AI accelerators or parallel compute units.


Which basically means Android devices, in what concerns mainstream consumers, given the state of the PC market.

If on the other hand we look at embedded market, it is also not something where NVidia has any kind of strong market presence, with the exception of them trying to be in a good position on autonomous driving market.


This isn't about mainstream consumers either. This is about a technological development where the balance between performance and energy consumption is attacked from both ends, where one vendor has the performance and the other vendor has the energy consumption but neither have both (yet).


NVidia isn't serving industrial embedded consumers either.


Yes they are. But it's not about that either. You are conflating end-user products with two companies developing to the same central point from opposing start positions. That is what this is about.


Why?


Qualcomm hasn't been able to manufacture desktop-class ARM devices yet (see SQ1 and X13 ARM edition), and the GPU IP doesn't really hold a candle to what else is out there either. They are however using significantly less power than say, x86 SoCs and GPUs from AMD, Intel or Nvidia.

Nvidia on the other hand has plenty of IP for GPUs and AI workloads that is proven to have high performance, even on the Tegra scale of things. The Jetson product line as an example does still outperform whatever Qualcomm has come up so far. But they are consuming more power and don't even come close to the same thermal envelope.

So Nvidia needs to reduce their energy needs, while Qualcomm needs to create actually performant devices for real-time AI workloads. That's why.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: