Meanwhile Apple is stubbornly insisting on its own ways, and awkwardly silent during this whole AI revolution. I gave up using Siri years ago due to its glaring stupidity compared with Google Assistant and Alexa.
While Apple keeps making money from overpriced hardware, the competitors are working on actually being pioneers in AI. It makes me sad to see so much computational power in my iPhone and iPad getting wasted on silly subpar iOS apps.
Apple are never first, sometimes they are quite late, but when they do launch their version of a product it is polished and fixes many issues with all other implications.
I have no doubt that Apple are working on local LLMs and combining them with Siri, it's only a matter of time. I expect we will see something next year, probably tied to the next iteration of hardware - they do want to sell hardware after all.
Apple have indicated that are working on this stuff, subtly. They avoided saying "AI", instead going for "ML", "encoder/decoder models" and "transformer models". The spell check on the next iOS is based on a transformer model. It's packed with image models for extraction, lighting and image correction. iOS is littered with ML stuff, that have "Nural cores" in their silicone.
This stuff is coming from them, when it's ready.
What I expect they will launch with a "LLM Siri" will blow our minds, it will be an all encompassing personal assistant. It will have linkage of all our documents, email, messages, movements, browser history, all locally securely on our devises. Not in the cloud. It will be so close to appearing like AGI that some people will clams it is.
Apple Maps, as it is today, is so much better than Google Maps. The UI is way more polished than Google Maps. The Google UI looks like it is designed by committee - it is so ugly and dense. I find it really offensive.
The Apple Maps has better directions for my city at least. The Google one re-directs me through side streets and weird turns into traffic crossing lanes at the last minute.
I agree, the only thing that's still better is google maps reviews for restaurants, just because of the sheer amount of users compared to yelp (worldwide). But I really dislike google maps reviews UI
You're getting downvoted but honestly I'd agree insofar as the UI.
Google Maps is loaded with suggested content. What about lunch? Need a haircut? Check out these top rated places!! Would you recommend Google Maps to a friend or colleague?
Apple Maps is like what Google Maps was years ago: a map and a search box.
Sometimes I really do want to find somewhere to eat lunch at though - i.e. Google maps minus some of the webshit is reasonably good at being a centrally dense place for all that informatjon
The key is to have enough cash that you know you will hit the nail. Microsoft is failing constantly, even launching several times different products for the same use case. It doesn't matter at that level.
Many of these were not "polished" at release either.
- The Lisa was so expensive that it failed to find a market regardless of how visionary it was, ultimately making less of an impact than the Apple II.
- The first iPod models were plagued with battery issues and had no meaningful way to replace them once it went bad (this extended into smaller models like the Nano).
- The iPhone was notoriously gimped at launch and only barely delivered on it's promises, with the majority of features people know today being added in updates.
- The first generation Airpods should be classified as instruments of sonic and physical abuse, not headphones.
- The first generation iPad was depreciated almost immediately and got ~2 years total software support.
To say nothing of their success, sure, but Apple clearly has their own honeymoon phases to work through.
I wasn't really a fan of the Earpods, but they were cheap and sounded fine for $20. The Airpods do this wirelessly while sounding worse for $180. The fact that they reserve comfort-fit ear tips for their premium headphones is an insult to every user at that price point, and it sabotages any chance they had at sounding balanced or natural.
Airpods Pro are much nicer, but should be the standard. The Airpods 1 (and frankly the Airpods 2 as well) sound thin and are borderline unacceptable for what they cost. Just my $0.02 though, audio and comfort are ultimately subjective.
> I have no doubt that Apple are working on local LLMs and combining them with Siri, it's only a matter of time. I expect we will see something next year, probably tied to the next iteration of hardware - they do want to sell hardware after all.
This was recently submitted to HN and I think it's largely reasonable supposition:
Apple has been pretty good about quietly integrating AI-based image processing into iOS.
All text within photos is being automatically recognized and is fully searchable. That has saved my ass on occasion when some piece of information was in a photo.
The AI background removal feature in Photos is cool too — just drag from a foreground element, and it usually gets it right.
But despite this kind of flawless feature integration work, clearly Apple as a company has some issues with applying generative models on a more fundamental scale. They look a bit stuck in their own loop of trying to replicate past success with an upcoming $3,500 device that’s all about ever fancier glass and aluminum hardware.
They just haven’t been loud about it. But they have specialized silicon for running neural nets (which they, admittedly have been quiet about outside of “adopt CoreML”). They have CoreML. And if you browse places like r/localllama you very frequently find people recommending Mac Studios as a great machine for a hobbyists interested in local inference and training. They even had a subtle jab at nvidia in their last keynote talking about how much memory they support on on their latest M2 machines.
Qualcomm is announcing hardware that won’t be available until (at least) next week. That’s just not Apple’s way, and I think it’s a mistake to say they’re ignoring what’s going on.
Apple has not been silent. The whole iOS 17 release has AI stuff sprinkled all through it. They just aren't speaking the same language as everyone else. They are also hiring generative AI folks like crazy for things like Siri. They want to own the on-device space and are well on the way to revolutionizing their operating systems and hardware to do it.
That seems right. Apple leads out with "We're doing feature X, Y, and Z" and downplays the particular technology behind those features. Nobody really cares if X, Y, and Z are powered by AI, as long as they're awesome.
Contrast that with the rest of the industry who lead out with "We're doing AI!" and then they connect it to this and that feature.
I'd argue that consumers care about having great features, not about what particular technology powers it. My phone could run on unicorn poop, and I don't care as long as it does what I want it to do.
Despite all the hype, and being genuinely intriguing, this is the truth. They can do some neat tricks, but overall utility is low. Model hallucination in particular is a problem. I don’t see Apple rolling out any model that has the potential to lie to you.
I slowly started to think Apple knows there will be a golden hour to jump on a mature, economic LLM. Until then, they are probably watching without a massive investment into LLMs if it's not drastically changing their products.
The power of LLMs is that they can allow a human to interact with a computer using natural language.
Imagine just asking Siri to order you something and Siri finds the cheapest legit supplier and just orders it from them. It would remove the first hop needed to go to Amazon and unlike Alexa Siri is always with you. Google and Apple could do to Amazon what they did to Facebook.
> The ability to run generative AI models like Llama 2 on devices such as smartphones, PCs, VR/AR headsets
Maybe it's an upcoming feature for the Quest 3?
To that end, I've been pretty amazed by how far quantization has come. Some early llama-2 quantizations[0] have gotten down to ~2.8gb, though I haven't tested it to see how it performs yet. Still though, we're now talking about models that can comfortably run on pretty low-end hardware. It will be interesting to see where llama crops up with so many options for inferencing hardware.
ChatGPT 3.5 is the base level people expect LLMs to be, it would be 2-3 generation(3-4 years) of hardware before we can reach that. Anything below is just going to get bad reviews
Is 512gb a typo? The current biggest consumer card has 24GB, so we're probably 15 years from a 512GB card (judging from the increase of 4Gb to 24GB between 2012 and 2022).
I doubt it to be honest, desktop GPUs use too much power (and hence produce too much heat) to be integrated in that fashion, and any kind of shared memory will be too high latency.
There are 'desktop' (well server) cpus with 64GB of HBM memory per socket now. And big LLMs can be run on lower memory bandwidth systems (like zen4 chips with 12x ddr5 per socket) at lower performance, but where 1-2TB of ram is no big deal.
But for what applications? Sure, for answering free-form questions I expect GPT-3.5+ quality. I don't think GPT-3.5 is necessary to provide auto-complete in your email client.
Not Facebook hardware in this case, it would be their model running on QCOM hardware. If people can write apps in which the model is used locally and no information goes to the net (to Meta, Google, or anyone else) that's a huge privacy win. Presumably Meta would make money by licensing it to Qualcomm and getting some amount per device.
The key issue would be whether local-only apps can be written, or if Qualcomm and Meta would insist on seeing what the user is up to and would try to monetize that.
personally I hope that the facebook altruism as of late has been a market effect from their needing to compete with other mega-corps, and that it comes with no catch or ill-will because it's simply there to be 'the other'.
it's probably a naive hope, but I hope it's right.
(i've never trusted FB further than I could throw their company, so this is a big leap of faith for me.)
As I said in the other Llama 2 thread[0], this is Meta doing a traditional textbook "commoditise your compliment". It's a preemptive attack with this particular announcement on Apple and Google. They don't what them own LLMs on portable devices, they want it to be a commodity.
The move from Meta is pretty obvious, they just want to attract the best AI researchers, they know that this generation of AI is not it, and they know that the best AI researchers want to work on open sourced stuff.
Next up: they should make a social media app where each user can customize their profile page and have their favorite music start playing. It would be called YourSpace
Hmm I wonder what sort of privacy impacts there are in a future of having Meta (or Google et al) AI running on chips on your phone when the parent company has so much info on you and blatantly flaunts privacy laws.
How would it work to get a model that needs 8Gb+ of vram currently into some chiplet form factor? Is there an obvious way of translating this more directly to hardware?
Mobile chips use a shared memory model and 12+ gigs of ram for mobile hardware isn't exotic at all. Seems reasonable to think that it won't be an issue.
I don't think that works for LLMs. My understanding is that every single token produced by the model requires running floating point operations against all x-billion parameters, so the entire thing needs to be loaded into memory at all times for it to work.
Nope, you can use mmap to virtually map it to memory, and then you don’t have to hold the whole thing in RAM at once. I have spent the last 4-5 weeks working on this and optimising it
You'd have to have enough RAM to hold the whole model in memory or performance will be awful, mmap is just a way to get faster startup (if the mmap'd file looks exactly like the in-memory representation) and easier sharing (the mmap region can be shared read-only memory that multiple processes use).
You can shuffle it back and forth between disk and memory. It's slow, but it works.
There are people working on compute-in-memory hardware, like flash chips that can do matrix multiplication in place. None of it is close to reaching the market but there's a lot more interest now that neural networks are obviously useful.
Which basically means Android devices, in what concerns mainstream consumers, given the state of the PC market.
If on the other hand we look at embedded market, it is also not something where NVidia has any kind of strong market presence, with the exception of them trying to be in a good position on autonomous driving market.
This isn't about mainstream consumers either. This is about a technological development where the balance between performance and energy consumption is attacked from both ends, where one vendor has the performance and the other vendor has the energy consumption but neither have both (yet).
Yes they are. But it's not about that either. You are conflating end-user products with two companies developing to the same central point from opposing start positions. That is what this is about.
Qualcomm hasn't been able to manufacture desktop-class ARM devices yet (see SQ1 and X13 ARM edition), and the GPU IP doesn't really hold a candle to what else is out there either. They are however using significantly less power than say, x86 SoCs and GPUs from AMD, Intel or Nvidia.
Nvidia on the other hand has plenty of IP for GPUs and AI workloads that is proven to have high performance, even on the Tegra scale of things. The Jetson product line as an example does still outperform whatever Qualcomm has come up so far. But they are consuming more power and don't even come close to the same thermal envelope.
So Nvidia needs to reduce their energy needs, while Qualcomm needs to create actually performant devices for real-time AI workloads. That's why.
While Apple keeps making money from overpriced hardware, the competitors are working on actually being pioneers in AI. It makes me sad to see so much computational power in my iPhone and iPad getting wasted on silly subpar iOS apps.