ChatGPT 3.5 is the base level people expect LLMs to be, it would be 2-3 generati...

bigyikes · 2023-07-18T20:54:44

A small model might be useful for e.g. NPC interactions in a Quest game

Legend2440 · 2023-07-18T21:09:23

3-4 years to run it on your phone seems generous, barring algorithmic breakthroughs.

If I can run 100B+ models on my high-end desktop in 3-4 years I will be very happy.

nullc · 2023-07-18T21:47:17

What do you consider a high end desktop now?

holoduke · 2023-07-18T22:14:05

He means a videocard with 512gb of memory or more.

delecti · 2023-07-18T22:57:59

Is 512gb a typo? The current biggest consumer card has 24GB, so we're probably 15 years from a 512GB card (judging from the increase of 4Gb to 24GB between 2012 and 2022).

flangola7 · 2023-07-19T03:40:21

That's why 3-4 years would be impressive

Legend2440 · 2023-07-18T23:23:03

With 4-bit quantization the requirements are more like 64gb.

I expect we'll see more unified memory designs like Apple's 128GB M1 Ultra.

zacmps · 2023-07-19T03:26:42

I doubt it to be honest, desktop GPUs use too much power (and hence produce too much heat) to be integrated in that fashion, and any kind of shared memory will be too high latency.

nullc · 2023-07-19T22:35:40

There are 'desktop' (well server) cpus with 64GB of HBM memory per socket now. And big LLMs can be run on lower memory bandwidth systems (like zen4 chips with 12x ddr5 per socket) at lower performance, but where 1-2TB of ram is no big deal.

madars · 2023-07-18T20:59:49

But for what applications? Sure, for answering free-form questions I expect GPT-3.5+ quality. I don't think GPT-3.5 is necessary to provide auto-complete in your email client.

brucethemoose2 · 2023-07-19T01:39:29

Llama 65B finetunes already exceed it in some niches, like roleplaying or specific (coding and spoken) languages

kernal · 2023-07-18T21:12:01

Isn't that the same LLM that doesn't know how many e's are in "ketchup"? Nice.

Legend2440 · 2023-07-18T21:14:50

Aren't you the same user that doesn't understand word-level tokenization?

kernal · 2023-07-18T21:19:58

You must have me confused with an LLM bot that does know how many e's are in "ketchup".