Storage is effectively free. I can fit the entire Vicuna LLM on a $5 (US retail) SD card right now, and in a decade that’ll be less than $.50. LLMs will also presumably get more efficient at storing fact data outside of the network parameters.
People can download Wikipedia pages but weirdly they don’t. They prefer the interactive web to big downloads of static encyclopedia data, even though the latter is better and easier to distribute in a censorship-resistant manner. That’s why Tor is much more popular than high-latency censorship resistant networks that share static blobs, even though low-latency networks are relatively insecure and much easier to block. The question then is not how to save $3 of storage media: it’s whether you can simulate the interactive experience of the Internet in a satisfying way without a reliable low-latency network connection. Maybe the answer is no, maybe it’s yes. Imagine a really high-quality future LLM that you can actually interact with and ask questions about specific topics, maybe even using voice communication, and that can play games with you and generate other types of “web like” experiences. Is that really less compelling than a download of Wikipedia?
People in regions where Internet access is a problem to begin with do not download Wikipedia because they don't have the ability to do so. That's why Kiwix has side projects such as a Raspberry Pi based content server that can be set up in e.g. a school and used to serve content to all students using whatever cheap phones they might have, or a reader that runs on WinXP. And, yes, that stuff is actually deployed in various places in Africa.
Now imagine the hardware that you'll need to run that high-quality future LLM. How many Kiwix hotspots could you set up for the same money?
Mind you, I'm not saying that there aren't compelling use cases for LLMs. It's just that it's a thing that's nice to have after you have Wikipedia etc. Indeed, ideally you'd want to have Wikipedia indexed so that the LLM can query it as needed to construct its responses - otherwise you're going to have a lot of fun trying to figure out which parts are hallucinations and which aren't.
The situation I’m considering is not people who are sitting in regions where Internet connectivity is fundamentally poor, because there is no infrastructure. The situation I’m considering is one where Internet connectivity exists but is being deliberately throttled or filtered: either through periodic Internet shutdowns (as is happening in Pakistan right now as described in TFA) or through ubiquitous content and site filtering (as happens routinely in China and increasingly in other nations.)
There are places where Internet access and compute resources are fundamentally limited by lack of infrastructure. Talking about LLMs in those places doesn’t make sense. But on the trajectory we’re currently on, those places will likely become more rare while political Internet filtering will become more common.
People can download Wikipedia pages but weirdly they don’t. They prefer the interactive web to big downloads of static encyclopedia data, even though the latter is better and easier to distribute in a censorship-resistant manner. That’s why Tor is much more popular than high-latency censorship resistant networks that share static blobs, even though low-latency networks are relatively insecure and much easier to block. The question then is not how to save $3 of storage media: it’s whether you can simulate the interactive experience of the Internet in a satisfying way without a reliable low-latency network connection. Maybe the answer is no, maybe it’s yes. Imagine a really high-quality future LLM that you can actually interact with and ask questions about specific topics, maybe even using voice communication, and that can play games with you and generate other types of “web like” experiences. Is that really less compelling than a download of Wikipedia?