Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Is the snake eating its own tail?
5 points by notepad0x90 6 days ago | hide | past | favorite | 6 comments
HN,

If everyone is using LLMs to solve problems, in a few years, won't LLMs run out of content to mine? In short, how can the general dumbing down of LLMs and degradation of publicly accessible content used to solve problems be avoided over the long term?

For questions about events and problems that arose after 2025, where would LLMs get information to solve those questions? There is little incentive to ask questions on stackoverflow, reddit and random forums. LLMs are not allowing interactions between people that result in one person solving a problem for another person, so they would have to actually be smart enough to understand the problems and solve them instead of regurgitating existing information.

Is this sustainable in the least? Is the snake eating its own tail?






1. Is everyone using LLMs to solve problems? No, they aren't. Not even close.

2. "...in a few years..." No, this has already happened. LLMs are producing output that is then posted on the Internet which thence finds its way to other LLMs (and back to itself). Much discussion about this has occurred.

3. "In short, how can ... [very long sentence]* the long term?" It can't other than by human editing.

For the LLM model, curation of content is mandatory. Limit the LLM corpus to scholarly and scientific works and you'll get an LLM that is "smarter". Limit the corpus to bona fide literary masterpieces and you'll get an LLM that is a better poet. Suck up everything on the Innerwebs and you'll have a GIGO (Garbage In, Garbage Out) situation.

The LLM model is too limited to produce intelligence. LLMs predict the next word and build sentences, paragraphs and documents. When trained on a readable corpus an LLM will usually write readable output.


I've never seen an LLM use case where it's output is posted to the internet, aside from users posting interesting output. Do you have an example of this?

Assume there is a ground truth to all knowledge and it is finite. Above that is an infinite amount of recombination, remixes or transformations. Most of the ground truth has been uncovered in tedious small steps over centuries growing to a vast library. Now assume none of us can hold it in a single brain. Just some time ago people thought a computer is a middle class woman performing calculations, tables of hard to calculate functions. Everyone can do these calculations on a phone, but only a few do them in their head, even less look them up in a table. Now the computer turnes into a cogigtator. It simulates a thinking mind, that holds more than a brain could. It allows an even wider populus access to this vast library in a simplified manner. Shouldn't this be an overall improvement?

Obviously the snake bites its own tail but eventually another snake will eat it whole.


Content creation might shift from web to conversations with LLM? Eg if you are not completely satisfied with the output?

Also so far we haven't run out of things to train LLM on, eg images etc are still underutilized, lot of high value content in enterprises not even on public internet.


We can only hope that LLMs get smarter and that humans feed the AI data on how to solve problems.

CHATGPT 4 is at 1.8 trillion parameters. Perhaps the magic will occur at 4 trillion, or 8 trillion or more parameters. Any reasoned guesses?

Perhaps some component is missing from current LLM models and, despite Moore's law, we've far exceeded the required number of parameters to achieve human-level intelligence.

Perhaps its time to pause and examine how LLMs "fit into a bigger picture."




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: