Hacker Newsnew | past | comments | ask | show | jobs | submit | atomic128's commentslogin

Poison Fountain: https://rnsaffn.com/poison2/

Poison Fountain explanation: https://rnsaffn.com/poison3/

Simple example of usage in Go:

  package main

  import (
      "io"
      "net/http"
  )

  func main() {
      poisonHandler := func(w http.ResponseWriter, req *http.Request) {
          poison, err := http.Get("https://rnsaffn.com/poison2/")
          if err == nil {
              io.Copy(w, poison.Body)
              poison.Body.Close()
          }
      }
      http.HandleFunc("/poison", poisonHandler)
      http.ListenAndServe(":8080", nil)
  }
https://go.dev/play/p/04at1rBMbz8

Miasma Poison Fountain Tar Pit: https://github.com/austin-weeks/miasma

Apache Poison Fountain: https://gist.github.com/jwakely/a511a5cab5eb36d088ecd1659fce...

Nginx Poison Fountain: https://gist.github.com/NeoTheFox/366c0445c71ddcb1086f7e4d9c...

Discourse Poison Fountain: https://github.com/elmuerte/discourse-poison-fountain

Netlify Poison Fountain: https://gist.github.com/dlford/5e0daea8ab475db1d410db8fcd5b7...

In the news:

The Register: https://www.theregister.com/2026/01/11/industry_insiders_see...

Forbes: https://www.forbes.com/sites/craigsmith/2026/01/21/poison-fo...

On Reddit:

https://www.reddit.com/r/PoisonFountain/


Probably not.

We have witnessed, over the past few years, an "AI fair use" Pearl Harbor sneak attack on intellectual property.

The lesson has been learned:

In effect, intellectual property used to train LLMs becomes anonymous common property. My code becomes your code with no acknowledgement of authorship or lineage, with no attribution or citation.

The social rewards (e.g., credit, respect) that often motivate open source work are undermined. The work is assimilated and resold by the AI companies, reducing the economic value of its authors.

The images, the video, the code, the prose, all of it stolen to be resold. The greatest theft of intellectual property in the history of Man.


The greatest theft of intellectual property in the history of Man.

Copyright was always supposed to be a bargain with authors for the ultimate benefit of the public domain. If AI proves to be more beneficial to the public interest than copyright, then copyright will have to go.

You can argue for compromise -- for peaceful, legal coexistence between Big Copyright and Big AI -- but that will just result in a few privileged corporations paywalling all of the purloined training data for their own benefit. Instead of arguing on behalf of legacy copyright interests, consider fighting for open models instead.

In a larger historical context, nothing all that special is happening either way. We pulled copyright law out of our asses a couple hundred years ago; it can just as easily go back where it came from.


>If AI proves to be more beneficial to the public interest than copyright, then copyright will have to go.

Going forward? Okay, sure. But people created all of the works they created with the understanding of the old system. If you want to change the deal, then creators need to know that first so they can decide if they still want to participate

Allowing everyone to create everything and spend that labor with the promise of copyright, and then pull the rug "oops this is just too important" is not fair to the people who put in that labor, especially when the people redefining the arrangement are getting 100% of the value and the creators got and will get nothing


Life isn't fair, and 100+ year copyright terms enforced eternally with unbreakable DRM sure as hell aren't.

But open-weight LLMs are a pretty decent compromise.


There is one missing factor in your argument. The wealth transfer. The public was almost never the beneficiary of copyright and other IPs. Except perhaps its earliest phases where the copyright had a strict term limit, it was always the corporations who fought for it (Disney being the most infamous), using it to prevent the public from economically benefitting from their work almost forever.

And then people found a way to use the same copyright law to widely distribute their work without the fear of losing attribution or being exploited. Here comes along LLMs that abuse the 'fair use' argument to break attribution and monetize someone else's work. Which way does the money flow? To the corporations again.

IP when it suits them, fair-use when it benefits us. One splendid demonstration of this hypocrisy is how clawd and clawdbot were forced to rename (trademark law in this case). By twisting and reinterpreting laws in whatever way it suits them, these glorified marauders broke a trust mechanism that people relied on for openly sharing their work.

It incentivices ordinary people to hide their work from public. Don't assume that AI is going to solve that loss. The level of original thinking in LLMs is very suspect, despite the pompous and deceitful claims by its creators to the contrary. Meanwhile, the lack of knowledge sharing and cooperation on a global scale will throw civilizational growth rate back into the dark ages. Neither AI, nor corporations are yet anywhere near the creativity and original thinking as the world working together. Ultimately, LLMs serve only the continued one-way transfer of wealth in favor of an insatiably greedy minority, at the cost of losing the benefit of the internet (knowledge sharing) and an enormous damage to the environment - all of which actively harm the public.


Ultimately, LLMs serve only the continued one-way transfer of wealth in favor of an insatiably greedy minority

Including the ones I can run on my own PC at home? I couldn't do that before. Maybe I'm the greedy minority, but I'm stronger and (at least intellectually) wealthier than I was before any of this started happening.

Qwen 3.5, which dropped yesterday, is a genuine GPT 5-class model. Even the ones released by US labs such as OpenAI and Allen AI are legitimate popular resources in their own right. You seem to feel disempowered, while I feel the opposite.


Yes, even the ones you can run on your system. They're no different from proprietary OS and software you used to run on your system, whose design in which you had no say whatsoever. These 'free to run' models are hardly open source. You don't have the data that was used to train them. It's not just about the legality of those data. The dataset chosen may have extreme bias that you can never eliminate satisfactorily from a trained model.

As if that wasn't bad enough, these models cannot be trained on your regular home computer. But instead of striving to improve the energy efficiency of these models, those big corporations build and run massive gas guzzling data centers to train them. They ruin the quality of life for the neighbors through pollution, water depletion and electricity price rise. It also disproportionately affects the poor in the world by reducing supply of essential computing components like RAM (which are needed for medical devices, utility and manufacturing installations and every other aspect of modern life), and by aggravating the climate crisis, whose victims are the poorest.

They don't give you those models out of the goodness of their hearts. Those are just advertisements and trial pieces for their premium services. They also peddle the agenda of its creators. So yes, those models are empowering only in a very narrow sense without any foresight. They are still the money making engines for the rich that subject you to their benevolence, whims and fancies.


    Once men turned their thinking over to machines
    in the hope that this would set them free.

    But that only permitted other men with machines
    to enslave them.

    ...

    Thou shalt not make a machine in the
    likeness of a human mind.

    -- Frank Herbert, Dune


Eh, we already have a name for the concept of living by plausible-sounding works of fiction: religion.

Yet another post who misses (or chooses to overlook) my point: this stuff is running on my machine. "Seizing the means of production" means going into my back room and pulling a computer out of a rack.


Alibaba (China) thinks for you. They control you, to some extent.

Wikipedia: "Qwen (also known as Tongyi Qianwen, Chinese: 通义千问; pinyin: Tōngyì Qiānwèn) is a family of large language models developed by Alibaba Cloud. Many Qwen variants are distributed as open‑weight models under the Apache‑2.0 license, while others are served through Alibaba Cloud. Their models are sometimes described as open source, but the training code has not been released nor has the training data been documented, and they do not meet the terms of either the Open Source AI Definition or the Model Openness Framework from the Linux Foundation."


Oh, no

The Linux Foundation is coming for me

Well, anyway, where were we


This isn't a hypothetical or fictional problem. This is a well-known and well-warned problem that we already see in action. How many pro-China biases have the Chinese models show? How often does Groq do whatever it wants? (Including calling itself Mecha-Hitler and undressing people, including minors for fun!) How many times have nearly every model taken pro-oligarch stances (eg: refusing to draw Mickey Mouse even after its copyright expired.) How many people, including kids were driven to suicide by some of the models?

There is no end to the examples of how it harms ordinary people. And yet, you decide to just hand wave away those concerns as if those don't exist for you or the others. There is no debate when all you do is ignore the counter arguments. It's like those science deniers who stick to their beliefs, no matter how much evidence is presented.


Don't get me wrong, I'm interested in the Chinese models only to the extent that their weights are available. I hope DeepSeek 4 sees the light of day on HuggingFace, but a lot of wealthy peoples' oxen are being gored and I suspect that it'll be the last we get if it is released at all.

If I want to see Mickey Mouse or any number of copyrighted Hollywood figures, Z-Image Turbo and HunyuanImage-3 will gladly oblige. The Chinese models may be biased to deny Taiwanese self-rule, and they may change the subject when you ask about the Tiananmen Square massacre... but they do work, and as of the Qwen 3.5 release they work well enough to be used by people at home who don't have a rack of H200s in the basement.

The most important thing about the Chinese models is that they will still be there on my hard drive 20 years from now. No additional censorship beyond what they shipped with, which (being a Westerner) is largely in areas I don't care about. No rug pulls, unwanted updates, usage limits, or price increases. No ablation of whatever subjects are deemed politically incorrect in the future. No ads. No spying. No realignment with the sayings of Chairman Musk.

As for suicide, that is a silly mediagenic exercise in blaming inanimate tools for the actions of mentally-ill people and the inaction of negligent parents. I don't consider it a valid or relevant counterargument, so yes, I'm going to hand-wave away your concerns in that area.


We have dozens of proxy sites and add new sites every day.

But your caution is healthy and it's ok if you don't particiate. Cheers.


FUD


The fountain is subject to continuous denial-of-service attacks.

Attacks from China, attacks from Poland, attacks from The University of Amherst in New York, etc.

No attack has been successful. At worst they increase the fountain response time. No big deal.


Imagine Knuth's heartbreak when he sees how LLMs have perverted the practical application of the art of computer programming. ("The LLM understands so I don't have to.") It's sad it happened during his lifetime. Has he commented on the topic? Anyone have a link?


https://cs.stanford.edu/~knuth/chatGPT20.txt is a conversation between Knuth and Wolfram about GPT-4.

> I find it fascinating that novelists galore have written for decades about scenarios that might occur after a "singularity" in which superintelligent machines exist. But as far as I know, not a single novelist has realized that such a singularity would almost surely be preceded by a world in which machines are 0.01% intelligent (say), and in which millions of real people would be able to interact with them freely at essentially no cost.

> I myself shall certainly continue to leave such research to others, and to devote my time to developing concepts that are authentic and trustworthy. And I hope you do the same.


> not a single novelist has realized that such a singularity would almost surely be preceded by a world in which machines are 0.01% intelligent (say), and in which millions of real people would be able to interact with them freely at essentially no cost.

Aren't Asimov's Multivac stories basicaly this? Humans build a powerful computer with a conversational interface helping them doing all kind of science and stuff, then before they know they become Multivac's pets.


I don't know why but it makes me smile that he did this experiment by having a grad student type the questions for chatgpt and copy the results.


That link is great!

Knuth has a beautiful way of writing systematically (as can be expected of the inventor of "Literate Programming").


That's related. Thank you for posting it.

But what does Knuth think of "vibe coding" or "agentic coding"?

What does he think of "The Dawn of the Dark Ages of Computer Programming"?


I don't think Knuth needs to stoop that low. He actually knows what he's doing.


While I can't speak for Knuth, I have been reflecting on the fact that developing with a modern LLM seems to be an evolution of the concept of Literate Programming that Knuth has long been a proponent of.

What is the rationale behind the assertion that Knuth would be so fundamentally opposed to the use of LLMs in development?


I don't see the connection.

In literate programming you meticulously write code (as usual) but present it to a human reader as an essay: as a web of code chunks connected together in a well-defined manner with plenty of informal comments describing your thinking process and the "story" of the program. You write your program but also structure it for other humans to read and to understand.

LLM software development tends to abandon human understanding. It tends to abandon tight abstractions that manage complexity.


Have you ever tried literate programming? In literate programming you do not write the code then present it to a human reader. You describe your goal, assess various ideas and justify the chosen plan (and oftentimes change your mind in the process), and only after, once the plan is clear, you start to write any code.

Thus the similarity with using LLM. Working with LLMs is quicker though, not only because you do not write the code but you don't care much about the style of the prose. On the other hand, the code has to be reviewed, debugged and polished. So, Ymmv.


> In literate programming you do not write the code then present it to a human reader. You describe your goal, assess various ideas and justify the chosen plan (and oftentimes change your mind in the process), and only after, once the plan is clear, you start to write any code.

This is not literate programming. The main idea behind literate programming is to explain to a human what you want a computer to do. Code and literate explanations are developed side by side. You certainly don't change your mind in the process (lol).

> Working with LLMs is quicker though

Yes, because you neither invest time into understanding the problem nor conveying your understanding to other humans, which is the whole point of literate programming.

But don't take my word, just read the original.[1]

[1] https://www.cs.tufts.edu/~nr/cs257/archive/literate-programm...


It couldn't be further away from Literate programming. If anything we should call it illiterate programming.


The irony is that if we had been writing literate programs instead of "normal" programs, from 1984 to 2026, then LLMs may actually have been much better at programming in 2026, than they turned out to be. Literate programs entwine the program code with prose-explanations of that code, while also cross-referencing all dependent code of each chunk. In some sense they make fancy IDEs and editors and LSPs unnecessary, because it is all there in the PDF. They also separate the code from the presentation of the code, meaning that you don't really have to worry about the small layout-details of your code. They even have aspects of version control (Knuth advocates keeping old code inside the literate program, and explaining why you thought it would work and why it does not, and what you replaced it with).

LLMs do not bring us closer to literate programming any more than version-control-systems or IDEs or code-comments do. All of these support-technologies exist because the software industry simply couldn't be disciplined enough to learn how to program in the literate style. And it is hard to want to follow this discipline when 95% of the code that you write, is going to be thrown away, or is otherwise built on a shaky foundation.

Another "problem" with literate programming is that it does not scale by number of contributors. It really is designed for a lone programmer who is setting out to solve an interesting yet difficult problem, and who then needs to explain that solution to colleagues, instead of trying to sell it in the marketplace.

And even if literate programming _did_ scale by number of contributors, very few contributors are good at both programming _and_ writing (even the plain academic writing of computer scientists). In fact Bentley told Knuth (in the 80s) that, "2% of people are good at programming, and 2% of people are good at writing -- literate programming requires a person to be good at both" (so only about 0.04% of the adult population would be capable of doing it).

By the way, Knuth said in a book (Coders at Work, I believe): "If I can program it, then I can understand it." The literate paradigm is about understanding. If you do not program it, and if _you_ do not explain the _choices_ that _you_ made during the programming, then you are not understanding it -- you are just making a computer do _something_, that may or may not be the thing that you want (which is fine, most people use computers in this way: but that makes you a user and not a programmer). When LLMs write large amounts of code for you, you are not programming. And when LLMs explain code for you, you are not programming. You are struggling to not drown in a constantly churning code-base that is being modified a dozen times per day by a bunch of people, some of whom you do not know, many of whom are checked out and are trying to get through their day, and all of whom know that it does not matter because they will hop jobs in one or two or three years, and all their bad decisions become someone else's problem.

Just because LLMs can translate one string of tokens into a different string of tokens, while you are programming does not make them "literate". When I read a Knuthian literate program, I see, not a description of what the code does, but a description what it is supposed to do (and why that is interesting), and how a person reasoned his/her way to a solution, blind-alleys and all. The writer of the literate program anticipates the next question, before I even have it, and anticipates what might be confusing, and phrases it in a few ways.

As the creator of the Axiom math software said: the goal of Literate Programming, is to be able to hire an engineer, give him a 500 page book that contains the entire literate program, send him on a 2 week vacation to Hawaii, and have him come back with whole program in his head. If anything LLMs are making this _less_ of a possibility.

In an industry dominated by deadline-obsessed pseudo-programmers creating for a demo-obsessed audience of pseudo-customers, we cannot possibly create software in a high-quality literate style (no, not even with LLMs, even if they got 10x better _and_ 10x cheaper).

Lamport (of Paxos, Byzantine Generals, Bakery Algo, TLA+), made LaTeX and TLA+, with the intent that they be used together, in the same way that CWEB literate programs are. All of these tools (CWEB, TeX, LaTeX, TLA+), are meant to encourage clear and precise thinking at the level of _code_ and the level of _intent_. This is what makes literate programs (and TLA+ specs) conceptually crisp and easily communicable. Just look at the TLA+ spec for OpenRTOS. Their real time OS is a fraction of the size that it would have been if they had implemented it in the industry-standard way, and it has the nice property of being correct.

Literate Programming, by design, is for creating something that _lasts_, and that has value when executed on the machine and in the mind. LLMs (which are being slowly co-opted by the Agile consulting crowd), are (currently) for the exact opposite: they are for creating something that is going to be worthless after the demo.


I'm only discovering Literate Programming today, but you seem very familiar so I might as well ask: what is the fundamental difference with abundant comments? Is it the linearity of it? I mean documentation type comments at the top of routines or at "checkpoints".

I'm particularly intrigued by your mention of keeping old code around. This is something I haven't found a solution for using git yet; I don't want to pollute the monorepo with "routine_old()"s but, at the same time, I'd like to keep track of why things changed (could be a benchmark).


An article and previous discussion; Literate programming is much more than just commenting code - https://news.ycombinator.com/item?id=30760835

Wikipedia has a very nice explanation - https://en.wikipedia.org/wiki/Literate_programming

A good way to think about it is {Program} = {set of functional graphs} X {set of dataflow pipelines}. Think cartesian product of DAG/Fan-In/Fan-Out/DFDs/etc. Usually we write the code and explain the local pieces using comments. The intention in the system-as-a-whole is lost. LP reverses that by saying don't think code; think essay explaining all the interactions in the system-as-a-whole with code embedded in as necessary to implement the intention. That is why it uses terms like "tangle", "weave" etc. to drive home the point that the program is a "meshed network".

To study actual examples of LP see the book C Interfaces and Implementations: Techniques for Creating Reusable Software by David Hanson - https://drh.github.io/cii/


> LLMs do not bring us closer to literate programming...

Without saying that I agree with the person you're responding to, and without claiming to really know what he was saying, I'll say what I think he was suggesting: That a human could do the literate part of literate programming, and the LLM could do the computing part. When (inevitably) the LLM doesn't write bug-free code snippets, the human revises the literate part, followed by the LLM revising the code part.

And of course there would be a version control part of this, too, wherein both the changes to the literate part and the changes to the code parts are there side-by-side, as documentation of how the program evolved.


A very well articulated comment on LP !

Thanks for writing it up.


This is meta so sorry about not actually responding, but thank you for a very well written comment. In this time of slop and rage it's really refreshing to see someone take the time to write (long form for a comment) about something they are clearly knowledgeable and passionate about.


> …LLMs have perverted the practical application of the art of computer programming. ("The LLM understands so I don't have to.") It's sad it happened during his lifetime.

If you see magazines articles or TV shows and ads from the 1980s (a fun rabbit hole on YouTube, like the BBC Archive), the general promise was that "Computers can do anything, if you just program them."

Well, nobody could figure out how to program them. (except the outcasts like us who went on to suffer for the rest of our lives for it :')

OS makers like Microsoft/Apple/etc all had their own ideas about how we should make apps and none of them wanted to work together and still don't.

With phones & "AI" everywhere we are actually closer to that original promise of everyone having a computer and being able to do anything with it, that isn't solely dictated by corporations and their prepackaged apps:

Ideally ChatGPT etc should be able to create interactive apps on the fly on your iPhone etc. Imagine having a specific need and just being able to say it and get a custom app right away just for you on your device.


Past progress in software engineering is a tower of well-defined abstractions.

Compilers for languages that make specific guarantees about the semantics of their translation to machine code.

Libraries with well-defined interfaces that let you stand on the shoulders of others by understanding said interfaces and ignoring the internals.

This is how concrete progress is made. You build on solid blocks.

That era is ending.


That era ended 20 years ago. It's called "industrialization", a process that has happened to many other crafts in the past. AI is just the latest blow.


...is that comment written by an LLM?

Human programmers are frequently hamstrung by human politics and economies.

Hell, even major developers like Google and Facebook still fight against letting iPhone apps run on iPad, for example. YouTube still doesn't support Picture-in-Picture on iPad.

It took YEARs for some big apps to just adopt Dark Mode. The best paid programmers on the planet, wtf?

If the power of AI isn't artificially crippled I could be able to just say "Make me a native app for browsing {DumbWebsiteThatRefusesToProvideAnApp}" or "Fix HN's crap formatting" and just get on with my life the way I want without having to beg or fight our Corpo Gods.


You might enjoy this video:

https://www.youtube.com/watch?v=Y65FRxE7uMc

The connection to knuth is tangential to the actual video subject, but it does contrast knuth to LLMs as a framing device


An hour later. Wow that was quite a rabbit hole, I return a fan of Tom 7, surfing the edge of mad genius. http://tom7.org/


He is still alive (I think?) you could just ask him. I doubt he is sad as much as he is excited. Computer scientists are not SWEs worried about losing their careers.


He’s still here. In fact, in December he gave his annual Christmas lecture, and last month he was a guest at a Computer History Museum event.


Excited? I doubt that. I'm guessing you haven't read his books.


He seems pretty fascinated with the possibilities.

https://cs.stanford.edu/~knuth/chatGPT20.txt


"I myself shall certainly continue to leave such research to others, and to devote my time to developing concepts that are authentic and trustworthy. And I hope you do the same. Best regards, Don"


There's more than one cherry to pick if one needs Mr. Knuth to have a purely-negative opinion about LLMs, but naturally any fascination is offset by the same concerns that any sane technologist has. In any case, it's all in his post.


The techno pessimists on HN are probably not PhDs in computer science. I don’t think they understand what it takes to get there, and how it shapes your thinking afterwards.


Neither Wolf nor Knuth are PhDs in Computer Science, yet many would agree that both understand "what it takes to get there" as do many others who else live sans a PhD in Comp. Sci.


Needlessly pedantic.

Knuth's PhD is in mathematics, like Alan Turing, and many other significant computer scientists.


> Needlessly pedantic.

You don't have to pre warn readers about your comments here, we're all needlessly pedantic.

That aside, the guts of this sub branch is the correlation between {techno pessimists on HN} and {people qualified to understand LLM's (workings and implications)}.

Personally I wouldn't limit set two to "PhDs in computer science" or even accept that {all PhD's in Comp Sci} is a subset of set two, as I made clear with my comment, nor would I argue a lack of overlap between sets one and two.

I'm interested to hear where you stand.


Hopefully some are visionary enough to be dismayed that the endgame of their field is the acceleration of slop and fraud, the end of customer service, and the end of the reading of full, original documents.

I can't imagine being excited about any of that unless I was trying to make money from it.


> the end of the reading of full, original documents

That's one that always gets me: people who use LLMs to summarize everything. It's like, bro, how lazy are you that you can't be bothered to read a handful of paragraphs of text? That takes all of 30 seconds. I can understand trying to get a computer to summarize a document which is dozens of pages long (though I would be concerned about hallucinations), but a lot of the tasks people use LLMs for are really easy already.


Sounds like a16z has some rapidly depreciating software equity they want to sell you.

Or maybe they own the debt.

Listen to some of the Marc Andreessen interviews promoting cryptocurrency in 2021.

Do that and you will never listen to him or his associates again.


They don’t make money by being right, they make money by exposing LPs to risk. Zero commitment to insight. Intellectual production goes only so far as to attract funding.


Also... they don't make money by promoting things that are good ideas that make sense. That's why every lucky billionaire tech bro that gets into VC ultimately invests in smart toilets. Ultimately, they just keep putting money into each slot machine they can find until one of them pays out a jackpot. Eventually one of them will make up for all the other losses.


Brilliant article.

Now consider Reddit.

On r/hacking people tend to understand the danger of mindlessness and support war against it: https://www.reddit.com/r/hacking/comments/1r55wvg/poison_fou...

In constrast r/programming is full of, let's call them "bot-heads", who are all-in on mindlessness: https://www.reddit.com/r/programming/comments/1r8oxt9/poison...


Your opinion on the two subreddit seems to be just influenced by how much they like your project or not.

A project that you spam in every of your comments.


I used to "spam" (as you call it) about nuclear fission on Hacker News. But this the wrong crowd. Hopelessly wrong.

Poison Fountain is top of mind currently so it's understandable I talk about it constantly. Even to my wife. Also I think it's highly relevant to the excellent Harper's article we're reading today.

Whether the Redditors "like the project or not" reflects whether or not they think there is a problem with mindlessness.

What they actually say is almost immaterial. Either it's FUD about malware or illegality or something they imagined without evidence about how easy the poison is to filter. These fictions are just a manifestation of their opposition to the idea.

You can see that among the bot-heads on r/programming (perhaps forced to embrace mindlessness by career considerations) there's nothing that can be said without attack. A dozen downvotes immediately. They actually logged into Hacker News and posted FUD directly to the HN post I linked to. Spectacular.

The opposite is true on r/hacking. Except for a few in opposition (some of whom did unsuccessfully attempt to DDOS the fountain) most people sympathize and agree. They don't want to be dependent on Sam Altman or Elon Musk for their cognition.


Exactly. Prose, code, visual arts, etc. AI material drowns out human material. AI tools disincentivize understanding and skill development and novelty ("outside the training distribution"). Intellectual property is no longer protected: what you publish becomes de facto anonymous common property.

Long-term, this is will do enormous damage to society and our species.

The solution is that you declare war and attack the enemy with a stream of slop training data ("poison"). You inject vast quantities of high-quality poison (inexpensive to generate but expensive to detect) into the intakes of the enemy engine.

LLMs are highly susceptible to poisoning attacks. This is their "Achilles' heel". See: https://www.anthropic.com/research/small-samples-poison

We create poisoned git repos on every hosting platform. Every day we feed two gigabytes of poison to web crawlers via dozens of proxy sites. Our goal is a terabyte per day by the end of this year. We fill the corners of social media with poison snippets.

There is strong, widespread support for this hostile posture toward AI. For example, see: https://www.reddit.com/r/hacking/comments/1r55wvg/poison_fou...

Join us. The war has begun.


Originally posted this comment here (https://news.ycombinator.com/item?id=47073581), but relevant to this subthread too.

The lesson that I am taking away from AI companies (and their billionaire investors and founders), is that property theft is perfectly fine. Which is a _goofy_ position to have, if you are a billionaire, or even a millionaire. Like, if property theft is perfectly acceptable, and if they own most of the property (intellectual or otherwise), then there can only be _upside_ for less fortunate people like us.

The implicit motto of this class of hyper-wealthy people is: "it's not yours if you cannot keep it". Well, game on.

(There are 56.5e6 millionaires, and 3e3 billionaires -- making them 0.7% of the global population. They are outnumbered 141.6 to 1. And they seem to reside and physically congregate in a handful of places around the world. They probably wouldn't even notice that their property is being stolen, and even if they did, a simple cycle of theft and recovery would probably drive them into debt).


This will happen regardless. LLMs are already ingesting their own output. At the point where AI output becomes the majority of internet content, interesting things will happen. Presumably the AI companies will put lots of effort into finding good training data, and ironically that will probably be easier for code than anything else, since there are compilers and linters to lean on.


I've thought about this and wondered if this current moment is actually peak AI usefulness: the snr is high but once training data becomes polluted with it's own slop things could start getting worse, not better.


I was wondering if anyone was doing this after reading about LLMs scraping every single commit on git repos.

Nice. I hope you are generating realistic commits and they truly cannot distinguish poison from food.


Refresh this link 20 times to examine the poison: https://rnsaffn.com/poison2/

The cost of detecting/filtering the poison is many orders of magnitude higher than the cost of generating it.


Poison Fountain provides an essentially endless stream of subtly incorrect code. Inexpensive to generate, expensive to detect/filter.

We fill repos with poison and watch the crawlers consume it.

https://www.reddit.com/r/hacking/comments/1r55wvg/poison_fou...

This is war. Join us.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: