Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: For human brains only?
24 points by neon_me 6 months ago | hide | past | favorite | 45 comments
How can one create content that only humans can process (read, save, edit, reply), avoiding AI scraping and analysis? In an age of advanced computer vision and audio processing, what methods could ensure information remains obscure to AI but readable by humans?



Pen and paper. You can never ensure that the information remains obscure, but if you keep the original documents away from a computer, and share them via post, they will only ever be seen by human eyes. You can’t ensure that someone won’t scan them at some point though.


Yes, do things that don't scale well. Tell someone face to face without recording it. It's a story as old as time.


Newsletters used to be delivered through regular mail, they weren't always emails you intended to read but ultimately ignore.


There are newsletters that are still only available through regular mail. They tend to be very niche (and therefore expensive), but they still thrive.


States and corporations are investing billions of dollars into creating the infrastructure to train on everything, everywhere, all the time. As AI on the web runs out of data to train on, the need for organic training data will necessitate merging AI with global surveillance capitalism and the panipticon, and training on the real world. You're not going to be able to just opt out.

You may not be recording your conversations but someone will be. Everything you write will be scanned, and every camera will be training an AI on everything it sees.


That’s an interesting question, I had never really considered….

There’s probably no definitle 100% safe way, but one possibility might be to exploit some quick of human perception (kind of like what optional illusions do), but of course there’s no guarantees that a sufficiently advanced future AI won’t be able to read it (on enslave humans to translate it).


This requires cooperation of all the humans who process the information. Any human that processes it could transcribe it for an AI. It's like trying to stop corporate information leaks.


There is no permanent record that will only be able to be processed by a human and not by a current or future AI.


Prove it.


Create an encoding system based on various unconventional modifications to physical objects, do not document it anywhere, teach it to a small group of humans who are contractually bound to not spread the knowledge, and keep the knowledge going into the next generation along with the sense of purpose.


Unless the encoding system was miraculously complex and the amount of content produced with it remarkably small, reversing the encoding in order to process the data seems highly plausible, especially if the input to the cipher was typical of human generated content.


So that's why those ancient philosophers despised writing anything down and had secret assemblies!


I think this is like asking "how do I encode binary that computers can't read?" or "how do I make tangible change that nobody can notice?"

You can't. By thinking, or writing, or speaking or gesturing we generate photographic, textual, audible information that can be parsed in multiple different ways. AI is simple and adaptable, however we fool it today becomes tomorrow's training benchmark.

Besides cipher encryption, I really don't think there are any ways to guarantee that AI cannot understand you. Most methods end up ensuring that humans can't understand you either.


hence, we are doomed


I suppose you could have said the same thing about the invention of books, that would share the dangerous knowledge of kings and sages with the unstable and unwashed masses.


Humans had natural information barriers. A foreign power invading a secret library would need significant time and resources to process its contents, often missing crucial details. This created an asymmetry in knowledge and data processing capabilities.

Today, AI can analyze extreme amounts of data, eliminating this asymmetry and creating a gap with our own capability. How can we maintain some level of 'information obscurity' or processing advantage against the AI? Are there any methods that remain challenging for AI to interpret but are accessible to humans?

I was thinking more like creating "DRM-like" content in obscure copy that might withstand for a while ... shrug


I just don't understand the question, I guess. By sharing something online you are removing its information barriers almost entirely. That means both humans and AI have plausible and complete access to what you've made. If you don't want people or AI to have access to your content, then don't put it on the internet. I don't think you can have your cake and eat it too, there is no "stop humans from reading about me online" button and certainly no such thing exists for AI either.


This is the situation, yes.

So what I've done is remove my works from the open web and put it behind a login wall. If you want an account, I have to be certain that you're a human being and that you will not proceed to put my work somewhere where it can be scraped by an AI bot.

Which, in practice, means that I have to personally know you. I don't know of any other solution to this problem at this time.


Such schemes likely exist, but you would be hard pressed to find any system that wouldn’t also be burdensome for real humans to deal with. This is basically the same thing as CAPTCHA.

I’m curious if you’re more concerned about the corpus of your work being used in the training of a model, or, parts of your work being analyzed by an existing model? For the former, I suspect copyright laws will eventually come around that afford some protection. But as for having something summarize your work, I sense that no copyright laws would be made for that.


I've never read a book that was able to slurp up my data.


You're doing the slurping.


We seem to have read different chains of thought in this thread. The topic was "is it possible to encode things so that humans can understand but machines can't"?

Talldayo said "I really don't think there are any ways to guarantee that AI cannot understand you."

neon_me responded: "hence, we are doomed"

Talldayo responded "I suppose you could have said the same thing about the invention of books"

My comment was in response to that. Talldayo's response seemed a nonsequitor to me because books are providing data to the reader, not trying to collect and understand data from the reader.


Tangential, is the opposite also possible? Something that only non humans could understand?

E.g. is there a way to trick 'AIs' on the other side of the phone without a human noticing it? For example 'tricking' something like Contact Center AI from google https://cloud.google.com/solutions/contact-center?hl=en


Machines can "talk" via impulses in cables or radio with very low latency. Humans can not catch that.


Yeah, computers tend to hear differently than humans do, so you can cater some sort of message to them that humans won't process but they will. See https://www.mdpi.com/2079-9292/12/8/1928 as an example.

They can also be embedded into songs so that humans hear one thing but the computer does something different: https://www.usenix.org/conference/usenixsecurity18/presentat...

There have also been attacks on computer vision systems (like in cars) that can make them suddenly brake or misidentify a lane or street marker, etc., but in ways not obvious to humans: https://adversarial-designs.shop/blogs/blog/adversarial-patc... or to fool facial recognition systems into thinking you're someone else (in a way that won't fool anyone who actually knows you).

And more broadly, any sort of obscured malware tries to deliver a malicious payload while pretending to be something else to the human who runs it.


Interesting, but I don't think it's possible. The purpose of AI is to be more like humans. Even if it can't understand you now, it will tomorrow.


Isn't that concerning?

As species, we "seem unable" to develop any kind of information that would remain exclusively ours, especially when faced with a potential rival that processes data exponentially faster and with greater precision than we do.


  *> Isn't that concerning?*
Off the top of my head, not especially. The problem of rivals that can process data faster and with greater precision is a problem that existed before AI was used/commoditized.

I suppose I am concerned about AI using information to specifically target vast numbers of people at scale, based on their psychological traits/desires/vulnerabilities. Esp for political, psyops, dark pattern marketing, etc.


That's not really a AI-specific problem though. Any sort of hostile codebreaker is the same sort of threat.

What sort of information would you want to protect from AI but not other humans? If it's a secret, isn't it a matter of who gets to see it, not necessarily what? I'd sooner trust "our AI" than "their human".


I'm not worried about the knife coming for me, I'm more worried about the rich and powerful wielding the knife.


Insert a slur after every other word. The LLMs filters won't allow it to reproduce any of the content it scrapes.


In person, oral recitation. You'd probably have to have some mechanism for ensuring nobody was recording though.


The Game


Write it in an obscure language that computers haven’t figured out yet.

This does require your humans to understand the obscure language though.


I look for different platform/format of data than inveting language - because cracking the language/code/cipher is far easier for machines than humans.


You would have to have some physical device that can decrypt the content only by verification of human biological signature.


This doesn't follow a read, save, edit, reply notion but I have seen Nightshade for images. I am unsure of its effectiveness. I heard an art friend mention it when Mid journey got popular last year. The term they use is that the data being shown or uploaded should be poisoned.

https://nightshade.cs.uchicago.edu/whatis.html


Easy: stereograms. Although, it may be hard for a large proportion of your audience to read them.


"You can't grep dead trees."

AI doesn't have the ability to open hard copies. Create, print, and send everything offline.


Hardcopies are usually written and shipped to print in electronic versions + there is no guarantee that someone will not take photos/scans of "dead trees" ... so this option is sadly - out of the table.


Classified documents plainly exist.


they tend to be on a separate network. but the moment you access them, presumably a sufficiently advanced ai can watch the flicker in the lights or measure the temperature changes in the room or watch the colour on your face and have a good guess as to what your orders were.


Currently at least, sign language fits the bill.


poison it with falsehood that would be obvious to humans. blinker fluid needs to be changed annually.


What rank prejudice is this? How will our children, meat or electrical, ever grow if we refuse to teach them?

I suspect the real issue you wish to address might be expressed better. Perhaps "How can we ensure that large AI companies haven't got favorable intellectual property rights over individual's output?"




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: