Hacker News new | comments | show | ask | jobs | submit login
DissidentX – Censorship resistance tool by Bram Cohen (github.com)
222 points by edwincheese 1145 days ago | hide | past | web | 89 comments | favorite



Judging by the claims and the code, this is a tool created by someone who hasn't read any prior research about steganography. If you trust this, don't be surprised when law enforcement detects that you're using it.

I'm surprised to see someone of Bram Cohen's caliber releasing something like this. No one has any business coding security tools unless they've taken time to read forensics whitepapers to look for reasons why their tool won't work. And this tool certainly won't work.

The goal of steganography is to hide the fact that you've transmitted messages. The longer the message, the harder that becomes. This may be suitable for hiding a few bytes, but no useful message is going to be a few bytes, unless it's something like a decryption key (and hiding a decryption key using stego would be crazy). This doesn't solve the problem of "law enforcement wants to know what your decryption keys are, because they've detected you're encrypting data." The whole point of stego is to avoid that scenario.

Anyone who's interested in steganography should start here: http://www.cl.cam.ac.uk/~rja14/Papers/jsac98-limsteg.pdf ... No one who reads that whitepaper and understands its implications would take this tool seriously.

EDIT: To clarify: a message as short as ~50 bytes can often be detected, depending on the stego implementation, because even that is enough to cause statistical anomalies in the covertext which indicates that an encrypted message is hidden in the covertext. So I'm not talking about detecting images or videos sent via stego; just encrypted plaintext messages.


This is a framework for steganographic schemes, not a specific steganographic scheme. The specific ones thrown in are just for demonstration purposes. The versatility of this approach is a major step forward in defeating statistical detection schemes. You of course don't know this, because you haven't read through the page and figured out what the code does.


You of course don't know this, because you haven't read through the page and figured out what the code does.

Let's not get personal. I only mentioned your name because it was in the headline, not to bully anyone.

I know this is a framework. But the problem with stego is that as soon as you release your code, you make it almost trivial for law enforcement to detect that you're using stego. It's a catch-22: you want people using the code, but you don't want law enforcement knowing what code you're using, because then they can just use the same code to detect that you're using stego, which defeats the purpose of stego.

This isn't theoretical. Each time someone releases a new stego tool out into the wild, forensics companies add it to their own frameworks for detecting stego.

Let me be clear: I want you to succeed, and I think it's a great thing that so much effort is being put into developing these sorts of tools. But you have to say something like "Don't use this tool yet! It's not ready for production!" ... The way it was presented here made it sound as if it's ready to be used, but anyone who uses it in its current state will be swiftly detected by law enforcement.

Let's put it another way. Do you think the 120 people who upvoted this did so because they understood this is "just a framework / reference," or because they were hopeful this actually works? It's not fair to them not to include a disclaimer saying this shouldn't be used. The way the README is written makes it sound like you're encouraging people to use it, even though it's not intended to be used.


There's great irony in you saying

> Let's not get personal.

right after saying

> I'm surprised to see someone of Bram Cohen's caliber releasing something like this.


I feel bad about it. I shouldn't have called him out by name; I should've concentrated solely on why this tool falls short. Sorry, Bram.

I'm just worried that people will see his name, see that he's saying things like "this tool is ready to be used," and then actually use this, just because "It's Bram Cohen," and end up getting themselves caught.


I don't see anything wrong with pointing out that a famous person's work is below par, particularly when said person decides to show up in the thread and ignore what you are saying and act like a jerk in response. You shouldn't retreat so easily.


This tool allows for the specifics of how the encoding is done to be changed without the decoding algorithm needing to be changed ever, so yes in fact it is ready to be used, although better encoders are both easy to write and welcome.


There are two possibilities. Either you've created a tool which enables people to covertly send messages without being detected, which every publicly-released stego tool thus far has failed to do, or you haven't.

Have you spent much time researching why current stego tools have all failed? The way you're endorsing this makes it sound like you haven't, and you're putting people in danger by pretending like law enforcement is incompetent.

Remember, law enforcement somehow managed to acquire an image of Silk Road's server, even though they were running it as a Tor Hidden Service, and they also managed to recover >100k bitcoins from DPR. All of this was done through forensics. Are you claiming that this tool is secure against such an adversary?

Hopefully someone will write a program called "DissidentXDetector" before law enforcement does. The myth that this generates undetectable messages needs to be debunked before people start trusting this.


From the README:

Q. Can someone detect that a file has messages encoded in it?

A. That depends on the encoding used and the properties of the file the data is being encoded in. There's a whole field of academic literature on steganography, none of which is invalidated by this code. What this code does is vastly simplify the implementation of new steganographic techniques, and allow a universal decoder and encoding of multiple messages to different keys in the same file.


The README should read:

Q. Can someone detect that a file has messages encoded in it?

A. If the file was generated with an encoder whose code is public (i.e. Github, bitbucket, ...) then yes. Always. And even if the code is private, it may not be secure. Unless you come up with an encoding scheme that's never been thought of before, then law enforcement will likely be able to detect the encoded messages unless they're trivially short.


You're assuming that there isn't a shared secret between the sender and reciever.


random peanut gallery thought....mentioning "law enforcement" in the README itself might open up the tool itself to legal attacks, no ?


No


I have been working on a new steganography algorithm that I believe is secure. I plan to make it open source in the new year. Would you like to review the design and code when it is released?


To clarify: There is not, and cannot be, a universal DissidentX encoding detector. It will always, always, be possible to iterate arbitrarily on the encoding technique, and this iterating is easy to do.


Every time someone iterates and releases the code on github, law enforcement will then be capable of writing a DissidentX encoding detector for that encoder. Every time you publicly release stego code, that stego code becomes ineffective.

At best, this framework provides a way for people to write stego encoders that they don't plan on releasing publicly. But you should say that! Warn people how dangerous it is to be releasing their stego code. And warn people not to trust any of the default encoders.

It's not as easy to iterate on stego techniques as you're implying. There are only so many ways to creatively hide a message. And if people happen to come up with a scheme which has already been broken in the past, then their encoder will provide no security at all. They'll trick themselves into believing they're secure, when they're not.


The endgame of effectiveness of stego with this framework of published techniques versus customized detectors for those techniques specifically is going to be much, much more difficult for the detector than it is today, if not outright losing. As for your claim that it isn't easy to stego techniques: Seriously, go read through the docs before making claims. You're just plain wrong, and obviously so.


Can you elaborate? I don't see why this must be true. Just as good encryption is indistinguishable from random data, good steganography should be indistinguishable from whatever universe of target plaintexts you've chosen. In both cases, the code is public, but the secret key is needed to see that the message is non-random, or non-plaintext.

I am interested in how this scheme is different from https://fteproxy.org/ though.


Here's one way it might go down in practice. After law enforcement seizes your computer, they'll scan your computer for any encrypted containers, along with any code that looks like it's used for steganography. They'll find DissidentX, since its README mentions "steganography," which is a keyword that their forensics tools will search for. Then they'll use each encoder in your DissidentX folder to scan your computer for any encoded messages. Unless the message is trivially short (<50 bytes) then they'll come up with a list of suspect messages. This list will include any encoded message you've created using DissidentX, along with some false positives. Then, if you're in the UK, they'll have a judge demand you cooperate with them; any plausible deniability you may have had is gone at that point. It's "cooperate or go to jail."

Check out http://www.cl.cam.ac.uk/~rja14/Papers/jsac98-limsteg.pdf and related literature. In particular this quote:

Shannon provided us with a proof that such systems are secure regardless of the computational power of the opponent [43]. [...] Yet we still have no comparable theory of steganography.

The problem is that there's no such thing as perfectly secure stego (undetectable covert messages), even though there is perfectly secure encryption (unbreakable encrypted messages, regardless of the computational power of the adversary, when implemented correctly, and when not defeated via side channel attacks, and when not compelled to cooperate by a judge).


I just read it, and you are completely extrapolating what that paper says. It does not say steganography is hopeless. It contains no mathematical proofs, and it's also from over 15 years ago.

More generally, "we do not have a proof" does not mean "we disprove". You also completely ignored my point about the secret, without which the encoder will not work when an attacker tries to run it.


It's highly likely that a safe algorithm will come with a proof, thus the lack of a proof demonstrates the lack of a safe algorithm.


So can't you just embed messages in every file you own? Then when asked for the keys, give out fake keys for the really really secret stuff. So law enforcement ends up with a few sensitive documents and a whole bunch of random bytes where you cannot distinguish between "actual random bytes" and "bytes decoded with the wrong key". And there is your plausible deniability.

Obviously it's not perfect. Obviously a totalitarian regime which suspects you of dissident activity will pick any reason out of thin air to lock you up for as long as they like, or just execute you.

But being able to say "here's the keys" with them having no way to know if they are all the right keys, is at least something.

Though of course at best you won't keep those files on your PC in the first place. You'd keep them on a microSD card that you keep in a tiny pouch under your skin. You'd keep them encoded in photos you have printed out and hung as wall pictures. You'd have them embedded in a well-torrented movie and backed up willingly by hundreds of thousands people (though not you). And if you just use them to send encoded messages, neither you nor the recipient will ever store them on an hdd.

I mean, you're not stupid, right?


I'm terribly confused by your argument. Who's talking about equipment seizure here? What does that have to do with the on-the-wire security of the encoded messages?

Your argument appears to concern only the risk in openly publishing encoders. Are you also arguing that Bram's framework encourages such publishing? If not, then what exactly is your beef with it (the framework)?


OK, so when you're talking about seizure of equipment, with all the tools and past encodes just sitting there on the machine for the taking, you're really far afield of the kind of argument you had seemed to be making. I understand now why you seem to assume that the adversary is near-omnipotent here - because you're assuming that the user is a dolt who is doing most of the hard work of damning themselves for the state.

An effective dissident is going to employ some reasonable opsec practices and have multiple layers of security, they're not going to be foolish enough to think that one program is a magic bullet.


"but no useful message is going to be a few bytes"

The stereotypical intro to crypto 101 message "attack at dawn".

Although I agree if the point is to sneak out multi-gig video footage of war crimes, this isn't going to work very well.


Here's another fun steganographic tool: http://www.spammimic.com

Hide messages in SPAM Text:

Dear Decision maker , We know you are interested in receiving amazing intelligence . This is a one time mailing there is no need to request removal if you won't want any more . This mail is being sent in compliance with Senate bill 1625 ; Title 4 ; Section 302 . THIS IS NOT MULTI-LEVEL MARKETING ! Why work for somebody else when you can become rich as few as 33 days . Have you ever noticed people love convenience and more people than ever are surfing the web ! Well, now is your chance to capitalize on this ! WE will help YOU decrease perceived waiting time by 190% and increase customer response by 150% . You can begin at absolutely no cost to you . But don't believe us . Ms Ames of Washington tried us and says "I was skeptical but it worked for me" . We assure you that we operate within all applicable laws . We implore you - act now ! Sign up a friend and you get half off . God Bless !


That is actually remarkably clever. Spam would indeed seem to be an excellent vector for sending hidden messages!


What in this makes it get past spam filters?


It doesn't matter, as long as it still gets delivered to a spam folder, you can still retrieve the message.


It doesn't have to get past spam filters. Normally spam is held in the spam bin for 30 days.


Stenography is one of the NSA's worst nightmares. Encrypted strings sent over the Internet they know are encrypted, and often know what algorithm and key length. But the fact that any image can contain an encrypted message, and there's know way to know whether or not something exists within, scares the shit out of them.

So, good work.


Most steganography is trivially easy to detect.

Steganography that is implemented correctly then requires reasonable amounts of cover text, and small amounts of hidden text.

NSA fucking loves steganography because most of it is a toy implementation where someone hides text in the LSB of the bytes of a gif or jpeg. The ratio of cover:hidden text is terrible. And the implementer forgot to mention that it's just a toy and not to be used seriously.

The number of decently implemented steganography systems is small.


No most of steg is actually even worse then that: append text/rar to end of other file (many formats are tolerant to extra data at end).


Why use cover text? Why not just put ciphertext in a jpeg? Wouldn't that just show up as noise?


Sorry, by cover text I mean anything that is used to hide the hidden text. Thus, the jpeg would be the cover text.

Thanks for pointing that out.

To answer the question: It shows up as a specific type of noise that's easy to detect. Some of the crypto / math people will be able to explain it much better than I can.


To answer the question: It shows up as a specific type of noise that's easy to detect. Some of the crypto / math people will be able to explain it much better than I can.

Ahhh. What if you were to use a video instead of a still image and only use a handful of pixels (or macroblocks) in each frame, chosen randomly (the random seed exchanged out-of-band)? Seems like that would give you a very high cover:hidden text ratio.


Seems like that would give you a very high cover:hidden text ratio.

It would, but that doesn't change the principles used to detect the steganographically encoded cyphertext. The bits would still be twiddled in the same way, and could be found in the same way.


The question is: would it be feasible to search for them? Scan every single video on youtube looking for noise with some elevated probability of containing hidden text? What happens when you find a candidate? Pick random pixels out of every frame and then try and brute force it with every known symmetric cipher and every single key?

You could flip a single, random, least-significant bit on each frame of a 1 hour movie. This would allow you to store a 10.5KB encrypted message within. I'd like to know how anyone could possibly find those bits, let alone decipher them.


Depends on your use case and threat model.

If I'm the Secret Police in some oppressive state, then I just need to find out whether you seem to be using stego — which is naturally against the law, itself, and hence grounds for arrest. Then, I can use rubber hoses, bamboo splinters, the threat of violence against your loved ones, and what-not to "brute force" your passphrase.

If I'm the NSA, I just detect the presence of stego and stash the container for later — say, when my quantum computer finally works as advertised, or I can plant a keylogger or turn on the back door on the your computers and sniff your passphrase, or simply mine your social graph until I find some other means of compromising you.

The possibilities are hardly limited to a naïve, brute-force search across the set of (crypto algorithm, passphrase) tuples.

EDIT: But, to your point: yes, using video makes finding stego harder. It doesn't change the nature of the problem, though; it just changes its scale. Against adversaries with the computational power of a modern nation-state, however, if you're relying on scale to hide your behavior, licit or otherwise, you're only deluding yourself.


If I'm the Secret Police in some oppressive state, then I just need to find out whether you seem to be using stego — which is naturally against the law, itself, and hence grounds for arrest. Then, I can use rubber hoses, bamboo splinters, the threat of violence against your loved ones, and what-not to "brute force" your passphrase.

Me? I'm the entire population of the country. Which one of us is using stego?

To my reckoning, the search space would put the number of atoms in the universe to shame.


Maybe start with the guy who uploaded the video to YouTube?


I'd like to know if there's been an implementation of that. I remember reading about such a thing in William Gibson's Pattern Recognition.


Perhaps the reason for creating this?


Interestingly enough, stenography was already being decried pre 9/11 as a tool used by terrorists [1]:

>"Uncrackable encryption is allowing terrorists — Hamas, Hezbollah, al-Qaida and others — to communicate about their criminal intentions without fear of outside intrusion," FBI Director Louis Freeh said last March during closed-door testimony on terrorism before a Senate panel. "They're thwarting the efforts of law enforcement to detect, prevent and investigate illegal activities."

So law enforcement is fine with encryption so long as it's crackable...

[1]http://usatoday30.usatoday.com/tech/news/2001-02-05-binladen...


The consensus in the infosec community seems to be that most (real) Islamic terrorists haven't been using email or cellphones since ~2003. So any mass-surveillance/SIGINT sales pitch about catching terrorists is mostly bullshit. If they do catch anyone, they are likely not the type of people who could have accomplished anything. It seems to be much more useful for catching other nation-state intelligence spies at work or catching aloof criminals.


Interesting... source?


Here is one interesting discussion:

https://twitter.com/thegrugq/status/399352954060144640

The author has cited terrorists training manuals elsewhere on his blog, that are apparently available publicly online, dated from as early as 2003 with security guidelines to not use email or talk on cellphones.

TLDR: The adversary can easily stop using email/cellphones to discuss plans. Do they still use email/cellphones for other reasons? Sure most likely, as was shown in Zero Dark Thirty, but not in any meaningful way that can be usefully gleaned from a mass-surveillance approach. Therefore the large investment and privacy trade-offs to the greater society isn't worth it.


What do they use? Trusted couriers?


"Terrorists use web forums and couriers."

From same source as previous comment: https://twitter.com/thegrugq/status/407662098093580288


Silly... seems like forums would be easier to snoop than phone calls.


I was thinking the same. Maybe they don't really use forums to plan and coordinate plots, but only to recruit and spread propaganda.


i am also interested in what they use?


There was a CIA contractor who outright defrauded them claiming that he had tools which were detecting steganographic messages used by terrorists on the internet. He didn't get prosecuted because they were too embarrassed to admit that they'd been completely bamboozled.


> "Uncrackable encryption is allowing terrorists — Hamas, Hezbollah, al-Qaida and others — to communicate about their criminal intentions without fear of outside intrusion,

You succeeded to put 3 different ethnic groups - I should say 2, the last one being an US product - in the same bag and doing then, a misleading association, fucking idiot!


It was a quote.


:( Are you one of those people that doesn't like to call things what they are? I have noticed many problems in life are due to people know wanting to call things what they are.


Are you one of those people who thinks he knows an obvious truth that everyone else ignores for some reason?


No.


I was under the impression that undetectable steganography was extremely difficult. If commonplace steganography was widespread, no doubt they'd write analyzers to determine what things might be hiding data. On top of that, if steganography becomes widespread, it's likely the protocol will be a common one adopted by plenty of people. At that point, it reduces to encryption, does it not?


It's true that many common forms of steganography (such as hiding information in the alpha channels of a given image) are easily detectable, and analyzers already exist, but in some aspects steganalysis is somewhat of an oxymoron, since the entire point of steganography is to conceal the fact that hidden information even exists in the first place.

I'm not sure exactly how you'd define a steganographic protocol. It's not quite as straightforward as cryptography, in fact it's yet again oxymoronic. Steganography (at least ideally) works somewhat like an archetypal spy's codebook. It sounds like everyday conversation to you, unless you're meant to know it's not, and that there's a hidden meaning. If you catch something off-guard, then the stego has failed.


I don't see how it would be terribly difficult to undetectably (without key) hide a few bytes of data in the least significant bits of a .jpg.

There are likely trillions of images available on the Internet. I would imagine less than 0.001% of them have a hidden message. This increases the "haystack" so drastically for the NSA that, even if 100x as many people started using it, it's still a big-ass haystack.


This is one of the worst, easiest to detect, form of steganography. Publishers like Springer Verlag have many papers and books about detecting that type of steganography.

While analysis (breaking) of steganography is long lived there hasn't been much work on creating new better forms.

Just as things like PGP are still hard for regular people to use, and there's no real encrypted chat, there's not much in the way of strong stego.

Obvious caveats apply here: How much does the text need to be hidden? Who does it need to be hidden from? Me hiding my angsty poetry from my sister doesn't need much and anything is going to be okay. But me hiding material that could get me killed, from a well funded government? I need something better than a reference github project.


> and there's no real encrypted chat

What about OTP? One of the easiest things to set up and use imo, users just need to know to exchange key fingerprints over a third party medium (in person being the foolproof way).


At some ratio of hidden data to visible data, I'm sure it can be undetectable. But transmitting reasonable amounts of data leaves a trace using LSB algorithms. Here's one paper. It shows the LSB part of the image, which leaves an obvious looking impression.

http://rahuldotgarg.appspot.com/data/steg.pdf


This is just showing steganography with plaintext payloads. If you use only ciphertext payloads (with the keys exchanged out of band) you sidestep this problem.


Not really. Encrypting the message will yield uniformly distributed noise and that is a very rare in nature. So if you attempt to hide an encrypted message in the least significant bits of images, audio recordings or video it is as easy to detect as plain text messages if not even easier.


Then don't use every LSB in the image; use a low percentage. Just a guess, but I bet if you applied your stego detection algorithm to a large sampling of random images on the internet, you'd find a significant false positive rate. Just hide your messages in the false positives.


As already mentioned by others - hiding small amounts of data is easy, the challenge is to hide nontrivial amounts of data. There are algorithms that (try to) compensate statistical changes introduced by hiding data. One approach is to only use half of the available bit for hiding data and modifying the other half in a way to preserve a set of statistical properties.

It is probably not a good idea to hide data in images available on the internet because this enables direct comparison of the same image with and without hidden data.


I mean this in all seriousness ... Is there any evidence that anyone in the NSA is really that scared of stenography?


Stenography ≠ Steganography


I am going to port this to ruby. I'm currently unemployed and it should be a good sample to share with potential employers.


I would have liked to have seen some references to the research in the field in the explanation or comments. I wrote something like this around 15 years ago (https://github.com/tokenrove/steaghan/; horribly broken, do not use) but quickly abandoned it when Niels Provos started doing much more sophisticated stuff (http://www.citi.umich.edu/u/provos/stego/).

Since then, there has been a fair bit of really interesting research in the field; I recommend anyone interested read Peter Wayner's book Disappearing Cryptography. Might be a good place to start for enhancing this provocatively named framework.


The first link incorrectly has ; with it.


I feel like this is a good place to mention a similar project which aims to circumvent deep-packet inspection with some cool encoding techniques. It can even be used as a Tor plugin!

https://github.com/kpdyer/fteproxy https://fteproxy.org/


Q. Why did you use Python3 as a reference language?

A. Because not having distinct binary and unicode string types is barbaric.

Well played.


This isn't really a "censorship resistance" tool as it is a steganography tool. You can still be censored if your internet access is cut, or you have no way to publish your message.


> "You can still be censored if your internet access is cut, or you have no way to publish your message."

Hence censorship resisting, not censorship defeating.

Stenography is potentially useful if partial but monitored and censored communication channels remain open. See: The Great Firewall of China, or the postal system in prisons. Some data gets through, but data that they don't like does not. If the data is concealed, you can get it through.

Beyond just stenography, in the Soviet Union and beyond, some writers and artists would use allegory to criticize political figures or the state, enabling them to make points that would otherwise be censored. They could have shut down all film and book production, defeating this technique, but as long as some artistic works were allowed this channel remained open.


It's been a long time since I've done work in infosec related things, so I apologize if I'm way behind on...things.

I remember in school a million years ago we discussed an algorithm of the following type for sending short covert messages.

1. Negotiate cipher/mapping for where to look for hidden information 2. A wants to send B message "Let's get drinks @ 9 @ Bill's" -- instead of inserting this into some random file, he instead maps to the cipher/mapping area and then iteratively searches for images/texts that are closest possible matches in those bits to his message. 3. Ideally, given access to enough cover files and a short enough message, he has an EXACT match. A sends B picture of puppies with NO bit twiddling. B knows to meet at the pub.


We don't need yet another steganography tool based on texts, we need a steganography tool to scramble data into a pile a fucked up HTML DOM tree.


I love the question in the FAQ:

  Q. Why can't it be given more than two alternates for one position to encode more information?

  A. Because of math. See Explanation.txt for a bit more detail.
Because of math hilarious


i was actually thinking that the "of" was superfluous. since reading [0] on hn (discussion: [1]) i am more and more seeing the use of "because" without preposition. so i was expecting

"Q. Why can't it be given more than two alternates for one position to encode more information? A. Because math. See Explanation.txt for a bit more detail."

[0] http://www.theatlantic.com/technology/archive/2013/11/englis... [1] https://news.ycombinator.com/item?id=6765099


The use of the word "math" (and "science") seems to be changing as well, I think this usage is an example of an unconventional use of the word "math"( rather than an example of the new use of "because"*.


Yeah, but it's sort of just a comical thing that is getting tired and played out... Let's be honest, it was a slow news day for the Atlantic.


Yes, it was. But i have a degree in linguistics, so i am probably more interested in this than the general population. :-)


Steganography has a bad reputation because the only tools publicly available are worthless. Not one is both secure according to Kerckhoffs's principle and secure against statistical analysis. I hope to change that by releasing an implementation of a new algorithm I have developed, sometime in the new year. If you are interested in reviewing the algorithm and code when it is released, feel free to follow my blog.


Has anyone sensible done any kind of analysis of this?


It is very primitive steg. This will not survive antisteg tools. Look at the *encode.py


This is from a few months ago. Still neat :)


I like the name!




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: