Hacker News new | comments | show | ask | jobs | submit login
Dissent: accountable anonymous group communication (yale.edu)
80 points by turing 1396 days ago | hide | past | web | 25 comments | favorite

There are something like 20-ish papers here. I have skimmed several of them without any clear answer to this question:

How exactly do you stop sockpuppetry while maintaining anonymity?

In order to ensure that a person cannot create more than 1 account, it would be necessary to observe some property of the person that is difficult to alter (such as their appearance, or their social security number) for uniqueness on the network.

Now if this network is truly anonymous, then it must be disjoint from every other network. And there is at least one other network ("real life"), so whatever this property is, must be something that an adversary in the real-life network would not be able to compare to the anonymous network.

And so certainly, it must not be some property that a person in the real-life network could observe by following you around or ruffling through your drawers and then compare to the answer you gave to sign up for the anonymous network. So it must be something in your head (like a password).

But if it is in your head, isn't it easily changed, and this property would be exploitable to create sockpuppets?

I must have missed something fundamental here, possibly in terminology. Can somebody who is current on the research here enlighten me?

I am current on this research.

The group for anonymous communication in Dissent is formed using unspecified means. It could be everybody who signed up in the last month, obtained a credential from some authority, has a key within the web of trust for the group, etc.

Some of these methods obviously allow the adversary to create Sybils (aka sockpuppets). The ones that don't may not provide anonymity about who is in the group, but the protocol will provide anonymity for who said what during the group communication. This is still extremely valuable. Consider voting as an example: the group is known, but individual vote anonymity matters.

If the group formation mechanism does allow Sybils, that still doesn't violate anonymity. For a message from an honest member, the adversary cannot tell which honest member it came from. It also doesn't violate the accountability of the protocol - any disruption will be attributable to some Sybil, who will be punished.

I think it's unfortunate that the page mentions that Dissent protects groups against Sybils and sockpuppets. I too spend lots of time going trough these papers looking up what the algorithm was. Dissent clearly does not even try to solve Sybil/sockpuppet problem.

It does in the following senses:

1. Anonymity is provided as long as there is a single trustworthy member, regardless of how many phony members there are.

2. Denial-of-service resistance is provided even against many Sybils - eventually they will all be kicked out of the group and communication can proceed.

This is in contrast to protocols (e.g. onion routing, Aqua[0]) that only provide their security properties when the adversary doesn't control too much of the system. I think it is a fair claim to make and in particular is clear to people familiar with this area of research.

[0] "Towards Efficient Traffic-analysis Resistant Anonymity Networks" <http://www.mpi-sws.org/~stevens/pubs/sigcomm13.pdf>

You seem to be outsourcing the trust mechanism to the users, while the page implies that you've solved the trust problem internally through the protocol.

Don't get me wrong, the research is very impressive in it's own right, but that's at best misleading.

What is the mechanism to kick Sybils out of the group, though? How is it ensured that enough sockpuppets don't try to kick trustworthy groups out? I am not following the mechanisms here.

> How exactly do you stop sockpuppetry while maintaining anonymity?

I don' t think this is possible. Nor is it desirable; someone might well want to use different online identities for different parts of their personality.

If I'm reading the slides correctly (and there's a good chance I'm not) you have to have a preselected group. From the problem statement:

"A group of N>2 parties wish to communicate anonymously, either with each other or with someone outside of the group. They have persistent, real world identities and are known, by themselves and the recipients of their communications, to be a group."

That of course leaves me with a lot of, "But how would you do X?" questions. I'd love to see a FAQ for the project, and a list of things you can and can't do with it.

If you're correct, I don't really see what is novel here. If I already had a group of 50 people I could trust, what would I need this software for? How is this better than just handing out credentials to my IRC Tor hidden service?

Again, I've just skimmed a couple of presentations, so I could be off. That said, the tech definitely looks novel.

One application that comes to mind is human rights abuse reporting. Let's suppose I get 100 volunteers in the field with software based on this. Some of them are agents of the abusive regime; most are sincere. All communications are monitored by the government. If I read the presentation correctly, anybody can submit an incident report that we can be sure came from one of the probably-trusted, but that can't be traced back to an individual.

Another application they mention is secure voting. Let's say you're part of a cabal like WikiLeaks. You could use it to do untraceable but reliable internal anonymous voting on whether a given document gets released or not.

The way this is usually handled is anonymous credentials. You force someone to identify themselves uniquely (full name, passport, SSN, DNA, photo, etc). You then issue them a credential that is anonymous (i.e. when shown, you can't link it with when it was issued or previously shown). Furthermore, you set up these credentials so that if they are used too many times (i.e. cloned) then the identity is revealed.

You then require them to show the untraceable credential each time they submit a message.

So the way this maintains anonymity is by not trying?

What you've described is a central authority architecture which is in charge of issuing the credentials. But we already have CA authorities that issue SSL certs. They're brittle to government influence, because it's impossible to know whether the government has acquired the secret keys. In your case it'd be impossible to know whether the hypothetical CA has stored your secret credential / whether they've told anyone your real identity.

Even if this model were to work, it wouldn't protect users from themselves. Here's a fascinating read about the problems of staying anonymous even in an environment with perfect anonymity guarantees: https://whonix.org/wiki/DoNot

For example, no anonymity network can protect against stylometry, so that's always a concern. I was also shocked to realize that something as simple as automated time synchronization will reveal your general location when using Tor, because your machine requests a time update for your specific timezone. You have to set your clock to UTC to avoid that. There are about two dozen other vectors by which you can accidentally reveal your true identity even when using a rock-solid protocol. Anyone who's interested in this should read the entirety of the Whonix wiki. In addition to being comprehensive, it's also a lot of fun to read.

(Most of my comment was meandering and not really related to yours. It's just interesting how difficult perfect anonymity is. It's probably true to say that getting the tech implemented correctly is only a small fraction of the total amount of work required to be truly anonymous.)

I should clarify. The credentials are anonymous even given a malicious CA. This is a cryptographic guarantee assuming the Strong RSA assumption holds.

The only thing the CA can do is produce a list of people they issued credentials too.

So the problem I see with that solution is twofold:

* If there is a central authority collecting the information, then as far as Jeff the User knows, his information could be stored in plain text on a subpoenable hard disk by the central authority.

* If the information is verified by a set of moderators, then the moderators must be numerous enough that no single moderator could capture a substantial portion of the identity data. But to achieve that, the mod requirements would be low enough that an attacker could infiltrate their ranks and uncover the identities of users.

And then the next day the NSA forces the credential authority to give up all their information.

This is not a solution.

I don't know how Dissent does it. The only solution I can think of right now is to ask people to perform some time consuming human task before they can create an account. That would limit the scalability of any sock puppet creator and at the same time it could serve as payment for the service if that task is useful.

Apologies in advance for the wall of text, but the first half is basically just a summary of their protocol for people who don't want to try to identify where the actual information resides (it appears to be http://dedis.cs.yale.edu/2010/anon/pres/120104-dissent.pdf).

As I understand it, the basic idea is that there is a network made up of N clients and M servers, with the clients (and potentially servers) identifiable in some external fashion (IP addresses or GPG keys or whatever, it's unimportant to the protocol) and we want to make it possible for any of the clients to broadcast a message without anyone being able to tell which client it came from.

So what happens is each client and each server generate a Diffie-Hellman key which is associated with their actual external identity. Then each client establishes a unique secret with each server (and vice versa, naturally). These unique secrets are used to produce N times M unique bitstreams using PRNGs (so each client has a bitstream corresponding to every server that exists, and each server has a bitstream corresponding to each client that exists).

Then each client XORs together the bitstreams from the unique secret it shares with each server, and each server does the same thing for each client's shared secret. Now there are N + M bitstreams, with the nice property that if you XOR together all of them they all cancel out (because every client-server pairing occurs in the bitstream from that client and the bitstream from that server).

Furthermore, if one client also XORs some data into the bitstream that they publish, no one else can tell, it still contains a bunch of indistinguishable-from-noise data to everyone else who might look. But then when we XOR together all N + M bitstreams, we end up with everything cancelling out except for that extra data that one client added.

So then the Dissent protocol pulls in another construct, and uses something called MIX to shuffle a set of public keys generated by the peers, and uses these public keys to establish a transmission order, essentially reimplementing TDMA (Time-Domain Multiple Access) in a digital domain with signatures.

In my opinion as a hobbyist interested in this stuff, the whole "everyone produces a bitstream and they magically evaporate leaving behind only the data everyone transmitted" thing is almost magically cool. The time-domain multiplexing is less cool, and my EE background compels me to wonder if a meaningful analogue to CDMA or OFDM could be developed. Well, obviously they could be developed the real question is "could they be useful?".

It's also sort of interesting how the fact that we can't ever be allowed to know when a given peer is transmitting means that the design becomes more "continuous", with data being transmitted by everyone at all times so that the real transmissions can be disguised. I wonder if the theoretical perfectness could be loosened somewhat to allow only, say, 10% of peers to have to be transmitting at any given time (in the long run this could make it possible to identify a transmitter uniquely, but not so quickly that it wouldn't be useful still).

Unfortunately, the second bullet-point (accountability) goes close to unfulfilled. And I feel like it sort of has to, since any method which could determine where malicious data comes from can also be used to undermine the anonymity of the system for everyone else. There's a kind of accountability, which is that the peers themselves can be associated with a public identity without anyone being able to tell which peer produced a given message, even in theory, but it doesn't extend to any system with open registration, because it doesn't handle the "sock puppet" problem at all.

But personally I think the sock puppet problem is pretty much un-silver-bullet-able. The best we can probably ever hope to do for "general purpose" uses is probably a combination of a cryptographic proof-of-work algorithm, public-key signatures to allow (though not force) persistent identity, and some sort of reputation system.

I've been thinking more about the whole "better multiplexing" issue for a bit now, and I suspect that some sort of http://en.wikipedia.org/wiki/Carrier_sense_multiple_access_w... type of solution is ideal for this use case. Because if you think about it, two transmitters at once is not an unrecoverable failure mode, in fact it's easier to handle here than in, say, the original ethernet standard, because with packet-based internet stuff we can do things like changing the data rate on the fly in response to collisions.

Currently the system I'm imagining works as follows: Every T seconds a new packet is initiated, and fancy spanning tree relaying or whatever is used, and eventually everyone has the XOR of all server and client versions of the packet, which happens to be the XOR of whatever various clients happened to include into their packet. Now the additional information for a client who chooses to add that will be the payload plus a checksum value (which must not be homomorphic under the XOR operation). If one client transmits, the checksum passes, and everyone is happy. If multiple clients transmit, they collide, and the checksum does not pass, and each transmitter knows this and waits a random backoff time (number of packets) before trying again. But in addition, a collision is often a signal that the packet rate is too low, so collisions also cause the packet period T to decrease (and lack of messages will cause it to increase, naturally).

So I think the basic "matrix of shared secrets" construct can be extended to allow low-overhead (for values where "low" means "on the order of 3x") communications, because dynamically varying the period between packets and allowing anyone to try transmitting during any packet time will tend to mean that when no one is transmitting bandwidth drops to a very low level (I could easily believe 16-byte idle packets every 5 seconds for something where absolutely low latency isn't a requirement).

Two problems with unscheduled communications: 1. It allows the adversary to disrupt communications by continuously sending junk. Solving this problem was a major goal of Dissent not adequately handled by previous designs based on Dining Cryptographers networks. 2. Without a schedule telling everybody when to send something, the first guy to talk is obviously the sender, destroying anonymity.

I might just be tired, but I cannot actually come up with a reason this would be useful. Someone help me out?

We decided to build an anonymous group communication platform with some other assumptions and requirements:

In our case, speed was an issue. And so was encryption of communication. + We did not want anyone except Bob and Alice to know what they were talking about. We did want to allow formation of groups... ( https://register.blib.us )

I tried your demo and immediately got a "SSL cert not trusted!" chrome error. https://rrc.imp.blib.us/link/album/private?albumid=117bc3

In our system, every user generates a self-signed certificate and has the option of buying a signed certificate. That demo is from a user of the system that did not buy a signed certificate.

As a user, all I see is a website saying they take security seriously and then pointing me at a demo with a broken SSL cert.

I realize that's probably an unfair characterization, but you have to understand that's what every user is going to think. You'll also have to work on your message if you want to get people using this, because it's very hard to understand what exactly your website does. I had to do a double-take because the website says it's a tool for building a personal library of books, not anonymous group discussion.

Agreed. Will put a signed certificate for the demo. Thanks for the input.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact