How exactly do you stop sockpuppetry while maintaining anonymity?
In order to ensure that a person cannot create more than 1 account, it would be necessary to observe some property of the person that is difficult to alter (such as their appearance, or their social security number) for uniqueness on the network.
Now if this network is truly anonymous, then it must be disjoint from every other network. And there is at least one other network ("real life"), so whatever this property is, must be something that an adversary in the real-life network would not be able to compare to the anonymous network.
And so certainly, it must not be some property that a person in the real-life network could observe by following you around or ruffling through your drawers and then compare to the answer you gave to sign up for the anonymous network. So it must be something in your head (like a password).
But if it is in your head, isn't it easily changed, and this property would be exploitable to create sockpuppets?
I must have missed something fundamental here, possibly in terminology. Can somebody who is current on the research here enlighten me?
The group for anonymous communication in Dissent is formed using unspecified means. It could be everybody who signed up in the last month, obtained a credential from some authority, has a key within the web of trust for the group, etc.
Some of these methods obviously allow the adversary to create Sybils (aka sockpuppets). The ones that don't may not provide anonymity about who is in the group, but the protocol will provide anonymity for who said what during the group communication. This is still extremely valuable. Consider voting as an example: the group is known, but individual vote anonymity matters.
If the group formation mechanism does allow Sybils, that still doesn't violate anonymity. For a message from an honest member, the adversary cannot tell which honest member it came from. It also doesn't violate the accountability of the protocol - any disruption will be attributable to some Sybil, who will be punished.
1. Anonymity is provided as long as there is a single trustworthy member, regardless of how many phony members there are.
2. Denial-of-service resistance is provided even against many Sybils - eventually they will all be kicked out of the group and communication can proceed.
This is in contrast to protocols (e.g. onion routing, Aqua) that only provide their security properties when the adversary doesn't control too much of the system. I think it is a fair claim to make and in particular is clear to people familiar with this area of research.
 "Towards Efficient Traffic-analysis Resistant Anonymity Networks" <http://www.mpi-sws.org/~stevens/pubs/sigcomm13.pdf>
Don't get me wrong, the research is very impressive in it's own right, but that's at best misleading.
I don' t think this is possible. Nor is it desirable; someone might well want to use different online identities for different parts of their personality.
"A group of N>2 parties wish to communicate anonymously, either with each other or with someone outside of the group. They have persistent, real world identities and are known, by themselves and the recipients of their communications, to be a group."
That of course leaves me with a lot of, "But how would you do X?" questions. I'd love to see a FAQ for the project, and a list of things you can and can't do with it.
One application that comes to mind is human rights abuse reporting. Let's suppose I get 100 volunteers in the field with software based on this. Some of them are agents of the abusive regime; most are sincere. All communications are monitored by the government. If I read the presentation correctly, anybody can submit an incident report that we can be sure came from one of the probably-trusted, but that can't be traced back to an individual.
Another application they mention is secure voting. Let's say you're part of a cabal like WikiLeaks. You could use it to do untraceable but reliable internal anonymous voting on whether a given document gets released or not.
You then require them to show the untraceable credential each time they submit a message.
What you've described is a central authority architecture which is in charge of issuing the credentials. But we already have CA authorities that issue SSL certs. They're brittle to government influence, because it's impossible to know whether the government has acquired the secret keys. In your case it'd be impossible to know whether the hypothetical CA has stored your secret credential / whether they've told anyone your real identity.
Even if this model were to work, it wouldn't protect users from themselves. Here's a fascinating read about the problems of staying anonymous even in an environment with perfect anonymity guarantees: https://whonix.org/wiki/DoNot
For example, no anonymity network can protect against stylometry, so that's always a concern. I was also shocked to realize that something as simple as automated time synchronization will reveal your general location when using Tor, because your machine requests a time update for your specific timezone. You have to set your clock to UTC to avoid that. There are about two dozen other vectors by which you can accidentally reveal your true identity even when using a rock-solid protocol. Anyone who's interested in this should read the entirety of the Whonix wiki. In addition to being comprehensive, it's also a lot of fun to read.
(Most of my comment was meandering and not really related to yours. It's just interesting how difficult perfect anonymity is. It's probably true to say that getting the tech implemented correctly is only a small fraction of the total amount of work required to be truly anonymous.)
The only thing the CA can do is produce a list of people they issued credentials too.
* If there is a central authority collecting the information, then as far as Jeff the User knows, his information could be stored in plain text on a subpoenable hard disk by the central authority.
* If the information is verified by a set of moderators, then the moderators must be numerous enough that no single moderator could capture a substantial portion of the identity data. But to achieve that, the mod requirements would be low enough that an attacker could infiltrate their ranks and uncover the identities of users.
This is not a solution.
As I understand it, the basic idea is that there is a network made up of N clients and M servers, with the clients (and potentially servers) identifiable in some external fashion (IP addresses or GPG keys or whatever, it's unimportant to the protocol) and we want to make it possible for any of the clients to broadcast a message without anyone being able to tell which client it came from.
So what happens is each client and each server generate a Diffie-Hellman key which is associated with their actual external identity. Then each client establishes a unique secret with each server (and vice versa, naturally). These unique secrets are used to produce N times M unique bitstreams using PRNGs (so each client has a bitstream corresponding to every server that exists, and each server has a bitstream corresponding to each client that exists).
Then each client XORs together the bitstreams from the unique secret it shares with each server, and each server does the same thing for each client's shared secret. Now there are N + M bitstreams, with the nice property that if you XOR together all of them they all cancel out (because every client-server pairing occurs in the bitstream from that client and the bitstream from that server).
Furthermore, if one client also XORs some data into the bitstream that they publish, no one else can tell, it still contains a bunch of indistinguishable-from-noise data to everyone else who might look. But then when we XOR together all N + M bitstreams, we end up with everything cancelling out except for that extra data that one client added.
So then the Dissent protocol pulls in another construct, and uses something called MIX to shuffle a set of public keys generated by the peers, and uses these public keys to establish a transmission order, essentially reimplementing TDMA (Time-Domain Multiple Access) in a digital domain with signatures.
In my opinion as a hobbyist interested in this stuff, the whole "everyone produces a bitstream and they magically evaporate leaving behind only the data everyone transmitted" thing is almost magically cool. The time-domain multiplexing is less cool, and my EE background compels me to wonder if a meaningful analogue to CDMA or OFDM could be developed. Well, obviously they could be developed the real question is "could they be useful?".
It's also sort of interesting how the fact that we can't ever be allowed to know when a given peer is transmitting means that the design becomes more "continuous", with data being transmitted by everyone at all times so that the real transmissions can be disguised. I wonder if the theoretical perfectness could be loosened somewhat to allow only, say, 10% of peers to have to be transmitting at any given time (in the long run this could make it possible to identify a transmitter uniquely, but not so quickly that it wouldn't be useful still).
Unfortunately, the second bullet-point (accountability) goes close to unfulfilled. And I feel like it sort of has to, since any method which could determine where malicious data comes from can also be used to undermine the anonymity of the system for everyone else. There's a kind of accountability, which is that the peers themselves can be associated with a public identity without anyone being able to tell which peer produced a given message, even in theory, but it doesn't extend to any system with open registration, because it doesn't handle the "sock puppet" problem at all.
But personally I think the sock puppet problem is pretty much un-silver-bullet-able. The best we can probably ever hope to do for "general purpose" uses is probably a combination of a cryptographic proof-of-work algorithm, public-key signatures to allow (though not force) persistent identity, and some sort of reputation system.
Currently the system I'm imagining works as follows: Every T seconds a new packet is initiated, and fancy spanning tree relaying or whatever is used, and eventually everyone has the XOR of all server and client versions of the packet, which happens to be the XOR of whatever various clients happened to include into their packet. Now the additional information for a client who chooses to add that will be the payload plus a checksum value (which must not be homomorphic under the XOR operation). If one client transmits, the checksum passes, and everyone is happy. If multiple clients transmit, they collide, and the checksum does not pass, and each transmitter knows this and waits a random backoff time (number of packets) before trying again. But in addition, a collision is often a signal that the packet rate is too low, so collisions also cause the packet period T to decrease (and lack of messages will cause it to increase, naturally).
So I think the basic "matrix of shared secrets" construct can be extended to allow low-overhead (for values where "low" means "on the order of 3x") communications, because dynamically varying the period between packets and allowing anyone to try transmitting during any packet time will tend to mean that when no one is transmitting bandwidth drops to a very low level (I could easily believe 16-byte idle packets every 5 seconds for something where absolutely low latency isn't a requirement).
In our case, speed was an issue. And so was encryption of communication. + We did not want anyone except Bob and Alice to know what they were talking about. We did want to allow formation of groups... ( https://register.blib.us )
I realize that's probably an unfair characterization, but you have to understand that's what every user is going to think. You'll also have to work on your message if you want to get people using this, because it's very hard to understand what exactly your website does. I had to do a double-take because the website says it's a tool for building a personal library of books, not anonymous group discussion.