If the (sufficiently small) centralized parts can be made so that they're oblivious to the content and can't discriminate it; if their only practical choice is to work or not work; then most of the disadvantages of centralized systems go away.
Sure, you might disable such a system by destroying the centralized nodes; but it does prevent using them for the much more likely risks such as censorship or MITM attacks for eavesdropping or impersonation.
Stefan Brands described a digital cash system using zero-knowledge signatures more than 20 years ago that provides far greater payer and payee anonymity than the blockchain, and is fully distributed with no requirement for full-time network connectivity to engage in trade. The only real downside is it requires a traditional bank at the backend to act as an honest reserve.
http://groups.csail.mit.edu/mac/classes/6.805/articles/money... - half way down, under the heading "PROTOCOL 5" for a high level description of operation.
Many people don't realise that the only substantial benefit unique to Bitcoins Blockchain architecture is the lack of central issue from a reserve bank.
Still... I wonder how Bitcoin would stand up to a serious attack. Imagine the Chinese government wanted to kill it. They could build 51% worth of ASIC in secret and spring it on the network all at once, taking over the net and maybe just wrecking it. It might cost them so much it would not be economically beneficial, but if it were done for political reasons they might not care about actually profiting from it. It would be a military attack.
Of course flying a plane into the Federal Reserve headquarters is also possible. At some point you have to accept that any system can be attacked.
So it's not so much that Bitcoin would be destroyed, as a new avenue for theft would open up. Probably temporarily, as other groups would respond by fighting back with more computational power and probably with ad-hoc blocking.
Would the bitcoin network then be able to check for any double spending shehenigans while I am using that node? That way I only have to trust it until I can get a message from the bitcoin network crying foul that some double spending has happened.
Ultimately, it needs to be backed by system with a monetary policy, since whatever fraction of double-spending you cannot prevent effectively causes inflation of the money supply.
The problem as you stated feels like it should be solvable. The CAP theorem is not relevant here -- you're not bound by strong consistency because you don't need to provide connectivity 100% of the time. There is strong geographical stickiness in connections, so that looks like something to exploit. Combined with smart localized buffering of packets near the destination, it looks like a solution should exist.
The solution may not be compatible with your existing semi-centralized solution, which may be blinding.
The article is fairly vague - what is the risk of Sybil attacks exactly? (I looked at the FAQ, but it's sparse on details; only glanced at source.)
Is it some kind of spoofing? Since addresses are supposed to be collision resistant and tied to public keys, it should be possible to discard all messages regarding a peer that don't come from it. Although 40 bits is actually way too small...
Denial of service by pretending to be able to route to a user? Okay, but there should be some way to punish IP addresses that consistently fail to route things they claim to be able to... complicated, but not impossible.
Of course, I've only thought about this for 10 minutes, while the author has studied it for a long time. I'm sure they have more insight about the problems, but without more details I can only speculate.
there's no point trying to convince the general crypto community that believes bitcoin is the first solution to the byzantine generals problem. i'll fix that later.
But the problem is the same problem, and I would counter that the security is if anything stronger than that of Bitcoin. Ever since I studied pre-existing literature on consensus, I no longer see Bitcoin as you probably see it today.
Zero Configuration is also basically impossible because you have to bootstrap the network some how. Fully decentralised solutions still require bootstrap information. Which is unfortunately hats enough for many and therefore works add efficient show stopper.
Last nail to the coffin is that most people really do not care about security at all. User base is after all just a small bunch of delusional geeks.
Otherwise something like RetroShare would be widely used.
I've been considering and wondering these same questions for several years. Btw. About bitcoin style messaging check out bitmessage.
The challenge remains to create something that is simple and convenient to use and is at the very least on par with popular alternatives. Nothing more than that. Counting all the ways a decentralized implementation in principle is going to supposedly struggle without mentioning a longer list of problems with centralized systems is being excessively negative and cynical and for what purpose?
When TPB guys said they would make something cool, I naturally assumed they would release a fully decentralized, pseudonymous, anonymising solution, which would replace bittorrent which is inherently dangerous because it leaks so much metadata as well as reveals to whole world what you're downloading. Instead they released something (technically) lame, the piratebrowser. Which just allows accessing their site via Tor (basically working as proxy), even if it's blocked locally.
I really do like Freenet & GNUnet design, because relays do store and cache data, creating a lot of sources quickly for popular data. Many systems have serious flaws, bitmessage is not scalable with it's current design. Most of other fully decentralized systems suffer from flooding, sybil and metadata leaks. I personally played a little with a bitmessage creating over 100k fake peers just for fun and it worked beautifully flooding the network with noise and generated junk control traffic. Because it's distributed solution, up to 10k junk messages were stored on each peer, wasting also disk space. Decentralized design also makes nodes running the system vulnerable unless additional steps are taken to limit resource consumption. After resources need to be limited, then it's a great question what should be limited and how that affects the network.
Like the original article says, it's really hard to come up with solution which wouldn't be bad in someway.
When mobile devices get more and more common, the fact is that it's really hard to beat the centralized solution. Of course the system can be partially distributed so mobile clients can connect to one spot only. Server hubs doing routing and data storage. But basically that's no different to email, DNS and web at all. So no news here either. After all, email isn't bad solution at all. It's secure (if configured correctly) and it's distributed. - Yes, I'm one of those running own email servers, as well are my friends too.
Final question is, how to protect metadata and hide communication patterns. As I mentioned in my post about Bleep. It didn't impress me at all.
I'm not attacking the original article writer. He clearly has done a lot of job and considered options as well as written clearly and well about it. But I'm just thinking these issues in completely technical manners.
I'm happy to hear comments. I'm just cynical because the distributed systems didn't turn out as popular as I would have expected and wanted.
What I would like to see? A perfect fully decentralized P2P network running as HTML5 App which wouldn't require installation. That would be something which I would rate as pretty cool currently.
I wonder if it could stand up to a big Sybil attack though? Most of the black hats that could do it don't because they all watch illegal movies and have no motive to attack that network. I would bet you a donut the RIAA and MPAA have talked about it, but actually doing it could expose them to civil and even criminal penalties since it would technically be computer fraud and abuse.
Now somebody just needs to make a personal server platform based around CoreOS and Docker.
For users of very low power devices and low bandwidth mobile that could not run the software, you could have any number of independent hosting providers that would host a web based instance either for free as a community service or for a few bucks.
My impression is that Diaspora tried valiantly. If execution had been better and the team hadn't collapsed and if they could have really solved the technical problems I think they would have made it to a pretty decent size.
Edit: The 1400-page "Handbook of Peer to Peer Networking" (2010) has a good list of relevant papers. Even though the book itself is expensive, many of the papers are available elsewhere.
Supernodes are important because they permit instant on, rapid lookup, and almost instant nat traversal. They also bridge sparse multicast groups as described here.
Both Skype and Spotify have backed away from p2p. In Skype's case I suspect PRISM. I don't know why in the case of Spotify. That seems like an ideal application. Since a file is not a moving target, a lot of the problems in distributed databases disappear. That's why BitTorrent magnet links van work pretty well, though I bet a determined attacker could Sybil the Kad DHT to death.
In other words, if some nodes have to be centralized for technical reasons, apply a decentralized, Web-of-Trust, out of band, policy/reputation approach to governance and auditing the identity of the supernode operators.
E.g. Supernode N is in City A, owned by Person B, regulated by Country C, packet-inspected by ISP D, and is vouched for by the following list of DevOps infrastructure auditors.
A federated model or some kind of expanded hierarchy is possible but it would still have a center or at least a certificate authority. Otherwise you get Sybil.
Either the supernode can't do anything bad even if it's malicious (such as bitcoin miners) and you don't need the transparency or it can do something bad and then you can't audit it remotely, you need trust.
If "something bad" is altering data, then if you can detect if the data is 'proper' (such as bitcoin transactions), then a node simply can't fake data; but if you can't detect it then you need to trust the node. The same for reading data, if it's cryptographically impossible for the node to read your messages, then you don't need to audit that node; but if it's possible, then you can't be really sure unless (and probably even if) the node is under your full control.
 Ahh, and if by 'verify' we mean not 'automatic verification that the system can do' but 'human verification where I check that Bob seems to be a nice guy, and thus his node can't be evil', then we're back to the original post - either centralized certificate authority chain, or Sybil problems.
If the node is running an operating system that supports a dynamic root of trust, it's theoretically possible to perform a remote attestation for binary hashes of all software & config running on the remote node.
Not suggesting that is easy, but it can be done with the right hardware (TXT, TPM) and Linux software stack.
In addition, having a true and valid hash of my OS, software and config doesn't give you any true assurance - a security hole in that software or OS can easily give an ability to execute arbitrary code, so the "proper configuration" can behave in whatever way, despite being verified. That's what I meant by saying that you can't be sure if it's not doing anything bad, even if you completely control the node - if the underlying math ensures that reading or altering X isn't possible, then it's okay; but otherwise you have to hope/trust that it doesn't. This is meant by zero-knowledge systems as opposed to 'secure' systems that simply ignore knowledge that they shouldn't look at.
> if the underlying math ensures that reading or altering X isn't possible, then it's okay
If an attacker has access to a zero-day exploit that can compromise the OS, could they compromise random generation of numbers and weaken the zero-knowledge system?
So yes, for a fixed problem domain, the information transversing it requires topological constraints. Or another words, the fact that the data is structured requires the network to be traversible.
So how about this? We remove the structured requirement from the data. My device "knows" several other devices. Each device has some biometric identifying data to show who is using it currently. Those devices send my device unstructured data and app code from time to time. Each device decides who and what to share things with and whether and how much to trust incoming material. The data and the app are linked through a cataloging system provided by the sender.
The trick is this: each device really only makes one decision: how much do I trust the other devices? If trust is established, the app and code can continue to propagate. So we're sort of making viruses a good thing!
Now your data has no structure, so there's nothing to coordinate. If some of the initial apps helped folks communicate, then you could communicate to those you trusted. I could imagine some "master" data types that all devices could use: URL, IPv6, Text, HTML, JSON, XML, email, IM, etc. You could even have some master device types.
The beauty of this is that you start building security at the same time you're doing anything else -- because if you don't, somebody's going to own your machine, and rather quickly. Each device has no knowledge of the whole, but it has to quickly evolve a very complex and nuanced understanding of what it's willing to store and do for other devices.
I don't think the math would work on this at all, but I think it's a workable system - much the same as a cocktail party is a terrible place to interview people, yet lots of folks end up with jobs because of conversations held around cocktail parties. (Do people still have cocktail parties?)
A few interesting efforts that I rarely see mentioned in this context may add to your opinion on the matter:
- the creator of Kademlia, Petar Maymounkov, has been working on a successor on and off for years, his site offers some context and in 1 of his own comments there are a couple of pointers to his currents efforts in this area, you'll find it from here if you're interested, updates are a bit scattered throughout different media and time :)
- another field that is not that fundamentally different in what it wants to achieve is 'content centric networking' which is advocated by PARC and is backed by some of the oldschool networking guys/researchers. A wikipedia page is here but i don't really like it:
A presentation by Van Jacobsen of PARC on the subject can be seen here:
the talk seems intended for a low tech audience at first and i can imagine staying concentrated is hard here but imo the talk does deliver.
The problems these two encounter and are trying to solve are of a different nature than what I see in most projects - which i think often get entangled in nonfundamental problems and eventually ending up in a unrecoverable mess because these fundamental issues were never addressed, because it's hard and may be impossible currently... In my opinion cooperation/adaptation in lower osi layers will be needed to achieve the efficiency, security and decentralization you mention, single purpose application may be easier to get away with. Whether this will ever be able to happen or not is maybe more a geopolitical discussion than a technical one, complicated stuff, we'll see :)
Here it is: http://www.maymounkov.org/kademlia
This makes it sound like a technical problem, but it's also at least partially a market problem. It's much more straightforward to squeeze revenue streams out of centralized walled gardens like these than decentralized equivalents.
A decentralized Facebook would have less revenue, but also negligible hosting costs.
The assumption is (quoting from the Conclusion): "For example, the transmission delays between the local and central stations are assumed to be negligible compared to processing times; this may not be true for data centers that are separated by
significant geographic distances." This sounds to me awfully like an "oracle machine (http://en.wikipedia.org/wiki/Oracle_machine). It has "more computation" and it can decide instantaneously on where to apply it. I'm skeptical of generalizing the result from the paper to all distributed systems, in particular, the ones where latency-to-effort ratio doesn't fit into the stated assumption.
Also have you looked into eventual consistency?
I know some community mesh nets have used it with some success. Just because something has limits doesn't mean it can't be very useful for lots of applications.
> Google, or Twitter because nobody's been able to ship one.
Diaspora and identi.ca(/pump.io) are decentralized alternatives to Facebook and Twitter. Admittedly not in fully decentralized sense the author is talking about, but that isn't their problem. People don't use them, because people don't use them.
Suppose I wanted to join Diaspora. I search for "diaspora" on Google. The first result is the Wikipedia article, and the second is a link to joindiaspora.com. Cool, looks promising, but when I click on the link I find this:
"JoinDiaspora.com Registrations are closed"
But OK, maybe there's hope:
"Diaspora is Decentralized! But don't worry! There are lots of other pods you can register at. You can also choose to set up your own pod if you'd like."
So I click on "other pods" and I get sent to a naked list of sites. No explanation, no basis for choosing among them. Many of them in hinky-looking international domains.
OK, so I go back and click on "Set up your own pod", which takes me back to joindiaspora.com. At that point I give up.
Diaspora may work for hard-core geeks, but it doesn't work for normal people.
http://gnu.io/ will only ever be used by niche users and will never get to the size or even utility of Facebook,Google, and Twitter. Pretending otherwise is just willful blindness.
1. building distributed systems is a lot harder
2. so deploying/building centralized systems just gets on the market faster and is quicker to develop for
Well, that's what my company's been working to change. We started a little earlier than the Diaspora project was announced. And we've already designed the whole platform and built over 95% of it. As for the list, we have slightly different priorities.
#1 - we are putting this way lower on the list, and consider it unnecessary if the network is really distributed. Just like Google Plus's "Real Name Requirement" is unnecessary because if Bill Gates doesn't want me to see his real name, he doesn't want me to find his account via that name. It's more to enrich Google than to help people. The current IP system and DNS / Certificate Authority system can a good compromise for this, and notably they rely on centralized, albeit, delegated, mechanisms. One could argue that acquiring a domain can be completely decentralized in order to prevent censorship. But none of that is required for real time point-to-point connectivity, because with total privacy, you won't find the node you're looking for unless you follow a path through the network by which it wants to be found. All the nodes in that path would have to be online as you navigate them, but that's about it. I'll call this requirement #1a.
#2 - Absolutely. This would probably be shifted to #1 for us. For example, if I am on NYU's network and I friended a few people, and then I go join Columbia's network, it should seamlessly reconstruct my social graph there from the people who are also on Columbia AND want me to know that. Requests for information published by multiple parties would have to be routed correctly and responses would have to be combined. And so forth.
#3 - For you this is part of #2, but also assumes #1. Of course, to accomplish this, two nodes sharing a session would simply need to agree at all times to use one (or more) other nodes which are currently online, to pick up the session. If your device goes out of range of your WiFi and switches to some other network, it still contains the routing information for that node. This is a bit different in that it doesn't require #1, but only #1a.
#4 - I guess I do not need to worry so much about lower protocol levels, but basically if you can assume https, then that's enough (although you are at the mercy of the certificate authorities until TLS starts to use Web of Trust instead of x.509). Usually, people run clients instead of servers, and they choose to trust some server to run an application. Presumably this server would not be blocked by its internet service provider -- in fact it may be a wifi hotspot in an african village that doesn't even have internet. It would be a single source of failure, but only for that subnetwork. In a distributed social platform, one could easily just go use another server hosting the same app.
#5 - Private, encrypted, secure communication gets tough for some applications. For example, multi user chat or mental poker where you may not trust other participants. So until we have zero-trust technology for everything, people trusting a server of their choice seems to be a good general solution. And all this communication once again relies first on a reliable identity for users, apps, servers, etc. which can of course be achieved by self-signed certificates, but needs to be more seamless across networks as per #2.
#6 - Everything I've described above can be decentralized in the same sense the internet is -- smaller networks joining into larger ones. But you don't really need a central DNS, you don't really need a centralized identity. I also assumed you did, until February of this year when I met with Albert Wegner from Union Square Ventures. Our system was already able to handle everything as decentralized, but we wanted to have unique ids for every user in SOME database. Then we realized that this isn't necessary.
That said, there is one problem with the web's security model that does not allow apps to truly be decentralized. The problem is that, if I sign in using one app somewhere, and then visit another app, the other app doesn't know where I've signed in or which account I prefer to use. People attempted to solve it with proposals like xAuth which are merely stopgap measures ... but to date, no one has really been able to make identity totally seamless because of it.
In short ... I now disagree that #1 should be the biggest priority, because every doesn't REALLY need to be able to connect to every other node on demand, but only to nodes it has encountered while using the network. That makes everything able to be distributed.
As for the CAP theorem, I guess the reason you can't maximize Availability and Consistency while at the same time having Partition Tolerance is because subnetworks can disconnect at any time, and if you need consistency you'll have to wait for them to reconnect. But many problems do not require consistency across the WHOLE system! That's why we've chosen to extend the web model, where clients connect to a server they trust, and this can be used to power chatrooms and other group activities. Partition Tolerance within this subnetwork is only limited to splits within the subnetwork, not outside of it. And since the network consists of very few nodes compared to the whole world, the splits happen rarely, and the rest of the time, we can have high Availability while preserving total Consistency.
"If you’re using early-binding languages as most people do, rather than late-binding languages, then you really start getting locked in to stuff that you’ve already done. You can’t reformulate things that easily." -Alan Kay
The problem for me is not so much that a variable has a static type once it's been set. It's more that the variable shouldn't even exist in the first place. The way we concentrate so much on variables in C-based languages is really kind of absurd, once we stop to consider that they are only temporary holding sites for the flow of information.
I had never quite made the connection before that this is really the problem with the internet at large. We're focussing on websites, IP addresses, servers, etc, when really it's all about the information and the relationships and transformations that connect it.
The distributed internet that eventually replaces the one we pay for now will probably work more like a solver, so we'll create a request or query of some kind and the network will derive the solution, which most of the time will amount to simply fetching the data that matches a hash. It will be interesting to see what happens when we throw out archaic protocols like NAT that reduce everyone to second class netizens, and even abandon the idea of data bound to device or location, and think of it more as a medium like time or space where we can freely exchange ideas without having to depend on central points of failure or limiting protocols.
What kind of haunts me though is the leverage that this will give software. Imagine how much more powerful the internet is compared to, say, a library. Now imagine if instead of having one computer, with all the limitations we can’t even perceive anymore because we’ve been stuck in these limiting paradigms, we each have thousands or millions at our disposal. I picture it a bit like wolframalpha.com except with natural language processing, so you could ask it anything you wanted, and even if it couldn’t find the answer, it would do its best to reconstruct the missing pieces and get you close to your goal anyway. I just feel in my gut somehow that this will be part of the basis of artificial intelligence, and this general way of storing, retrieving and transforming information by hash, diff, or schema-less relationship and probability works from the lowest level logic gates and languages like Go up to big data APIs like Hadoop. Like maybe what we think of as intelligence is a subset of a universal problem solver. I guess that’s why I’m so interested in a free, distributed internet. It’s like right now what we think of as the internet is really the inside of a cage, and we can see out through the bars and imagine an ecosystem where countless ideas are evolving untamed.
I’m not sure if I’ve said anything at all here, but hey, it’s Saturday so might as well post it.