
Will there be a Distributed HTTP? - prostoalex
https://www.mnot.net/blog/2015/08/18/distributed_http
======
pkinsky
This is actually a pretty cool idea (although perhaps badly explained, given
the other comments here).

Here's how it could work: IPFS ("In some ways, IPFS is similar to the Web, but
IPFS could be seen as a single BitTorrent swarm, exchanging objects within one
Git repository") is a globally distributed hash-addressed versioned
filesystem. (see: [http://ipfs.io/](http://ipfs.io/))

They have a mirror of their homepage hosted on IPFS, here:
[http://gateway.ipfs.io/ipfs/QmeYYwD4y4DgVVdAzhT7wW5vrvmbKPQj...](http://gateway.ipfs.io/ipfs/QmeYYwD4y4DgVVdAzhT7wW5vrvmbKPQj8wcV2pAzjbj886)

To answer the question: distributed GET and HEAD are absolutely possible.

~~~
comboy
Isn't that more or less how Freenet works?

~~~
derefr
Yep. Freenet is inefficient, though, because it has a bunch of indirection in
order to obscure the originator of the inserted documents. You could remove
that indirection if you just cared about the content, not about anonymity.

~~~
ArneBab
If I don’t need anynomity, I would choose the Download mesh from Gnutella.

If I do need anonymity, Freenet is a very good choice.

------
kragen
This is one of the most crucial things we need to make free software viable
again. In 2006, I wrote that the only solution to the problem of proprietary
services was to "build these services as decentralized free-software peer-to-
peer applications, pieces of which run on the computers of each user":
[https://www.mail-archive.com/kragen-
tol@canonical.org/msg001...](https://www.mail-archive.com/kragen-
tol@canonical.org/msg00154.html)

And, in particular, I wrote a few months later that replacing HTTP URLs for
naming content is necessary and nearly sufficient: [https://www.mail-
archive.com/kragen-tol@canonical.org/msg001...](https://www.mail-
archive.com/kragen-tol@canonical.org/msg00176.html)

We still have a long way to go, but it's heartening to see so much work toward
solving the problem! Perhaps one of the systems mnot links to will evolve to
solve the problem; perhaps it will be something that we haven't started to
build yet.

This is crucial to the future of civilization and to the longevity of your
personal work. Nearly all the effort that went into proprietary software in
the 1980s and 1990s has been lost rather than becoming part of the cultural
heritage of humanity, in the way that Emacs and GCC have. Similarly,
everything you invest today into proprietary web services is ultimately
destined for the dumpster, whether it's code you write to build them or data
you store in them. We need an alternative that has a chance of lasting.

~~~
gress
How do you foresee the decentralized production of CPUs being achieved?

~~~
traverseda
A bit of a non-sequiter, but I'll bite.

Replicating toolboxes. Or self-replicating benchtop factories.

Microcontrollers could be produced using dip-pen nanolithigraohy, or a few
other techniques.

Of course we need decent open source atomic force microscopes first.

There will always be components that can't be produced on that kind of scale.
In reprap parlance we call the "vitamins". An apt metaphor.

~~~
jacquesm
The production of integrated circuits is very much dictated by economies of
scale. The capital require is simply too large to justify the creation of a
production line for a small run. If you crack that particular nut then you're
going to be the next Bill Gates.

~~~
vidarh
The capital required to _currently_ produce top-of-the-line ICs is too large.
But we're getting closer - people are building their own CPUs. E.g. there was
the article of someone breadboarding a CPU in the 10's of MHz range recently.
And there are plenty of FPGA projects, which while still depending on ICs
manufactured by a central manufacturer substantially reduces their role.

~~~
gress
How exactly do FPGAs do antything at all to reduce the role of central
manufacturing?

~~~
kragen
The same way computers do: you can reprogram them instead of buying a new
thingamajig.

~~~
gress
This makes no sense. Computers are made of centrally manufactured components,
and are regularly replaced by new ones as technology advances.

This applies to CPU's too and is in no way mitigated by FPGAs.

~~~
kragen
I use my computer to read books, for example. This eliminates the central
manufacturing of books; when I want to read a book, I tell my computer to turn
into a book, thus manufacturing one temporarily on my desktop until I'm done.

I use my computer to graph equations. This eliminates the central
manufacturing of graph paper; when I want graph paper, I tell my computer to
turn into graph paper.

I use my computer to make telephone calls. This eliminates the central
manufacturing both of telephones and of telephone switches; when I want a
telephone call, I tell my computer to turn into a phone, and it turns routers
into phone switches.

I use my computer to watch pornography. This eliminates the central
manufacturing of porn VHS tapes; when I want a porn video, I tell my computer
to turn into a porn video.

I use my computer to store books. This eliminates the manufacturing of
bookshelves, although to be fair, bookshelf manufacturing has never been as
centralized as computer manufacturing.

I use my computer to do engineering design. This eliminates the central
manufacturing of drafting tables, slide rules, and pocket calculators.

I use my computer to find out what time it is. This eliminates the central
manufacturing of wristwatches.

I use my computer to send letters such as this one. This eliminates the
central manufacturing of envelopes and postage stamps.

I use my computer to find out about Syrian children drowning in Turkey. This
eliminates the central manufacturing of newspapers, although again, this may
not have been as centralized as computer manufacturing is.

Perhaps this can help you to understand in what sense the original statement
is true, as well as the sense you've already pointed out in which it is false.

------
sktrdie
There's already a quite large distributed "HTTP" being used everyday:
BitTorrent's DHT network. URIs are just the keys of the distributed hash
table. Keys are also mutable so one can change the content stored at specific
keys. Right now it's being used to serve very large files and not HTML/CSS/JS
files. Things like Project Maelstrom are a step in the right direction.

Problem is that it's hard to find things, just like it was hard when the Web
started. There are opportunities for the next "google" of this new DHT space.

~~~
Torgo
Is there a description of how maelstrom holds references in the DHT? I can
think of half a dozen ways to bolt HTTP hosting on bittorrent, but the scale
of "people using http" is orders of magnitude larger than "people who use
bittorrent".

------
MorphisCreator
Yes, I've already implemented it:

Distributed HTTP: Maalstroom on MORPHiS :)

GPLv2 unlike Bittorrent Inc.'s Mælström

[https://morph.is](https://morph.is)

It is very fast because it is not anonymous first. Although it is 100%
designed with being non-leaking over Tor. It already works over proxychains
great. I will add SOCKS5 support soon.

Also, don't forget the distributed spam resistant automatically encrypted and
transparently authenticated mail:

[https://morph.is/v0.8/dpush-whitepaper.odt](https://morph.is/v0.8/dpush-
whitepaper.odt)

Dpush is distributed /unsolicited/ POST :) Solves the previously open problem
perfectly.

MORPHiS hosted MORPHiS website:

morphis://sp1nara3xhndtgswh7fz OR localhost:4251/sp1nara3xhndtgswh7fz

URL is a hash of the data or the key that signed it. No MITM possible.

The next module I am implementing is DDS - Distributed Discussion System. It
is quite easy because it is fully enabled by the existing Dpush invention that
already powers MORPHiS Dmail.

~~~
ArneBab
how does it avoid timing attacks when running over Tor?

------
marknadal
I met Mark and Tim Berners-Lee at Extensible Summit and was very happy that
they are still actively fighting for the World Wide Web in its full
distributed, decentralized glory.

I do work on synchronization in distributed systems, and would like to add my
database, [http://gunDB.io/](http://gunDB.io/), to the list. Why? Because it
answers his questions in the "Some State and Processing Really Wants to Be
Centralised" section. If you want more info on this, check out the github
repo, or ask me.

Anybody interested in these subjects should be at [https://2015.distributed-
matters.org/ber/](https://2015.distributed-matters.org/ber/), Kyle Kingsbury
will be doing the keynote and later on in the day I'll be presenting my
protocol.

Mark's "Modifying The Web is Scary" section is important, I do see a lot of
people reinventing the wheel but it isn't too hard to get everything to work
over PATCH (sadly a verb which didn't take off but is in the specification)
and upgrading to WebSockets.

Overall, great post. I hope more people talk about this.

------
cbhl
It's not clear to me what's novel about this proposal, compared to existing
distributed stores like the Freenet Project
([https://freenetproject.org/](https://freenetproject.org/)).

~~~
mburns
1\. This isn't a single proposal

2\. It is not trying to be novel relative to existing distributed stores. It
is trying to get the features in HTTP that have been used in lots of other
custom applications.

------
wang_li
How is this distributed model going to deal with the fact that a lot of (all
of?) the websites we visit have dynamic content?

I feel like I'd be much better off, privacy-wise, if I could get a browser
that had a user manageable list of locations to pre-cache with some kind of
daily/weekly offline cache refresh. So when I go to a web page that fetches
jquery from google's hosted libraries service, it instead pulls it from a
locally cached copy and never ever fetches from google as a result of visiting
a web page.

~~~
onion2k
_How is this distributed model going to deal with the fact that a lot of (all
of?) the websites we visit have dynamic content?_

Web content is rarely _very_ dynamic. With the exception of social networks
and news sites, most websites would be largely indistinguishable from what
they look like now if they were serving the same page as yesterday or a week
ago. Technologically we could trade the 'rendered on request' model with the
very latest data for a 'rendered and then cached across the internet for a
day' distributed model very easily.

Psychologically however, no one would agree to it. People believe their
content has to be available to everyone the instant they press Publish.

~~~
mrsharpoblunto
I disagree - aside from blog sites or marketing/portfolio type sites basically
every web application on the modern internet requires per-user dynamic content
to function - e-commerce, banking, productivity apps, social networks,
messaging etc. the modern web is all about displaying and processing users
data.

Really any distributed cache is only going to be useful as a replacement for
existing CDN's serving static assets (and thats totally fine - it would be
great to see the democratization of the performance & scalability of a global
CDN)

------
alexchamberlain
A little bit confused here. HTTP is already distributed, right? It's an open
protocol with a very large number of servers serving a variety of content with
little to no coordination.

~~~
zokier
HTTP is federated more than distributed.

~~~
Zash
Email is federated. Not sure I'd call HTTP federated in itself, but protocols
like StatusNet that build on HTTP are federated.

------
moron4hire
> Finally, cutting the server out of the equation is seen as an opportunity to
> reset the Web’s balance of power regarding cookies and other forms of
> tracking; if you don’t request content from its owner, but instead get it
> from a third party, the owner can’t track you.

This is somewhat of a concerning line. Off the bat, I imagine it's not outside
of the realm of possibility that CDNs could collude with "trackers". Yes,
idealistically, a completely distributed model performs in such a way that
provides no favor to any particular individual. But I don't think it is
anywhere near close enough to being proved that the free-rider problem won't
skew the issue towards providing trackers a _de facto_ centralization.

Also, this doesn't eliminate other bottlenecks, like tapping into internet
exchange hubs. If there are too many CDNs with which to collude, there are
certainly not too many hubs. It works better for them, actually, as they can
start dragnetting traffic completely outside of even more modern tracking
systems like Super Cookies.

We have long past the point that people can no longer intuitively understand
what data they leak on the network.

------
Kalium
A more fundamental question, I think, is "Will there be effective distributed
authority?". So far, this is problematic at best.

~~~
ilaksh
We should supercede authority with better technology. Such as bitcoin.

~~~
Kalium
Being annoying to attack is not superior to the level of trust and certainty
made possible with technologies such as DANE and DNSSEC. Which is to say that
I agree with you that we should replace authority with better technology, but
bitcoin is not such a better technology.

------
ilaksh
We are going to get a distributed something. He mentions a lot of the existing
efforts.

I think these are tough problems but actually mostly solved in different
projects that are out there. The hardest part is making the ideas work
together and agreeing on protocols.

The solutions that become popular could really help quite a few people. I see
it as possibly being the key to society's overall struggle for effective
organization.

Right now I believe we need a small number of very flexible distributed
protocols to be used as widely as possible, and have most if not all other
systems built on top of them. That will mean a high degree of automation in
systems integration while supporting diversity and freedom for systems to
evolve. If we can do that and solve problems like privacy, synchronization,
and latency issues at the same time, we could leverage that type of system for
addressing things like inequality and efficient use of resources.

------
tracker1
One of the things I wish browsers supported would be signed content... IE your
CDN/distributed content doesn't need to be on HTTPS, just have a header with a
payload signature against the HTTPS cert that the application host uses... I
don't know why such a beast was never introduced into the browsers.

I also feel that it would be nice to see more p2p protocols especially
regarding live chat, and other systems. I think having a self-discoverable
alternative to IRC could be a pretty nice thing... of course eventually the
bots would destroy it if it got popular.

~~~
mnot
That's actually been discussed quite a lot. However, it shares the "request
privacy problem" that I talked about in the blog entry; the mere fact that
you're requesting information -- even if it's public -- is sometimes sensitive
information.

Keep in mind that the determination of its sensitivity is often highly
contextual; e.g., something that's not a problem in your country may be
illegal elsewhere, or someone in a different situation to you may feel
differently about how their request stream should be treated.

~~~
tracker1
Fair enough... my main point was pragmatic... It would be nice to be able to
serve certain assets more decentralized and widely distributed than even, for
example cdnjs.

jQuery, React, shims for browserify, etc, would all be nice to haves outside
of the main payload, and loadable/cacheable on a widely distributed signed
system from the browser directly.

------
ArneBab
Do you remember the download mesh? [http://rfc-
gnutella.sourceforge.net/developer/tmp/download-m...](http://rfc-
gnutella.sourceforge.net/developer/tmp/download-mesh.html)

3 additional HTTP headers for fully distributed downloading.

The server provides an URN and some IPs from participating downloaders and the
clients can (but do not have to) swarm that from other clients. Tiger tree
hashing ensures that every chunk can be verified, the X-Alt header allows the
clients to exchange IPs among themselves (the server does not need to know
all) and an X-Nalt header gives distributed disruption avoidance.

Back then I wrote a simple server which provides that:
[https://bitbucket.org/ArneBab/gnutella_tracker/src/43dc24ddd...](https://bitbucket.org/ArneBab/gnutella_tracker/src/43dc24ddda61e62a85c975439889201f45c21ba6/gnutella_tracker.py)

This doesn’t give additional privacy (aside from the effect that the server
owner does not need to know whether you just started the download or finished
it), but the users can decide themselves where they download from. And the
download mesh has proven itself with 50 million users.

------
LukeB42
Yes. It's a "steam engine time" thing where the invention manifests in
regionally disparate places in roughly the same period due to evolutionary
necessity.

Even I have an instance(!):
[https://github.com/LukeB42/Uroko/tree/development](https://github.com/LukeB42/Uroko/tree/development)

------
jokoon
> That certainly isn’t impossible, but it’s going to require a fairly
> sophisticated protocol to achieve; I’m not aware of one yet, would be happy
> to be shown otherwise.

Well this means version control. Of course you don't need fine granularity,
but version control is a good start to solve that problem.

> What about incrementalism?

Honestly I'd prefer something brand new. HTTP is plaintext just like telnet,
which in my opinion is not adequate for such a complex decentralized protocol.
If you want performance, look at how bittorrent does it. I think performance
is important here, and since decentralization means being more vulnerable
attackers, I think the protocol should really be designed around mitigating
attacks. But maybe I'm wrong. Telnet or HTTP are things that should not be
dealt with, in my opinion. I would gladly see them disappear to be honest.

~~~
serge2k
HTTP2 isn't, it's binary.

Version control is not a solution to the problem of shared state in a
distributed system.

------
AlphaWeaver
While it doesn't solve the whole problem of a distributed web, something
called MORPHiS has a lot of these concepts handled pretty well. It was on HN a
while back, the website is morph.is if you'd like to check it out.

------
Animats
I'd settle for distributed IP, where everyone who wants one has a fixed IPv6
address visible to others. This allows straightforward VoIP and video chat
without a server or gimmicks to get through dynamic DNS.

~~~
david_ar
Hyperboria/cjdns gives you that

------
neurohax
I think he is talking about some semantic routing layer above http where you
ask for content without querying a specific source/endpoint in the same way
that dhcp looks for an ip address. I call it the neuroweb and I have made some
satisfying experiments in this field; the only barrier I can see has to do
with performance/scalability since we have schemaless data structures +
realtime queries, which can easily bottleneck without some kind of efficient
specialized index.

------
lighthawk
Why is distributed HTTP needed when you could just run a local (web)server
that would act as a translator to the peer-based web which you could implement
with other protocols? That way, you can continue to use existing browsers. Now
you'd just need to implement that server...

Heck if you did it this way, you could just have remote servers actually pull
content from the regular web and serve it up through the peer web. It could be
similar to a proxy.

------
viraptor
There's a similar summary in the ietf draft:
[http://tools.ietf.org/html/draft-huang-ppsp-p2p-webrtc-
surve...](http://tools.ietf.org/html/draft-huang-ppsp-p2p-webrtc-
survey-00#section-4)

It adds some information about the architecture of each solution.

------
jheriko
i do think there is a lot to be had from p2p technologies. the centralised
server is just the obvious and easy solution in most cases...

its a shame webrtc is such a disaster of a project though. it could be truly
reusable and powerful... instead its a nightmare of obsolete things, difficult
configuration and fictitious problems imposed by bad developers. a lot of this
is a legacy of being meant for all kinds of things... but we seem to be
lacking a clean and simple p2p library today.

i don't want to depend on systemd or obsolete headers MS deprecated in VC 6.

------
corobo
It's only vaguely related... do any of these new HTTPs (HTTP 2, SPDY, etc)
support DNS SRV records for load balancing and failover at the DNS level?

------
thrownaway2424
Fundamental problem: either the distributed thing will be an extremely easy
target for DoS attacks -- just because the service is distributed doesn't mean
the attacker need be; the attacker can be as monolithic as he wants to be --
or, it will be only superficially distributed, and actually hosted on some
honking big central infrastructure.

Things that are smaller than Google, Amazon, or Facebook can be DoSed into
total oblivion. Abuse is the greatest unsolved problem of distributed systems.

------
lazyloop
No.

~~~
xj9
Why not?

