
The Decentralized Web Primer (2017) - dgellow
https://dweb-primer.ipfs.io/
======
velcrovan
So...an outline for an unfinished IPFS primer. Nothing to do with webmentions,
microformats, WebSub, or other decentralized web stuff.

~~~
bibyte
I think this is the gitbook: [https://flyingzumwalt.gitbooks.io/decentralized-
web-primer/](https://flyingzumwalt.gitbooks.io/decentralized-web-primer/)

~~~
velcrovan
right, that's the same thing? It's basically an outline since most of the TOC
links don't work/have no content.

~~~
bibyte
Are we talking about the same book ? I can only see two TODOs in the TOC. All
the other links seems to be working.

------
rthomas6
I think I know what IPFS is, but nothing about its culture or who uses it, or
why. What is the use case for ipfs? Can I make a normal website on it? What
kind of stuff is on it right now? Is this one of those things that has mostly
disturbing illegal content, drugs, and fringe political stuff on it? Or is it
a more normal community?

~~~
chriswarbo
> Is this one of those things that has mostly disturbing illegal content,
> drugs, and fringe political stuff on it?

Disturbing, illegal content often relies on encryption (to hide what's going
on) and/or anonymity (to hide who's doing it). IPFS isn't about either of
those: if you serve something over IPFS, anyone can fetch it and anyone can
see that you're serving it. Presumably encryption and anonymisation can be
layered on top of this, by those who want it (just like TOR does for normal
Web).

The main point of IPFS is content-based addressing: serving a file, like a Web
page, via IPFS gives it a URL automatically, which is just the cryptographic
hash of that file. This means that:

\- Multiple people serving the same file on to IPFS will be serving the same
URL

\- Anyone trying to fetch a URL can get the file from anyone that's serving it

\- Serving a different file (e.g. updating the latest stories on a news page)
will end up with a different URL

On the normal HTTP Web, each URL is a request which refers to a particular
machine (usually via a domain name which is associated to one or more IP
addresses). When we fetch a URL (e.g. example.com/cat.jpeg) on the normal Web,
we're relying on that machine (e.g. example.com) to exist, we're relying on it
knowing/understanding the request (e.g. cat.jpeg) and we're relying on the
result to be what we expected (e.g. a particular picture of a cat). Those
expectations can be violated at any point, e.g. the server may respond with
some porn rather than the cute cat picture it gave us yesterday. What's worse
is that, if something goes wrong (e.g. a domain expires, some URL becomes a
404 due to deletions or changing the server software, etc.), nobody else is
able to fix it: we're able to serve copies of content, at different addresses
(e.g. like the Internet Archive does), but users must manually seek out those
copies, and have no way of knowing if they're real.

With IPFS, anyone can choose to serve files that they're interested in. Anyone
can serve their own blog on IPFS, but _readers_ of the blog can _also_ serve
it; that way, even if the author disappears or their server explodes, anyone
fetching those URLs will still get the content they wanted (they would just so
happen to be getting it from the readers rather than the author). Since URLs
contain a hash of the content, it's trivial to check if the fetched content is
correct: just hash it and compare against the URL (IPFS might do this
automatically, I'm not sure).

I like the idea of hosting my blog and git repos on IPFS because it's more
resilient than a HTTP server. I currently have a VPS dedicated to serving
HTTP, even though most of that server's capabilities are wasted since it
requires hardly any resources. Yet I daren't use it for anything else, since
any breakages will take my Web site offline. The same single-point-of-failure
applies to reverse proxies, load balancers and other such HTTP endpoints.

On the other hand, I can stick an IPFS daemon on a few different computers and
have them all able to serve up those files in the background, whilst I use
them for other tasks like gaming, writing, home servers, etc. If any of them
falls over, my IPFS site doesn't care. I even had my laptop serving my Web
site via IPFS, which was suspended when travelling to and from my office; that
would be unthinkable for a HTTP site's availability.

Unfortunately the main IPFS implementation is currently too resource-intensive
for me to keep this up. Running it on a RaspberryPi was difficult and it
slowed my normal laptop down noticably, so for now I've stopped running the
daemon. I keep an eye on new releases and bug reports about resource usage, so
maybe it will become more usable soon :)

~~~
jakeogh
"just the cryptographic hash of that file"

Is that strictly true? I am still wrapping my head around the DAG, but as far
as I can tell, if I add a "large" file and you add a large file, they may
result in different ipfshashes. Is that remotely accurate? More importantly,
does it matter?

Can I hand someone a 16cb96ba835448d441b02c50d783a0ea7b424df1 and expect them
to be able to find the file?

[https://github.com/ipfs/go-ipfs/issues/1953](https://github.com/ipfs/go-
ipfs/issues/1953)

[https://github.com/ipfs/notes/issues/126](https://github.com/ipfs/notes/issues/126)

~~~
chriswarbo
You're right that it's slightly more complicated than e.g. `sha256sum <
myfile`. I would claim it's still "just the cryptographic hash of that file",
albeit via a slightly more complicated algorithm than raw SHA or MD5 or
whatever.

Files get split into "blocks" of raw data of a certain size (e.g. 64k), and
each of these is hashed to get its ID. Blocks don't have to store raw data,
they can instead store references to block IDs, i.e. "concatenate the blocks
with the following hashes: 12345 abcde vwxyz ...". Those blocks are _also_
hashed to get their ID and can be referenced by other blocks, and so on.

The first type of block (containing raw data) acts as storage. The second type
of block (containing references) tells us how to reconstruct our files. The
reason IPFS does this is to increase the chance that data can be deduplicated
(for example, if we upload a new version of our blog, the unchanged pages can
reference the same blocks as the old version) which makes storage more
efficient (with a logarithmic cost for storing the blocks of references). It
also makes the storage more resilient, since blocks that appear in multiple
files (or multiple version of a file) are more likely to be served by more
people. It also allows downloads to be performed in parallel, since we can get
some blocks from one person and other blocks from someone else.

This splitting-up-and-referencing process is deterministic, so people
uploading the same file _will_ get the same hash. However, there are options
which we can change from the defaults, which will cause the hash to differ;
for example if we pick a different block size or hashing algorithm. Files can
also be added in a "streaming mode", which will also result in a different
hash than adding in the default way. AFAIK streaming mode makes it easier to
pick blocks from the start of a file (e.g. so we can buffer the first X% of a
video file, rather than fetching X% of the blocks from all over the file in
whatever order our requests happen to come back in)

> Can I hand someone a 16cb96ba835448d441b02c50d783a0ea7b424df1 and expect
> them to be able to find the file?

That's not a valid IPFS hash (e.g.
[https://ipfs.io/ipfs/16cb96ba835448d441b02c50d783a0ea7b424df...](https://ipfs.io/ipfs/16cb96ba835448d441b02c50d783a0ea7b424df1)
doesn't work). If you send that file to `ipfs add` (or `ipfs add -n` to only
get the hash and avoid actually serving it), you will get a hash that anyone
can attempt to fetch. Those attempts should work, as long as someone (possibly
you, possibly anyone else) is serving that file (just like with HTTP; although
multiple people can serve it and no permission is required, unlike the single
IP/domain of HTTP identifiers)

------
bibyte
IPFS is truly an exciting project. Unfortunately in practice it is much slower
then the normal web. I am hoping that will change in the future.

In case anybody else missed it here is the gitbook link:
[https://flyingzumwalt.gitbooks.io/decentralized-web-
primer/](https://flyingzumwalt.gitbooks.io/decentralized-web-primer/)

------
comboy
Any reason last repo updates seem to be from 3 years ago? I thought IPFS is
still an active project.

~~~
bibyte
Which repo ? The latest commit on this repo is 1 year old.

[https://github.com/flyingzumwalt/decentralized-web-
primer](https://github.com/flyingzumwalt/decentralized-web-primer)

~~~
comboy
Oh, I was looking at the github link at the bottom[1] but I didn't notice it's
a link to specific commit, latest commit is actually from 11 days ago, I'm
sorry.

1\.
[https://github.com/ipfs/website/tree/49b7cc4cd170138388012c7...](https://github.com/ipfs/website/tree/49b7cc4cd170138388012c70ff6087b14111c1f0/content/pages/docs)

