Hacker News new | comments | show | ask | jobs | submit login
HTTP is obsolete. It's time for the Distributed Web (2015) (blog.neocities.org)
467 points by Karrot_Kream on Sept 30, 2017 | hide | past | web | favorite | 216 comments

I need to comment because people are missing the point... there's nothing in this text that says the web won't need servers.

Imagine that you and some friends want to launch a small local business and need to host a website. Instead of paying to host it "up in the cloud", why not plug a few raspberry pi's into the walls at each of your houses? Between that and also seeding it from your laptops, the site should have decent coverage. Maybe you could also offer some discounts for loyal customers who choose to seed it. This site will be redundantly hosted in the location where it's most likely to be accessed.

If you're doing something more permanent, then you will want to make sure you have more stable hosts, and not just some peoples cell phones. Just like you do now. It's the same thing as now. If you want to host something, make sure there is at least one computer that is serving it on a decent internet connection.

The difference is that with distributed web technologies, there is a smooth continuum for scaling. You don't even need to assume there is an ISP to seed! All you need is LANs. But if you want to do big things, you can harness the power of thousands of peers all streaming something that they're into.

I use syncthing and resilio sync all the time, and it works great with just 3 devices.

> Imagine that you and some friends want to launch a small local business and need to host a website. Instead of paying to host it "up in the cloud", why not plug a few raspberry pi's into the walls at each of your houses?

Setup, updates, maintenance, tech support, and uptime guarantees, just to name a few reasons that "the cloud" is better. A service like Wordpress.com or Wix beats the self-hosted Pi on all of these counts.

I interact with a lot of non-technical small business owners and am "that tech guy" in their minds. A question I'm hearing more and more frequently is _why even bother with a website when a Facebook page is much easier and they can see people interacting with it._

Their reasons are not all that different from why many tech savvy HN readers are using a Mac instead of Linux: convenience; less shit to worry about.

Hosting anything on a Pi plugged into the wall goes in the exact opposite direction from what these people want. The centralized services are winning because they pay attention to what the market wants, they build it, and they make it easy to sign up.

I don't literally mean a Raspberry Pi. Raspberry Pi is the Apple 2 of what I'm imagining. I'm talking about some next generation stuff, picture a Firestick with a much more refined iteration of sandstorm, with distributed apps that have hardly even been conceived of right today in 2017.

If my roommate can plug a Roku into the TV, and knows how to use Ableton Live and Squarespace, there's absolutely no reason he couldn't use something like that.

And there's no reason that those non-technical people couldn't continue to pay you for helping them use stuff like that.

And there's no reason that people can't continue to use stuff like Facebook. But I have a feeling that people are going to be over that way of doing things by the time the next two decades are over.

But you still have all maintenance related problems? How do you upgrade a hard disk or memory, go to their home? What happens if their home net is down or slow, no one can visit the site? Back in the days I was running a few web servers from my office directly, and it's a lot of extra work that is just not worth it.

There's still plenty of room for paid hosting services. But this lowers the bar in a major way.

If their home network is down, hopefully their office network isn't, or their business partners' networks aren't.

Also, these kinds of applications are practically begging for mesh networks. So what it means for a home network to be "down" could change a lot in the coming years.

Who cares?

The Internet is breaking all the time anyway. Every day at least one of the bigger / more important sites I visit has a temporary problem with something. Three days a week HN keeps returning CloudFlare errors to me. Even Facebook has some issues that break it every other week. The world isn't ending because of this, and it isn't going to end because the site I co-host with my other friend is down for the night.

As you grow to the point where close-to-perfect reliability matters, you'll be able to afford to get someone do handle the hosting for you, just like you do today.

Not denying you get these issues or errors with sites. But how come I never seem to have any issues with major sites. I do see cloudflare stuff. But never for a top 1000 site.

Why would you need to upgrade a hard disk or memory?

Have you ever worked in a datacenter.

How many datacentres does the average "small local business" have?

For the price of hardware we're talking about, it'd just get replaced.

Let’s imagine a real path to market: some big 5 company offers a “Facebook accelerator” usb compute stick that you put on your network and it provides local access and resiliency.

You are trying to communicate a wonderful message with an extreme amount of vision to people who sling code and do devops all day. What many of them may hear is you are asking them to do a bunch of work. But if you think 10-20 years in the future, your server was running something like a Phoenix type environment, then scaling wouldn't be as much of an issue. There are ways that updates and other problems can be abstracted away in a non-cloud environment. There are ways of running lean technologies on commodity hardware for servers.

The bottleneck I see is bandwidth and government regulation of the electromagnetic spectrum. If Amit Pai gets his way then what you are talking about will be much more difficult.


My roommates are all self employed musicians/artists, dependent on cloud services. So yeah I'm thinking about their business needs!

I have thought about this a bit lately. Social media can be a very convenient way to keep up to date with people, and to allow people to keep up to date with you. But it has real costs.

I abandoned my Facebook account many years ago and refuse to set up a new one. As a result, it is difficult for my children to interact with me. They post their lives on Facebook, but they don't open that up to the public. So only specific Facebook accounts get to see what they share. That excludes me.

I have to open myself to Facebook to see their content. Alternatively, they could set up non-Facebook websites and "blog" their lives there. But that is harder than just using Facebook, and gets really hard if you want to keep it viewable only by select people. And then I would have to set up an account on their server to be able to see their content. Multiply that by all the people I would have to set up an account with and it's obviously unworkable pretty quickly.

I still hate Facebook, and refuse to set up an account with them. But I recognize that the alternatives are not pretty.

Edit: typos

This is why to solve the Facebook / social media silo problem, we really need to solve the web identity problem.

The solution needs to allow identities which are "register once, use anywhere" across the web, and portable so that you can migrate to a different identity host/provider/implementation without losing all your accounts. Ideally these identities would also allow you to reveal as much or as little information about yourself as you want, and not force you to reveal some unique property which is correlatable between colluding sites.

Obviously that's not an easy thing to create, and may actually be harder than creating a viable rival to Facebook, but if we don't solve the web ID problem, then governments and corporations are going to "solve" it force us, and we'll all be the worse as a result.

I liked Mozilla's Persona (BrowserID) for the fact that it protected privacy from the 3rd party provider. You're asking for privacy protections from the site seeking identity authentication.

It's a fascinating idea to have some means of identity authentication where the party seeking to authenticate your identity doesn't have enough identity information to connect the dots, and the party providing authentication doesn't even know about the party seeking authentication.

Uptime guarantees? Would that be those 25% rebate for a single month if your business is down for 7hrs or more? I have never seen people talk about it other than as a joke.

Setup, updates, maintenance, and tech support is however a thing, but there is always the fine print. The cloud generally do not maintain and update a website, and attacks generally succeed today by targeting poorly updated websites rather than servers. That leaves setup and teach support, two things which the website developer will often provide while they create and maintain the website.

You can abstract away the Pi and updates as easily as the cloud abstracts away everything. 90% of AWS users can't even figure or don't care about out how to distribute across multiple availability zones, much less data centers. Two Pis would be fine for 99% of this 90%.

Seems like the future to me, it's just not in any big tech company's best interest right now as they fight over data centers.

I don't know why we keep gravitating to Raspberry Pi's for this. IPFS works just fine on Windows clients. (By which I mean, it's no less stable than on Linux, neither of which runs particularly well at this early stage.)

What if publishing to your blog was as easy as installing a Chrome web extension that now travels with you? Suddenly every computer you have that runs the extension can pin a local copy of the site, and your visitors help out by virtue of the protocol. This doesn't exist of course, but thanks to ipfs-js, it could, very easily. There are obvious drawbacks to this approach of course, but my point is that the "Hello World" of this kind of app could easily be much lower bar than setting up a Raspberry Pi.

Distributed systems are suited to static content or "append-only" mutable data - canonical examples include Magnet links, distributed hashtables, git, and the Bitcoin blockchain - they're all reliant on content-addressable storage. Not all web-applications can support this model, for example a banking app or online shopping cart site, which depend, respectively, on secrecy instead of complete transparency and mutable ephemeral state. What is the "Raspberry Pis in the walls" solution to those problems?

Well of course it depends a lot on your specific application.

Applications like Tox or Matrix (which uses servers, but not necessarily "centralized" servers) are great examples of dynamic p2p applications.

Or for example applications that use statically distributed javascript to facilitate dynamic p2p communications. Stuff like together.js, gun.js, freedom.js, etc.

Syncthing and Resilio Sync are also wonderful examples, and Resilio Sync has amazing encryption features: you can give out seed-only links to your data. People who use these links won't have permission to decrypt the content. They will only have permission to echo it. That's a "raspberry pi plugged into the wall at the coffee shop" solution to private, mutable content distribution.

As for the shopping cart example, this is something that could be conducive to a more centralized approach, especially if your physical distribution model is centralized and your payment system is centralized (traditional banks). In that of case, you'd want to have a more direct connection with the physical distributor. If you want a direct, instantaneous connection with the shopping cart company's servers, then that's what you need.

But it's possible to have a situation where your product is not physical (like music or video), and you are using a decentralized currency (like bitcoin). There's absolutely no reason you couldn't facilitate that in a completely distributed way.

By the way, Bitcoin is a banking app... Have you ever used a browser based cryptocurrency wallet? Imagine a browser based cryptocurrency wallet that's hosted on IPFS. That's a pretty distributed banking app. If you want privacy too, use zcash or monero.

See the comment below https://news.ycombinator.com/item?id=15376665. It is not true that distributed systems are only good for static content or "append-only" data. "Mutable systems" can be built on top of immutable systems.

I agree with you, but your argument is deeply flawed. There's quite a far stretch between "Y can be built on top of X" and "X is good for Y".

To provide an argument that might fill this gap:

Most systems don't actually have a huge amount of data. Look at the data size and data growth of CRMs, special-purpose wikis, and so on: These are mostly smaller than 500 MB (excluding static content like images), and grow by less than 1MB even on a busy day. And that's the uncompressed size.

Also, most systems, despite being mutable, actually want (or need) an audit trail. So these are really append-only systems which merely have a "mutable look and feel" to the user.

Agreed that many CRMs etc. don't have a lot of data. And that's actually good, it makes the database size very manageable in the context of trustless, distributed networks.

I'm not following the logic of the argument here though, jumping from "X is good for Y" to "...don't actually have huge amount of data", perhaps you can elaborate?

With a merkelized append-only log (immutable DAG), there's always an audit trail. I agree with your point about "mutable look and feel", in a lot of use cases there's only a limited set of "writers" and updates happen infrequently.

Perhaps I should rephrase my previous comment, then, as "immutable systems are good for building mutable systems on top". Does that help to provide a better counter argument?

Here's my complete line of reasoning:

You can build mutable systems on top of immutable (append-only) systems. But is that a good idea? Yes, it is, for systems which don't have huge amounts of (non-static) data, and/or system which need an audit-trail anyway. And these are more systems than one may initially think.

"Here's my complete line of reasoning: You can build mutable systems on top of immutable (append-only) systems. But is that a good idea? Yes, it is, for systems which don't have huge amounts of (non-static) data, and/or system which need an audit-trail anyway. And these are more systems than one may initially think."

I disagree that immutability is a negatively defining factor here re. data size or capabilities of the database.

If you look how many Big Data systems process data, you'll find that at the core of many, is an append-only log. For example: Kafka is a log (https://engineering.linkedin.com/distributed-systems/log-wha...), and looking at Apache Samza's architecture, we can see how a log is at the core of it (https://www.confluent.io/blog/turning-the-database-inside-ou...). In less Big Data orientated databases, there's always a log of operations (sometimes also called a transaction log or replication log) to keep the track of changes.

I think git is a great example of bridging the mutable/immutable gap. The "mutable" stuff happens locally in the ram, or on a local filesystem, as someone edits their files, debugs, whatever. A commit represents a save checkpoint. Somebody has decided that this state is worth snapshotting, that it would be a useful reference down the line. At this point an immutable version is made, ready to be shared.

As with git, even if a version (commit) is immutable, it doesn't mean it's worth saving. Lots of times, you might make a temporary branch locally to do some work. Then you'll merge it and push the merged version upstream. Later you might check out a new copy from upstream, not caring that your temporary working branch isn't there.

User friendly versioning is a major challenge for dynamic, distributed applications. How do we gracefully bridge the gap between long term (distributed) memory and short term (local) memory? Each specific application has its own needs and tradeoffs.

And how do applications communicate about which versions are compatible with the applications' needs? About which versions are worth holding onto?

I don't really get it. Sure it's fine if one p2P app uses 3GB (1GB for the append only log, 2GB for a database with indices that can actually be queried) of data. What if you have several apps? Let's say 10. Then you need 30GB and because people only have 32GB to 64GB of storage on their phones the discussion ends right here.

I didn't downvote you. But your data sizes are arbitrary.

Why would something like a chat or email app need to hang onto that much history?

Imagine a distributed "email" app that uses networks of mutually trusted peers to deliver encrypted messages ("emails") asynchronously. My device doesn't need to hang onto your emails indefinitely. It only needs to hang onto them until they've been received. This could be done via explicitly sending receipts, or probably in most cases by giving stuff simple expiration dates. The sender would have the most incentive to hang onto the original message until its been delivered.

How this scales in terms of MB and GB is hugely dependent on how your application is configured, how frequently new data is emerging, the limits set by peers for how much they're willing to share, etc. But text is pretty cheap. I can't imagine storing 3 GB of yours or someone else's text emails on your phone, short term or long term. The raspberry pi plugged into the wall at your house can has much more storage anyway ;)

I don't really see why a CRM needs to be decentralised. You need to host it yourself to avoid a cloud vendor going out of business but other than that what problem do you solve by decentralising it?

Delivery. A cloud provides worldwide availability at the cost of trust. A distributed site can survive any one entity failing, which includes you, and it can serve from anywhere your users want it to.

You're right - you can model any data as append-only though. Granted, in many cases it will require you to seriously sit down and remodel your data. Nobody's claiming it's going to be SQL and ACID transactions :) It's more likely going to be collaborative append-only logs based on CRDTs.

There are examples of the use cases you mention being built with decentralized technologies. [1][2]

[1] The various cryptocurrency wallets and exchanges

[2] https://openbazaar.org

You're not describing a problem that users need fixing. You're describing how nice it would be for corporations to unload their own infrastructure requirements onto unsuspecting users to try to piggyback on their hardware. That's not a solution to any problem. That's in fact a problem being created for users. Consumers purchase hardware to fill their needs, and there's nothing to be gained for wasting their resources and energy to power someone else's business.

It is a problem for sites facing censorship (gov't or corporate or social).

Isn't this the reason why they introduced Filecoin? To incentivize consumers to share their resources.

> Isn't this the reason why they introduced Filecoin?

That assumes that all users are morons. If they want to pay anything to use else's hardware and energy then they only need to pick up their wallet and offer real cash. Offering another Dogecoin competitor to convince people to give away their energy bill and hardware is just a fancy way to officially assert that the people who fall for these gimmicks are complete morons who give away their resources in exchange for glorified arcade tokens.

The "arcade tokens" can be reliably traded for cash since other people need them to access real resources.

well... discounts...

Since I've started using syncthing, I've also had similar thoughts. Say you want to start a private, static blog for a small group of friends ... why host it publicly, then lock it down? Why not just use sync a shared a repo?

The "problem" in this article's premise is coupling links with physical servers, for which DNS and CDNs have solved.

Distributed computing certainly has it's role, but the added complexity of cache invalidation, versioning, and content synchronization are undeniable.

DNS and CDNs may have solved this problem in a limited context, but they're hardly accessible, and are too dependent on centralized infrastructure for a lot of peoples' needs.

I hear you on the added complexity of it all, and I think that's why its taken so long for this stuff to be developed. It's been stewing for a long time. LANs have been around since before the internet, but a lot of stuff that's relatively easy on the internet is much, much harder in distributed and mesh networks. I think we're currently witnessing an epiphany.

Having the ability to recognize the same resource, and to request it, from multiple, disparate devices in just about any network setting is a totally different level.

And on a human level, I believe it could foster healthier social interactions around the sharing of digital content. When content is in its infancy, these kinds of applications create incentives, however slight they may be, to share it in closer physical proximity to people. It gives us back the awesome joy of the LAN party. Facebook says "you need to share your content via ISP and it must be stored on our servers at all times." Whereas distributed network applications invite us to imagine more humane, local, trusted environments for sharing digital content. If you want to share old family videos, get together for dinner, watch them on the TV and then share them privately over the WIFI. If you're in a band, make arrangements with local cafes to host your album. People who come for coffee will have the best bandwidth on it. If you're an artist, host your portfolio on your phone. Share it with people like you'd share a business card. If they like it, they may choose to help seed it, even without your asking. We're talking about the internet for LANs.

This stuff has the potential to refresh the art of digital collection and curation, something that's been co-opted by centralized content providers.

IPFS lets users contribute to the hosting too. I've had a lot of favorite sites go down in the past that I wish I and others could easily help host in a way that all the old links still worked.

Hey, is there a reason that you use both Resilio and Syncthing, instead of just the one?

I originally started using syncthing for own stuff. It works well for my backup needs

But the Android interface kinda stinks, so when I wanted to use it with "non-tech" people, I decided to just use Resilio instead. It's quite a bit more refined. I wish it was open source, but you can't have everything.

That's a fantastic answer, thank you. :)

"there's nothing in this text that says the web won't need servers"

That's, flatly, a lie. There's a whole passage in the middle about how needing servers is a weakness of HTTP that this gets beyond.

Instead, we could put Raspberry Pis in our walls? We can do that now.

The article says that http is dependent on one server per resource, and that this is a weakness.

In a distributed web you need seeds. Whether they are called "servers", "peers", "devices", or "clients" is dependent on implementation and semantics. For the purpose of my comments I chose to call dedicated seeds "servers", because it made sense to me.

And yeah, we can do that now! I've even used an old cell phone with a busted touchscreen as a low power syncthing device. Mostly it was meant to relay notes between my phone and laptop if one or the other was sleeping or dead. It worked great. Eventually, I stopped depending on it, because I have a headless machine running now (a "server") that does the same thing.

All I had to do to migrate was share my syncthing directories with the server. It was a breeze. I didn't have to worry about configuring addresses or anything.

Also, the server is useful as a relay, but if my phone and laptop are both on, say if I'm working at the library or something, they just communicate directly. So I don't depend on the server in the same way.

It's kind of funny that the example link to a "permanent" object still returns a 404 ("ipfs resolve -r /ipns/QmTodvhq9CUS9hH8rirt4YmihxJKZ5tYez8PtDmpWrVMKP: Could not resolve name.") I totally want the web to magically be distributed too, but clearly not even the author is bothering to host their IPFS content anymore...

> IPNS isn’t done yet, so if that link doesn’t work, don’t fret. Just know that I will be able to change what that pubkeyhash points to, but the pubkeyhash will always remain the same. When it’s done, it will solve the site updating problem.

Yeah, I saw that, but it's two years later and it still doesn't work? Not to be snarky, but my limited understanding is that IPNS is really the only novel part of IPFS anyway; if I just want to share a file peer-to-peer based on the hash of the content, bittorrent has existed for ages.

It just seems silly to talk about how HTTP is unreliable because your severs might go down, when the alternative "serverless" architecture you're hyping doesn't work either. I'm totally on board with the aims of IPFS and hope they accomplish all the things they're trying to do, but to say HTTP is obsolete when HTTP works and IPFS doesn't (yet) is just a little too much...

That's how these things always work, by saying "it will work one day."

If it would work, and the cost benefit ratio were there, people would adopt it quickly. That's what happens with just about everything else.

> If it would work, and the cost benefit ratio were there, people would adopt it quickly. That's what happens with just about everything else.

Great point! Just like:

* Betamax * HD DVD * Minidisc * Hoverboards * IPV6 * DNSSEC * PGP & PKI * Linux desktops * Dvorak keyboards * The metric system * Decimal time * [flavour-of-the-month programming language] * [flavour-of-the-month database] * [flavour-of-the-month cypher] * ...

The factors that influence the proliferation of a technology are wildly divergent from the criteria 'works well, cost/benefit'. I'm not even sure those are weakly correlated proxy indicators of technology uptake.

> The metric system

While I agree with your sentiment, this one is a bad example.

I grew up with the metric system, as did the vast majority of the world. I have an intuition for "meter", "kilograms", "seconds", and so on.

I need to convert to cumbersome stuff like "miles", "inches" or "pounds" only when reading articles written by, you know, inhabitants of that strange, large country over there.

It's still a good example because the government in your country probably mandated it. People didn't just switch on their own accord.

IBTD. This is mostly an educational issue. Here in Germany the metric system was introduced in 1872 [1], and compared to other European countries we were already late to the party. That's plenty of time for transition. The last generation who didn't work with the metric system is dead for a very, very long time.

[1] The history is actually more complicated, but let's not get into that.

The metric system has been taught in US schools for decades. Still no one uses it, because no one else uses it. Breaking out of network effect traps requires coordination only a government can provide.

Nicholas Nassim Taleb on the logic of the imperial system: https://www.facebook.com/nntaleb/posts/10153932393103375

> A furlong is the distance one can sprint before running out of breath

That doesn't seem very logical at all, that's entirely subjective. I'm fairly certain a top sprinter would easily be able to sprint much further than my (admittedly) unfit self before running out of breath.

One, furlong actually comes from "furrow length" which how long an ox could plow before tiring.

The point isn't that the measurement is precise, the point is that it's useful. The unit has an intuitive and tangible meaning in the real world that let's people ballpark. This doesn't mean we should start doing precision work in furlongs but demanding that everyone switch away from measures that are still useful is silly. As long as the measurements are standardized using metric units who cares that you have a funny name for 201.168m?

If there is one thing that people know deep in their guts today, its how long an ox could plow before tiring.

Which is why nobody really uses furlongs anymore but there are plenty of other units that are still in use. One example I think we're all familiar is the 'Rack Unit' for servers (i.e 1U, 2U) where 1U is 44.45mm. I don't think there would be any additional clarity gained by saying, "I bought a few 88.9mm servers".

But that's a context specific unit, not intended for general use.

Metric is great for general use simply because of it's multipliers: (...) 1G = 1,000M = 1,000,000K = 1,000,000,000 = 1,000,000,000,000m (...)

And also the simple way many units are related as well, like 1L of water having 1kg of mass (yes, with a certain temperature, pressure, yada yada yada)

I think the best situation is when you use sensible units for general situations, and when the funny units remain domain-specific.

Another example of a funny name is two-by-four, which - for some typically American reason - is understood not to actually be two inches by four inches...

> who cares that you have a funny name for 201.168m

You do care if you frequently have to convert between all those funny units.

Actually, it's pretty simple.

"Better" has to actually "be better ENOUGH" to warrant all of the retooling of existing systems. I've got plenty of clients who would happily run Windows 2003 ("it's paid for") if it weren't for changing standards that aren't compatible (newer TLS, Exchange, etc) and security breaches. They only upgrade because they have to. "E-mail is e-mail" to them.

But if you sell them some magical new technology that promises to meet new features, like tons of data analysis tools and easily graphs and charts in a new version of CRM, they'll happily upgrade.

Another important factor for adoption/adoptability is how well the new system integrates with existing deployments of older systems. Ideally it completely interoperates with the older systems, while providing you with additional value right from the start.

Agreed with lgierth and I believe this is what sets IPFS apart from many similar technologies: integration path for existing technologies. As far as I can tell, it has been an important design decision from early on for IPFS.

That only answers some of those examples.

It's pretty visible in tech that it's not actually the only (or main) reason, especially when you see companies continuously switching from one crappy tool to another. Tech is a fashion-driven industry; companies use what is hot and/or what everyone else is using. Both of those create a positive feedback loop that amplifies brief spikes in popularity (easily exploitable through marketing) beyond any reasonable proportion.

The worst thing is, though, that it kind of makes sense from the POV of management. The more popular something is, the less risk there is in using it, especially when the decisionmaker doesn't have enough knowledge to evaluate the options. Also, the more mainstream a given technology is, the cheaper and easier to replace programmers.

It was a good list until...

> The metric system

Really? You know that the whole world is on it, right? And that it makes far more sense than whatever nonsense someone came up with before.

You're implying that the metric system and IPv6 aren't being used in great numbers today, which is false.

I guess metric system is there to drive the point home to the Americans in the audience, and IPv6 as an example of something used but not enough to matter.

(Here's my new conspiracy theory: lack of adoption of IPv6 is caused by SaaS companies colluding to keep people and companies from being able to trivially self-host stuff.)

Wow! What an ignorant view of technology adoption! Almost, every revolutionary technology you see today (right from radio and A/C current to personal computers and deep learning) did not work fine once upon a time. It is because people kept saying, "it will work one day", and continued working on them that we have these technologies making our life simpler these days.

> "it will work one day", and continued working on them that we have these technologies making our life simpler these days.

You unknowingly make my point. I have nothing against people working on new technologies until they work. That's a strawman on your part.

But, this isn't a case of working on something until it works. The headline of this blog is, "HTTP is obsolete. It's time for the Distributed Web." IPFS is not ready to replace http, and it won't be until the cost vs. benefit ratio works out for enough people.

true for revolutionary new technology. Not true for incremental technology that aims to replace an existing similar technology. Especially if the incremental tech is something that most "normal" people don't really care about (such as hosting content on the internet).

Ask any non-technical person whether they've ever been bitten by link-rot! :)

Content-addressing doesn't alleviate the problem 100%, since content can still fall off the network - but it improves the structure of the network in a way that makes it tremendously easier to keep content around. It's not up the original source of the content (owner of the domain name) to keep the content around - anyone can help out by keeping a copy.

My colleague Matt addressed this beautifully in a recent talk at the NSDR Symposium: https://archive.org/download/ndsr-dc-2017/04_Speaker_3_Matt_...

A distributed web (including proper mesh networks) has the potential of changing the status quo from constantly worrying about data limits and "I don't have any wifi" to "normal" people having constant "internet" access everywhere they go.

I think they don't care much about the underlying technology, but they will notice when some apps work faster and without a mobile data connection when others don't.

> changing the status quo from constantly worrying about data limits and "I don't have any wifi" to "normal" people having constant "internet" access everywhere they go.

So will rapidly increasing data allowances and cellular coverage. Plenty of European countries have effectively unlimited data packages and effectively complete network coverage.

Crucially, whatever gaps there are in this are very likely to get filled through already underway progress much faster than a distributed web on mesh networks will get to a usable stage.

I am glad you are in a country where that seems to be the case or are just more optimistic.

The status quo however is colleagues of mine discussing which of the main carriers to choose to get decent 3G/LTE coverage in Berlin(!), after all these years of progrrss, so I think it's worth to consider other options. Rural Germany is even worse. My data caps are also about the same (less than double) as they have been 5 years ago.

Even if it won't take over as the dominant technology, it might create enough pressure on the carriers to act.

Couldn't people build webapps like this today by heavy caching and storing data locally?

To some degree yes, but with IPFS that becomes easier in my experience.

Even with static websites you usually need to have a web server you are able to connect to, or have to go out of your way to add a Webworker that makes is offline-capable. There a single address that is served via an IPFS gateway behaves better with less additional tooling.

In terms of caching, think of IPFS as making every node in the network also a dynamic CDN, with content automatically moving closer to the people who use it - including into your LAN.

Mo it isn't. I was trying to get small-medium businesses interested in a thing called email in 1990 and it was a hard sell.

Of course it was a hard sell! The cost/benefit wasn't right for that business in 1990...

DNS does that too?

Funny? Yes. Relevant? No.

The post makes the case for why IPFS should replace HTTP -- has nothing to do with how well the technology works right now in its infancy.

I'm pretty sure at this point in it's lifetime HTTP worked better than IPFS does now.

While this might solve distributing content serving, i'm still stuck with transporting all my data through my single point of failure + spying ISP, when all of my neighbors live within wifi-range.

For a truly distributed network we should look into ways to make internet work more like a mesh.

Back in university, all the student dorms were connected to the same internal campus network, back then internet was slow but you could still share files blazingly fast with all the other students on the network (using DC++ at that time). While this wasn't exactly a pure mesh either it shows that local data works and beats internet many times, even with just a few thousand clients. Then DC++ was mostly focused on pirated content but with a more human friendly solution, like IPNS, it's not unimaginable that average Joe neighbor with one click can create a local mirror of the whole Wikipedia for you.

IPFS can discover other nodes in the same local network via mDNS -- you don't need to have an internet connection at all to share data locally.

Combine this with the recent work on PubSub [1] and CRDTs [2] and you can make many applications work locally, that are otherwise annoyingly strongly coupled to internet services (thing Etherpad, Google Docs, Skype, Github, etc.)

> For a truly distributed network we should look into ways to make internet work more like a mesh.

Yes! We'll be putting more work into the network stack (libp2p) in the coming months. IPFS itself has done a ton to rework how content is defined and moved around, and libp2p will do the same for the network connections underneath. Think overlay networks, cryptokey routing, packet switching.

[1] https://ipfs.io/blog/29-js-ipfs-pubsub/

[2] https://ipfs.io/blog/30-js-ipfs-crdts.md

Why are distributed filesystems like IPFS so popular again these days ? Freenet has been around (and super niche) for close to 20 years soon. Is it because the Bitcoin hype has reinvigorated crypto-anarchists?

Hmm they are sort of different things. Freenet basically has hash-addressed content plus some relationship between human-readable strings and the hashes, so you/can/refer/to/stuff/like/this making it easy to use HTTP on top of Freenet for navigation. In contrast IPFS makes the hashed content itself in charge of navigation by using git-like objects -- if you understand the way git objects work https://git-scm.com/book/en/v2/Git-Internals-Git-Objects then you understand that this allows you to navigate an immutable tree of content and an immutable history. In contrast, on Freenet particular files are immutable, but that's as much as they guarantee.

They also differ in the way routing works. On Freenet you ask a (mostly) random neighbor whether they have a file with the hash you want. If they don't have it, they ask another (mostly) random neighbor. This can go on for a while, until it either finds the content or hits a maximum number of hops, in which case it backtracks. The only point of these rube-goldberg shenanigans is anonymity. Since IPFS is more concerned about performance it flips this on its head: instead of blindly asking nodes for content, it carefully keeps track of peers who advertise what they're looking for; aside from being much more efficient, this also allows you to choose not to do business with leechers, like in bittorrent.

Maybe IPFS is a reinvention of a past technology, but certainly not Freenet. (Does anyone know of something closer?)

Freenet paper: http://www.cs.cornell.edu/courses/cs414/2003sp/papers/freene...

IPFS paper: https://github.com/ipfs/ipfs/blob/master/papers/ipfs-cap2pfs...

>The only point of these rube-goldberg shenanigans is anonymity. Since IPFS is more concerned about performance it flips this on its head

Right, but there in lies the biggest issue that holds these systems back. Do you:

- Replicate and cache data freely between nodes, and by doing so open up scenarios where unpleasant content is stored on and served from people's nodes without their consent OR

- Limit replication and storage to elective manual choices made by the user and/or recent data they have explicitly accessed, and in doing so severely compromise the ability of your system to retain and serve data as that data ages.

IPFS in it's current state is prone to as much if not more bit rot than the web as a whole - When nodes drop offline the content they have pinned is unlikely to be present on any other nodes unless the original host party has explicitly replicated to other nodes they also control and pin content on.

The only solution IPFS has for this currently is manual, elective pinning of content by other network participants. Realistically, if your replication and robustness scheme depends on manual user intervention it's not going to find wide adoption.

All of this is fine, IPFS still has usage scenarios it meets well when operating in the state it's currently in. But as far as some ideas being bandied about on how it's producing a censorship-resistant, bit-rot resistant persistent storage infrastructure that might replace HTTP... Nope, not unless a novel solution emerges to this specific problem.

Very interesting, thanks for the answer. I'll read the papers you linked.

> Freenet basically has hash-addressed content plus some relationship between human-readable strings and the hashes, so you/can/refer/to/stuff/like/this making it easy to use HTTP on top of Freenet for navigation. In contrast IPFS makes the hashed content itself in charge of navigation by using git-like objects [..] on Freenet particular files are immutable, but that's as much as they guarantee.

Actually there isn't that much of a difference here. Freenet manifests are analogous to git trees; knowing the chk of a manifest file gets you to the metadata that identifies all the files under the tree. It's all immutable.

There are some noteworthy differences though. One that stands out in particular is that freenode breaks large (>32kB) files into chunks. Everything is encrypted too. So finding a file you want is not quite as simple as taking the hash of the plain, unencrypted file.

Either way, you can easily build a git-like hierarchy of immutable content (and history) on Freenet, and this is more or less what happens under the hood anyway with manifests and splitfiles.

As a slight deviation from the norm, Freenet also can also address (signed) content by its public key rather than content hash. This is one way to enable mutable data; not entirely unlike heads. They can still link to immutable content hash keys.

> They also differ in the way routing works. On Freenet you ask a (mostly) random neighbor whether they have a file with the hash you want. If they don't have it, they ask another (mostly) random neighbor. This can go on for a while, until it either finds the content or hits a maximum number of hops, in which case it backtracks. The only point of these rube-goldberg shenanigans is anonymity.

It's worth pointing out that Freenet does have a simple but powerful routing system. Each network node has a virtual location, in key space. Requests are routed towards the nodes that are most close to the requested key. With careful selection of peers, the network topology can make for very efficient routing. E.g. one could have a small number of nodes "far apart" in key space, to faciliate routing towards far-away keys. Then you have a larger number of relatively close nodes, so that when a request comes in "your general direction" from a far-away node, you're likely to have the right peer to route to.

It is true that some randomisation helps with anonymity.


I'll add that the flipside of a network that essentially enables leeching is that it's also good for retention of popular (or "popular") items. Requested data is cached en route, so it's sort of automatic load balancing. Soon enough popular resources are likely to be held by whoever is nearby. People won't need to manually pin the content, and it's hard to directly DoS those who share the content you're "after." Asking for it just makes it more available.

I think this is important to consider if we're discussing reliability of distributed networks.

I'm not saying what the implications are -- for they can be good or bad.

Thanks for the correction! I don't think manifests were mentioned in the whitepaper but I found some info on the wiki: https://github.com/freenet/wiki/wiki/Simple-Manifest.

It looks like I will also have to learn more about IPFS's routing. Clearly Freenet's has some merits, and I would hope that IPFS is strictly faster since it sacrifices privacy, but I don't understand it well enough to make sense of how it scales.

Because it's more practical. Freenet was just too slow 10 years ago. Maybe it's better now, but it could never achieve IPFS speeds.

From the article:

> IPFS doesn’t require every node to store all of the content that has ever been published to IPFS. Instead, you choose what data you want to help persist.

Freenet has a different set of design goals and tradeoffs. IPFS is much close to Bittorrent which AFAICT has never really waned in popularity.

For a couple of reasons:

- Because these centralized walled gardens were DOA; people just didn't realized it yet. Personally, I never saw any of the currently popular social network platforms as offering anything significant over the AOL of yesteryear.

- Now that its been shown Facebook and Google actively monitor and restrict content that doesn't align with their narratives, folks are becoming wary of centralization.

I'd run a freenet node if it wouldn't expose me to the risk of getting all my gear confiscated by some LEA.

Probably I don't know something: why should it?

Freenet's model places random content on every node. Sure, it may be encrypted, but if there's a sting and your IP distributes something bad during that sting, you will be investigated and possibly arrested/falsely accused.

It's the same dilemma that Tor exit node operators face. You are essentially loaning your IP out to the masses. Since this is what law enforcement uses to associate internet activity with individuals, there's a high risk that someone will eventually end up "borrowing" your IP to send something illegal.

Because Freenet is used for distributing CP.

I've been wondering that too. There isn't much allure to it beyond "our hosting/CDN will be cheaper if we let the customers pay for it". If anything it weakens privacy a little, since any mean-spirited gremlin gets to snoop on who visits and hosts what.

> There isn't much allure to it beyond "our hosting/CDN will be cheaper if we let the customers pay for it".

This is a very business/customer-centric perspective. What about resistance to censorship, ease of sharing without relying on central third-parties, resistance to linkrot of non-commercial content or even content from a business that went under, etc?

Censorship resistance is illusory without any anonymity. It'd make me a little sad if anyone gets into IPFS because of its censorship resistance when freenet and tor hidden services already exist. The central third party is preferable to potentially anyone being able to start logging who is viewing a given piece of content.

I agree that anonimity is an important aspect of censorship resistance, in particular the avoidance of being easily targetted. However, avoiding single point of failures is just as important, and this is what IPFS (currently) solves. Eventually, Tor and/or i2p integration is likely to happen.

I don't think Freenet and IPFS mutually exclude each other, though. Freenet is a specialized tool with low usability. IPFS aims to be a core, widespread infrastructural protocol with high usability. In this way, I think they can co-exist and be complementary. The existence of a harder to use and more anonymous tool doesn't negate the usefulness of having a more widely used tool with lesser guarantees.

Well, one allure for me is the fact that the topology of Internet is getting ridiculous. Bits between devices nearby travel halfway across the globe for no reason but centralized control over the data. When I watch a cool YouTube video and want to send it to my friend, who happens to be in the same network, he shouldn't need to download it all the way from Australia again - it should just travel over the LAN.

A bit of that, a bit of new people realizing the potential and giving it their best shot. I've spoken to a lot of people who say that Freenet was fun, but just too damn slow to be usable.

Freenet and friends are certainly slow, or at least were when I tried them many years ago. It was never really fit for general adoption. IPFS differs in that it's trying to do this "at scale" in a way that may potentially be mass-adoptable.

Obsolete? Maybe not yet. But I do think putting forth effort to improving distributed protocols like IPFS could be very helpful in preventing internet censorship like we have seen from Comcast, YouTube, etc. recently.

Exactly. It's like, Woah, hold your horses there. "Obsolete"? I don't think so. To what extent can this IPFS serve an API or a dynamic database at this point? To what extent will it ever be able to do that?

I think HTTP/websockets is very good for these things. Static data is one thing, dynamic is a whole other story. It seems IPFS is just a new distributed way to archive data. So what? It doesn't help serve something like FB over a distributed network does it?

And to what extent could some sort of "protocol" vulnerability stop these networks from being "uncensorable". Are they truly resistant to censorship, or could they be effectively shut down somehow? Wouldn't DDOS attacks cripple these? I mean, that's a crucial flaw, right, you just have to look up all nodes for an piece of content and constantly flood them with DDOS traffic and then, hey, you've censored the network, right?

Databases, and dynamic content in general, can be done with/on IPFS.

Take look at OrbitDB (https://github.com/orbitdb/orbit-db) - "Distributed peer-to-peer database for the decentralized web" or their blog post "Decentralized Real-Time Collaborative Documents - Conflict-free editing in the browser using js-ipfs and CRDTs" (https://blog.ipfs.io/30-js-ipfs-crdts.md).

And all that works in the browser without running a local IPFS in the background. That's pretty amazing imo.

In general? No. Just because you can, it does not mean you should use a distributed db. Please remember to say that distributed, open databases have very narrow use cases.

Leaving aside use cases like credit card information, there are a lot of user information that is illegal to share unless the user explicitly consents. In the EU you can't even share your access logs by default.

And how do you handle authentication? Passwords? how do you avoid user enumeration, the collection of user email and info?

Distributed filesystems and CDN in general are great, but let's use them for things that do not actually need a single bit of security, please.

> "Distributed filesystems and CDN in general are great, but let's use them for things that do not actually need a single bit of security, please."

The notion that distributed filesystems are inherently, or can't be, secure is way off. I would argue that with these technologies, such as IPFS, they can be more secure.

The use cases are not only "open databases" (by which I assume you mean open to public), private databases and data sets can be achieved just as well. Just because it's "distributed" doesn't mean it can't be private or access controlled.

Agreed on the comment re. "...illegal to share unless the user explicitly consents" and I believe this will turn out better in the trustless, distributed web, eventually. Our whole current approach is based on the client-server paradigm forcing us to put every user and their data into one massive centralized database. But we can change the model here. Instead, how about you owning your data(base) and controlling who gets to access it? "Allow Facebook to read your social graph?" "Oh, no? How about another social network app?". As a user, I would want to have that choice.

That bridges to your next point on authentication, which can be done on the protocol level with authenticated data structures. You can define who can read/write to a database by using public key signing/verification. It could be just you, or it could be a set of keys. One good example of this is Secure Scuttlebut (http://scuttlebot.io/). I highly recommend to take a look an understanding the data structures underneath.

The problem is not that only authorized clients can write, that is the easy part (signatures are enough for that).

The problem is limiting read access. Having a globally distributed db means that anyone can get a copy.

You can use ipfs for public data, and to store private encrypted data (with caveats: make sure you change the encryption key/nonce for every data change).

There is no way to modify private data depending on anonymous access without things like omomorphic encryption, and the whole system is completely void of any form of forward secrecy.

As someone who works with encryption and security, I can not recommend storing anything private on distributed systems. Leaks way too much data, and there are too many caveats. You can have securely designed applications, but I see no way to safely port common websites completely on distributed infrastructure without leaking a lot of data that today is supposed to be kept secret.

http://scuttlebot.io/more/protocols/secure-scuttlebutt.html >"Unforgeable" means that only the owner of a feed can update that feed, as enforced by digital signing (see Security properties).

https://github.com/ssbc/patchwork >You have to follow somebody to get messages from them, so you won't get spammed.

Doesn't that make it completely pointless because updates are still centralised? It merely shifted trusting a single provider to trusting each user which is not a scalable solution. The value add is so low you might as well just use IPNS and make people subscribe to IPNS addresses.

But it is scaleable. On scuttlebot you follow people just like you have friends in real life. I also don't need to ask the government permission to talk to that person. That is DEcentralized for you right there.

Think of orbitdb as a git repository where everyone has write access. Malicious nodes will spam hundreds of gigabytes of data into it until it's large enough nobody can clone it. Even if you solve the spam problem you still have the problem that you need to download the constantly growing dataset. The blockchain is already 160GB big even though it's transaction throughput is anemic.

Full p2p applications can't offload computation, computations have to happen on your computer and for computations you need the entire dataset. This is fine for messaging and static files.

Federation is a far better idea. You get the benefits of centralisation and decentralisation.

Noms is also a great example of a peer-to-peer database: https://noms.io

So a couple questions even reading this...

1. The availability depends on the number of peers like BitTorrent? If so, and if no seed is available, how does one access the content, esp in the context of an intranet?

2. Any change to how we run infrastructure except not serving HTTP?

The usual solutions proposed to fix number 1 is to pay someone to host your content. So basically as before but with the added chattyness and overhead of a p2p protocol.

IMO Filecoin is the most potentially revolutionary thing to come out of the IPFS space, and it or something like it may have impacts that extend beyond keeping content distributed. We badly need a safe, self-organizing, apparently-persistent storage mechanism.

What if a "universal basic income" meant "get paid for sharing your excess disk space"? This could even be made transparent similar to OS page caches, and then everyone with a computer + internet connection would be a participant.

It's not like you upload something to the swarm then disconnect your own server and hope people will forever seed. The swarm is a backup and load balancer. You (or some hosting provider you pay) should still be the main seeder of your own content.

There are services like www.eternum.io (which I wrote) which you can pay to seed your content. Those ensure that it'll stay up even if some of your content is not very popular.

The one huge advantage of content-addressing over location-addressing (= URL with domain name and path) is that the original source isn't solely responsible for keeping the content online and the link working.

Anyone can help out by keeping a copy. With location-addressing additional copies of the content aren't just largely hidden, they also get into a weird mode of competition with the original URL. With content-addressing, additional copies instead forge stronger resilience.

> is that the original source isn't solely responsible for keeping the content online and the link working

No, the original source is still the one solely responsible, unless there's some agreement between parties to cache content for them.

Anyone can help, yes, but it's not their responsibility to keep content online unless there's some agreement between the original content source and other nodes, which is of course not the case by default.

Content-addressing doesn't alleviate the problem 100%, since content can still fall off the network - but it improves the structure of the network in a way that makes it tremendously easier to keep content around. You don't need control over the original domain name to help in distributing a certain piece of content.

My colleague Matt addressed this beautifully in a recent talk at the NSDR Symposium: https://archive.org/download/ndsr-dc-2017/04_Speaker_3_Matt_...

// edited to fix URL

This guy combined IPFS and Steem. He uses the Steem reward system to split payment between the content uploader and storage host 75/25. There is a cost, but this guy figured out a way to sustainably host videos in a p2p environment.


> The availability depends on the number of peers like BitTorrent? If so, and if no seed is available, how does one access the content, esp in the context of an intranet?

There's a concept called pinning which keeps files available in a local share. You can pay other people to keep it pinned for a long amount of time, and they usually charge for the size of the file.

Does HTTP have to lose for IPFS to win? Except the cost argument, none of the points in this article prevent sites from serving things via HTTP with IPFS metadata for the long term (though I don't know how that would play with advertising).

One question that came to mind while reading this was "Why do we have to demean HTTP in order for the tenets of IPFS to succeed?"

HTTP is ubiquitous, and no purism or idealistic superiority of some other protocol is going to sway everybody to the New Hot Thing. This is not a knock on IPFS but rather a recognition of reality: you're going to have to work within the current system to supplant it. And that means not only tolerating the old thing while pushing for the advantages of the new thing, but accepting that absorption of the dynamics of the new thing within the old thing represents a victory for the new thing, even if it doesn't get the named recognition.

Maybe we're ready to evolve beyond the limitations of HTTP (and HTTP/2, which I see as a viable and feasible, if not short-sighted improvement to HTTP). How are you going to get Google, Facebook, Amazon, and everyone else to go along with you? If you offer the benefits as compatible add-ons to the existing norms, you will succeed. If you demand that we fully jettison HTTP to achieve something better, methinks you will have an insurmountably hard time.

I agree that HTTP doesn't have to die -- for many use cases it has serious flaws though, the most important being its coupling to location-addressing. You always need a domain name and path to construct a URL. If the host decides to change the path, or loses control over the domain, you're out of luck.

Fun fact: when URLs were being defined, some people expressed the opinion that location-addressing is a huge mistake to begin with. Almost all of the links and references in the respective mailing list archives are now: broken.

Wonderful articulation.

But I suggest that internet is 'centralized' not due to any technological function, but due to the nature of information, particularly the economics and business of information.

The 'protocol' used to pass information from A-B, were it changed to something better, would not yield what the author is suggesting. Google 'would still have all our stuff' - I believe.

When 'individuals' can 'run services' from a variety of physical locales, with robustness - we will see more decentralization, I think.

I believe this will happen probably by accident, gradually, as 'more tech' creeps into our homes, one day, most people will have enough 'gear' to run some kind of service from home/small office - and other places.

Finally - despite the obvious failings of HTTP ... incumbency is what it is. I'll be we're stuck with it for a very, very long time.

If a powerful enough entity decided to change that - like Google AND Amazon together, or the US Government, or the Chinese Government ... that could change. Funny enough I think it's China that's best positioned to do it. They have the wherewithal and the tech momentum, they could do it by 'fiat' and in 10 years implement some kind of 'new, better network' that we'd all eventually move to.

Hey - America is still using 'swipe cards' and doesn't even use 'smart cards' though they've been in use for 40 years around the world :)

No. The Chinese government will fancy a 'centralized' internet that the government has total control. It will try as hard as it could to make sure the internet is 'centralized' and dismantle anything that is distributed or P2P. Some ISP in China doesn't even allow you to use a public IP. So I think this IPFS thing is most likely a huge threat to the GFW(Great Firewall) system of China. I hope this thing can eventually tear down the entire effort of the Internet censorship in China.

I didn't mean to say that they would do something 'decentralized' - of course they wouldn't. But possibly something 'better' than HTTP.

One of the things about the net, is that it's transactionally open.

There's no inherent identity/or security, it was grafted on with SSL - 'kind of'.

I suggest it might have been better if identity were required to even make connections, to avoid many kinds of attacks. Hopefully, it could be done in an unbureaucratic manner, also wherein 'identities' can remain de-facto anonymous.

My main point was that 'http' is kind of old, but we're stuck with it unless a 'major power' does something about it.

I know what you mean. But this 'better' than HTTP thing would only be a dream in China. Most websites, including some very big websites in China, have no concern about security. It will take longer than you can ever imagine for them to embrace SSL.

Synchrony is an intended solution to centralisation written back in 2015 after being designed since around 2011: http://github.com/psybernetics/Synchrony though this implementation also ships a peer-to-peer hyperdocument editor baked into the core of the UI so the Python PoC is a bit of a mutant.

Currently aiming to release a Go implementation at some point after canning a C implementation earlier this year, where p2p is instead accessed via CONNECT proxy.

Is there any interest in this project?

How does ipfs deal with dynamic content? And how would you make sure that everyone uses an updated version of the website.

I was worried about that too.

Scheduled republication is my best answer so far.

If you promised to sign and republish the same file every day with a new timestamp, then people would know when they had the latest, and when they didn't... would just have to wonder if you fell off the earth, which we sort of do already with all those abandoned free software projects online.

Republication may be cost prohibitive for large files, so instead you could republish a metadata file that pointed to the latest hash as of the metadata file's publication time.

For the "hit by a bus" problem (or for a server doing this automatically, the "hit by a comet" problem?), it'd be nice to include a dead man's switch from a third party, where they can publish a "FINAL -- EXPECT NO MORE UPDATES"...

But that that point you're trusting a third party. If you're willing to trust a third party this is far easier. So that might be what we'd end up with... something like DNS providers, but they're suddenly managing indexes and metadata for hosted files? I don't know...

(Also, this has probably been worked out already by smarter people than me, I haven't looked at IPFS much, this was just a back of the napkin guess.)

See PubSub: https://ipfs.io/blog/29-js-ipfs-pubsub/

You can use it as the base for CRDT structures: https://ipfs.io/blog/30-js-ipfs-crdts.md

Does anyone know if Dat can be used over HTTP like IPFS can?

Yes, Dat sites can be rehosted over HTTPS. See for example https://github.com/beakerbrowser/dathttpd

Also: https://hashbase.io/

IPFS is using Filecoin, which in itself is a questionable ICO.


You got it the wrong way around - Filecoin will be built on top of IPFS, but IPFS itself works fine without Filecoin.

So, we're not commenting on "That hash is guaranteed by cryptography to always only represent the contents of that file"? Because we should be. The fundamental rule of hashing is that you are guaranteed to have collisions. The challenge is to find a good hashing function for the kind of file you're hashing, but on the internet you get EVERY TYPE OF FILE, so at the scale we're talking about for IPFS, we're dealing with a large enough number of files of every conceivable type that collisions are guaranteed.

You're thinking of non-cryptographic hashes. A cryptographic hash that could generate collisions in any practical situation would be considered broken. You'd need more files than there are atoms in the galaxy to have even a 1 in a billion chance of collision with a secure 512 bit hash.

What about dynamic content?

Indeed it has been said that decentralization is the worst form of networking except all those other forms that have been tried from time to time.

(With apologies to Winston Churchill.)

something like? https://zeronet.io/

I can't tell whether the link "promising leads" was broken on purpose when the page was written in 2015 (to make a point), or if it itself is an example of centrally-served pages that slowly bit-rot over time.

An off-grid social network | https://news.ycombinator.com/item?id=14050049 (Apr 2017)

You can add redundancy to HTTP by having more servers, just add more IP addresses to the domains A record. And files will be cached on the client (forever, if you let it).

Yes, you can, but do you think that's a comparably easy thing to do? Especially when you take into account that IPFS also facilitates coordination between unrelated parties in making the caching happen.

So why aren't these guys as big as Wix, which is in the same space with crappy technology?

I would like distributed web methods like IPFS to be a w3c World Wide Web consortium standard.

Combine IPFS and crypto currencies for payments and we have a new distribution standard. Then one could have distributed movie sites like Netflix and YouTube where you pay for royalties to legally file share the content.

How would I use ipfs in a serverless model like AWS Lambda?

How would you deal with child pornography hosted on IPFS? What about IP infringement?

More importantly, how do you deal with a seeder that is (unwittingly) seeding such items?

Opt-in community based blacklist.

The same way you deal with them now. You go after the content discovery platform for which there is usually a small number of authoritative sources.

HTTP is not obsolete, it might be outdated, but words have meaning and HTTP one of the most used protocols there is, calling it obsolete is silly.

"adj. Outmoded in design, style, or construction: an obsolete locomotive. "[0]

So yes, it's obsolete.

[0] https://www.wordnik.com/words/obsolete

You are of course choosing to ignore the first definition, which says "no longer in use"

No, I'm not. Words can have multiple definitions. One definition does not apply, but a different one does. That's how language works. Hell, most adjectives mean more than one thing, that's just English.

> One definition does not apply, but a different one does.

This is you choosing to ignore the first definition. And it's NBD.

I do understand how language works. You can choose to focus on one definition and ignore another, as is clear by this thread.


> You're just being disingenuous to be a dick.

Yowza. No I'm just discussing a topic you emphasized. Don't take a disagreement as a personal attack.

"Pot, this is kettle. You're black."

If someone clarifies a meaning by saying "this is what I mean" and pointing to a dictionary definition, a more appropriate response might be "Oh, I didn't realise that's the way you meant it".

That's a sophisticated response. Though you might care to decide which meaning of "sophisticated" you think I'm referencing here.

(There are, in fact, words with widely varying, and even opposed, definitions. Arguing that someone means one that they clearly don't, in such a case, would be exceedingly uncharitable.)

Totally agree. You can make distributed apps that uses HTTP. The IPFS project have interesting ideas about immutable links to data.

wipes tear from eye

It's time for Blockstack.

Ugh, servers!

Except, you're gonna have to have servers unless you've got the entire web backed up on everyone's computer. Otherwise, you don't and can't know how many copies of a page or other file are out there. But who's going to pay for servers to retain random peoples' and companies' web detritus? This whole project exists because that's not feasible in the long term...

It's not "crazy", it's pure, thoughtless hype.

Because IPFS uses a distributed hash table like Bittorent, you don't have to know where stuff is--that's the problem with the HTTP, which is location addressed. IPFS is content addressed—the hash is the location.

You never host anything unless you want to; you don't host random stuff.

It's 2017, if you're using Google Docs or an instant messenger program and lose access to the backbone, you can't communicate with somebody who's in the same room with you. That's kinda silly. IPFS solves that issue.

IPFS is censorship resistance because it's a distributed protocol that can use a variety of transports. If I run example.com, people can DDOS it; it's much harder to do that when hundreds or thousands of nodes have the same content and you can connect to any of them. Sure worked in Turkey: http://observer.com/2017/05/turkey-wikipedia-ipfs/.

Filecoin is a cryptocurrency that will be mined by providing storage via IPFS.

Folks may want to read the white paper before they make assumptions about what's possible and what's hype: https://filecoin.io/filecoin.pdf

Someone has to have replicated the file though, else it can be lost. Yeah, you still have the hash, but if no one is left that stored the file, what are you gonna do - brute-force search for a preimage of the hash to get your content?

As long as IPFS requires replication to be voluntary on the side of the nodes, the argument of the parent holds.

That's the same thing we've got now, except that going through the process of "replicating the file" usually means either paying an a hosting company and learning how to maintain a server or signing your control and rights over to a company like Facebook.

There's nothing preventing anyone from going through these same measures with IPFS or dat. It's just that you don't have to in order to get started hosting something.

Did you miss the part regarding Filecoin monetization? If I have content I care about, I can pay to have it hosted by other IPFS nodes.

They certainly have the funding to create a storage market that can rival what passes for distributed storage: https://www.coindesk.com/257-million-filecoin-breaks-time-re...

But that really only holds true for content that hundreds of thousands of nodes find interesting enough to hold on to. IPFS is definitely a step up in that it allows for more than just the originating party to preserve it, but this isn't too different from http mirroring save for the obvious advantages to discoverability.

"You never host anything unless you want to; you don't host random stuff."

Then explain that to the author of TFA, because they seem to imagine that websites and whatnot are just somehow going to be out there forever with IPFS, without having to rely on one's server. In reality, unless the content is popular and anyone's bothering to replicate it, it's going to still fall off the internet the moment the server's down in an IPFS-based web.

Except for bittorrent, in my experience unless someone dedicates themselves to full time "hosting" of some torrent, it will be seedless within a month or so, after the initial wave of popularity.

Right, if you're running a site you'll still need to have your own infrastructure to host the authoritative source. IPFS is supposed to reduce the need for CDNs.

- Even assuming all this, a hybrid approach of HTTP + IPFS (or DAT) is still better than what we have now, since IPFS is essentially a worldwide CDN for static files.(Sorry: an inter-planetary one.)

- The content-addressing aspect makes it perfect for distributing commonly used libraries.

- We already cache all this content locally. What a waste! Why do I have to fetch jQuery from fricking California when it's sitting on my girlfriend's phone in the other room?

- This extends beyond the web: think about the benefits (both in security, practicality, and performance) of content addressing introduced into package managers (take it one step further even: combine this idea with the new move towards reproducible builds (https://reproducible-builds.org) and package managers like guix and nix and things get really interesting).

- It's actually easier to use for the average person. If you don't think this is the case I propose a simple experiment: download the beaker browser and set up a simple static site. I recently did this. It really is one-click hosting! Considering how complicated web hosting is to the average person (ever try to walk a friend through setting up a website? not. fun.) -- people would love to be able to set up personal websites this easily... and for free?

- As others have mentioned, there are many solutions being worked on for the mirroring of data (Filecoin etc).

- For websites that are visited regularly, this is not an issue -- all content is cached temporarily. It suddenly becomes basically free to serve an audience of millions... again: with one click.

- If history serves as precedent, if it does fail it would be in spite of being an objectively superior, practical solution. Getting a critical mass of people on this thing is the hardest problem to figure out. -- I suspect package management, academic data are the best place to start, then one-click personal hosting -- not even think about "apps" for now.

- Didn't you just read the web is about to go permanent? Do you really want to be archived for all history as one more nay-sayer? ;)

>since IPFS is essentially a worldwide CDN for static files.(Sorry: an inter-planetary one.)

Sorry but IPFS is interplanetary in the same way a Boeing 747 is capable of orbital flight.

Last I checked IPFS will not tolerate minute long latencies and requires a bandwidth above several kilobits per second which would immediately disqualify it for anything farther than the moon.

And I'm not sure it would work on the moon since that is a 2 second latency and I had issues with it when I used it on a mobile phone network with 800ms latency.

>I recently did this. It really is one-click hosting!

Except it isn't hosted unless atleast one person keeps a copy online, otherwise it goes offline or you pay money to some hoster or filecoin (not that I think that filecoin isn't a huge scam at this point)

>- Didn't you just read the web is about to go permanent? Do you really want to be archived for all history as one more nay-sayer? ;)

Since the number of people interested in the content of this page is declining with every decade passing, I think I'll make a bet it'll be no longer available on an IPFS after a mere two decades.

> Last I checked IPFS will not tolerate minute long latencies and requires a bandwidth above several kilobits per second which would immediately disqualify it for anything farther than the moon. > >And I'm not sure it would work on the moon since that is a 2 second latency and I had issues with it when I used it on a mobile phone network with 800ms latency.

Fair points :) We'll be addressing this in the coming months with increased work on the network stack (libp2p).

> Last I checked IPFS will not tolerate minute long latencies and requires a bandwidth above several kilobits per second. [...] It isn't hosted unless atleast one person keeps a copy online.

And the vacuum tubes in my Colossus might overheat at that rate too! -- Damn, you're right, we're just not smart enough to solve those problems.



> Does sarcasm prove your point?

Fair enough, sarcastic Parthian shot removed. I get overexcited sometimes.

>And the vacuum tubes in my Colossus might overheat at that rate too! -- Damn, you're right, we're just not smart enough to solve those problems.

I was merely pointing out that it's not correct to call something interplanetary if it's not. It's mostly false advertising.

>Are those the ones that go whoosh?

Does sarcasm prove your point?

> Why do I have to fetch jQuery from fricking California when it's sitting on my girlfriend's phone in the other room?

Because your girlfriend probably values her battery life and data usage? I doubt we'll ever see phones hosting any ipfs content for the reasons.

Funny you should say that. Just hours ago I released an app that helps you pin your most important IPFS hashes to your phone[0]. Works well together with IPFSDroid[1].

Battery usage is definitely noticable, but my phone has a good battery life overall, and IPFS on the phone is a priority for me.

Data usage would be a problem, but I use it only when connected to a portable WIFI hotspot I am carrying with me.

[0]: https://play.google.com/store/apps/details?id=com.hobofan.ip...

[1]: https://play.google.com/store/apps/details?id=org.ligi.ipfsd...

In a post-carrier world, she would be paid with tokens that she could use to buy faster network access later in the day or sell any surplus to heavy network users (probably indirectly through a brokerage, perhaps even run by a company that used to be a carrier). The tokens might even buy electricity from the neighbor's solar panel to charge the phone.

If data isn't embarrassingly cheap before mid-century we did something terribly wrong. Ditto electricity / battery life by the end of century.

(and again: content-addressing could drastically reduce data usage anyway.)

If I understand correctly, the argument isn't against content addressing but against sharing the content with anyone. While you have an incentive to store it locally and reach for that first, you (currently) have a negative incentive (=cost of data/electricity) to share it with someone.

Why do I have to fetch jQuery from fricking California when it's sitting on my girlfriend's phone in the other room?

Just for one? Because, if you have the ability to do that, your girlfriend's phone has the ability to detect whether anybody nearby is accessing any arbitrary file or page. It just has to host a copy of that page and see whether anyone pulls it.

I don't know the internals of IPFS DHT implementation, but the whitepaper mentions Kadmelia and Coral. Coral tries to optimize for ping latency (you're not literally fetching from nearest geographical neighbor, I simplified to make a point).

Unless I misunderstand your point, but honestly it seems like people here are engaging more in "gotcha" nay-saying than honest efforts of criticism... it would've taken you two minutes of googling to find out this is a non-issue.

It's not a non-issue just because people boosting a technology say it is. Ever heard of timing attacks?

I have, but I'm no security expert so perhaps I'm not seeing something obvious. -- Do you have a specific attack in mind? Is it an insurmountable vulnerability?

If so you can just explain it (or report it).

Nobody is saying we can't or shouldn't have servers.

"But who's going to pay for servers to retain random peoples' and companies' web detritus?"

The same people who do this now. Because nothing in this scheme says you can't have servers just like you do now.

The difference is that you don't need a server to get started collaborating. If you want to host something to have it available offline, that's a built in feature. If you and some friends want to host an event invite, grass roots, you can do that. No need for facebook, no need for big cloud platforms. Share around a web page. People will host it while it's needed.

Personally I'm more into dat than IPFS... but they each of their own use cases. I like stuff that's not going to be "permanent". There is plenty of room for stuff like that. Not everything has to be in some permanent public record for all time. We need more accessible ways to share stuff like that. We need good ways to share stuff privately. Skip this ridiculous idea that Mark Zuckerberg should be privy to everybody's private, personal information. No thanks. Share that stuff on LANs, on encrypted p2p connections. Keep it nearby.

If you want to publish, that's what IPFS is for. And if you want it to stick around, invest the resources to make sure there are servers, whether they be on digital ocean or a bunch of raspberry pi's plugged into you and your friends' walls, that are seeding it. That's on you as someone who's committed to publishing information.

"Because nothing in this scheme says you can't have servers just like you do now."

Pretty much all the defenders of IPFS here give me the impression they haven't bothered to read the article hyping IPFS that's linked at the top of this page.

I read the article.

What are you finding to be inconsistent?

I'm thinking of a "server" as a dedicated computer connected to the network. With IPFS, if you want to ensure your data is available, seed it with one or more dedicated computers, like you would now. Seed it on digital ocean or amazon even, if you want. Nothing prevents you from doing this.

In the future, there will be new ways of incentivizing groups of people to seed data that isn't naturally viral. But nothing is stopping people from using the time tested, old fashioned ways in the meantime.

I don't buy the idea that IPFS stuff is inherently permanent, but I don't care. Even the way that IPFS handles broken links is way better than how they're handled now. With IPFS you at least get a hash of what you're looking for. That's a lot more useful than what you get now. The only thing you get now is "404".

I'm seeing a specific disconnect between the people who are into it and people who don't get it: they have ways of answering the question "why would someone want to host your content?"

The people who are into this idea realize that the content itself often carries its own incentive to share. Given the right infrastructure, a lot of stuff will host itself because people will want to share it. That's how bittorrent works.

Also, there will likely be stuff that falls out of fashion. The test of time will not disappear, but these technologies make it much easier for people who care about preservation.

The primer is actually much better: https://ipfs.io/ipfs/QmWimYyZHzChb35EYojGduWHBdhf9SD5NHqf8Mj...

There are over 5 billion files hosted on IPFS and over 500 GB per day going through the IPFS gateway. Not bad for something that supposedly doesn't work and that's only been around for a few years.

When people can get paid to make content available on IPFS… well, that's going to to be a quite a thing.

I can purchase a droplet on digital ocean than has gigabit and a terabyte of transfer every month for $5. A terabyte is more than enough content for me. I am sure most people could afford that to manage their interneting.

>I can purchase a droplet on digital ocean than has gigabit and a terabyte of transfer every month for $5

>I am sure most people could afford that to manage their interneting

And before the spec has even seen real adoption, we've already seen it centralize into a few major providers.

A bit tongue-in-cheek, but it's a real issue. HTTP isn't the reason things are centralizing so much as economies of scale and convenience are. I see nothing about IPFS that fundamentally changes that, and think we'd likely see similar centralization over time.

Git, one of the inspirational technologies, is in theory distributed as well and in practice hyper-centralized to only a few major providers.

Network infrastructure is also a powerful driver of centralisation.

I'd much rather host content from my home (and, living in a very sunny Australian city, power that with PV and battery storage at a low amortised cost) - but I can't, because the network to support that isn't there.

I get about 7Mb/s down and and 1Mb/s up - my link is highly asymmetric. When I finally get off ADSL and on to the new National Broadband Network, that'll still be asymmetric.

I can see why networks are built that way, given the current centralisation of infrastructure, but the build also reinforces centralisation.

Think back to 20 years ago when most business network connections, even for small business, were symmetric. Hosting stuff on-site (email servers, web servers, ...) was far more common.

Distributed technology keeps centralized providers honest. If github got complacent their customers could migrate their most important data in a very short time.

Github is complacent but people haven't moved because it's difficult. The issue tracker is proprietary and losing all of that and the account references for the comments makes moving non trivial.

> Git, one of the inspirational technologies, is in theory distributed as well and in practice hyper-centralized to only a few major providers.

Git repositories are replicated all over.

My laptop has mirrors of all my work's projects and many open source projects.

Imagine how many secure mirrors of, say, the React repository is out there. GitHub is basically just a conveniently located copy.

That's real and tangible decentralization. It's a magical property of the DVCS architecture that it's decentralized even when it's centralized, so to speak.

I agree that there are issues with central hubs though. Maybe the most significant one is that organizational structures and rules are defined in a centralized way on GitHub.

If you look at blockchains as another kind of DVCS that's totally focused on authority, legitimacy, and security, then it seems pretty likely that we'll end up using those to control commits and releases.

Git itself doesn't inherently provide the services that github does: discovery, project management, and social tools.

The kinds of tools that would make distributing those aspects of github are precisely what this article is advocating for!

> who's going to pay for servers to retain random peoples' and companies' web detritus?

You mean Filecoin?

Anyway, I'm also a skeptic about this model but I do think there is a sliver of chance that it may work. There ALWAYS is a sliver of chance that something crazy may work. That's how it's always been.

IBM laughed when personal computer vendors and OS creators thought that they will put computers on everyone's desk, and I'm pretty sure if I was back then I would have thought the same.

Also, before criticizing some technology it would help to actually understand how the technology actually works. As far as I know, IPFS is working on all the problems you mentioned. Now whether they will succeed or not is a whole different issue, but it's not such a trivially obvious thing that one could easily say that it's a "thoughtless hype".

First, let me say that I think that IPFS is a good idea and that it has applications that are useful now. However, if I interpret the parent's "thoughtless hype" as "hopelessly naive", I'm pretty much in agreement.

Checkout Freenet[0]. And while trying to maintain anonymity makes the problem even more difficult, there are fundamental problems in Freenet that make it basically unusable (mostly around cache coherency and latency). Freenet has been around since Ian Clarke's paper about distributed information storage and retrieval systems in 1999. They haven't managed to fix these fundamental problems in nearly 20 years of trying. I see absolutely no discussion of the same problems in IPFS (though abandoning anonymity is a good start).

It's one thing to say, "Hey distributed file system -- awesome". Then you can build all the easy bits and say, "Well, maybe cache coherency and latency won't be a big problem". But now look at what IPFS has to say about cache coherency on their wiki [1]. There is nothing at all that identifies or addresses the problems they will run into -- just a definition of the term and some links to random resources.

It's all well and good to say, "Eventual consistency", but what about guarantees of consistency? If I'm a vendor and I have a 1 day special offer, can I get a guarantee that caches will be consistent before my special offer is over? How do you deal with network partitions? Etc, etc, etc.

Before you start calling HTTP "obsolete", how about solving these kinds of problems? I have absolutely no problem with projects like these. They are awesome and I encourage the authors to keep working towards solving hard problems like the above. But announcing your solution before you've even realised that the problem is hard is pretty much the epitome of naivety.

[0] - https://freenetproject.org/

[1] - https://ipfs.io/ipfs/QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1m...

First, it was not IPFS people who said HTTP was obsolete. If you check out the original post, it's from Neocities blog.

Second, people tried "sharing economy" startups back in the web1.0 era when everything went down crashing. But in 2017 we have Uber.

The freenet project doesn't change my argument at all because like I said, I'm not saying IPFS will succeed. I'm saying there's always a chance because the world is constantly changing. If you're lucky, you're at the right place at the right time building the right thing. If you're not, you fail.

In 1999 this wouldn't have worked of course, and that's my point. Successful projects succeed not just because of the product but also because of luck, timing, etc. There are so many new powerful technologies coming out nowadays, not to mention the societal change.

This is definitely a different world than what it was in 1999 and I'm saying just because it didn't work in 1999 doesn't mean it won't work in 2017.

One very important vector for adoption that often gets overlooked is interoperability. The cost of adoption can be significantly reduced by making sure the new thing nicely interoperates with the existing deployments. We're attempting to do this well with IPFS and libp2p.

Http doesn't solve cache coherency so it's not naive for ipfs to claim to solve HTTP's issues while having orthogonal issues on the TODO list.

I agree.

In fact the “web” is consolidating onto those who have the capital to do servers at scale.

In some ways this is good as it makes computing power more accessible to the masses with good ideas but on the other hand it puts the power of what happens with that business in the hands of far fewer people.

What do you think of this actual proposal? You absolutely do NOT need the whole web replicated on every computer.


Exactly, it's like how bittorrent doesn't work at all.

BitTorrent works but it has not supplanted centralized methods of distribution and it doesn't work for all circumstances.

Bittorrent sure the Hell doesn't work anything like TFA imagines IPFS working...

> It's not "crazy", it's pure, thoughtless hype.

What a well thought out argument, thanks so much for your insight.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact