Hacker News new | past | comments | ask | show | jobs | submit login
HTTP is obsolete – it's time for the distributed, permanent web (2015) (ipfs.io)
510 points by Hakeemmidan 41 days ago | hide | past | favorite | 325 comments



IPFS needs to decide what it wants to be. Is it about being a decentralized caching layer? Is it about permanently storing content? Is it about replacing my web server? Is it about replacing DNS? Is it about censorship resistance?

Right now it does none of those things well. The client chews through CPU and memory when seemingly doing nothing. If I try to download content, it is far slower than BitTorrent unless I go through a centralized gateway. If I add content it takes ages to propagate, making it utterly unsuitable as a replacement for a web server. There is no system to keep content alive so links will still die. The name system is byzantine and I don't think anyone uses it.

Unfortunately, they are now unable to pivot because they did that asinine ICO. The right thing to do is to give up on the FileCoin nonsense and build a system that solves a problem better than anything else, but that is no longer allowed because they already sold something else to their "investors"


IPFS's design makes it so that it's all of those things, or none of them. Picking one of them doesn't fit the shape of the technology. IPFS is basically the answer to the question "what is the RIGHT way to decentralize the web?". If you think about that question hard enough, then anyone can see that their way of doing it is the "right" way. It's just obvious. The problem is that all of these mutually supporting components need to be bootstrapped in order to make it work. You can't take the storage of content and federate it out across the entire internet without being able to refer to it by a content id. You can't just have content addressing by itself, because then it's inconvenient to find things on a day-to-day basis. So that means you need a DNS equivalent. And you can't assume a decentralized graph of network participants will bother to serve this information unless there's reward in it for them. So you come to filecoin. Etc.

Point being, they didn't just bite these pieces off randomly: they see a picture of how the internet _could_ work, and they're trying to realize it. If they can get it working, then boom! You have decentralized internet, and you also have a ton of bonuses that just fall out from this being the right way to do things: resistance to censorship, better archiving, reduced influence of web megacorps, etc. But you have to have it -all- to actually be better. The sum is WAY greater than the parts.

The trouble is, this statement:

> Right now it does none of those things well.

is true.

So I get what you're saying. To build user-adoption, they need to find a way to deliver an improved experience, not just an improved model that would be better if more people used it. But I object to the idea that the solution is to choose one of those things at the exclusion of the others. The whole idea doesn't make sense if they choose one. If I were advising them, I wouldn't tell them to reduce their scope in terms of "doing all the things", but rather reduce their scope in terms of doing all the things for the entire internet. They should find some kind of sub-network or community that gets extra value out of the decentralization, and prove out the concept there. Maybe it's a big company's intranet, or a network of (paging ARPA) universities?


> IPFS is basically the answer to the question "what is the RIGHT way to decentralize the web?"

There is no RIGHT way to decentralize the web. I don't think IPFS is the right way to do it either.

Tim Berners-Lee's Solid (https://solid.mit.edu/) offers a much more practical path to a decentralized web. The advantages with Solid's approach over IPFS is that:

Solid doesn't throw out what we already have, and recommend a new layer on top of the internet (example: ipns).

Solid handles access control which pretty much every application needs (encryption is btw, a poor substitute for access control).

Solid has the ability to revoke access, and delete data (very important).

It can work in browsers without extensions.

Solid is not muddied with talk of the Blockchain. It's disappointing that cryptocurrency has very nearly hijacked this space.

Solid is conceptually simple. You own a pod that has a unique address (using familiar schemes). You put your stuff on it and allow access to people; like DropBox but standards based. Companies can offer paid hosting services to run your pod - more space, bandwidth etc.

IPFS is not commercialization friendly. IPFS performance is unlikely to be great, ever.

Disclosure: I am invested in an open protocol similar to Solid, but simpler. So not entirely unbiased.


What is that simpler protocol?


I'll do a Show HN in a month.

Although a reference implementation (https://github.com/webpods-org/podmaster) is kind of ready, there's no documentation yet (which will go up on webpods.org soon).

The only way to see the feature-set is to look at some of the tests. https://github.com/webpods-org/podmaster/blob/master/src/tes...

If you're interested, please email me. I'm looking for collaborators.


I realize you plan to do a Show HN but in case I don't see it, could you answer a few quick questions:

1. How is this meaningfully different to WebDAV?

2. Is the assumption that web apps export stuff to your pod from time to time, or actually use it as the primary storage? If the former, isn't it more or less the same idea as Google Takeout, if the latter how do apps handle the possibility of slow pods, outages or the need to use relational databases for storage? When building server side apps you do normally need tight control over storage.


> 1. How is this meaningfully different to WebDAV?

Webpods is more like git than WebDAV. It allows apps/users to store data in logs, and the data can be records (strings) or files. If bob is syncing from alice, he'd pull all entries from the commit-id until which he has previously synced.

An app will store data in a pod (which has a unique hostname) such as instagram.jeswin.someprovider.com. Each pod can have multiple logs, such as "friends", "albums", "comments" etc. Logs have permissions attached to them, which control who can read those logs. There are similarities to WebDAV here, but again it's more like how we use git.

> Is the assumption that web apps export stuff to your pod from time to time, or actually use it as the primary storage? If the former, isn't it more or less the same idea as Google Takeout, if the latter how do apps handle the possibility of slow pods, outages or the need to use relational databases for storage? When building server side apps you do normally need tight control over storage.

Apps are expected to be local-first, though they aren't forced to. You'd write to the local database, and simultaneously sync with the pod. Similarly, if you're pulling data from friends, those would (most likely) be stored locally as well.

Slow pods are a problem, but I hope people would generally prefer reliable pod service providers. In the same way Dropbox gives you some guarantees of reliability. If the app is designed to be local-first, the user is not immediately prevented from using the app while the network is down; and syncing can happen once connectivity is regained.

Relational databases and schemas are not supported on Pods, it's just an immutable log. Most apps should do event-sourcing (https://martinfowler.com/eaaDev/EventSourcing.html), wherein they write to an event log. But this stream (of events) could be processed into an more easily queryable view.

Of course, this won't work for all kinds of apps. It works well for apps handling personal data or for collaboration tools; such as slack, project management tools, instagram, google photos, music collections etc. On the other hand, it's not a good fit for apps in which the data needs to be centralized. Such as ecommerce, banking, insurance, delivery services etc.


I see, thanks. Sounds like it's a sort of Apple Time Machine but for the rest of us?


I tried to use Solid, but having a protocol to store your data did not seem very useful without being able to swap the applications that use them at will. Each application needs to understand not only the Solid protocol, but the format you are using for your data too.

The specification naively says that the data is saved in interoperable formats. Sure, you can store your data in an interoperable data formats, suppose it is JSON, but it is of little use if the various applications do not know how to interpret and manage correctly the information contained therein.

It's been a while, have the applications improved?


> Each application needs to understand not only the Solid protocol, but the format you are using for your data too.

I don't think we'll be able to avoid that hurdle; we'll need to make sure that the protocol is really simple.

But having to know the data format of the app itself is to be expected. If an app "instagram-on-solid" stores data in a certain way, the alternate app will need to understand those schemas as well to be compatible. That's how interop has always worked, even in the pre-internet age when we were exchanging files on disk.

> It's been a while, have the applications improved?

I haven't looked at apps in a while - but that was indeed moving very slowly.


I would have preferred IPFS to do the storage and propagation of data really well and not implemented stuff like IPNS in the core.

To me this should be a separate codebase. And I see this in other features they have been including too.

I'm using IPFS quite heavily to store content generated on https://pollinations.ai but there is no way I could run it without a centralized node that I have control over at the moment because otherwise it would just be painfully slow and unusable.

I love the idea of content addressed decentralized data. I have no real use for IPNS at the moment and many of the other features of the core IPFS stack.


Have you had a chance to try Skynet? It provides all of the same features as IPFS, but it's also a lot more performant and higher uptime. You wouldn't need a centralized node that you run yourself.


Do you have any good links to Skynet? I could just find an old GitHub repo. Is it in active use ?


Check out the pinned repos: https://github.com/SkynetLabs

Also https://docs.siasky.net/

Very much an actively developed project, with over 100,000 monthly active users.


The second link claims that it will store data without payment and allow extracting money in the future, with 0 mention of who is paying for the space. This is so suspicious that I almost cannot believe this is not addressed immediately.


Skynet operates on a freemium model where users get 20mbps and 100 GB of storage for free and significantly more if they pay (80mbps and 1 TB for $5/mo). This is what pays the bills, and also how things stay pinned.


I found this, I think it's what @Taek's referring to - https://siasky.net


I haven't used IPFS so I was trying to see what the IPFS image URL would look like but when I tried to inspect the images using (right click -> inspect element), Chrome selects the whole division tab (class=MuiCardContent-root) in the Elements tab in web console. I found that it's happening because of (style="pointer-events: none;"). When this style attribute is removed from web console, I'm able to inspect individual elements like image, paragraph correctly. Just out of curiosity, do you know why (style="pointer-events: none;") is used here?


Check an URL with results: https://pollinations.ai/p/QmbhMnkgrqqwQ39BP7S3F8qNSqHueaQz42...

The landing page is a bit of a mess


The problem is their goal isn't to do these things well by the usual metrics, it's to do them using a decentralised implementation. It's the implementation model that is the product, not rapid replication, not fast access, not long term access to data, not large scale storage.

The problem is that their architecture by it's nature takes all the quality related qualifiers out of those goals. Replication, but not rapid, access but not fast, access to data but not long term or actually resilient, storage but not large scale. So it's only advantage is if you value decentralisation above all other characteristics.


I agree with you but I think compared to Http or even bittorrent It’s extremely easy to attack IPFS.

Anyone can register a lot of fake “seeder” for a file chunk making it hard to download that chunk.

Same problem for the name server.


as in government, even if you have the mythical unicorn, the actual behavior of the users is what decides what is right and will push that forward.


Note that the Filecoin network (which was designed to be the incentive layer for IPFS storage) has been operational for some time. If you look at the current status at https://file.app/ , you can see that storage costs there are extremely low for large amounts of data. If you can get your data verified as open, public data by applying for datacap with a Filecoin+ notary, it's currently free. See https://plus.fil.org/ (you can get 32GB of free datacap to play with just for having a github account).

If you want to use the Filecoin network as a "provider of last resort" for IPFS data, there's https://estuary.tech which will mark your data as verified, sort out the deals with storage providers, and then mirror it to IPFS.

There's also third-party tools like https://fission.codes/ , https://docs.textile.io/powergate/ , https://web3.storage/ and https://www.pinata.cloud/ for making this easier.

(Disclosure: I work at the Filecoin Foundation.)


> If you look at the current status at https://file.app/ , you can see that storage costs there are extremely low for large amounts of data.

We were looking at this at work the other day.

We noticed the storage price vs S3 saying "0.03% the cost of Amazon S3" and then someone (who's been trying to get adequate performance out of IPFS for a while) said "0.03% of the price, and even lower performance".


Imagine if we said that about web sites at the begining. The Web needs to decide what it want to be? A plaftorm to sell stuff? Contact people? Write? Listen to music?


But the web wasn't about listening to music at the start! It started with organizing documents on a network, at CERN. And it took over existing document platforms by adding a simple point-and-click browser user interface to them, inspired by HyperCard (where do you think the "hyperlink" got its name?)

The modern, more economic web, wouldn't come until Netscape added form fields and cookies, at the behest of some of the original owners. And there were a ton of people at Netscape making these decisions about their vision of the future of the web. In-browser music listening wouldn't come until Macromedia, Disney and Microsoft pushed their vision for a "multi-media web"; browsers wouldn't build native support until much, much later.

So yes, we absolutely decided what the web would be about, and built technology to match that vision.


> inspired by HyperCard (where do you think the "hyperlink" got its name?)

I'm as much of a HyperCard fan as (almost) anyone else, but that is almost certainly not where the term "hyperlink" comes from. Ted Nelson used the word "link" back in the mid 1960s, in the context of another coinage of his, "hypertext". The historical record is already a little unclear about whether or not he was using hyperlink that early, but by the time HyperCard came to be, the term was already differentiated from a "simple link", with some level of implication caused by the "hyper" prefix that it was most likely on another computer/server. The most HyperCard could offer was a link into a different stack.

The "hyper" prefix predated Hypercard, and it's meaning in the context of information processing/retrieval/presentation meant more than the majority of links that HyperCard offered (even though they were also great). Yes, I know that the wikipedia page on the word "hyperlink" claims that HyperCard "may have been the first use", but the cited reference for that claim offers no evidence for it whatsoever.


I remember a whole fascinating section about various hypertext system in a mid-80s issue of Byte. I spent hours pouring over the screenshots in it.

EDIT: here's a good summary article of pre-WWW hypertext systems from the 80s https://fibery.io/blog/hypertext-tools-from-the-80s/


Exactly. To get the beginnings of real adoption, a technology has to do something better. A specific thing. Better enough that people switch.

The early web's competition was things like FTP, Gopher, and email-driven apps (e.g., Listservs, the Usenet Oracle). Plus paper-based stuff, like department phone books, mailing documents around, etc. It was hugely better than any of those for many common uses, so adoption was rapid.

Once you have a critical mass of users, then it can make sense to add other things in. But for that first audience, we can't be vague, selling some shining future that will happen eventually.


and much better, to justify switching costs.


Mosaic supported form fields, before Netscape even started. Eg, https://www.w3.org/People/Raggett/book4/ch02.html .

As I recall, one of the example CGI programs from NCSA presented a form to fill out a Papa John's order, which was then sent via the email-to-fax gateway. Which, now that I think of it, was indeed more "economic".

Cookies was definitely a Netscape thing, for profit making - a shopping cart for MCI.


I don't see how the first two paragraphs support your conclusion here


The first two paragraphs reiterate that the early web was very much a reflection of the hypertext transfer protocol and hypertext markup language in that it literally just handled text pages with links in them. and it did it pretty well. It wasn't designed to handle streaming video or client-side processing/page rendering via Javascript or any of the innumerable other elements added on to it later. It was designed to do one thing well.


- A complex system cannot be “made” to work. It either works or it doesn’t

- A simple system, designed from scratch, sometimes works.

- Some complex systems actually work.

- A complex system that works is invariably found to have evolved from a simple system that works.

- A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over, beginning with a working simple system.

Systemantics, John Gall, 1978


There are two problems with this line of argument:

1. The web succeeded but many things whose backers made similar comparisons failed. Knowing that one technology had a big impact doesn’t say that a given unproven technology will be the next one to go big. It’s more likely that you’re looking at the next Groove Networks or something like that.

2. The web was immediately useful for many people and you could get started easily. IPFS has some interesting but far from unique properties and trying to be a network increases the amount of adoption and maturity needed for it to be worth using for most people. This is especially true for peer-to-peer sharing where the most useful participation requires up-front risk and costs which many people aren’t going to want to accept. Without that, it’s basically just harder to use web hosting which may or may not be cheaper.


It's been 6 years now and IPFS is still stuck with the same problems it had at the start with no real pathway to being useful. Most technology does something useful early on. I don't know the timeline for the web but I don't imagine it involved 6 years of marketing and selling to investors while not serving any purpose well.


HTTP was invented in 1989.

Netscape was founded in 1994.

So, if you’re comfortable with either of those a starting point, 6 years is somewhere in the dot com boom.


I'd argue the WWW was instantly useful, even during its halcyon days.

IPFS on the other hand is a horrible "jack of all trades" that has mediocre performance even in the best of times, and it hasn't really gotten any better since it first launched 6 years ago.

And that's not even bringing up the cryptocurrency cohort souring the project with its stench.

I don't object to the existence of IPFS, rather I prefer more efficient and focused projects instead. Someone in the comment threads mentioned Solid, which sounds like a decent decentralized information protocol or system of sorts.

And for those that want censorship resistance... who can forget Freenet? That project has been around since 2000 and seems to do a pretty bang up job, even if the performance is not much better.


Why is it that people trot out this argument in response to any criticism of any technology? You could apply exactly the same sort of reasoning to Microsoft Bob. Do you think that had the potential to be revolutionary?

I can't see the future. I can, however, look at the world and think about what I see.


Saying "imagine if we said that about a technology that doesn't suck" doesn't make a technology that clearly sucks, not suck.

The Web was a hit because it was really obviously useful for real work from the moment of its creation. IPFS is BitTorrent with magnet: links, and the seeding problems that implies, and the rest is janky nonsense.


> Imagine if we said that about web sites at the begining.

The web was fast (for documents on 56k) and extremely useful almost immediately. It was obvious to everyone watching that the technology was going to change everything.


I think you misremember. I was definitely on the net and the web with a 14.4k modem. It was not that useful. The web was a pretty small part of the net iirc until the mid-90s. I preferred IRC channels and BBSes then since I was very young and didn't have the patience for most websites to load and I couldn't instantly join the "conversation" like I could with a BBS or an IRC channel.


I do not understand the downvotes here. "It was obvious to everyone watching that the technology was going to change everything": how to disagree?


I wasn't a down-voter, but I didn't have 56k in 1994.

14.4 baud iirc. You wouldn't call it fast.


I think something regulatory changed about 1995, such that there were instantly tons of ISP startups, and I think that 33.6k modems were available about then.

I had a 14.4K Zoom modem, but I'm pretty sure the ISP I worked for around '95-'96 was buying lots of US Robotics 33.6 modems.

I agree that 56k came a bit later, and didn't necessarily work on any particular phone line.


IMHO what changed in 1995, was MS adding TCP/IP to Windows 95. Prior to that you had fight with dial-up, Trumpet Winsock, PPP and PPTP to get on the internet at all. Most normal people still couldn't do it without help, but it moved into the realm of possible.


During 95 or 96, I was explaining to customers over the phone how to set up Trumpet Winsock and MacTCP/PPP. Most people didn't instantly get Windows 95, so it wasn't the reason that the ISP existed. And it was already possible to access the internet to some extent through an established online service, I think I'd used Delphi, AOL, maybe others during high school.

Something made it feasible right then for anybody to set up a bank of modems in their apartment, to provide direct internet, and there was an explosive growth in small ISPs before they consolidated. At the time, I was kind of oblivious to the historic moment, but the one I worked for was literally a few modems in the closet of a crummy apartment downtown when I started and within months we'd moved to an office a few blocks away and were installing modems like mad.

I found this, not necessarily authoritative:

"In 1994 the National Science Foundation commissioned four private companies to build four public Internet access points to replace the government-run Internet backbone: WorldCom in Washington, Pacific Bell in San Francisco, Sprint in New Jersey, and Ameritech in Chicago. Then other telecom giants entered the market with their own Internet services, which they often subcontracted to smaller companies. By 1995 there were more than 100 commercial ISPs in the USA."

I think that was probably it - right then and there anyone could buy a pipe to the internet and connect some modems. It was around then that I heard the term "T1" which was a lot back then.

Maybe there was some connection to:

https://en.wikipedia.org/wiki/High_Performance_Computing_Act...


What was Apple's market share of desktops at the time? And whether you were talking people through a Trumpet WinsSock install or I was doing it in person, it wasn't going to get done that way, just too slow. Sure lots of physical infrastructure (modems etc...) had to be added, no disagreement there. Anyway, it's all a long time ago now. A funny aside, thinking of Apple in those days reminded me of "Cyberdog" - https://en.wikipedia.org/wiki/Cyberdog. I wonder if anybody has done a side-by-side of Cyberdog vs Safari. Things change.


>What was Apple's market share of desktops at the time?

I don't know, but probably significantly better than a few years later, due to how vast an improvement 95 was.

There really, in my view, and in some of the magazine reviews I read back then, was nothing to recommend Windows 3.1(1) except if (a) you couldn't afford a Mac, or (b) you wanted to run software that wasn't available for one.

Suddenly, once 95 got traction, people promoting Macintoshes had to make excuses for the lack of memory protection or pre-emptive multitasking, on top of the high prices. And Windows just wasn't as godawful ugly any more.

But at the moment that ISPs were all sparking into existence, I don't think the wave had quite arrived. I mean, people were getting it and I do remember vaguely the initial version of IE, but 95 wasn't even the majority of PC users for a little while.

A browser for 3.1 that I kind of remember from those days, that lapsed into obscurity, was Cello:

https://en.wikipedia.org/wiki/Cello_(web_browser)


Yea, I'm afraid that's not so. In 1993, I was running a WildCat BBS and I was way more hyped about that and it's RIP graphics, lol. The only way I could get on the internet at all was through other peoples university accounts, which required dial-up, Trumpet Winsock, and PPP. It was a chore to get running and was very slow on the 14.4k (and slower) modems of the day. 56k modems weren't introduced until the late 90s. So yea, between 90 and 95 other technologies seemed more appealing like BBSs, Gopher and places like "The Well", at least to me.


In 1993, iirc, I borrowed a Mac and 2400bps modem from my high school that I used to call the library and AOL.

Before that, the way to get software (for me, because I wasn't a college student) was to go to a local computer store and copy their disks containing free or shareware.

https://en.wikipedia.org/wiki/Fred_Fish

But everything changed about 1995.


I think that the question of what user-facing purpose(s) a technology can be put to is somehow qualitatively different than what backend roles it might play. I could be wrong.


This only sounds ridiculous because you replaced the things the OP was talking about, with your own, to make it sound ridiculous.

"Is it about being a decentralized caching layer? Is it about permanently storing content? Is it about replacing my web server? Is it about replacing DNS? Is it about censorship resistance?"

websites at the beginning did not decide to be a caching layer - they decided to be websites. they did decide to permanently store content. they did decide to use web servers. they did not decide to replace dns. they did decide to be about censorship resistance.

now imagine you put up a website, put some time into it, and it may or may not be up at a random time in the future. not a product that's usable.


> There is no system to keep content alive so links will still die.

Torrent trackers solved this in a very interesting way. They created an economic system where bandwidth was the currency, incentivizing the permanent seeding of content. It was illegal to take more than you gave. I've even seen an academic paper studying their system!

Bandwidth as a currency eventually proved to be a failure. It enabled the rise of seedboxes, dedicated servers featuring terabytes of storage and connections to high capacity network links. Just like the IPFS centralized gateways you mentioned. They would eventually monopolize all seeding, removing any normal person's ability to gain currency. In some trackers, if you wanted to consume content, your only options were renting one of these seedboxes or uploading new content to the tracker. You always stood to gain at least as much bandwidth as the size of the content you uploaded. The seedboxes would monitor recent uploads and instantly download your new content from you so that they could undercut you. I suppose it was a form of market speculation.

They also failed to realize that there is no uploading without downloading. By penalizing leechers economically, they disincentivized downloading. This led to users being choosier: instead of downloading what they like, they'd download more popular stuff that's likely to provide higher bandwidth returns on their investment. Obscure content seeders would not see much business, so to speak, due to the low demand for the data. Users would stock up on popular and freeleech content so they could get any spare change they could. The more users did this, the less each individual user would get. Then seedboxes came and left them with nearly nothing.

This was eventually solved by incentivizing what was truly important: redundancy. Trackers created "bonus points" awarded to seeders of content every hour they spent seeding, regardless of how much data they actually uploaded to other users. These points can be traded for bandwidth. This incentivized users to keep data available at all times, increasing the number of redundant copies in the swarm. People will seed even the most obscure content for years and years. In some trackers, these rewards were inversely proportional to the amount of seeders: you made more when there were fewer seeders. This encouraged people to actively find these poorly seeded torrents and provide redundancy for them.

We can learn from this. People should be compensated somehow for providing data redundancy: keeping data stored on their disks, and allowing the software to copy it over the network to anyone who needs it. The data could even be encrypted, there's no reason people even need to know what it is. Perhaps a cryptocurrency could find decent application here. Isn't there a filecoin? Not sure how it works.


> Bandwidth as a currency eventually proved to be a failure.

I'm not sure why everyone assumes that bittorrent was about currency. It wasn't, and that was its superpower. No matter how much you were able to contribute, you were able to strengthen its system, redundancy and availability.

The only problem bittorrent had in practice was a legal one that finally led to a problem of availability of discovery.

All that filecoin hype will lead to only one thing, and one thing only: the people that are popular will decide what files are worth to the system, and the masses will simply dump everything else and forget about it.

Add an inflation for every modification of any file in the whole system, and you have a perfect way to destroy real incentives.

I don't believe in most web3 projects because they always think it's about trading files with each other. It is not. Discovery and access to data should not be limited by your financial capability to buy things. The internet exploded because people got access to vast amounts of knowledge, for basically free compared to before.


> I'm not sure why everyone assumes that bittorrent was about currency. It wasn't

Torrent trackers were and are to this day. The most successful trackers have proven to be those with a ratio economy. The late what.cd, often described as the library of alexandria of music, had the harshest ratio economy of them all.

> All that filecoin bullshit will lead to only one thing, and one thing only: the people that are popular will decide what files are worth to the system, and the masses will simply dump everything else and forget about it.

That's certainly a possibility. I don't know. I welcome discussion about this.

> Discovery and access to data should not be limited by your financial capability to buy things. The internet exploded because people got access to vast amounts of knowledge, for basically free compared to before.

Absolutely agree. Unfortunately, this is not possible with current copyright laws. This sort of utopia is currently only possible in underground networks such as private bittorrent swarms. It's been proven by history that some sort of incentive is necessary to get people to commit their personal resources -- storage and bandwidth -- to those networks. Ratio economies were created to address the leecher problem: users who simply downloaded what they want, without providing neither bandwidth or redundancy in return.


Oh, very good comment. I didn't realize that trackers now incentivize availability (haven't used BitTorrent in a while). Very cool.

Reminds me of maker incentives in markets. Providing liquidity is sometimes paid for.


Yeah, I have to agree. We have working examples of immutable copy/append only web apps. Email/mailing lists didn't need crypto to succeed. Neither did bit-torrent. Baking in a coin seems like a me-too sort of move.


I think you've hit the thing that many new techs have, they don't 100% cover old techs it's a bit of one and a bit of another, but not 100% of each.


what did they sold to investor? The idea that people will pay for hosting using filecoin?


and recently, a crypocurrency subsystem


Decentralized DNS, apparently:

https://handshake.org/


That is based on yet another shitcoin. IPFS is not.


Filecoin


Hence the "apparently"...


I like IPFS, I really do, but whenever I try to use it, it's either too slow to become usable or sometimes it plain doesn't work. I pinned a whole bunch of files on IPFS a while back to experiment with it and the system seems to work, but every time I try to fetch those resources from a location that hasn't cached the content yet, it takes several seconds to show me the HTML/JSON/PNG files.

HTTP may be inefficient for document storage, but IPFS is inefficient for almost everything else.

I like the concepts behind IPFS but it's simply not practical to use the system in its current form. I hope my issues will get resolved at some point but I can't shake the thought that it'll die a silent death like so many attempts to replace the internet before it.


I love IPFS. It's one of my favorite recent technologies, but I think people have unrealistic expectations about such a young idea.

Decentralized tech doesn't work well until the network effects build up.

IPFS has the interesting quality that the more popular a piece of content is, the easier it is to get ahold of. If millions of people were using IPFS, the most popular content would be being served by many thousands of people, and finding it and downloading it would be extremely fast. Then subsequent viewings would be instantaneous because you can cache it for life.

This leaves interesting incentives for monetizing pinning and caching for less popular content.

It makes sense if you ask me. If I love a piece of music so much that I'm willing to give it to others for free then everyone benefits from being able to access it easier.

Content that people care about organically becomes more resilient and nearly impossible to remove.

Content that no one cares about is slow and inefficient because it has to be hauled out of cold storage the one time a year anyone cares.

If someone thinks that content is more important than people are giving it credit for they can host it or pay for someone else to do it.

If you have a website and you have "fans" that subscribe to you and help pin all your stuff, then your stuff becomes faster and easier to get. Your "fans" can even get paid for helping to serve your content.

So, to me, it's early days for IPFS, and the way to make it better is to try to build apps that increase its usage, so the power of the network effects is felt.


> IPFS has the interesting quality that the more popular a piece of content is, the easier it is to get ahold of.

That sounds like the worst of the current internet, but even worse


Or, you know, BitTorrent, which works perfectly fine.


There's a reason private trackers have to incentivize keeping the long tail alive.


Look at any public tracker and you'll find torrents that are very old, swarming strong.


Sometimes I try to get something that is a year old (or even 6 months) and it's stuck for weeks at 99%


Along with those that have 0 seeders.


IPFS solved that by making sure even popular stuff is difficult to get.


Thanks, quite a bit of coffee is now on my keyboard...


Dude this made my day. Thanks! xD


Survivor bias, there.


Yes, there is a reason: because they're private and have limited users.


It’s not a young idea though. The basic technology for p2p networks has been around for decades. DHTs, voting, vouching, etc were all got academia topics like 10-20 years ago. It’s just engineering skills at this point.

I remember popcorntime was as responsive as Netflix at the time it came out and it scared the shit out of the MPAA so they killed it with prejudice.

IPFS doesn’t have an excuse for sucking beyond a basic lack of engineering effort.


> If millions of people were using IPFS...

...then IPFS would just get even slower and use even more resources to manage the index and find content as I am pretty sure the DHT they are using doesn't scale the way you seem to think it does.


Why ? iirc the time I studied them, DHTs scale pretty well (like log(size of the network) complexity for everything).


log(size of the network) still means it gets slower as it gets larger without any aforementioned speed advantages for all but the IPFS Google popularity equivalent class content.


You seem to underestimate how slowly logarithms grow.


That doesn't matter, as it clearly grows faster than constant (as in, O(1)), which means that as the system gets larger it will take more time to do queries and maintain the index (which is ridiculously expensive on IPFS), not less (which was the claim we are contending is wrong... a claim which would still be wrong even if the system somehow were magically constant); and any supposed advantages in "caching" don't fix this (except maybe for extremely popular files: at best, ones more popular than the median file, though my intuition tells me it is going to be some inverse log worse than that, and I also suspect it might be the mean file instead of the median) as one is going to expect the number of unique files stored in the system as well as the number of queries performed to scale with the number of people using the system.


> think people have unrealistic expectations about such a young idea.

This interesting given your description:

> IPFS has the interesting quality that the more popular a piece of content is, the easier it is to get ahold of. If millions of people were using IPFS, the most popular content would be being served by many thousands of people, and finding it and downloading it would be extremely fast.

This idea hasn’t been new since the turn of the century (BitTorrent offered exactly that in 2001) and nothing in that description explains why this is different than the many previous attempts. It’d be interesting to hear about how IPFS plans to maintain that without the problems with abuse and how it keeps competitive performance relative to non-P2P in a world where things like CDNs are much cheaper and easily available than they were around the turn of the century. Using P2P means giving up a lot of control for the content provider and that’s a challenge both from the perspective of the types of content offered and the ability to update or otherwise support it on your schedule.


IPFS is basically BitTorrent if all the torrents could share with each other. IPFS is as if each "torrent" is a single chunk of data instead of a siloed collection of stuff.

IPFS expands BitTorrent into a global filesystem.

You can mount IPFS on your filesystem and address files by pointing at local resources on your machine. So you could have an HTML file say `<img src="/ipfs/QmCoolPic" />`. You can't do that with BitTorrent.


Okay, but it's not 2001 anymore. Bittorrent was useful because parallelizing uploads across a broad network increased speed to a degree that content hosts couldn't manage.

But that's not true anymore, most internet power-users are on broadband connections, many of which are symmetric, transfer speeds up or down are no longer a limiter that pushes people towards decentralization.

So when considering a decentralized system like IPFS, the downsides of decentralization, like availability, edit control, and service support, are much more salient.

There are a lot of things that "could work if everybody uses it". You can never get there if the thing isn't desirable compared to existing alternatives.


BitTorrent v2 shares dedups chunks between torrents as a side affect of changes to the hashing algorithm.


> This idea hasn’t been new since the turn of the century (BitTorrent offered exactly that in 2001)

I feel like freenet (2000) is maybe a better comparison


Yes. I can't stand that IPFS basically took the Freenet model, stripped out all the anonymity & privacy, and added the weird "interplanetary" marketing push.


> If millions of people were using IPFS, the most popular content would be being served by many thousands of people, and finding it and downloading it would be extremely fast.

Except that most people outside tech would probably be using phones/tablets/crappy pcs, with upload speed that is 10% of their download speed.


It isn't that early. They have a stupendous amount of money and have been around since 2014. By now they should have something to show for their work.


> IPFS has the interesting quality that the more popular a piece of content is, the easier it is to get ahold of.

So niche content is difficult to get a hold of?

Sounds like a bad idea.


Yeah sounds like our existing situation where content is largely moderated by centralized governing bodies. Big -1


No more difficult than it already was to obtain.

The point being, obtaining the popular stuff is no longer subject to DoS because distributed caching is built into the protocol itself.


> I think people have unrealistic expectations about such a young idea.

It launched in Feb 2015.

Things that have launched since then:

The idea of Donald Trump as President of the USA

TikTok

Covid-19

Tesla Model 3

OpenAI

It's entirely possible that everything could change any day now. It's equally plausible that it's just a bad implementation of a decent idea, and something similar could come along and deliver on its promise.


>The idea of Donald Trump as President of the USA

I find this disturbing, because not only is it not true, that it originated in 2015, and many people have commented on how The Simpsons predicted it in 2000...

https://en.wikipedia.org/wiki/Bart_to_the_Future

...but the reason they predicted it had a lot to do with Trump running in 2000, and more recently, he's reportedly been saying he is such a winner he won the first time he ran.

So it reminds me of the famous photo with Stalin and the "vanishing Commissar"...but what do you know - that has been deleted from Wikipedia recently!

https://commons.wikimedia.org/wiki/File:The_Commissar_Vanish...

It was there prior to 2016 though:

https://web.archive.org/web/20150516002908/https://commons.w...


You are missing the point of what I'm saying.

Trump wasn't taken seriously as a candidate in 2015. He didn't declare his candidacy until July 2015 at which point his odds were 150/1 and it didn't get above 66/1 in 2015.

https://www.bbc.co.uk/news/newsbeat-36392621

In 2000 his run wasn't taken seriously either. He had a approval rating of 7%. https://en.m.wikipedia.org/wiki/Donald_Trump_2000_presidenti...


Ok, fine, you wrote "the idea" and you meant "taking seriously".

But I suspect you had it right the first time. I sort of think "the idea" is the relevant stage. And that adds ~15 years to that particular thing.


> If millions of people were using IPFS

You criticize that people have unrealistic expectation, and then you are making an unrealistic claim...


I don't think that's necessarily true. Ethereum 2 is using libp2p to facilitate p2p communication between nodes. IPFS also uses libp2p. That means that every Ethereum node could easily become an IPFS node so people may end up running an IPFS node without even realizing it.


Wonder if it's practical to "buffer" popular content on IPFS by copying it to normal HTTP servers.

Requesting an IPFS document would query a few popular repositories, then revert back to normal IPFS if it's not found.

These buffer servers would also track what's popular and shuffle around what they store accordingly.


I think this is exactly what Cloudflare's and ipfs.io's web proxies do. They won't cache your stuff forever, but they'll cache it as long as someone requests the content again before the content gets removed from cache.

The downside of this approach is that it only works with popular nodes and you'd be back to the old, centralised internet architecture for all real use cases.

I don't think you can accurately gauge what is and isn't popular in a P2P network like IPFS. You never have a view of the entire network, after all.

There's also the problem of running such a system. Who pays for the system's upkeep and do we trust them? If we'd use Cloudflare's excellent features, who says Cloudflare won't intentionally uncache a post criticising their centralisation of the internet, forcing the views they disagree with to the slow net while the views they agree with get served blazingly fast.

I don't think such a system would work well if we intend to keep the decentralised nature of IPFS alive. Explicit caching leads to centralization, that's the exact reason caching works.

Instead, the entire network needs a performance boost. I don't know where the performance challenges in IPFS lie, but I'm sure there's ways to improve the system. Getting more people to run and use IPFS would obviously work, but then you'd still only be caching only popular content.

Edit: actually, I don't really want to see caching happen through popularity of the service either, because as it stands IPFS essentially shares your entire browsing history with the world by either requesting documents in plain text or even caching the documents you've just read. I wonder if that IPFS-through-Tor section on their website ever got filled in, because the last few times I've checked that was just a placeholder in their documentation.


How much were you paying for your IPFS pin? E.g., if you are getting something via HTTP, there's a server somewhere with that content just waiting for you to request it, typically stored on an SSD, etc. V.s. IPFS pins which are typically packed on to massive disks shared with lots of other people

IDK a whole lot about IPFS though. Maybe it was the metadata resolving / DHT lookup or whatever that was super slow. BitTorrent latency was always pretty high, but it didn't matter because throughput was also high


My IPFS pin was just one or two of my servers running an IPFS daemon. Since that daemon was running on Oracle's free VPS's, the answer is probably "a small fraction of what it costs for Oracle to have you in their database".

Paying for pinning sounds like something that could work but it would introduce some of the same problems that the real web suffers from back into IPFS. The idea "a web for the people, by the people" becomes problematic when you start paying people to make your content more accessible.


if it was slow running on a dedicated vps, not super encouraging.

The thing I liked about the idea of IPFS pinning is that you are paying per byte stored, v.s. per byte accessed, as long as the p2p sharing works. I.e. hosting-via-pinning a website only you read would cost the same as hosting a website that the whole internet reads.


To be fair to the software itself, the system was never pegged for CPU usage or anything, and it wasn't a fast VPS to begin with.

From what I could tell the performance issue was mostly located in the networking itself, getting the client to resolve the content on the right server. That's something that could be improved through all kinds of algorithms without breaking compatibility or functionality, so there's hope.

I agree that pinning comes with some interesting ways to monetize hosting without the need for targeted advertising that the web seems to have these days. Small projects like blogs, webcomics and animations could be entirely hosted and supported by the communities around a work, while right now giant data brokers need to step in and host everything for "free".


> They won't cache your stuff forever, but they'll cache it as long as someone requests the content again before the content gets removed from cache.

"It stays in the cache as long as it stays in the cache"

??? What on earth does this mean?


Content is cached for a certain amount of time (default is 24 hours, I think?) before it gets deleted. If the content is requested again, the timer is reset.

This is opposed to long-term caches like Cloudflare's that'll cache the contents of your website regardless of how many requests come in. Cloudflare will happily just refresh the contents of your website even if nobody has been to your website for weeks, and quickly serve it up when it's needed.


That’s not how Cloudflare normally works: the HTTP cache is demand based and does not guarantee caching. What you’re describing sounds like their Always Online feature which regularly spiders sites to serve in the event of an error.


I read it as saying that if someone downloads it before the cache timer deletes it, it resets the timer. So if the file is downloaded regularly, it is never removed from the cache.


The irony is that this and other IPFS problems will (must?) be fixed by recentralization. Cloudflare is doing this with IPFS Gateway, and Google will surely embrace/extend/usurp IPFS if it becomes popular. The user experience of bare IPFS is just not good enough.


I agree with a [previously] dead/deleted commented at this level:

"Doesn't matter. the point of ipfs is that when cloudflare and google shut down their gateway, the ipfs content is still available at the same address."


This really is one of the cruxes of decentralisation being built in at the protocol level. Even if centralised services exist, as long as one person exists who cares, the content lives on.

Without decentralisation being supported at the protocol level, as soon as the host dies, it's gone. This is particularly problematic because centralised services slowly subsume small services/sites and this either cuts off the flow to the other small sites or eventually something changes on the big centralised site and a bunch of these little sites break.


… if someone else paid to host a copy. Major companies hosting it makes that less likely and if their backing increases usage that also increases the cost of hosting everything, making it more likely that the content you want will be available. When Google shuts down their mirror, suddenly all of that traffic is hitting nodes with far fewer resources.

The underlying problem is that storage and bandwidth cost money and people have been conditioned not to think about paying for what they consume so things end up either being ad supported or overwhelming volunteers.


> suddenly all of that traffic is hitting nodes with far fewer resources.

One of the points of IPFS (and bittorrent before it) is that this is not a problem; each node that downloads the data also uploads it to other nodes, so having lots of traffic actually makes it easier to serve something (indeed, if it was already widely seeded by Google's mirror, there wouldn't be any sudden traffic).


I'm not particularly familiar with IPFS: does it have some solution for free-riding?

BitTorrent as many have noted is great for popular things, even not-particularly-popular things, but absent incentives to continue seeding (i.e. private trackers' ratio requirements) even once-popular things easily become inaccessible as the majority of peers don't seed for long, or at all.

I guess what I don't quite is what IPFS adds vs. say, a trackerless BitTorrent magnet link that uses DHT? Or is it really just a slight iteration/improvement on that system?


> I guess what I don't quite is what IPFS adds vs. say, a trackerless BitTorrent magnet link that uses DHT?

Beats me! I think there might be support for finding new versions of things, but I'm not sure about the details or how it prevents authors from memory-holing stuff by saying "The new version is $(cat /dev/null), bye!".


No it doesn’t

If nobody pins a link it disappears but there is no strong incentive it just rides on abundant space and bandwidth and wealthy Gen Xers that want to be a part of something

The same group released filecoin which experiments with digital asset incentives.. and venture capital

Inconclusive results


Bittorrent use breaks Tor, IPFS download does not.

so that's one advantage to one audience


Doesn't matter. the point of ipfs is that when cloudflare and google shut down their gateway, the ipfs content is still available at the same address.


> Wonder if it's practical to "buffer" popular content on IPFS by copying it to normal HTTP servers.

I guess the approach would be to simply run IPFS on those servers, with the popular content in it, as a seed.


Sounds like it's working fine for you. "Several seconds" of lag is nothing for an "Inter-Planetary File System", in fact it's on par with other decentralised P2P networks.


> "Several seconds" of lag is nothing for an "Inter-Planetary File System", in fact it's on par with other decentralised P2P networks.

That's good enough for kicking off batch file transfers (assuming you mean P2P networks like BitTorrent), but there's no evidence that people will tolerate a slow web, and lots of evidence that they won't.


Skynet fetches files in under 100ms, you can definitely get a decentralized system going as fast as the centralized web if you build it right.

The main challenge for me with this comment is that you can't expect distributed/decentralized networks to win if you set an expectation that "things will just be slower than the normal web". Nobody is going to migrate to that.


> Skynet

I don't know Skynet. I first checked Google, got a Wikipedia link describing a movie, then checked the Wikipedia disambiguation page, but got nothing.

https://en.wikipedia.org/wiki/Skynet

Also, why would a project duplicate the efforts of IPFS rather than contribute to it?


https://siasky.net/ and https://docs.siasky.net/

IPFS has chosen an architecture which fundamentally keeps it non-performant, Skynet is built from the ground up in a different way, and gets 10-100x improvements on performance for content-addressed links, and 100-1000x improvements on performance for dynamic lookups (IPNS)


Try browsing the IPFS example "website". I opens for me under a few hundred milliseconds.

    ipfs cat /ipfs/QmQPeNsJPyVWPFDVHb77w8G42Fvo15z4bG2X8D2GhfbSXc/readme


The IPFS network tends to run quickly if the file you are fetching is stored in the cache of either ipfs.io or in the cache of cloudflare. Everything else has lookup times of 30-60 seconds, sometimes more.

DHTs just aren't a good choice for massive data systems if you need low latency.


I think that's an unfair example because those files come pre-pinned or at least pre-loaded with most IPFS installs.

I've just pinned a 12MiB file filled with random bytes on one of my servers (`dd if=/dev/urandom of=test.dat bs=1M count=12; ipfs add test.dat; ipfs pin <hash that came out>`). The server has a 50mbps uplink, so transferring the file to my laptop should take about two seconds.

Dumping this blog's contents over IPFS takes the server about 3 seconds (first time load) so the network seems to be in working order, at least when downloading data. `ipfs swarm peers` lists about 800 known peers. On the server itself, `ipfs cat /ipfs/redacted > /tmp/test.dat` runs in about a second, which is all perfectly acceptable overhead for a transfer that'll take two to three seconds anyway.

On my laptop, I've tried to get the file but I just cancelled it after waiting for 16 minutes. Halfway throughout the wait, I've tried opening the file through the ipfs.io proxy, which finally gave me the file after a few minutes, but no such luck yet if I retry the ipfs command.

I don't know if it's the random file, the size, or something different, but if I'm launching a blog or publishing documents on IPFS, visitors should not be expected to wait five to ten minutes for the data to load. "After the first twenty visitors it'll get faster" is not a solution to this problem, because there won't be twenty visitors to help the network cache my content.

Maybe I'm expecting too much here; maybe the files shouldn't be expected to be available within half an hour, or before Cloudflare caches it. Maybe there's something wrong with my laptop's setup (I haven't done any port forwarding and I'm behind a firewall). Either way, if I follow the manual but can still buy a domain, set up DNS and hosting on my VPS and send a link to a friend faster than I can get the file through P2P, I don't think IPFS will ever get off the ground. Fifteen minutes is an awful lot of time for a data transfer these days!

Edit: actually, now it seems ipfs.io and cloudflare have picked up the file in their caches. Data transfer is up to normal speed now. If you want to try to replicate my experiment, I've just uploaded a new test file to /ipfs/QmbBD872kjfoutAmTKFCxTCApw9LBB9qxxRyXpEGYzsqMH.

Edit 2: I realized that by saying I downloaded the file and that the file is random, I just announced my personal IP address to the world through the IPFS hash, so I removed it. That was pretty dumb of me, and also a pretty clear problem of IPFS in my book.


$ time ipfs cat /ipfs/QmQPeNsJPyVWPFDVHb77w8G42Fvo15z4bG2X8D2GhfbSXc/readme

(...)

real 0m0.219s


I agree that the http caching layers are currently needed to achieve decent UX. But it’s also possible that won’t be necessary forever. The network could expand and get to the point where resolution time comes down to accessible levels. Only time will tell I suppose.


On what basis do you think this is likely? What part of IPFS gets faster when more users are online? If the file exists on a system in the network, it should make a connection and start fetching the file within milliseconds ideally. It's not like we are distributing multi gb files where more seeders means more speed, this is all just slowness in setting up a connection.

Even torrents take 1-10 seconds to initiate downloads with an absolutely massive network.


If you want to optimize for latency, you can't use a DHT. You need something that goes point-to-point instead of routing through a series of machines.


Which is why a DHT system is not appropriate for web style uses. No one wants to wait multiple seconds for a page to even start downloading. It's fine for a large file download but not for frequent small requests.


When downloading content using a magnet link downloads are usually quite slow started, while torrent files usually start directly. It's not a fair comparison since the torrents are from a private tracker with high quality peers, but it's noticeable that the DHT stuff is slower. Not sure how cjdns solves that.

Will the network be faster the more people join it?

Is centralized infra inevitable for fast things because of all assumptions and insight a centralized provider can use? BTC is also slow, I don't know of any cryptocurrency that can provide VISA transaction volumes, even if they use more power than some countries alone.


Cryptocurrency is different from storage because cryptocurrency needs to provide global consistency guarantees.

Skynet is a decentralized network with lookup times that have a p50 TTFB of under 200ms. It achieves this by looking things up directly on hosts rather than routing through a DHT. There's a bit of overhead to accomplish this (around 200kb of extra bandwidth per lookup), but for a smooth web experience that tradeoff is more than worthwhile.


I keep IPFS companion turned on constantly but when I hit a site that's getting loaded through IPFS it often takes so long that I end up turning IPFS off so it just fetches it from the central server


This echoes my experience as well, and I (used to) run a pinning service.


Wireguard and the usual tools for file search and retrieval works fine.

Do we need all that comes with IPFS? Not just technically, but the user training and pivot of technical doers?

So many of these projects feel like programmer vanity projects, there’s really little difference between them and a guy on the corner telling me why Protestants are wrong, join his flock.

That it’s a technical project not entirely ephemeral nonsense doesn’t matter; solutions exist already we just don’t implement that way.


Pretty interesting. A lot of people here are focusing on permanence, but for me the main difference on addressing by content vs by host name is the loss of authorship it opens up for the web. Since a research paper of essay is referred to by its hash, the owner effectively gives away all control of the work when it's first published. There will be no editing, no taking down unwanted work, and no real way to build an interactive website that allows dynamic linking to other materials by the author.

It's interesting how the same people promoting the "creator economy" also tend to promote the cryptocurrency space and IPFS without an ounce of self-awareness. IPFS sounds awful for creators of all kinds in the same way as BitTorrent was awful for artists. I can definitely see a use case for IPFS as a file storage for trustless systems such as smart contracts, which are designed as immutable, trustless systems.


You don't lose authorship, you lose ownership, which arguably you don't have under the regular internet either, ie. Once uploaded, anyone can mirror your content. Arguably under ipfs you literally cannot lose authorship if you put the author in the content because then authorship is part of the content hash used to address/load it.


IPFS claims to solve this with their name service IPNS which can update to point to a new hash with a revised file. Where the original hash can be cached and used but users can refer to the NS version and get the latest version. But last I saw, the name on IPNS had to be frequently pushed by the original server or it would go away.


IPNS doesn't work. Whenever you press them on this point, they'll admit it doesn't work.

But IPFS has the crypto problem of conflating the stuff that works now with the stuff that's hypothetical in its marketing, and not admitting that the latter is janky nonsense that doesn't bloody work.


Sorry, I don't know when you last tried out IPFS, but IPNS does in fact work: https://docs.ipfs.io/concepts/ipns/

Again, I'm not sure when it wasn't working, or when it began working (it's always worked since I've been around), but IPNS has made huge strides, I use it every day. Even https://ipfs.io is using IPNS, it's very popular.


Yeah this is what I saw 4 years ago. Shame it still isn't working. Insane how crypto projects spend all this effort on flashy landing pages, marketing, hype. But if you actually try to use the product, you find out it just doesn't actually work.


Seems like at that point it might be easier to just build a new http+ protocol that supports document signing and focuses on bringing caching back.

You could use all the current web/http/DNS infrastructure, and add certified/cacheable GET results.

Anyone could run their own proxy and cache what they see fit. Seems like an easier transition as it could be fully compatible with the current web.


The author claims that IPFS enables a "permanent web" and eliminates 404-like experiences. How does IPFS guarantee that all published content will be available forever? (In my beginner-level knowledge of IPFS, this is the first I've heard that claim. It seems absurd.)


And quite annoyingly, it's the opposite of how IPFS works. IPFS nodes only cache the content as long as it's actively requested. Depending on the cache policy of the node, this can be as short as 24 hours. Yes, it could still exist on your private node that your running from your laptop, but this is the equivalent of saying that all published content is available forever because it's on your hard drive.

Honestly, I like IPFS as a tech, and blockchain isn't useless, but the entire community just makes me want to puke because it's full of so many extremely ignorant people (at best), but mostly just fraudulent liars (at worse and most common).


> Yes, it could still exist on your private node that your running from your laptop, but this is the equivalent of saying that all published content is available forever because it's on your hard drive.

No, the difference is that IPFS will use the same address to fetch content from anyone who's seeding it.

If Hacker News shuts down, it will no longer be accessible at 'news.ycombinator.com'; all existing links to that address will die, or worse will start showing some unrelated content (probably domain-squatter spam). That cannot be prevented by making a copy on my hard drive (or even the Wayback Machine).

On the other hand, an IPFS version will continue to exist at the same address for as long as anyone is seeding it. All links will remain working; anyone can join in the hosting if they like; even if it eventually stops getting seeded, it may still re-appear if someone re-inserts it, e.g. if they insert the contents an old hard drive they found in an attic (as long as the same hash algorithm is used, then the address will stay the same).


Doesn't a forum need things like user login? Is there an example of such a thing as a forum existing fully on IPFS?


> Doesn't a forum need things like user login?

I was only talking about the HTML, not the backend server code. Still, you could look on https://awesome.ipfs.io/apps for ideas.


> On the other hand, an IPFS version will continue to exist at the same address for as long as anyone is seeding it.

I wonder how that would work, a naive direct translation seems impractical. An address identifies an exact piece of content, so a hacker news article gets a new adress every time a comment is added?


If the HTML differs, it would get a different address; just like the Wayback Machine, but the addresses are based on content rather than timestamp, and it's distributed. Anyone can do that right now, without any changes by (or permission from) the site operator (YCombinator in this case).

You couldn't host the "live" version of Hacker News (with user accounts, new comments, etc.) unless YCombinator open up their databases, and re-architect the system to work in a distributed-friendly way.

The latter is an interesting idea, but isn't required to fix HTTP issues like link rot.


pulling things out the ether my guess is that during normal operation news.ycombinator.com constantly creates and propagates new content ids as new comments are added (the front page simply fetches the newest ids, and there is a system for querying for updated ids which simulate how browser refresh works now), if news.ycombinator.com dies it becomes impossible to create new ids or post comments, but all already propagated pieces of contents are still accessible (possibly refresh still works on outdated content).

This is just a guess, though.


Bingo. The marketing strongly implies permanence, but the only permanence is that same content chunks will have the same hash, not that you can always retrieve content for a given hash. Then you start digging deeper and realize you need to have a paid pinning service to ensure that your content remains available. And these pinning services are way more expensive per gb than traditional storage options.


To be fair, this is all hard work to get right. I think there’s a third, much larger category of people who are pursuing the idea but just haven’t solved all the problems yet.


IPFS doesn't give you any guarantees about if the content is actually stored anywhere, it however gives you reasonable guarantees that the addresses for that content stay the same. Meaning if somebody finds an old backup tape decades down the road, they can just stuff it back onto IPFS and all the dead links start functioning again. That's something that is impossible with HTTP, as there isn't even a guarantee that the same URL will return the same content when you access it twice. With IPFS anybody can mirror the content and keep it alive as long as they want, they don't have to hope that the server that hosts it right now keeps running.

That said, IPFS isn't quite perfect here, as IPFS hashes do not actually point to the content itself, they point to a package that contains the content and depending on how that package was build, the hash will change.


One of the spinouts of IPFS is Multiformats, particularly multihashes. It is a standard to describe hashes and provide interop: https://multiformats.io/multihash/ (Or the IPNS dnslink: http://multiformats.io.ipns.localhost:8080/multihash/ )


Seems like this pointlessly duplicates the "Naming things with hashes" ni: and nih: URI schemes, standardised as https://tools.ietf.org/html/rfc6920 . (IPFS actually uses a proprietary "package" representation as parent points out, so there's little harm in it not using ni: URI's. But standard hashes of the underlying resource should use those. "Magnet" links are similarly problematic, but these at least support a few convenience features.)


> Seems like this pointlessly duplicates the "Naming things with hashes" ni: and nih: URI schemes

Wow, they have an URI scheme for Not Invented Here. Amazing!


That still requires you to deliberately decide to archive a certain page, which you can do just as easily with http. The SingleFile browser extension will grab everything that's linked to a page and build a locally stored directory with all the dependencies of that page.


> That still requires you to deliberately decide to archive a certain page

You don't "archive" a page with IPFS, with IPFS everything you keep a copy of stays available under the same address, that either happens automatically via cache or via a manual 'pin'. That's fundamentally different than what HTTP does.

The archive copy you create of a HTTP site is your own personal copy and completely inaccessible to anybody else. Even if you put it online, people would have no clue where to find it. With IPFS the document never leaves the address space and stays accessible to everybody under the same address, no matter who decides to host it.

Another important practical difference is that IPFS has native support for directories, so you don't have to try to spider around to try to guess all the URLs, you can just grab the whole directory at once. That in turn also has the nice side effect that .zip archives essentially become irrelevant, as you can just upload the directory itself.


It's the fundamental lie of IPFS. IPFS people will jump in and say that it's not a lie, what they're actually saying is yadda yadda, but then they turn around and say exactly that five minutes later. It's the motte-and-bailey ( https://en.wikipedia.org/wiki/Motte-and-bailey_fallacy ) of IPFS.

In a world of finite storage, nobody's going to keep up a copy of everything. Nobody will have a copy of most things. Even if IPFS worked acceptably, even if it worked as very narrowly promised, plenty of stuff would fall off the web. At best, we'd see somewhat fewer temporary disruptions of very currently popular content.


Yeah in the article there is an example of a video that was downloaded multiple times. That is so inefficient because "HTTP".

Solution for it becoming more efficient is that someone else should host it for you ideally for free.

It is just exercise in throwing big numbers and utter ignorance to impress people but downloaded megabytes are not magically going away.

Just like torrents - no one wants to seed or pay hosting costs everyone wants to download. There is no protocol that is going to fix that. Why is everyone mining BTC like crazy, because they get money for that


It absolutely guarantees nothing of the sort.

A big problem in the NFT space of late is that OpenSea sells NFTs that link to an IPFS URL, then doesn't bother seeding the image after it's sold - so a pile of NFT images no longer exist anywhere on the IPFS.

https://www.vice.com/en/article/pkdj79/peoples-expensive-nft...


Isn't that a feature? Some NFTs are hashes of recordings of destroying some physical piece of art. Surely the next level is for NFT of jpegs where only the hash is left! It will be the next big thing to go to the moon! The uncertainty about someone finding a forgotten copy adds incredible depth to the sport.


One benefit is that you or anyone for that matter can seed NFT images on IPFS.

You also gotta look into the smart contract code and see if baseURI an IPFS link or someone's proprietary website.


I don't know about IPFS but Arweave solved the "permanent" part by asking nodes to periodically prove that they're actually storing what they're supposed to be storing: https://www.arweave.org/


The IPFS version of Arweave is Filecoin


What happens when the servers then fail to provide that promise? Isn't the data lost?


The idea is to incentivise hosting via crypto - with ARWeave, there's a bigger upfront fee but no ongoing fees. In theory, it's meant to secure 200 years of storage by putting aside most of the upfront fees for later and (conservatively) assuming storage costs decrease by at least 0.5% per year.


If the servers do provide the proof, they get a reward, if they can't, they loose part of their collateral (also called stake).


Basically with ipfs you try to download some content by referring to it by a hash. This means that the content will be available forever - as long as there is someone willing to cache that content. Which is, obviously, something that won't always happen. But it _can_ theoretically happen, so people with a cache of some old web page may revive it, even after the original site is gone.


How do you refer to something "by name"? Let's say the current version of Wikipedia's article on Lagrange multiplier - the data is changing, so can't use a hash of the content.


With IPFS you access content by hash. So if you have a plain IPFS link to "Lagrange multiplier", it will always point to exactly the same version you are looking at right now. No way to change or update that.

If you want to update content, you have to point the user to a new hash. IPFS has the IPNS mechanism for that, this adds a layer of indirection, so instead of pointing to the IPFS hash directly, you point to the IPNS name which in turn points to the current hash. What an IPNS name is pointing to can be updated by the owner of that IPNS name.

Another option is to do in via plain old DNS and have the DNS record point to the current hash of the website.


The hash is not a literal hash of the content- I think it's like a key that lets anyone looking for it ensure it's legitimate. Then the IPFS p2p search mechanism is what lets you find it.


In what sense is it not a literal hash? Because it had the prefix at the start to specify what hash to use and the length?


Because it chunks, it's more similar to a merkle tree than a plain hash.


Thanks, I didn’t think of that


"of the content" meaning it's not just a sha-N of the file. Unless I misremember the IPFS protocol.


Thanks! I was thinking too vaguely of “hash”, and didn’t think about the requirement that the hash of something large and the hash of that thing but with a big chunk of it replaced with , uh, the hash or some kind of digest of that large bit, be different.


I wonder if it could make running a service like archive.org easier. Or if it could allow for a distributed archive.org service where nerds worldwide can contribute x GB storage to the cause?


This is one of the biggest stated use cases of IPFS as it effectively replaces archive.org with a better version that is more distributed, has better uptime, and more importantly has a much larger and nearly complete and pristine version of the entire internet for conceivably as long as the internet itself exists.


It does only the first of those things: using content hashes means that anyone can populate an archive which is easily discovered.

For the rest, hosting takes money. People will not archive the entire internet for free and IPFS is not a magic wand which eliminates the need to have people like the skilled IA team. It could make their jobs easier but that’s far from “nearly complete” and no more or less pristine.


It gives you the tools to build an archive.org equivalent using volunteer storage though, rather Than asking for monetary donations. All you need is a database of known content hashes and a database of volunteers and you randomly distribute content among volunteers and periodically ensure a minimum number of clients are replicating each known hash.


“All you need” is only true at the highest level: IPFS gives you a great way to discover replicated content. It doesn't help you know that the list of known hashes is complete (consider how much work IA has spent making sure that they crawl sites completely enough to be able to replay complex JavaScript), handle the scale of that list (this is a VERY large database which updates constantly), or provide networked storage at a scale measured in the hundreds of petabytes.

Volunteer capacity at the scale of many petabytes of online storage is unproven and the long tail of accesses is enough that you're going to have to think not just about the high replication factor needed but also the bandwidth available to serve that content on a timely manner and rebuild a missing replica before another fails.


> It doesn't help you know that the list of known hashes is complete

Right, which is why I said it would have to periodically scan the registered clients to ensure a minimum number of clients has each block to ensure redundancy.

> also the bandwidth available to serve that content on a timely manner and rebuild a missing replica before another fails.

I think a slow, cheap but reliable archive is better than "more expensive but lower latency", so I'm not particularly concerned with timeliness.


> > It doesn't help you know that the list of known hashes is complete

> Right, which is why I said it would have to periodically scan the registered clients to ensure a minimum number of clients has each block to ensure redundancy.

That's the easy problem, not the hard one I was referring to: doing what IA does requires you to be able to crawl web resources and identify everything which needs to be available for a page snapshot to be usable. IPFS only helps with that in the sense that you can tell whether you have the same URL payload without requesting it — you still need to handle dynamic behaviour and that's most of the work.

> I think a slow, cheap but reliable archive is better than "more expensive but lower latency", so I'm not particularly concerned with timeliness.

What I would be concerned with is “more expensive, higher latency, and greater risk of irrecoverable failure”. Relying on volunteers means that you need far more copies because nobody has a commitment to provide resources or even tell you if they decide to stop (“ooops, out of space. Let me clear some up — someone else must have this…”), and the network capacity isn't just a factor for user experience — although that can prevent adoption if it's too slow — but more importantly because it needs to be available enough to rebuild missing nodes before other ones also disappear.


> you still need to handle dynamic behaviour and that's most of the work.

I'm not sure what you think would be difficult exactly. You've said that archive.org has already done the programming needed to ensure dynamic resources are discovered, and now those resources are content ids rather than URLs. Nothing's really changed on this point.

> Relying on volunteers means that you need far more copies because nobody has a commitment to provide resources or even tell you if they decide to stop

Yes, but you would also have many more volunteers. Many people who wouldn't donate financially would donate CPU and storage. We saw this with SETI@home and folding@home, for instance.

> nobody has a commitment to provide resources or even tell you if they decide to stop

Why not? If you provide a client to participate as a storage node for archive.org, like SETI@home, then they would know your online/offline status and how much storage you're willing to donate. If you increase/decrease the quota, it could notify the network of this change.


> I'm not sure what you think would be difficult exactly. You've said that archive.org has already done the programming needed to ensure dynamic resources are discovered, and now those resources are content ids rather than URLs. Nothing's really changed on this point.

The point was that it's outside of the level which IPFS can possibly help with. IA actively maintains the code which does this and any competing project would need to spend the same time on that for the same reasons.

> Yes, but you would also have many more volunteers. Many people who wouldn't donate financially would donate CPU and storage. We saw this with SETI@home and folding@home, for instance.

That's an interesting theory but do we have any evidence suggesting that it's likely? In particular, SETI@home / folding@home did not involve either substantial resource commitments or potential legal problems, both of which would be a concern for a web archiving project. There's a substantial difference between saying something can use idle CPU and a modest amount of traffic versus using large amounts of storage and network bandwidth.

SETI@home appears to have on the order of ~150k participating computers. IA uses many petabytes of storage so let's assume that each of those computers has 10TB of storage free to offer — which is far more than the average consumer system — so if we assume all of them switch, that'd be 1.5EB of storage. That sounds like a lot but the need to have many copies to handle unavailable nodes and one factor controlling how many copies you need is the question of how much bandwidth the owner can give you — it doesn't help very much if someone has 2PB of storage if they're on a common asymmetric 1000/50Mbps connection and want to make sure that archive access doesn't interfere with their household's video calls or gaming. Once you start making more than a couple of copies, that total capacity is not looking like far more resources than IA.

> > nobody has a commitment to provide resources or even tell you if they decide to stop

> Why not? If you provide a client to participate as a storage node for archive.org, like SETI@home, then they would know your online/offline status and how much storage you're willing to donate. If you increase/decrease the quota, it could notify the network of this change.

All of what you're talking about is voluntary. One challenge of systems like this is that you don't know whether a node which simply disappears is going to come back or you need create a new replica somewhere else. Did someone disappear because they had a power outage or ISP failure, tripped over the power cord for an external hard drive, temporarily killed the client to avoid network contention, just got hit with ransomware, etc. or did they decide they were bored with the project and uninstalled it?

Since you don't have an SLA, you have to take conservative approach — lots of copies, geographically separated, etc. — which reduces the total system capacity and introduces performance considerations.


> IA actively maintains the code which does this and any competing project would need to spend the same time on that for the same reasons.

I'm not sure why they would have to compete. They're literally solving the same problem in basically the same way. I see no reason to fork this code.

> That's an interesting theory but do we have any evidence suggesting that it's likely? In particular, SETI@home / folding@home did not involve either substantial resource commitments or potential legal problems, both of which would be a concern for a web archiving project.

But this isn't a concern of a web archiving project any more if content-based addressing becomes the standard, because pervasive caching is built into the protocol itself. Publishing anything on such a network means you are already giving up some control you would otherwise have in where this content will be served from, how it's cached, how long it lasts, etc.

> All of what you're talking about is voluntary. One challenge of systems like this is that you don't know whether a node which simply disappears is going to come back or you need create a new replica somewhere else.

Yes, you would have to be more pessimistic and plan for more redundancy than you would otherwise need. Each node in a Google-scale distributed system has a low expected failure rate, but they still see regular failures. No doubt they have a minimum redundancy calculation based on this failure rate. The same logic applies here, but the failure rate would likely have to be jacked up.

> Since you don't have an SLA, you have to take conservative approach — lots of copies, geographically separated, etc. — which reduces the total system capacity and introduces performance considerations.

Whether there would be performance problems isn't clear. Content-based addressing is already fairly slow (at this time), but once content is resolved, fragments of content can be delivered from multiple sources concurrently, and from more spatially close sources. Higher latency, but more parallelism.

I'm not willing to invest the time needed to gather the data you're asking about to actually quantify all of the requirements, but despite the points you've raised, I still don't see any real obstacles in principle.


> I'm not sure why they would have to compete. They're literally solving the same problem in basically the same way. I see no reason to fork this code.

The point was simply that the original comment I was replying to claiming that this made it easy to replace archive.org was really only relevant to one fraction of what an archiving project would involve. If IA is going strong on their side, it's not clear why this project would get traction.

> > > That's an interesting theory but do we have any evidence suggesting that it's likely? In particular, SETI@home / folding@home did not involve either substantial resource commitments or potential legal problems, both of which would be a concern for a web archiving project.

> But this isn't a concern of a web archiving project any more if content-based addressing becomes the standard, because pervasive caching is built into the protocol itself. Publishing anything on such a network means you are already giving up some control you would otherwise have in where this content will be served from, how it's cached, how long it lasts, etc.

That's a separate problem: the two which I described are covering the commitment of storage, which is unlike the distribution computing projects in that it's only valuable if they do so for more than a short period of time, and the legal consideration. If you run SETI@Home you aren't going to get a legal threat or FBI agent inquiring why your IP address was serving content which you don't have rights to or isn't legal where you live.

> The same logic applies here, but the failure rate would likely have to be jacked up.

Yes, that's the point: running a service like this on a voluntary basis requires significantly more redundancy because you're getting fewer resources per node, have a higher risk of downtime or permanent loss of a node, and replication times are significantly greater. Yes, all of those are problems which can be addressed with careful engineering but I think they're also a good explanation for why P2P tools have been far less compelling in practice than many of us hoped. Trying to get volunteers to host things which don't personally and directly benefit them seems like more than a minor challenge unless the content is innocuous and relatively small.

> Whether there would be performance problems isn't clear. Content-based addressing is already fairly slow (at this time), but once content is resolved, fragments of content can be delivered from multiple sources concurrently, and from more spatially close sources. Higher latency, but more parallelism.

The problem is bootstrapping: until you get a lot of people those assumptions won't be true and a worse experience is one of the major impediments to getting more people. In the case of something like web archiving where the hardest part is crawling, which this doesn't help with at all, and there's a popular service which is generally well-liked it seems like have a detailed plan for that is the most important part.


its more that ipfs enables a permanent web, not that this technology immediately has this property as a feature. it requires players like internet archive, libraries, website hosters and individuals to pin content and develop interesting pinning strategies. for example browsers could pin everything that is bookmarked and everything that is in browser cache.


You can link to documents and keep them online as oppose to publications with unobtainable references.


Simple: you set up a few centralized servers with backups...


The claims are too strong, but it is strictly more available than today: if the original host stays it is guaranteed to stay, same as today. If they disappear, it may stay, which is stronger than today (ignoring internet archive).


I'll just point to the millions of torrent files today that will remain unseeded forever. Durability with P2P only works as a complement to centralization.


I think streaming is killing torrents and the fact that you need a VPN in many countries to avoid getting fined


I agree with this, I haven't torrented anything in a very long time as most of things I want are streamable somewhere (and I don't care about file ownership), or set up for direct download on some forum somewhere.


There are other problems. I personally encountered this issue with Netflix: https://www.denofgeek.com/tv/community-netflix-and-hulu-remo...


torrents needs to be available over a browser interface to hide in the usual https traffic


See WebTorrent for this


Named-data networking is literally what this is describing, except NDN works as a replacement for TCP, not HTTP. In the long term I'd rather have it work at a lower level and not have to think about it much. https://www.youtube.com/watch?v=gqGEMQveoqg


NDN is the way

p2p no huge IP infrastructure between nodes

interest packets at narrow waist (like human attention, scarce)

sign all packets, more secure, easier to trust|not

hash for pointers, no surprises in containers

crypto apply more cryptography, less web3 baloney

broadcast radio native, no emulation of copper wire

data, closer to where you want, find faster, keep

content you want w/out intermediation, need to find "where" within IP/udp/tcp addresses

https://youtu.be/P-GN-pYfRoo?t=1825 node to node

https://youtu.be/yLGzGK4c-ws?t=4817 more application, less security hassle

http://youtu.be/uvnP-_R-RYA?t=3018 hash name the data

https://youtu.be/gqGEMQveoqg&t=3006 data integrity w/out need to trust foreign server


NDN can work as a replacement for IP too, running directly over Ethernet or other link layer technology.


Well IPFS, or else something like it, need to establish the network/utility, and then we can increasingly optimize it with dedicated stuff at lower layers. I don't think going straight to the hard parts is going to work without much more government policy to push through the investment -- seems strictly harder than going the 0-cost overlay-network only approach, because clearly we are demand-constrained not efficiency-constrained.

I was tipped off to https://twitter.com/_John_Handel/status/1443925299394134016 which I honestly think might be the approach to think about how networks in a broader sense than regular telecommunications are bootstrapped.


I wonder how people are incentivized to host content well?

Whether HTTP or IPFS, you want content:

(1) hosted/stored redundantly enough for availability but not over-hosted/stored because that raises costs.

(2) hosted “near” to requesters to make efficient use of the network, which limits costs and increases speed. (Near in terms of the network infrastructure.)

With HTTP I understand how this works (the content publisher figures out a hosting solution balancing what they are willing/able to pay and what kind of availability and speed they want/need).

Not sure with IPFS though. If people are choosing what to host, wouldn’t the hosting be uneven, with some content under-hosted (or not hosted at all) and some popular content greatly over-hosted? …leading to an inefficient system? Under—hosted content would be slow to retrieve or simply unavailable. Over-hosted content is wasting resources, making the system more costly than it needs to be (somebody is paying for the storage and servers to make that storage available).

I realize this a alpha/experimental, but without a strong answer to this, I don’t see how the system can work at scale.


The "permanent web" goal didn't survive that well. I work closely with IPFS since 2018, and it's not part of the narative by now.

The distributed part is still going strong, but in my eyes distributed (or decentralization) should be a tool, not a goal.

The question is which kind of Internet we want to build with these distributed infrastructure? If it's just cloning the current Internet, then I don't see what it brings.

My personal answer is that we should build a democratic Internet. One where people are members of the Internet, instead of mere users, and decisions are made in a democratic process, like in a state. This kind of Internet can only be implemented using the distributed/decentralized tools.


So the minority of users see/talk about/hear what the majority wants? So they are forced just fork it and form their own distributed network independent and isolated from the internet at large?


That's not how democracy works.


People hating on IPFS DHT performance probably aren’t using the experimental accelerated DHT client. https://github.com/ipfs/go-ipfs/blob/master/docs/experimenta...


Probably because it’s experimental and is not mainstream yet.


> All the static content stored with the site still loads, and my modern browser still renders the page (HTML, unlike HTTP, has excellent lasting power). But any links offsite or to dynamically served content are dead. For every weird example like this, there are countless examples of incredibly useful content that have also long since vanished.

Perhaps I missed it, but I don't see where they claim that IPFS can provide a solution for dynamic content; which makes up a huge portion of what is served over HTTP.


It won't, but much of the web could work well on JAMStack, and there's no reason that couldn't be supported.


JAMStack isn't much of an option for services that can't trust the clients and cannot rely on high latency consensus algorithms.

It's also brutal for battery life.


JAMStack != static site served entirely by a CDN. There’s an A for API in there. Where’s the A for IPFS? There’s no such thing.


It can be built, but I don't think there are any frameworks that make it easy. A JSON file can be an API, but authentication and encryption using static publicly available files is more challenging, especially for a large number of users. That is where blockchain solutions come into place.


ironically, few links on this 2015 post are dead, and most of them being ipfs releated.

https://ipfs.io/ipns/QmTodvhq9CUS9hH8rirt4YmihxJKZ5tYez8PtDm... https://ipfs.io/ipns/ipfs.git.sexy/


> It's time for ... (2015)

The title shows clearly that they were wrong.


Economy of scale is hard to combat with a technically decentralized protocol.

Let's assume all goes well and in 2-5-20 years IPFS is the web. A random Joe has an IPFS server in his basement, because it's profitable or at least convenient for him. Most of the traffic never reaches AWS or CouldFlare. What do they do about it? They pretend to be random Joes and mirror the same setup.

Obviously Amazon will manage their servers more efficiently than an army of Joes, so the nothing really changes: we end up with a decentralized protocol where 90% of traffic just happens to end up on AWS servers anyway.


At least in your universe, if AWS goes down, some things still work.


I love the idea of IPFS but the last 2 times I tried using it, it simply didnt work. There was no "there" there. The simplest use cases like "I want file X from the ether, make it happen" like you could do with torrents -- nothing ever happened.

I read the docs and tutorials but no luck. Felt like docs were missing some special incantation or setup step, or precondition. Or the CLI wasnt giving me feedback on some blocking error, and it was failing silently. Dunno. Gave up. Hope to revisit some day. Because in theory it would be useful tech to have in my toolbox.


On a side note, do you all know about some research or PoC in the direction of a globally distributed knowledge graph or something of sorts? Or at least a more technical name I could google.

I think it's so bad that so much of the Internet information is siloed in walled gardens and that pages have to be dynamically generated and routed across the globe every single time... Imagine how it's gonna be when we get to Mars, for instance. Storing this knowledge graph in IPFS files seems like a logical step to me (as an absolute layman, though)


The Internet in a Box project is an interesting angle:

https://internet-in-a-box.org/


There's a project at MIT that seems to be precisely this. Underlay.

https://underlay.mit.edu/


Peergos is an interesting human-friendly project built on top of IPFS. Works great so far.

Of interest: they provide a trustless way to store your data encrypted data on centralised boxes (S3, Backblaze) if you want.

https://peergos.org/


You don't search for locations with HTTP, any more than a shipping company searches for addresses to deliver to. The problem with bad metaphors is it derails discussion, which I am arguably doing with this comment.


The analogy is just plain wrong.

search for content = it presumes that there's a google-like indexing of content, and then a DNS-like way to retrieve it.

It seems to me that IPFS, in this analogy, is simply a different way to address a web page / document.

note: I am very familiar with IPFS. I just think that this analogy is really poor.


I did like the concept of IPFS around the time this was written but now it feels to me like it misses the wood for the trees.

The thing is that the web by its nature is already decentralised. The real issue with the web today isn't really some technical, architectural flaw.

Centralisation has emerged in this already reasonably well decentralised system because of network effect driven accumulation, the market and pre-existing wealth from investors tipping the scales.

Yes IPFS is a more thorough attempt to create a distributed web, yet I doubt it is fully immune to the forces that captured and centralised the existing web, except for the fact that it is currently unpopular. Or it will remain unpopular because there's no incentive for private enterprises to invest much in a system where they can't control their own corner of it and few users like using systems built on the cheap when flashy, well funded alternatives exist, if it is truly that resilient to centralisation. Whichever way you want to slice it.

The social forces that cause centralisation in the web are the same forces that cause e.g. monopoly and extreme wealth accrual in the rest of society. And the fix has to be social, not technical.


I like IPFS for making Libgen faster (with a little help from Cloudflare). If it also manages to solve hard problems of decentralization and content-addressable storage, that's just a bonus.


As long as there are no actual, native iOS, android and PC Apps built on top of IPFS or even an access GUI or interface with browser on those platforms, IPFS can stop dreaming of replacing HTTP. Apart from mere 1-2% of world population who use internet, no one else is going to download a binary and run a CLI, and visit 127.0.0.1:8080/ipfs. For majority of people chrome is a gateway to internet (for them Http is a prefix u put in-front of a url) unless ipfs goes to that level, http is as relevant as ever. I know software devs who don’t know they could access files from browser via file:// protocol.


We're actually working on that, and have made progress with Brave and Opera. You're right that for wide, mainstream adoption, we'll need to expand onto platforms like Android/iOS, and that's the dream, honestly. Working on URIs like IPFS:// and IPNS://, which are supported in Brave/Opera to an extent.

It's a long road though, with lots of negotiations with lots of organizations. I do feel optimistic about it though.


Maybe I’m making it too simple but Native iOS, Android and desktop clients be a good start? Even a good desktop client could drive the adoption like crazy. Torrents never required any negotiations, it never depended on organisations to implement support. Just package it enough as a standalone app. Ipns can follow. If enough people are storing/accessing content in IPFS, organisations all implement interfaces on their own.


I wonder if the author of this ever heard about Project Xanadu: https://en.wikipedia.org/wiki/Project_Xanadu

The idea on how a decentralized web of hypertext documents should be done right isn't exactly new. AFAIK the fact that in the HTTP+HTML web stack, servers going down meant documents disappearing and links going stale was criticized even at the time. The HTTP+HTML stack winning out is probably one of the many examples of "good enough" winning over "perfect".


> “With HTTP, you search for locations. With IPFS, you search for content.”

Well, that's the difference between an URL and an URI, right? HTTP seems URL-oriented while IPFS seems URI-oriented.


Not quite. A URI is any identifier, a URL is a identifier to a location while a URN is identifier only containing a name. IPFS hashes are URIs/URNs while both URLs and URNs are URIs.


Additionally, what the "for content" statement is trying to get across is that there is necessarily only one IPFS address possible for a given unique piece of binary content. Which is a quality not universal among URNs either.


Oh, I see. Thanks for the clarification!


  URL - Uniform Resource Locator
  URN - Uniform Resource Name
  URI - Uniform Resource Identifier

URI is too generic, "I" in URI means "Identifier", which can be either a semantic/natural or a technical/surrogate key. And in many case it's the same Locator is in URL.

IPFS is a CAS (Content Addressable Storage), so an "Identifier" there is a semantic key (i.e. hash of the content).


As an outsider looking in IPFS will struggle to catch on with non technical people, like Tor. It can be the best thing ever for privacy, access, or whatever. It still won’t catch on. This is the Linux problem as explained by Torvald.

The problem is that it isn’t unique enough from the existing experience and the problems it claims to solve aren’t something non technical people care about. Web distribution in its current form works. Even if it’s supremely shitty grandma can still post photos on Facebook.

If a new distribution format wants to win it needs to be different in a way an 8 year old and grandma can equally understand. IPFS is not that. What 8 year olds and grandma equally understand is that content is king, the layman term for client/server model.

More likely the future will purely be a focus on web technologies for application experiences and distribution. Kill the client/server model. If grandma wants to share photos with grandkids she doesn’t need Facebook at all. She only needs a network, a silent distribution/security model, and the right application interface. No server, no cloud, no third party required.


This is very badly written and I think the author might benefit from rewriting this.

Start with a description of the solution. If you can't cover it in 1-2 paragraphs at the beginning, you are in trouble. Don't focus so much on what everyone else is doing wrong - focus on what you are doing and let others be the judge of whether this is better.


I like the concept of immutable content, but HTTP eliminated top down control of information?

Well, we’re all dependent on the Internet which can be shut down at any time by any government, and normally almost completely dependent on FAANG companies.

I applaud the freedom activist spirit but the only exciting technologies in this regard must be P2P and not Internet carried.


This has been bothering for the longest time. I would love to learn more about efforts that are being taken to avoid these workarounds we have for certs expiring, dns issues and such.

A lot of website hacking could've been avoided with a better design like even having encryption as standard.


Well a lot of hacking could also be avoided if everything were just plain publicly-accessible static files. Which is what they're talking about with IPFS.

But that would be a very different web from today's, where most content comes from dynamic CMS/CMF, blog software, forum software, or other web apps. It addresses things like the old personal homepage of the mid-90s, but not much at all about the modern web.

It also doesn't address the major outages (like we've seen with cert revocations or DNS outages, etc.)


Just create an easy way to host stuff. Support a cultural shift toward dithered images and a universal text format. Make hiring a host cheap and easy and make hosting locally easy. Small sizes makes backing up easier and cheaper. Create a service where people can store a shard of the backup of the new internet and in return have their content added to the collective backup. I really think it’s about a cultural shift more than anything. Because without a collective realization that bandwidth and file sizes need to be reigned in, the content on the internet will just live at the perpetually ascending ceiling of what the internet can handle. And in that situation there will never be a way to make the internet a robust, equitable and interesting place.


> file sizes need to be reigned in

Tell that to all of the content creators, corporate or not.


What I would like, I think, is the ability to specify both the multihash and a place to request the data from.

Like, if I’m talking to someone over some chat setup which doesn’t have a built in “send this file directly to this person” feature, it would be nice to be able to say, give them a multihash of the file and my external ipv6 address (+ port? I’m not quite sure how routing works), and have them request the file from my computer.

Now, you might say “but how does that help with the situation where a bunch of people in a room want the same file from a distant location?”.

And, maybe it doesn’t as much? But I think if e.g. people on a local network had theirs try first if the local network already had it, and then check the given external address, that that could work?


If you have a hash of the underlying resource (note that this is different from how IPFS works) then http[s]://<authority>/.well-known/ni/sha-256/<base64-url hash> is supposed to enable this. The ni: URI scheme also allows for an authority field, which is used as a hint on how to locate the resource.


That does sound very much like what I want!

I’m somewhat surprised that I hadn’t heard of the ni scheme before tonight.

Is there some nice software (can be cmd only, but ideally multi-platform) to do that where both parties are on separate residential connections, and where the receiving party side software automatically checks that the received data matches the hash?

Because if so, that seems to serve exactly the purpose I’m thinking of!

Thank you for the direction


You have just invented torrents!

> give them a multihash

magnet URIs

> place to request data from

trackers

> try local network first

Already baked in!


Just learned that apparently BitTorrent 2 supports multihash in magnet URI , including the merkle tree thing, so yeah that does sound like, exactly what I’m looking for?

So, sounds like I ought to learn to use bit torrent.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: