Hacker News new | past | comments | ask | show | jobs | submit login

"Even if my server goes down, as long as these files are pinned somewhere, anyone should be able to use that IPNS name at another gateway and see the blog."

This is also, in my opinion, the biggest objection to IPFS - that it really doesn't necessarily lead to any kind of true decentralized hosting unless someone else has decided to pin your files.

And why would they? I understand that IPFS is supposed to prioritize connections based on seed/leach ratio like a typical torrent service, but I only torrent a few files at any given time and it's pretty trivial to set them to seed or disable manually; there's no way I'm going to make seed vs. don't seed decisions on every website I visit. So unless for some reason I specifically think to help share some webpage or other, it'll just get auto-deleted from my machine when it cycles out of the cache, as the automated stuff is what'll be keeping track of maintaining my seed/leech ratio, total disk usage for other people's content, upload limits, not leaving if you're one of the only seeders left, etc. In theory, IPFS could be even more prone to link-rot than the vanilla Web, depending on how many people try and actually depend on the decentralized hosting and end up having their files vanish once it's been a long enough time that nobody's still hosting them.

And when there's so many different webpages, as opposed to just a few torrents, why would I think to pin any one thing in particular? Torrents can work based on charity and the need to maintain a particular seed/leech ratio; but I don't have the mental energy to bother deciding whether to be charitable about every dang website I visit.

So this ends up meaning that IPFS works for popular, recent content, where there's enough people who have downloaded the content themselves recently enough that it's still in their cache to meaningfully take the load off the original host in serving that content. But you're not going to get dedicated long-term seeders of any particular site the way you do with highly-desirable files like pirate torrents. But generically, you will always need your own centralized server for any content you want to upload and make sure stays online long-term.

As I understand it, this is sort of the problem Filecoin is trying to solve, but that has its own issues (it's hard to see how paying people to host your stuff on their own machines can ever be cost-competitive with paying AWS to do it).

You can rephrase this sentence:

> it really doesn't necessarily lead to any kind of true decentralized <data storage> unless someone else has decided to <store your data>

You might have an overly romantic idea of decentralization that doesn't necessarily align with the actual definition of decentralization. I would even argue that there is no solution for the idea you're suggesting. You can't have data that isn't stored anywhere.

The permanent bit in ipfs is actually referring to something else: The data isn't guaranteed to be available at all times, but the link is guaranteed to point to the correct data. Slightly anecdotal, but a while ago there was a discussion about a pdf in #cjdns. Somebody had an ipfs link, but nobody was seeding it anymore. A few hours later somebody digged out a pdf from an old archive but wasn't sure if it's the correct one, so we ran `ipfs add file.pdf` and the ipfs link started working again.

> A few hours later somebody digged out a pdf from an old archive but wasn't sure if it's the correct one, so we ran `ipfs add file.pdf` and the ipfs link started working again.

So the file was lost until it could be recovered from a different source and IPFS served the function of a glorified hash database, that's something very different from what it promises.

Bittorrent has the same mechanics with magnet URNs, but you never see them promising permanent storage.

> Bittorrent has the same mechanics with magnet URNs, but you never see them promising permanent storage.

To be fair, IPFS does not "promise permanent storage" either.

A link pointing to the data is fairly useless if nobody is seeding it.

I have torrent files that point to the correct data but it's impossible to obtain because the data they point to isn't being seeded. But just like in IPFS, if I have the correct file I can seed it again simply by adding it to my torrent client into the torrent job.

At least a third party can prevent link rot in this situation. Link rot in the standard web is really bad these days. We do have the internet archive that is trying to prevent it, but then you still need a tuple of (URL & access time) and you have to hope that one organization cached it.

The link rot still occurs on IPFS when the content isn't being seeded, I don't think that's an improvement.

Seems way more tractable to me than solving link rot in the regular web.

Not really, if some niche content with 20 people who read it goes offline, it is very unlikely the content will come back.

And then someone will have to put in the money to pin/host the content anyway if you want to keep it online.

Link rot is solvable with tools such as WARC Proxies.

> Not really, if some niche content with 20 people who read it goes offline, it is very unlikely the content will come back.

You think these 20 people will stay offline forever?

You think these 20 people will keep that niche website they read once in their cache forever or even pinned?

Torrents die all the time because nobody bothers to donate bandwidth to strangers for content they barely care about.

How is IPFS different?

It is different in that a completely unrelated third-party can pin this content of their own volition, and it will continue to be provided under the exact same address/hash. In the web, the domain owner is the single entity who can ensure survival of the content.

Torrents do the same thing. Dat also does that. Even better in either you don't have to think about pinning. Why should IPFS be the one?

To be clear: I was referring to an overly romantic idea of IPFS I had actually had at one point, and which I acquired because lots of other people were talking about IPFS that way.

I am aware that IPFS does not actually do this, but a lot of people talk about it like it does, like IPFS means you can have websites without really needing to have/pay for any server yourself at all. (Which is, yes, technically sort of true, but only if visitors to your site reliably pin its content.)

What IPFS does do, however, is make it possible to host your website on a server with minimal capabilities - even your personal computer (provided it stays on) - because an increase in views will equally distribute the load.

That's not quite correct, the increase in views will still increase in load, especially if the increase is very sudden and not very many nodes have the content yet.

You don't have to actually pin things. Merely visiting will cache and rehost the files, and they will only be GCed at some later time. All pinning does is disable the GC for those files, which, as of a few months ago, weren't getting GCed at all anyway, so visiting something was equivalent to pinning it.

"So unless for some reason I specifically think to help share some webpage or other, it'll just get auto-deleted from my machine when it cycles out of the cache" - my original post above.

I was thinking about that content effectively "linkrotting" away if nobody visits it for a while, say it's old posts on a blog or something. I suppose how big of an issue this really is comes down to how long stuff would tend to stay in that cache in a real, practical usage scenario. I guess I'd been assuming that it'd only be day or so, but it could be longer in practice, depending on how much space is actually allocated to the local cache and how much each individual bit of unique content takes up and how many such unique chunks of content will be downloaded per day, along with maybe some more complex factors besides first-in-first-out for deciding what to collect.

(I elaborated a little more about this below, when I thought about it kinda working like an automated torrent manager that made sure you maintained a decent seed ratio, didn't exceed an upload data cap, etc. and shuffled stuff out when it was taking up too much space or had hit a target ratio and wasn't going to be seeded anymore, with some prioritization for "is there anyone else seeding this", "is this a highly-demanded bit of content", etc.)

And in any case, even though a centralized server is still needed for p. much any practical use case, it's not like that makes IPFS useless - it's still possible in principle for any given bit of content to stick around basically forever and its link to keep working, even if the server goes down, as long as other people want to save it.

Plus, even if almost nothing is going to really be truly dcentralized, it could make hosting a site a heck of a lot easier because the traffic you personally have to deal with may be drastically reduced, as popular content gets uploaded to the first users who can then share with later users. You only have to be the exclusive host of content that's accessed more rarely than the period it'd typically stick around in the cache of the last person to ask for it. (depending on whether or not that last person is currently online.)

(It's easy to imagine a case where short-term decentralized hosting from user caches could be almost entirely sufficient - say, an IPFS-based version of 4chan, where threads are inherently short-lived and temporary objects anyway, and only in the very slowest and least-populated boards would you have threads that are checked so infrequently that you couldn't rely on the currently online user caches to contain that content. You'd still need a central server to do things like spam filtering, user authentication (to make bans stick), and update the index of what files hashes and post hashes are in which threads and in which order, and which threads are currently alive on each board)

> an overly romantic idea of decentralization

For my part, I think it's more romantic that individuals in the network might affirmatively choose to pin the content that is meaningful to them.

I know every mention of this seems to get downvoted because it's a cryptocurrency, but that's what the second half of the IPFS plan is, FileCoin. They've found a way to incentivize storing files by allowing people to 'mine' FileCoin. Mining in this sense isn't a waste of energy, or staking of coins. It's proving you have available space and proving you're making duplications of the data. This causes a race to the bottom for the pricing of storage while balancing it out with reliability.

If you want to host your website on IPFS, pop it up on a domain through Cloudflare and set up a bid for storage space.

The problem with FileCoin is that the math behind it - Proof-of-Spacetime and Proof-of-Replication - just ... doesn't exist yet. That's what the delay has been, trying to figure out how to actually turn those into usable cryptographic primitives. It's not clear that a practical, scalable implementation of the system described in the FileCoin paper is actually even possible.

(There's been some progress on getting proof-of-replication working, but it's still early stages and I haven't seen anything on proof-of-spacetime.)

Also, the economics of FileCoin don't make a lot of sense ... https://blog.dshr.org/2018/06/the-four-most-expensive-words-...

I've found they've been extremely quiet for the whole year, or longer. They did recently send out an email talking about their progress and showed some demos, but I just skipped over the email actually, didn't look into it to see if they talked about their POS 'Proof of [stuff]'.

I just dug it up and they say Proof of Replication is going great. But don't show anything, they promise to open the code in the 'coming months' so people might see what progress has been done if any.

No progress was mentioned for Proof of Spacetime.

Edit: The link to their update, the only substantial one since the ICO last year I think... https://filecoin.io/blog/update-2018-q1-q2/

Yeah, this is the kind of stuff I'd seen that had made me reluctant to talk about FileCoin as "IPFS has found a way to..."

They wrote a paper describing a plausible system that - if it existed - could do a thing. They've been much less successful actually finding a way to implement it.

Seems that proof of replication could be done by querying for a random subset of data from the stored file. Is anyone familiar with a hashing scheme or algorithm for determining if a subset of data is part of a larger file without having access to all the data?

There are ways to do that, proof of possession. What they need is proof you have multiple copies, but you can easily pretend to have multiple copies if they are identical by running the proof on another copy.

You can probably do it by mixing the data with some extra data so replicated copies actually look different so you have to do proof of possession on each replica, and you dont even know they are replicas.

Why do we even want a given node to have multiple copies of the same data?

If I understand right, GP's saying we need proof that the nodes storing the data are collectively storing multiple copies (as opposed to being sybil identities for a single node that's getting paid multiple times for storing a single copy).

GP suggests making each copy unique. It seems to me that the difficult part is making it cheap for uploaders, verifiers and downloaders to translate between the original data and the various unique copies, without also making it cheap for storers to do so (otherwise they could just store the original data and generate parts of the copies on the fly when challenged).

A similar problem arises in memory-hard functions used for password hashing, such as scrypt and Argon2. Those functions are designed to ensure that you have to use a large amount of memory to compute the function - or at least, to ensure that a space/time tradeoff that allows you to use a smaller amount of memory is very expensive. I wonder if techniques from memory-hard functions could be useful in proof of (unique) storage?

Ah, got it. Thanks :)

You don't, but if you're the best bid for storing one copy your the best bid for storing all n copies!

I think dat:// is a much more promising protocol over IPFS. The developers don't seem to delude themselves as much to the capabilities of the archive and there is no ICO. They advertise it as it is; Torrents with updates via public keys.

ipfs:// on the other hand I hear most frequently handwaving away the problems like "who will pin unpopular content" or "how far does it scale" or "who will pay for gateway bandwidth once it gets big" or even "how to make content easily update" (I don't consider IPNS a good solution to that question)

I was going to post just this. While I'm very skeptical that either of them will become popular, dat seems to be more suited for publishing websites and sharing large files. For a website it's nice that you share the whole site and are able to update the content while keeping the link (it's possible to tell if the content has changed, but there is no version history). And dat is already my first pick whenever I need to copy large files privately over internet.

I blogged about my experience with dat a couple of months ago. Here's a link if anyone's interested. https://hannuhartikainen.fi/blog/dat-site/

Since then I've used dat for copying files a couple of times. I haven't really browsed the dat:// web and I'd guess nobody has visited my website over that protocol (but then I don't have analytics and estimate my blog to have a dozen visitors a month).

>who will pay for gateway bandwidth once it gets big That's only a problem because IPFS isn't "big" enough for browsers to ship with an IPFS gateway integrated in them.

IPFS will need very big gateways because until that happens IPFS needs to grow a lot. And even then I doubt browsers would just ship a P2P application to autostart with the browser to the consumer.

This feels a bit to me like saying: "electric cars still have to charge their batteries somewhere, and they aren't necessarily being charged with renewable energy."

Electric cars provide the capability to power a car with renewable energy. Powering the grid with electric energy is a separate problem.

IPFS provides the capability of distribution. Incentivizing pinning is indeed a separate problem, but providing this capability feels like a huge step forward to me.

Likewise if the process of distribution was automated there would be people complaining about opt-in being a waste of their own resources for content they don't care about, or something.

Exactly this. I don't understand why this is being thrown about like it's a bad thing; I gladly pinned all of Wikipedia, but it's also a huge amount of data and I made sure I was prepared to follow through before I asked my IPFS node to do it. If there's content out there that has earned my respect, I'll pin it gladly just to help the authors out, especially if they ask me politely.

To be fair, some of this is automated. If you browse using your own node, it does maintain a small cache, which helps with the load distribution when something gets really popular. This blog post is a good example; despite literally being the #1 item on HN, it's still up; that's the global IPFS cache at work. It's slower than the average site, sure, but it didn't outright vanish, even though the author clearly is using minimal resources to actually host it. I think that's awesome.

More realistically it would have the same complains as services like Freenet. You are in legal limbo if you just host random stuff and it's not too unlikely to get all your equipment confiscated.

For web pages, you could imagine a sort of universal "like" button which pins the page locally and bookmarks it. I think it's less about incentives and more about convenience -- people are happy to help out if it's not too much trouble for them.

People don't give a damn about helping you, and will unpin any site as soon as it becomes a resource drain. And over time, they will learn that's unpredictable, and nobody wants to be a site admin, and so nobody will pin things.

I thought the last 20 years on the Internet have shown us that "humanity is noble, and good" isn't true, and we should assume "evil and/or lazy".

Yes, but keep in mind that, like BitTorrent, IPFS also includes a thing where nodes are given preferential treatment based on seed/leech ratio - so you're incentivized to help host content, and just the stuff you can share while downloading yourself might not be enough.

So yes, they will "unpin any site as soon as it becomes a resource drain" unless they're feeling particularly charitable - but that's exactly how it works with torrents, too! Generally you set up rules like "seed until a maximum up/down ratio is reached for that file, then don't bother", "restrict uploads to a certain maximum speed and/or certain maximum amount of data per day", that kind of thing. I figure you'd basically have an automatically-managed cache of limited size that you'd use to hold the stuff you were liking-to-pin, and stuff that was judged to be no longer worth seeding would get kicked off, as would old/unpopular content if the cache had filled up and you were shuffling in new content.

So unpinning anything that becomes a significant resource drain is also - at least in the relatively-short-to-medium-term - fully compatible with other people being willing to "like" your site to pin it, if not perpetually, than for an extended period of time.

>IPFS also includes a thing where nodes are given preferential treatment based on seed/leech ratio - so you're incentivized to help host content

so you're disincentivized from using the service

Inside every cynical person is a disappointed idealist. - George Carlin.

The "pin" could be the new form of "bookmark" where you're almost guaranteed to never loose the page that you bookmarked

There's plenty of media/webpages I'd love to "save offline" on WiFi so I could then later access them without having to pay for mobile data (or access them at all on flights/road trips through areas with no phone service)

(Note that I am aware that for many pages you can, in fact, do this already. Downloading HTML is a thing. I just meant that since I can already do this, it would be nice if I could both do this and have it integrate seamlessly with my browser for navigating links and host those files for other people while on WiFi I was at it)

This is the dream all us IPFS/decentralization nuthouses are chasing.

Idunno, "Save as HTML..." seems to go for that dream pretty well. And lots of other sites utilize service workers fairly well to work offline.

Save as HTML is often completely broken. Especially when javascript injects <script> tags to load even more javascript. You're better off by using archive.org's wayback machine.

There is a number of other offline tools available (google "archive team warc ecosystem"), they just need better integration.

I don't know what your personal circumstances are, but your tone here sounds like you need to take some time away from the internet, for your own mental health.

It's a huge oversimplification to describe people in just those terms. People are complicated and so are their motivations, and it's hard to tell in advance which new things will be successful or not.

For example, I once worked for a guy who hadn't heard of open source and, when I described the concept to him, couldn't wrap his brain around why so many people would take the time to write and maintain a bunch of software and then just give it away. This is not the work product of "evil and/or lazy" people.

I don't think it's a mental health issue for me that the tragedy of the commons extends to the Internet. This is not about "not knowing", this is about prioritizing self interest and actively abusing the system.

We know these two things happen, consistently. We have failed to account for them in much of the foundations of the Internet - it's the assumption of benevolent cooperation. Which was great for ARPAnet, but simply doesn't work in the wild. We have 20+ years of proof.

We cannot continue to put our heads in the sand and ignore that, because for better or worse, the Internet has become a major force shaping society.

The idea that naivete translates into mental health is certainly fascinating, but I believe the colloquial translation for that is "ignorance is bliss".

The tragedy of the commons depends a lot on the nature of the commons. People can and do work together to manage common resources. You can check out the work of Elinor Ostrom for some examples.

Not evil and/or lazy so much as self serving.

> pins the page locally and bookmarks it

If most people use smartphones to consume content on metered internet connections with limited battery life, where do they pin it such that it's available to others?

In one futurist vision, your home router/gateway modem could provide this service (and also be powerful enough to host game servers if you start up their clients on your phone while on that modem’s LAN.)

A lot of people still have a "real" computer they do a lot of browsing on - it's just increasingly that "real computer" is a laptop, rather than a desktop, computer.

(around 65% of Americans between 18 and 50 have PC/Mac laptops- see https://www.statista.com/statistics/228589/notebook-or-lapto...)

I'd be interested to know what fraction of laptops are fully powered on (i.e. not suspended) at any given time.

Here's a graph of average hours of computer usage per US household in 2009:


Let's assume these figures are useful as a first approximation of the number of hours of laptop usage per laptop owner in 2018. I'm not saying it's a perfect match, just that they're likely to be better than figures you or I would pull out of the air.

Taking the centre of each band, assuming "more than 10 hours" means "10 to 16 hours", and excluding the "no computers" band, gives a mean of 4.3 hours per day. So each laptop is active roughly 18% of the time.

That's more than I expected, but it still means you're going to need a lot of people to pin the content before you have a 99% chance of at least one copy being online at any given time.

Edit: As far as I can tell, with 18% uptime you need 15 copies for 99% reliability (assuming no correlation between the online times of the various copies, which is optimistic - in reality there will be strong daily and weekly cycles).

Maybe just publish what you've downloaded while charging. Much like most intensive things like say picture backups that often are done only on wifi and only when charging.

Or done charging.

Check out Beaker Browser. It uses dat instead of IPFS, but it's just 2 clicks to do just that.

This is a good description of IPFS for those of us who weren't aware of it before. But, I think if it works the way you're describing here (and I assume you have it right), then it's not going to fully succeed at being the "distributed web" that it desires to be for the reasons that you outline.

Having said that, I think IPFS sounds like a really good idea -- oddly, it's a lot like what I envisioned in a science fiction novel, where the notion of data being replicated across hundreds of storage nodes is so taken for granted that the main character has trouble conceiving of the notion of data that has a "location" (that is, is stored on only one device). But to get there, it needs to be largely content-agnostic: if the data is out there on the network, then it's replicated across those hundreds of storage notes, regardless of popularity. IPFS proves that technology is basically already here in theory -- but in practice, I'm not sure it's feasible in terms of storage costs/requirements yet.

As a counter argument, your standard browser already caches a ton of stuff, might as well serve some of it back up if your upstream can support it. If you replace your browser cache with IPFS, and replace bookmarking with pinning you've got a functioning system. Your hosting requirements for content will likely go down, you get massive scale redundancy and reliability for free, no more slashdot effect and you might be able to claim 100% uptime due to just how unlikely it would be that someone hits your site while your root host is down and it's so low traffic that it's not bouncing around the network, even if it's just a standard consumer phylink.

You could still get 'a hug of death ' but as the host you can just sit back and let the infrastructure work, distribute and recover, ala what it looks like to be the first seeder of a torrent. I predict this is unlikely in real world situations though, as for someone to share your site they'd likely have visited it first, thus your content host is likely to not be the only host. I'm an optimist, mostly because I think is stupid where we are at concerning self hosting from our homes.

IPFS is basically a cross between BitTorrent, plus Git's datastructure for permanently and uniquely identifying particular versions of and updates to a given piece of content, plus a clever distributed hash table for keeping track of who has what content.

Am I the only one who finds this prospect very exciting?

Definitely not, hence the enthusiasm with IPFS, and Cloudflare's recent decision to launch a IPFS gateway.

One idea I have is to execute health checks to the network from different points in ipfs node graph. For example you could add your file to your localnode(connected) then spawn some geographically distributed ephemeral modes to request the file. Once all the your emphermal nodes have recieved they wipe and you take your localnode(offline). Now the ephemeral nodes heartbeat, at some interval t, to request the file with the original source offline. If some N of them don't receive the file in some timeout, reconnect your localnode. In theory this would keep your file alive within the network for some x*t timespan, the question is what is that timespan and what's the cost of running the heartbeats.... I'm actually going to run this experiment and I'll reply here with my results.

> (it's hard to see how paying people to host your stuff on their own machines can ever be cost-competitive with paying AWS to do it).

The counterargument would be: there are cheaper ways to mine bitcoin than on aws infra. If you can incentivize that with money, you can incentivize storage with filecoin.

You also get censorship resilience (and maybe even better distribution if all goes excellently).

> This is also, in my opinion, the biggest objection to IPFS - that it really doesn't necessarily lead to any kind of true decentralized hosting unless someone else has decided to pin your files.

Perhaps not, technically, but it will turn hosting into a commodity. And it will remove any difference between centralized and decentralized from the user's point of view. Therefore, it's a huge win.

While it remains to be seen whether Filecoin will be price-competitive with AWS, IPFS has already proven itself to be far more censorship-resistant than any centralized provider can hope to offer.


The original IPFS whitepaper mentions Filecoin, so that is one possible incentive, but there are many other ways to incentivize people to keep files available. Constituting a superset of the current incentives for hosting files, at the very least.

In my experience, persistence is random.

"Fast Data Transfer via Tor"[0] has persisted for ~20 months. But "Fast Data Transfer via Tor (Methods)"[1] was gone soon after I took the hosting IPFS node down.

0) https://ipfs.io/ipfs/QmUDV2KHrAgs84oUc7z9zQmZ3whx1NB6YDPv8ZR...

1) https://ipfs.io/ipfs/QmSp8p6d3Gxxq1mCVG85jFHMax8pSBzdAyBL2jZ...

> it's hard to see how paying people to host your stuff on their own machines can ever be cost-competitive with paying AWS to do it

That's a big part of Amazon's AWS offerings, hosting other people's files on their own machines. Heck, Amazon could spin up IPFS host nodes and start earning filecoin for themselves. The goal is that file hosting becomes competitive, cheaper and more available. You won't have to choose AWS or DigitalOcean, you just throw some change at it and anybody can host it.

But, why would Amazon want to do that? Right now they get a lot of brand recognition from doing it themselves.

Because they have idle machines and there would be money to be made.

>it really doesn't necessarily lead to any kind of true decentralized hosting unless someone else has decided to pin your files

You can either pay somebody to pin the files. Or a public Organization like "archive.org" decides that a site is important enough to pin it. The typical surfer will seldom pin something but can keep low freq special interest sites alive. There is no fee lunch.

With HTTP it has become hard to mirror a modern site. In IPFS it is a built in feature. Makes a better Internet.

>And why would they?

Because they're reading the blog, and are interested in the content - i.e. because next generation IPFS-friendly browsers participate in the pinning, as they should.

I mean, think about it: there is already a copy of this web page out there, multiply redundant, in the form of our browser cache. All it really needs for IPFS to be viable is for the browser vendors to give the user the means to make those cached files part of their contribution to the Internet...

That would be beautiful, and hopefully realizable with a browser extension, which would get glommed into the browser core (a la Brave and Firefox) once the technology is proven.

But how do you solve NAT? Most end-user devices can't expose ports, and IPs are increasingly shared. Skype was uniquely successful there, but I doubt IPFS in its current form could make use of those tricks.

> But how do you solve NAT?

The solution to NAT is IPv6.

To make the data live after the original node goes offline, you have to push your data to random nodes which also push to random nodes.

Your node blindly accept data, store it and upload it.

There are some P2P network implementations which did that and failed horribly bad on the end. A few abuser can DoS the network.

Oh and there are illegal data and numbers. I'm rather extreme freedom-of-speech advocates myself. But even for me, there are some immoral data I will never help spreading it knowingly.

Having a network-connected device that you own be always available may not be such a huge problem. Many home's use a router that is always connected: this router could also serve as IPFS content hoster!

I view IPFS as a better distribution network, not a better host - in contrast to current's internet structure, if many people request the same content from me, they might only need to contact my server once and then share it between them thus reducing overall load.

So the point of IPFS is not to store content for eternity, but to provide efficient distribution of content.

Basically - why would anyone else have your files pinned somewhere?

It's not just pinning though, is it? This is like saying "why would anyone host your file on BitTorrent", but the reality is that no one has to host your file on BitTorrent. Downloading naturally seeds[1]. Unless something has changed with the pitch, this is the same with IPFS. Not only do people seed while downloading, but it will (eventually) even seed locally first, such as within the same network.

IPFS promises nothing, nor tries to, in the way of permanent archiving. It's about reducing congestion and gaining benefits of immutability. It could archive something in the sense that something popular is difficult to remove.. but that's definitely not a guarantee.

[1]: ignoring aggressive non-sharing downloaders of course

Yeah, and that's entirely true as an explanation for why IPFS is still valuable - if you have popular content other people have accessed recently or are currently accessing, it can in theory take a ton of load off of your server by letting those people share some of the load to new users. Incredibly powerful for DDoS protection/avoiding the "Hug of Death", and potentially greatly reducing hosting costs by reducing traffic to your particular server at precisely those moments that would be most expensive - when you'd otherwise be serving lots of requests for some newly popular bit of content.

It's just a mistake to view IPFS as allowing for truly "decentralized" websites or as a decentralized file storage platform - unless you have the kind of content that makes other people want to follow the typical torrent model and actively long-term "seed" it by pinning, you'll still need to have your own personal central server to host the content on if and when nobody else is.

Which, yes, IPFS has explicitly never promised that - but a lot of people seem to think it does.

this explanation actually makes sense as to a possible use case for this tech. thanks!

You say: IPFS promises nothing, nor tries to, in the way of permanent archiving.

From the IPFS web site[0]:

Humanity's history is deleted daily


IPFS keeps every version of your files and makes it simple to set up resilient networks for mirroring of data.

This is clearly stating that the system keeps every version of your files, and it says nothing about "pinning", or the fact that the files will, in fact, not be kept. At best the web site is misleading, at worst it is simply lying.

I don't yet know with the IPFS really is, nor what it really does, but it's statements like that on the IPFS web site that makes me distrustful of the hype.

[0] https://ipfs.io/#why

Imo, the hype is irrelevant. Many people misunderstand IPFS, and the wording that you highlighted doesn't help. Luckily, I view IPFS as a very valuable "technology", and whether or not it specifically wins in this space, I don't care - I care that the technology is useful and I think it (or something like it) will eventually be the future of the web.

So I don't really buy into the hype-drama. So many people are concerned with hype.

Anyway, to your specific points - if you understand IPFS those comments are not entirely off board. However I can understand why they would lead people astray. In reality I see those comments, ie human history being deleted, as a reference to the mutable web. I can find a post on Reddit and today it is meaningful, tomorrow it might be deleted. In a general immutable system, if I reference an immutable address to the content I care about I will always find exactly that content. Whether or not it exists permanently is another issue, one that I don't care about honestly - I care that what exists can't change out from under you. Just by viewing data in an IPFS-like system naturally makes you own it, as you effectively download a copy of it. No one can take that from you.

Now, whether or not you decide to permanently hold onto the data you want is another story. But again, permanency is not likely to be "solved" by anyone.. and honestly, given how so much "content" can be illegal, I don't think we ever can or should solve the permanency issue.

> ignoring aggressive non-sharing downloaders of course

It has not been reasonable to ignore such users for the last five years or so. Like it or not, the computing landscape has shifted to mobile devices which aren’t on all the time and have limited power and bandwidth. Perhaps BitTorrent is OK now, in its niche, but if it were to go mainstream for downloads you’d go from a small fraction of non-sharing downloaders to a high fraction.

Skype’s architecture is an interesting one in that space: it used to be distributed, with many computers all over the place nominated supernodes; but the shift to mobile made that architecture untenable, and so they had to shift to a centralised model, which generally performs worse, to keep it working at all—not for surveillance, but for scalability!

> It has not been reasonable to ignore such users for the last five years or so.

You're taking me the wrong way. I wrote that because I didn't feel like writing paragraphs going into explicit detail over the pros and cons of seed avoiders and how one might handle them, etc. It wasn't the point of the conversation I was trying to make, informing about the general design. BitTorrent works without you hosting the file, so does IPFS, that's the point. Nothing more, nothing less.

Which, I should have known better from HN, but /shrug. I guess in the future I need to be more explicit when I ignore a topic.

[1]: ignoring human nature of course


I enthusiastically pay money to Pinboard every year to maintain a snapshot of every page I've bookmarked (6,280 to date). I don't like trying to reference or look up something that's a few months or years old and not being able to find a current copy of it anywhere. The Internet Archive sometimes helps here, sometimes doesn't, depending on the individual site.

Pinning other people's pages that I've read seems like a reasonable way to scratch the same itch.

Idea: whichever bookmarklet you use to pin sites should also request an internet archive backup of the site at that moment.

Say you have an interesting page on your website that I want to reference on my website - then I would pin at least the page I am linking to, to ensure that my links continue to work (I really hate dead links).

Because you would reciprocate? I can imagine people forming networks where they trust each other and systematically pin each others‘s files.

Because the network can pay them, if you pay in a kiddy.

Its called FileCoin. And its the answer to what the grandparent says.

Is it possible to set up an IPFS caching server, which stores any content from any point in time, up to some configurable percentage of disk space? Ideally with some kind of mechanism to store poorly seeded data.

I don't know how IPFS works, but I'd like to contribute by setting up and forgetting an IPFS cache on the Internet to help the network grow.

Anytime a user adds content an IPFS daemon, provide messages are sent out to the closest peers containing the cid of the newly added content. The peers will store the record, the cid of the content, and who told them about it. The peers will not request and download the content. However, you could set this up on your own.

It's not a configuration option though. You would either have to build up your own system using the ipfs libraries, or monitor the events from a daemon.

Unfortunately there is a bug that is not allowing the information required to be logged correctly in the events, so the actual cid of the content is not exposed.

If it where, you could look for the `handleAddProvider` or `handleGetProvider` operation through the event logs (`ipfs log tail`) and then inspect the object (`ipfs object stat`) to determine if you wanted to pin it.

> Ideally with some kind of mechanism to store poorly seeded data.

This would be a little more difficult, but you could attempt to query the dht to determine how many peers are already providers for the data (`ipfs dht findprovs`).

>This is also, in my opinion, the biggest objection to IPFS - that it really doesn't necessarily lead to any kind of true decentralized hosting unless someone else has decided to pin your files.

This is why IPFS (w/ Filecoin), Storj and others use cryptoeconomic mechanism design and incentive structures to incentivise hosting / pinning data.

From what I've understood, IPFS decentralization = if you have enough money to host , you have a voice. if enough people like what you say, you have a voice. if either of those fail, good luck & hope the door doesn't hit you on the way out.

> This is also, in my opinion, the biggest objection to IPFS - that it really doesn't necessarily lead to any kind of true decentralized hosting unless someone else has decided to pin your files.

I think of it more like email, email doesn't work either unless you pay someone to host or manage your email situation (or setup your own). It's like email in that the protocols,(multihash stuff etc) is more like an RFC describing how the system works, and thus standardizes how files are shared on the internet. Other tools (browsers etc) can build on those primitives to have a better system of caching and tooling doesn't have to keep re-inventing the algorithmic wheel in terms of how to do this stuff. Maybe IPFS isn't the address files by the hash of their bytes protocol we end up with, but I think it's likely that we will end up with something(s) that do operate in that way and unify the disparate methods of managing that stuff we use today.

So as for who hosts the files you pin, I would figure you'd pay someone to do that in the near term in the same way you'd pay for S3. In the future these providers could compete on price by offering features like p2p load balancing to offer cheaper prices, CDN integration for better perf. Then in a further off future something FileCoin[1] can move it to a decentralized system (we'll see how it plays out and even if we don't get there it's still better I think than the smorgasbord we deal with now).

The cool thing is between all those transitions there should be minimal disruption in how your data is stored and accessible, and it should be common between a lot of different projects.

To me the big open issue is that I think IPNS isn't what I would do for what I call the labeling problem, there i'd go with something more like git type of a system. You really need a way to move the content hashes out of the urls if you want them to be human readable, then the urls themselves (which really probably ought to be urn's) should assert their own immutability, allow namespacing and versioning.

So I could say something like set-label 'mutable://blog.johnsmith.com' -> "content-hash|other-label" but have that action recorded into a blockchain or publicly accessible git repo like thing pointed at by DNS and have the name resolution system replacement be able to query that chain automagically when making requests (the browser for example is aware of this system and leverages it). Further for mutable urls (again probably ought to use urns) previous versions can be requested trivially with @version e.g. "mutable:blog.johnsmith.com@3" and so on.

We can do all this now ad-hoc with http headers etc, but it's all disparate and not unified like it should be, for example it doesn't tie in meaningfully with your file system. If if did then my file system on my computer at the label level (not the inode implementation level) would also use the same scheme. I write to a folder there, it automatically updates the label tree of that folder into the naming system and it's all consistent.

1: My hunch is that AWS/Azure/etc will have economies of scale too hard to beat in terms of the actual hardware but they may be eclipsed / acquire a startup that does the user facing implementation of this.

Yeah, I don't think anyone believes that IPNS is good enough as a naming solution. Regarding the blockchain-like thing, I remember seeing something called NameCoin that sounded pretty similar to what you're talking about, and offered the ability to securely assign a human-readable namespace to stuff like Tor hidden services or IPFS hashes.

There is also Ethereum Name System, which works quite nicely:


If I'm not mistaken there were already some attempts to connect it to a TLD hierarchy.

Perhaps you're confusing IPFS and Filecoin? IPFS provides the storage and retrieval layer, while Filecoin allows for incentivised markets, to your point.

Sure, IPFS isn't a great archive of uninteresting content that needs to be kept forever.

But imagine the normal thing is to give 1GB of cache to IPFS. Granted if a planet worth of 1GB cache's isn't enough to save your content then it's not interesting enough.

The thing I'm worried about is niche content that is very interesting/useful, but only rarely - the kind of stuff that old Web1.0 personal sites, blogs, forum threads, and university-faculty webpages can be a treasure trove of, and which is especially prone to linkrot. It is precisely this content that a "permanent Web" would be of most interest, to me, about preserving. This is the "long tail" of content - any individual piece is only rarely interesting and only to some people, but in aggregate they actually make up a substantial fraction of the value of the Web to me.

So if IPFS aspires to be the "permanent Web", but is virtually useless for preserving these, it hardly seems to qualify. With a few exceptions (for very large content that would require a ton of bandwidth to download and host, or to very complex content that requires preserving the structure of a whole large website to stay functional instead of individual pages, or for inherently server-side content), the kind of stuff that would be popular enough for someone to pin virtually never vanishes from the Web - even if links to it do break - because that's the kind of stuff that has almost certainly already been saved locally & reposted to other websites. And the stuff that's served unpinned from user caches is, very clearly, not permanent.

So it's a "permanent Web" ... which is, at best, barely less fragile and prone to linkrot than the current web, in that if content is popular at one point and then the site layout changes or the original host goes down entirely it can still be kept up at the same link if someone was smart enough to save it locally, and specific versions of a website can be linked to specifically (even if nobody's necessarily hosting them anymore). But in all other respects, it is exactly as ephemeral as the current Web, and the fancy decentralized parts of it that are different than the current Web are among the most ephemeral parts, while the option to still have the boring old fragile centralized-server solutions where you host your own content personally are the durable ones.

I guess the Internet Archive could just throw everything it has into IPFS and pin it. It's not much better than the current situation (in which the Archive is the backup for all those old academic sites and forum threads that have rotted away) but on IPFS instead.

If literally no one in the world cares enough about a website to dedicate some of their resources to it, does it really matter that much?

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact