Wonder if it's practical to "buffer" popular content on IPFS by copying it to no...

jeroenhd · on Oct 17, 2021

I think this is exactly what Cloudflare's and ipfs.io's web proxies do. They won't cache your stuff forever, but they'll cache it as long as someone requests the content again before the content gets removed from cache.

The downside of this approach is that it only works with popular nodes and you'd be back to the old, centralised internet architecture for all real use cases.

I don't think you can accurately gauge what is and isn't popular in a P2P network like IPFS. You never have a view of the entire network, after all.

There's also the problem of running such a system. Who pays for the system's upkeep and do we trust them? If we'd use Cloudflare's excellent features, who says Cloudflare won't intentionally uncache a post criticising their centralisation of the internet, forcing the views they disagree with to the slow net while the views they agree with get served blazingly fast.

I don't think such a system would work well if we intend to keep the decentralised nature of IPFS alive. Explicit caching leads to centralization, that's the exact reason caching works.

Instead, the entire network needs a performance boost. I don't know where the performance challenges in IPFS lie, but I'm sure there's ways to improve the system. Getting more people to run and use IPFS would obviously work, but then you'd still only be caching only popular content.

Edit: actually, I don't really want to see caching happen through popularity of the service either, because as it stands IPFS essentially shares your entire browsing history with the world by either requesting documents in plain text or even caching the documents you've just read. I wonder if that IPFS-through-Tor section on their website ever got filled in, because the last few times I've checked that was just a placeholder in their documentation.

zinodaur · on Oct 17, 2021

How much were you paying for your IPFS pin? E.g., if you are getting something via HTTP, there's a server somewhere with that content just waiting for you to request it, typically stored on an SSD, etc. V.s. IPFS pins which are typically packed on to massive disks shared with lots of other people

IDK a whole lot about IPFS though. Maybe it was the metadata resolving / DHT lookup or whatever that was super slow. BitTorrent latency was always pretty high, but it didn't matter because throughput was also high

jeroenhd · on Oct 17, 2021

My IPFS pin was just one or two of my servers running an IPFS daemon. Since that daemon was running on Oracle's free VPS's, the answer is probably "a small fraction of what it costs for Oracle to have you in their database".

Paying for pinning sounds like something that could work but it would introduce some of the same problems that the real web suffers from back into IPFS. The idea "a web for the people, by the people" becomes problematic when you start paying people to make your content more accessible.

zinodaur · on Oct 17, 2021

if it was slow running on a dedicated vps, not super encouraging.

The thing I liked about the idea of IPFS pinning is that you are paying per byte stored, v.s. per byte accessed, as long as the p2p sharing works. I.e. hosting-via-pinning a website only you read would cost the same as hosting a website that the whole internet reads.

jeroenhd · on Oct 17, 2021

To be fair to the software itself, the system was never pegged for CPU usage or anything, and it wasn't a fast VPS to begin with.

From what I could tell the performance issue was mostly located in the networking itself, getting the client to resolve the content on the right server. That's something that could be improved through all kinds of algorithms without breaking compatibility or functionality, so there's hope.

I agree that pinning comes with some interesting ways to monetize hosting without the need for targeted advertising that the web seems to have these days. Small projects like blogs, webcomics and animations could be entirely hosted and supported by the communities around a work, while right now giant data brokers need to step in and host everything for "free".

Jasper_ · on Oct 17, 2021

> They won't cache your stuff forever, but they'll cache it as long as someone requests the content again before the content gets removed from cache.

"It stays in the cache as long as it stays in the cache"

??? What on earth does this mean?

jeroenhd · on Oct 17, 2021

Content is cached for a certain amount of time (default is 24 hours, I think?) before it gets deleted. If the content is requested again, the timer is reset.

This is opposed to long-term caches like Cloudflare's that'll cache the contents of your website regardless of how many requests come in. Cloudflare will happily just refresh the contents of your website even if nobody has been to your website for weeks, and quickly serve it up when it's needed.

acdha · on Oct 17, 2021

That’s not how Cloudflare normally works: the HTTP cache is demand based and does not guarantee caching. What you’re describing sounds like their Always Online feature which regularly spiders sites to serve in the event of an error.

TuringTest · on Oct 17, 2021

I read it as saying that if someone downloads it before the cache timer deletes it, it resets the timer. So if the file is downloaded regularly, it is never removed from the cache.

CharlesW · on Oct 17, 2021

The irony is that this and other IPFS problems will (must?) be fixed by recentralization. Cloudflare is doing this with IPFS Gateway, and Google will surely embrace/extend/usurp IPFS if it becomes popular. The user experience of bare IPFS is just not good enough.

vmception · on Oct 17, 2021

I agree with a [previously] dead/deleted commented at this level:

"Doesn't matter. the point of ipfs is that when cloudflare and google shut down their gateway, the ipfs content is still available at the same address."

jacoblambda · on Oct 17, 2021

This really is one of the cruxes of decentralisation being built in at the protocol level. Even if centralised services exist, as long as one person exists who cares, the content lives on.

Without decentralisation being supported at the protocol level, as soon as the host dies, it's gone. This is particularly problematic because centralised services slowly subsume small services/sites and this either cuts off the flow to the other small sites or eventually something changes on the big centralised site and a bunch of these little sites break.

acdha · on Oct 17, 2021

… if someone else paid to host a copy. Major companies hosting it makes that less likely and if their backing increases usage that also increases the cost of hosting everything, making it more likely that the content you want will be available. When Google shuts down their mirror, suddenly all of that traffic is hitting nodes with far fewer resources.

The underlying problem is that storage and bandwidth cost money and people have been conditioned not to think about paying for what they consume so things end up either being ad supported or overwhelming volunteers.

a1369209993 · on Oct 18, 2021

> suddenly all of that traffic is hitting nodes with far fewer resources.

One of the points of IPFS (and bittorrent before it) is that this is not a problem; each node that downloads the data also uploads it to other nodes, so having lots of traffic actually makes it easier to serve something (indeed, if it was already widely seeded by Google's mirror, there wouldn't be any sudden traffic).

zerocrates · on Oct 18, 2021

I'm not particularly familiar with IPFS: does it have some solution for free-riding?

BitTorrent as many have noted is great for popular things, even not-particularly-popular things, but absent incentives to continue seeding (i.e. private trackers' ratio requirements) even once-popular things easily become inaccessible as the majority of peers don't seed for long, or at all.

I guess what I don't quite is what IPFS adds vs. say, a trackerless BitTorrent magnet link that uses DHT? Or is it really just a slight iteration/improvement on that system?

a1369209993 · on Oct 18, 2021

> I guess what I don't quite is what IPFS adds vs. say, a trackerless BitTorrent magnet link that uses DHT?

Beats me! I think there might be support for finding new versions of things, but I'm not sure about the details or how it prevents authors from memory-holing stuff by saying "The new version is $(cat /dev/null), bye!".

vmception · on Oct 18, 2021

No it doesn’t

If nobody pins a link it disappears but there is no strong incentive it just rides on abundant space and bandwidth and wealthy Gen Xers that want to be a part of something

The same group released filecoin which experiments with digital asset incentives.. and venture capital

Inconclusive results

vmception · on Oct 20, 2021

Bittorrent use breaks Tor, IPFS download does not.

so that's one advantage to one audience

alisonkisk · on Oct 17, 2021

Doesn't matter. the point of ipfs is that when cloudflare and google shut down their gateway, the ipfs content is still available at the same address.

amelius · on Oct 17, 2021

> Wonder if it's practical to "buffer" popular content on IPFS by copying it to normal HTTP servers.

I guess the approach would be to simply run IPFS on those servers, with the popular content in it, as a seed.