
Hugo and IPFS: how this blog works (and scales to serve 5k% spikes instantly) - mscasts
https://withblue.ink/2019/03/20/hugo-and-ipfs-how-this-blog-works-and-scales.html
======
matt2000
The basis of this post isn't too sound:

* The "5k% spikes" being discussed here are on a tiny base load. His peak is 6K page views in a day. That's 4 pages a MINUTE. A raspberry pi could handle that. Anything can handle that. There's no load here.

* It's not clear what IPFS is getting him, but it's certainly not performance. Any performance improvements are coming from using the Cloudflare caching proxy, which is basically just using Cloudflare as a CDN. You can get that with a normal website behind Cloudflare with a lot less hassle.

It's fine to check out the implementation details for interest's sake, but
let's just be clear that it all could have been done with a $5 VPS.

~~~
marknadal
The question is, is CloudFlare caching IPFS or not?

If CloudFlare is running IPFS naked, then IPFS seems to be scaling.

If CloudFlare is caching IPFS, then it is CloudFlare saving their butts.

Can we confirm which it is?

I'm trying to chat with CloudFlare about my own P2P protocol, which I've seen
handle HackerNoon's 15M monthly users (I saw about 10K concurrent users per
second at peak load), to run GUN (
[https://github.com/amark/gun](https://github.com/amark/gun) ).

Because I do think it is important to test all these protocols out at
bigger/larger scales, that certainly makes it easier to then debug and then
fix problems (assuming CF isn't caching).

~~~
joecot
> If CloudFlare is running IPFS naked, then IPFS seems to be scaling.

> If CloudFlare is caching IPFS, then it is CloudFlare saving their butts.

They are caching IPFS for people who don't support IPFS. They are not caching
IPFS for people who do support IPFS, and instead passing the user through to
the IPFS servers.

Which means Cloudflare is indeed saving their butts, because as soon as IPFS
has widespread support, this falls flat on its face. Same as traditional
hosting would without a CDN. Their IPFS setup may be able to distribute the
load enough to avoid an internet hug crippling it (that's part of IPFS's
point), but this setup in no way demonstrates that.

------
WhatIsDukkha
So the way I think of IPFS (disclosure: haven't deepdived) is something like
the world we could have if we had a broadly supported CDN protocol for the
Internet (this seems really overdue).

This very naturally answers the repeated questions from people about the why
of IPFS.

Yes you COULD serve this via 1 or 5 of the other 20-30 incredibly individually
complex stacks with CDN functionality as well. But why would you if there was
a common CDN protocol?

What would the upsides be if your ISP could offer you an opt in localish IPFS
instance?

IPFS is a functional expression of the long term academic research into
creating an internet oriented around content blocks vs server socket oriented
as it is today -

[https://en.wikipedia.org/wiki/Content_centric_networking](https://en.wikipedia.org/wiki/Content_centric_networking)

We, of course, need both so its not an either or.

We are backing our way into this world via javascript SRI -

[https://en.wikipedia.org/wiki/Subresource_Integrity](https://en.wikipedia.org/wiki/Subresource_Integrity)

Look at that, we need blocks of code that we know is what we want but we want
flexibility in where it comes from.

~~~
acdha
I agree this is an interesting research area and it’s likely to stay one for
awhile until the underlying economic blockers are well solved: storage,
bandwidth, and support all cost money.

If you host it yourself you have easy answers for that but if you’re relying
on generous strangers you are going to hit limits: how many people are going
to mirror large amounts of content, deal with DMCA takedowns and other legal
requirements, etc. before giving up? Performance could in theory be
competitive but there’s no guarantee and it’s notoriously hard to deal with
unreliable nodes — which again can be dealt with if you have skilled support
staff available. You’ll get some volunteers for the right cause but that only
gets you so far – that’s why there’s a a couple decades of these projects
dreaming big and failing once they get popular.

------
oliwarner
I've handled being Slashdotted back when being Slashdotted was still a thing.

Yeah, thanks, I'm ancient. Before I fart myself to sleep reminiscing the good
old days, we got ~10k visits _in a hour_ , on a ASP.NET website (admittedly
with good caching) but on the absolute dirt-cheapest of _shared_ hosting. It
didn't skip a beat. I've done similar with PHP and Django. As long as you can
avoid hitting the database every request, you can scale to the moon with very
little work.

Hugo is static. Your toaster should be able to support 5k users.

------
e12e
> one issue I’ve experienced with running an IPFS node is that it can use
> quite a bit of bandwidth, just for making the network work (not even for
> serving your content!). This has been greatly mitigated with IPFS 0.4.19,
> but my Azure VMs are still measuring around 160GB/month of outbound traffic
> (it was over 400 GB with IPFS 0.4.18).

I'm not sure if that's 160gb/vm or 160gb across all vms - but either way
that's 10-30 usd/month in bandwidth?

Pretty steep compared to a (single) 5usd/month vm or free (free tier of cdn
provider).

Interesing write-up, though. Nice to see cf support ipfs.

~~~
Scoundreller
I wonder why EC2 bandwidth costs so much more than s3 bandwidth (and the same
equivalent questions for Azure).

------
AndrewStephens
Good on the author for an excellent write-up but I really don't see the point
of this except as a learning exercise.

Instead of running a single static server, they now have a cluster of three
servers plus Cloudflare to serve up a few static files to a fairly small
number of people. And even then they have problems with caching and IPFS-
related bandwidth. And almost nobody actually accesses the site via IPFS,
making the whole thing rather pointless.

------
Zopieux
What proportion of the load was just Cloudflare CDN being a CDN and loading
from cache?

This article seems to be mostly about the benefits of IPFS, but I can only see
the benefits of Cloudflare caching. What am I missing?

------
writepub
Where is the analysis on the percentage of traffic served via IPFS to back up
any claims of the benefits of IPFS.

It appears to me that Cloudflare accepts IPFS as (another) source for it's CDN
operations & any benefits of flat CPU utilization are likely from the CDN and
not IPFS.

In fact, 3 VMs in 3 regions is complete overkill. If the traffic numbers are
to be believed, one could simply post their Hugo site to GitHub Pages or
Netlify with zero extra steps or dollars spent. No IPFS needed

------
hinkley
The stupid-simplest solution I ever saw for this was all the way back in the
Slashdot era, where people would filter on refer headers and redirect all
traffic from news sites to the Google cache version of their page.

That works great for static pages. Hugo and Jekyll are basically precompiling
static pages.

For a more dynamic site? I know some people who do special handling for bot
and spider traffic. The bots get not exactly static content but much less
dynamic content. I could see rerouting all traffic, especially for everyone
without session cookies, to that version during a big spike.

Those solutions behave a little bit like the eventual consistency you see on
very large websites, where values are approximated or cached with a very short
TTL.

As others have commented, the simplest way to get that on a small site is to
pony up money for a CDN. Maybe not the cheapest, but certainly the simplest.

~~~
momack2
genius. why a focus on refer headers from news sites instead of using some
content identifier to indicate whether to route to a more static or dynamic
page?

------
NKosmatos
Nice write up and includes a lot of links with technical details on how to
deploy your own IPFS node(s).

I agree that the benefit of this implementation (3x IPFS + Cloudflare) over
the dedicated VM/VPS might not be obvious for the amount of traffic/visits
this specific blog is having, but it's a good alternative to know :-)

On a side note... Wouldn't it be great if OpenStreetMap/Google
maps/your_favorite_map_provider was hosted on or via IPFS and there was an
easy interface between the IPFS network and the www? This way the distribution
of the tiles would be peerless, the CDN would be huge and each user would have
locally the tiles most frequently used and serve them at the same time. No
more dependency on big companies/providers, immune to
DDoS/blocking/restrictions and free for all ;-)

------
mark242
I really appreciate the work of getting this up and running; it's certainly a
good exercise in making a fully-distributed content site.

What I'm missing is where a collection of static files couldn't just be served
up from an S3 bucket and Cloudfront or Cloudflare on the front end -- you
arguably have the same caching performance if not better since Amazon and
Cloudflare have real SLAs for getting your bits from your bucket to a user's
browser.

IPFS seems like un-needed complexity when there are a huge amount of options
available. If you personally don't like Amazon, you can use Github pages, or
DigitalOcean, or Netlify, or Fastly, or the list goes on and on.

Does anyone have a use for IPFS that isn't already covered by existing
hosting+cdn options?

~~~
benmanns
You can have both. Cloudflare has an IPFS gateway that you can use with a
Cloudflare domain. Content hosted in IPFS and CDN-ed via Cloudflare's network.
An IPFS-capable client will bypass Cloudflare via the dnslink entry.

See [https://developers.cloudflare.com/distributed-web/ipfs-
gatew...](https://developers.cloudflare.com/distributed-web/ipfs-
gateway/connecting-website/)

~~~
joecot
Except then you still have to cover the demand from requests which bypass
Cloudflare through the dnslink entry.

Essentially it's a gimmicky system which is propped up by using Cloudflare,
which ultimately fails once it actually gets user adoption, in a way the
tradition setup (CDN + basic or even free hosting) would not. This only works
because most users don't have IPFS support currently, and if they did the
servers would grind to a halt just as they would with no CDN at all.

~~~
kevincox
IIUC the theory is that the clients using native IPFS will be caching the
content for a period of time and serving it to other IPFS clients. Therefore
the load should be shared across viewers leading to a minimal increase on
"origin" load.

~~~
joecot
Yes, theoretically that's the goal. Given they're showing 160GB of transfer on
their servers, it doesn't seem like that's working particularly effectively.
Regardless, this implementation does not demonstrate at all that IPFS would
save a site from an internet hug -- just that Cloudflare can.

------
leetrout
I love Hugo and direction it has been heading and always appreciate notes
about how easy it is to install and how fast it is (mainly because it is
written in Go).

I have been struggling to get adoption with students working on web sites at
the university I work at but when we get it up and working it is so great for
90% of the things we do.

------
apiudit
IPFS looks great, but the whole article smells like (not that)subtle
Cloudflare advertising.

