
How to build your own CDN with Kubernetes - ilhaan
https://blog.insightdatascience.com/how-to-build-your-own-cdn-with-kubernetes-5cab00d5c258
======
Pfhreak
The list of reasons not to use a third party CDN in this article are:

1) It might have bugs. You are trusting someone else to run the
infrastructure.

2) They might choose profits over customer experience "because they are a
business".

3) They might capture and sell/leak metadata on where/when your customers
visit your site.

The author then goes on to describe using Route 53 and Amazon EKS to deploy
your own CDN. (Which doesn't seem to address points 1, 2, or 3 at all?)

There's not a single mention of Cloudfront.

Something doesn't add up here.

~~~
justicezyx
Also why call it kubeCDN, I mean, it's creating workers using EKS, through
TerraForm, should it be called TFCDN?

The kube part, to me, seems superficial and insubstantial.

------
reilly3000
I love CloudFlare, but wow they have a tremendous amount of control over the
sites they proxy. While the folks in this thread have done a fine job
explaining why this wouldn’t work or save money, I am curious if there are any
legit alternatives to a major CDN. It seems like an open/community approach
could be viable if people where to pool resources for edge servers and
bandwidth bills. It would take auditable infrastructure throughput. I know,
blockchain... pitchfork... but maybe CDN is an actually awesome use case for a
distributed system. If torrents weren’t the charred earth of the DRM wars, it
could probably work as well if it caching could be optimized well enough
through the entire route.

It’s hard to imagine that any distributed solution could compete on first byte
and overall latency; CloufFlare is too amazing at that. I really wish you
could just check out my site to your device and receive a stream of updates,
maybe as a micropayment per update for web publishers. That way content
procurement is a one-time affair for the consumer, and ongoing consumption is
billable in bites/bytes, not by monthly subscriptions. What if you could
distribute source binaries for micropayments as well, funding bandwidth and
development efforts with fees associated with actual use? Damn, there I go
sounding like a blockchain pitch again.

I see the publishing industry running towards membership models and quickly
going to be warring over a diminishing wallet share. Apple now looms large.
Something has to be done, or FANG efforts to ‘help the news industry’ will
result in greater control of media consumption at every level. Bloody
megacorps deciding how to optimize their Q4 with our brains.

Does working with a commercial CDN the best for the web we want to give to our
grandkids?

Seeing a big player showing up and offering ‘free protection’ reminds me of
The Godfather. And how gets hurt when they need to raise Series F?

~~~
sudhirj
A lot of what’s being described here is in IPFS, including distributed hosting
of your content, and a distributed name resolution system.

~~~
balboah
Which is also supported by cloudflare so a bit of both worlds:
[https://www.cloudflare.com/distributed-web-
gateway/](https://www.cloudflare.com/distributed-web-gateway/)

------
nrmitchi
I appreciate that this was attempted, and feel that there are some good
learnings in this article, however feel that it should be reframed. The
concepts addressed in this piece area useful if you want to distribute a
stateless application geographically, and route a user to the closed access
point. While this a necessary attribute of a CDN, it is not sufficient in
itself. I think this project ignored a main purpose of a CDN; serving static
content (or acting as a reverse-proxy/cache to your upstream services) in
_massive_ quantity.

Bandwidth and latency are the main concerns or a CDN, and this approach
addresses latency, but completely ignores bandwidth. Honestly, a CDN is
probably one of the worst possible applications that could be built on AWS (or
any other cloud hosting provider) simply due to the extreme bandwidth costs.
EC2 egress costs are already an order of magnitude greater than Cloudfront
costs, and not even fairly comparable to non-"cloud" options.

Basically, if you want a build a CDN, the cloud providers are horrible
options, whether you use Kubernetes or not.

> edited for formatting

~~~
MuffinFlavored
> the cloud providers are horrible options

Just to be clear: AWS/DO are horrible options, but Cloudfront is a good
option?

~~~
nrmitchi
I'm not 100% sure what your question is. Cloudfront is already a CDN in
itself. You would not use it as a platform for building a new CDN.

I'm not making any claims here about whether or not Cloudfront is better or
worse than any other CDN, but can be a perfectly valid choice if you're
already within the AWS ecosystem and need a CDN.

------
rubiquity
Kubernetes solves the problem of deployment orchestration and that is probably
problem #5209 on the list of problems you’ll need to solve to run a CDN.

------
slashink
A CDN is really about reach and edge bandwidth. Hosting a bunch of containers
on AWS in a couple of regions only gives you the partial benefits off a good
CDN. There’s a reason Amazon offers CloudFront which has a wider network than
there are AWS datacenters.

Also the EC2 outbound bandwidth cost... this is not a good idea compared to
using other solutions.

------
nerdbaggy
A real CDN is hard. I wager that basically any CDN will outperform a homegrown
CDN.

Maintaining low latency connections with all the networks in the world is no
easy task.

LinkedIn has what I think some good CDN blog posts
[https://engineering.linkedin.com/performance/how-linkedin-
us...](https://engineering.linkedin.com/performance/how-linkedin-used-pops-
and-rum-make-dynamic-content-download-25-faster)
[https://engineering.linkedin.com/network-performance/tcp-
ove...](https://engineering.linkedin.com/network-performance/tcp-over-ip-
anycast-pipe-dream-or-reality)

~~~
rhizome
How are you defining "outperform?" Because this sounds like a challenge.

------
notyourday
Wait. You are building a CDN on not just a pay per byte but pay one of the
highest prices per byte in the world platform because it lets you solve
_orchestration_ problem?

------
sbr464
I think one interesting point from a compliance perspective is the technical
requirement to provide a cdn with your private certificates. It’s not a
showstopper, but it’s definitely a point to consider. I wish it could somehow
be removed from the equation, similar to how you don’t need to provide your
certs to a managed dns provider.

~~~
nerdbaggy
This is why a lot of websites uses a separate domain for assets. Separate cert
and keeps cookies from leaking. Then at this point they can already change the
content of what is being served so them having the private key doesn’t really
do much

------
quickthrower2
It's fun to know someone has tried this, but this doesn't seem a very
practical thing to do. You are either in the CDN business and you'll have your
own hardware, or you are not and you'll want to just get the professionals to
take care of it. This inbetween state is a bit odd unless you have weird
requirements.

------
blissofbeing
This looks like creating a problem to solve rather than actually solving the
problem at hand.

------
mychael
Building your own CDN probably makes sense for ~1% of the people reading this
blog post.

~~~
wolco
More like ~0% but it is interesting.

------
justinsaccount
You could build your own CDN on top of k8s across multiple cloud platforms and
use BGP and anycast DNS to completely own the infra... but this isn't that.

------
asasidh
makes you ask DIWhy ?

------
halayli
the point of a cdn is that they have thousands of pops spread globally to be
closest to the user and their network is highly tuned to routing to the
nearest pop using AS networks.

