
My Own Private CDN - zrail
https://www.petekeen.net/my-own-private-cdn
======
1996
The main problem is often geoip, and properly using the results of said GeoIP.

Having a dozen of POPs around the world is a trivial weekeend project.

Routing the client to the faster one and scaling up to local demands while
optimizing given bandwith price is not trivial. Because not everything is
Digital Ocean, for example what about Japan? Or Brazil?

~~~
zrail
Absolutely. This project leans heavily on AWS route53’s latency-based routing.
I don’t do the magic, route53 does.

------
giancarlostoro
I wonder how long till shared hosting or some sort of similar cloud hosting
takes off more that handles a lot of this nuance for you. CDN's can really
reduce a lot of the heavy burdens that a single server will face which is
downloading static files. Not familiar enough with all the things offered on
the cloud, just with the few tools I've used so anyone who might already be
aware can let me know. I know the author mentions Amazon's Route53 but I'm on
about an all-inclusive solution like one would find in shared hosting that
could include:

* SSL - Free tier would bring Let's Encrypt SSL Certs by default, the end-user shouldn't even think about how it works or anything of the sort, it should just work. * CDN - no need to even _think_ about how it works, ServerlessCDN type of stuff. * Host any kind of app (maybe some sort of serverless approach or docker, though the less you have to mess with even more dev tools the better) * Intuitive admin panel for when SHTF with detailed logging.

I'm surprised we're still stuck in the era of CPANEL and L*MP stacks. Imagine
if anyone could install and run Discourse as easily as they do Wordpress, same
with Ghost. DigitalOcean comes pretty close in this regard, but maybe
something more fool proof for the average joe would be great.

~~~
Ayesh
One of the most painful things to work on would be clearing caches. I have my
own setup that I invalidate the _entire_ set of assets just to make sure the
end users do not get stale assets, and I just ++ a build ID and all the URLs
are uncached at the edges. Not all sites hosted in shared hosts can do this.

~~~
zrail
My plan for this is to just tell the cache nodes to delete the nginx cache
folder entirely via their one-minute check in cycle. Each node will get a flag
in the database that says "needs clearing" and when they check in next their
update script will include an `rm -rf /nginx/cache/directory/*`. Extremely
blunt but also easy and effective.

------
oneplane
I think the N part of the CDN is going to be the real thing here, the CD part
is easy, especially with the various FOSS options out there.

~~~
fooblat
This. Building a "CDN" on top of an existing network is not building a CDN.
Nice project to learn some parts of CDN management software tho.

------
michaelgv
As someone who just recently CDN hell, and rebuilt our entire CDN network from
the ground up (software and hardware), I was wondering why you picked RoR?

~~~
zrail
It’s what I know best and what I’m most productive in. The project is to get
something running and learn a handful of new things, and learning a new
framework would be a detriment to that first goal.

The manager app is not in the hot path with this design so performance doesn’t
matter all that much.

~~~
michaelgv
Are you designing this CDN to pull from origin, cache temporarily? Or to pull
from local file and put strong cache on it?

If you need a hand let me know, I’ve built _pretty large_ CDNs before (10M r/s
at peak)

~~~
zrail
The former to start but I want to add push zones and/or “s3sync” zones that
proactively sync an s3 bucket to local disk.

Thanks for the offer! I might just take you up on it :)

~~~
michaelgv
Just be careful, understand that if you do a PULL only CDN, you're not going
to gain big benefits. If you do want a pull only CDN, have a background task
runner to retrieve the files, and update them locally.

~~~
tatersolid
> understand that if you do a PULL only CDN, you're not going to gain big
> benefits.

This statement makes no sense. A CDN edge node is just a cache; its size and
your access patterns determine the hit ratio.

At $dayjob we get Nginx cache hit ratios on our edge in excess of 99% for “an
origin fetch” setup. That is a very large benefit.

Cloudflare works entirely on origin fetch. They seem to be doing okay.

------
zshenker
You could also look at something like Apache Traffic Control, which came out
of Comcast, and is used by a number of CDNs.
[https://trafficcontrol.apache.org/](https://trafficcontrol.apache.org/)

~~~
jsjohnst
> which came out of Comcast

Not that it matters, but some fun history for you. ATC is built on top of
Apache Traffic Server. Before being donated over to ASF, ATS was known as YTS
(Yahoo! Traffic Server). Of course the story doesn’t stop there, it was
originally known as Inktomi Traffic Server, Inktomi having been acquired by
Yahoo! in the early 2000s.

------
xxdesmus
"NET::ERR_CERT_COMMON_NAME_INVALID"

off to a good start...

Your cert is valid only for corastreetpress{.}com

~~~
zrail
Hmmmmm! Thanks for the bug report!

Edit: fixed. T'was a dumb copy and paste error.

------
djhworld
> Deploy onto the server in my basement on my ZeroTier network

I've read as much as I can handle of the website of this ZeroTier thing and I
still can't fully grok it.

What's the difference between this and your own private VPN?

~~~
zimbatm
A VPN is point-to-point. You fire up the client on your laptop and connect to
a server. That server is usually also acting as a bridge into a network. It
gives your machine a presence into that other well defined network.

ZeroTier is an overlay network. What is does it create a new encrypted
network, with it's own address space and everything, on which you can connect
through the controller (a default one is provided by ZeroTier). It doesn't
matter where the nodes are. If it detects that two nodes are on the same LAN
it's going to route the traffic directly. The overlay network is encrypted all
the time, even when it goes over your LAN.

~~~
jlgaddis
Sounds kinda like DMVPN.

------
ksec
I look through the goals, apart from self learning experience, bunnycdn.com
seems to fit all the bills without the hassle. And despite its pricing, it is
pretty damn fast as well.

~~~
zrail
I can get 4TB of transfer from Vultr or Digital Ocean for half that price.

In any case, the cost considerations are somewhat secondary. I wanted to learn
some stuff and this is a practical way to do it :)

~~~
ksec
> I wanted to learn some stuff and this is a practical way to do it :)

Fair enough.

>I can get 4TB of transfer from Vultr or Digital Ocean for half that price.

You still need multiple POPS, with each droplet in those region the minimum
cost is still going to be higher.

------
anotherrobot
Dont see the point if its not distributed across networks.

~~~
hegz
Digital ocean has a bunch of locations

~~~
jsjohnst
Location is far from the only factor, peering can make as much of a difference
too. Not saying DO is bad, but there are more factors at play than just
location diversity.

------
greyman
Good exercise, but I doubt this will be cheaper than a commercial offering
like Cloudflare, for example.

~~~
powmonk
The skills you pick up from even attempting these things are probably the
biggest reason for doing it. And bragging rights maybe?

