
Rebuilding My Personal Infrastructure With Alpine Linux and Docker - kristianp
https://www.wezm.net/technical/2019/02/alpine-linux-docker-infrastructure/
======
drdaeman
In my experience, "slim" Debian images (like `python:slim`) aren't
significantly larger than Alpine-based ones, but save lots of time and
headache when something assumes glibc and breaks (or, worse - _subtly_ breaks)
with musl (or doesn't have a binary distribution for musl so every time
image's built you have to build from source).

Also, I'm not sure what are the benefits going `FROM alpine` and installing
nginx, than just starting `FROM nginx:alpine`. The latter has benefit of a
more straightforward update logic when a new nginx version is released -
`docker build` will "just" detect this. It won't notice that Alpine repos have
an upgrade, though, and will reuse cached layers for `RUN apk install nginx`.

Just saying.

~~~
nickjj
> In my experience, "slim" Debian images (like `python:slim`) aren't
> significantly larger than Alpine-based ones, but save lots of time and
> headache.

I came to pretty much the same conclusion too.

For years I was using Alpine but as of about a year ago I've been going with
Debian Slim and haven't looked back.

I'd much rather have the confidence of using Debian inside of my images than
save 100MB on a base image (which is about what it is in a real web project
with lots of dependencies).

~~~
drdaeman
In my experience, the difference is sometimes even less than 100MiB (which is
quite a lot). For the current ("real-world") project I'm working on, it's
about 25MiB - something like 325MiB for Alpine and 350MiB for slim Debian base
images.

Either way, it's not 1.12GiB I was getting with a fat `FROM python:3` base
image.

------
Ao7bei3s
What's the best way to handle updates?

I would like to switch to a dockerized setup, but running everything on
Debian/stable has the advantage of unattended-upgrades (which has worked
absolutely flawlessly for me for years across dozens of snowflake VMs). Not
going back to manual upgrades.

I tried a Registry/Gitea/Drone.io/Watchtower (all on same host, rebuilding
themselves too) pipeline and it worked, but felt patched together. Doing it
wrong?

~~~
alias_neo
In-container upgrades seems to be an issue for many, we've had this issue at
$work.

From what I've seen, the is no consensus on the "right" way to do it.

You could run the upgrades in the container and lose them when it's re-upped,
or, you need to continuously deploy the containers.

This alone is one major reason I'm in favour of statically linked binaries in
from-scratch containers, where possible.

------
fulafel
> An aspect of Docker that I don’t really like is that inside the container
> you are root by default

PSA: everyone should turn on the userns option in docker daemon settings. It
messes with volume mounts but you can turn it off on a per container basis
(userns=host) or arrange a manual uid mapping for the mounts.

[https://docs.docker.com/v17.12/engine/security/userns-
remap/](https://docs.docker.com/v17.12/engine/security/userns-remap/)

~~~
barrystaes
I wondered if this is what Unraid already does: yes its mentioned at
[https://wiki.unraid.net/UnRAID_6/Overview#Containers](https://wiki.unraid.net/UnRAID_6/Overview#Containers)

> The cornerstone of Docker is in its ability to use Linux control groups,
> namespace isolation, and images to create isolated execution environments in
> the form of Docker containers.

If you want to run dockers at home, i suggest you give it a try. All you need
is an old computer and a USB (it runs in RAM). Unraid basically is Linux with
a (happy little) webinterface for NAS shares + apps.

~~~
fulafel
The web page doesn't read like it's enabling userns-remap, to me.

------
gravypod
I'd suggest people looking to do something similar to this to check out Caddy
as a reverse proxy for your services. It'll manage grabbing SSL certs for you
and some people have already wrapped it into a nice docker container for you
[0].

[0] - [https://github.com/wemake-services/caddy-
gen](https://github.com/wemake-services/caddy-gen)

~~~
lytedev
I recently had the pleasure of using Traefik[0] as my reverse proxy, which
similarly handles SSL automatically via LetsEncrypt. Lovely piece of software!

[0]: [https://traefik.io](https://traefik.io)

~~~
windexh8er
I recently upgraded my simple web facing CV site over to a couple containers
and front-ended with Traefik. What's really nice about it is if you run the
host as a single-node Swarm you get a lot of freebies with regard to service
discovery. I also use Traefik in my internal network to front-end Heimdall [0]
in a container. This affords me a very elegant internal network dashboard,
bastion host (proxy) and presents all my internal services with LetsEncrypt
valid certificates (no more internal self-signed cert warnings).

I've been meaning to start blogging again and use this as a first topic.

[0]: [https://heimdall.site/](https://heimdall.site/)

------
adtac
>Note that Alpine Linux doesn’t use systemd, it uses OpenRC. This didn’t
factor into my decision at all. systemd has worked well for me on my Arch
Linux systems. [...]

How does systemd, or any init for that matter, come into the picture if you're
running everything inside docker? Containers don't use any init, right? They
just execute the binary in the host environment (but containerised). Or am I
missing something?

Edit: nevermind, OP is using Alpine as the host OS as well.

~~~
freedomben
Correct. Docker containers typically only contain one process, and that
process runs as PID 1. If you need an init system, tini is very popular, and
is now built in to docker itself[1]. Systemd is _way_ heavy and overkill
inside docker.

[1] [https://github.com/krallin/tini](https://github.com/krallin/tini)

~~~
inetknght
Specifically _why_ is it way heavy though? What does systemd provide which
isn't needed?

~~~
JoshuaRLi
Primarily, you would run an init system in a docker container in order to
correctly proxy signals to the one process in question, which would otherwise
run as PID 1 - for example, sending SIGTERM to docker run running a PID 1 with
no registered handler will result in nothing, because Linux won't use the
default handler (killing it).

Secondarily, if you want to be neat and save some pids and kernel memory, you
need an init system to wait(3) on orphaned zombie processes.

These are the only two use cases AFAIK, which a small init system such as tini
satisfies, without the complexity and size of systemd.

~~~
inetknght
That doesn't answer the question: what does systemd provide which isn't
needed?

I have multiple network devices. I want some to be controlled by processes
running in a container; effectively I want some processes to run under a user
account but still provide root (root-like?) access to the specified network
device(s). I want to be able to give a specific (containerized) user _full_
control over one or more specific network devices. My (naive?) understanding
is that the init daemon takes care of bringing the network online and then
subsequent management of it. For systemd, that would be Network Manager? Or do
I misunderstand?

------
chvid
I haven't really used docker - so here is a dumb question; suppose one makes a
setup like the author here, then what does a deployment of new version look
like?

Suppose the author updates one of his rails apps and there some database
schema modifications.

Is that handled by docker?

How long does a deployment take? (Minutes, seconds ... basically is the tool
able to figure out what is changed and only apply the changes or does it
remove the old installation and build the new from scratch?)

~~~
sbhat7
Deployment of a new version would depend upon your setup. Assuming a setup
similar to the author, you can have a new Docker images with the new version
of your code and run it in parallel. All you have to do after that is point
the traffic from the old version to the new version (By just running `docker
compose`).

If you have a more complex setup, e.g. if by using Kubernetes, you can do
things like run both the version at the same time, person A/B testing or have
canary deployments to ensure the new version works .

Time for deployment would be most likely in seconds unless the setup is
complex/convoluted.

Schema modifications are another beast. For small use cases, you could run a
specialized one time container that performs the modifications, but once you
need high availability, you'd have to consider a more complex approach. See
[https://queue.acm.org/detail.cfm?id=3300018](https://queue.acm.org/detail.cfm?id=3300018)

------
Mister_Snuggles
This article was interesting to me because a lot of my personal infrastructure
runs on FreeBSD. While I don't host anything publicly accessible, I do have
some similar needs.

The author mentions the Docker port for FreeBSD. According to the FreeBSD
Wiki, it's meant to run Linux Docker images and relies on FreeBSD's Linux ABI
layer to do so. To me, this is the wrong approach.

FreeBSD already has good container technology, what it really needs are good
tools around that. Since the author ended up building his own Docker images, I
suspect that he'd be happy with a FreeBSD-equivalent way to declarative build
and manage jails.

~~~
waz0wski
iocage is a decent cli utility for managing jails, automating some common
operations, and associated zfs bits for storage

[https://github.com/iocage/iocage](https://github.com/iocage/iocage)

[https://dan.langille.org/2015/03/07/getting-started-with-
ioc...](https://dan.langille.org/2015/03/07/getting-started-with-iocage-for-
jails-on-freebsd/)

~~~
Mister_Snuggles
This does look to be better than ezjail - I'll have to give it a look! The
vnet support is something that will be very useful.

------
ruduhudi
Gitlab offers free private docker registry with your non-commercial projects
and it's fairly easy to build and deploy containers using their CI when
hosting the Dockerfiles there.

~~~
dsumenkovic
Thanks for mentioning that. Here's some more info about it, an official doc
[https://docs.gitlab.com/ee/user/project/container_registry.h...](https://docs.gitlab.com/ee/user/project/container_registry.html)
and a blog post [https://about.gitlab.com/2016/05/23/gitlab-container-
registr...](https://about.gitlab.com/2016/05/23/gitlab-container-registry/).

------
drej
I would love to Alpine all the things, it's just so fast to work with, BUT
musl makes things hard, especially if you're over in Pythonland - packages
with C dependencies, if built, are only built for glibc, so when installing
them on Alpine, one has to have a compiler, development headers etc. Makes
things too slow and error prone.

Otherwise it's a great distro and I use it for non-Pythonic stuff or for
Python with no dependencies.

~~~
Gigablah
With multi-stage Docker builds nowadays it’s easier to have an intermediate
builder environment with all the build deps installed.

~~~
drej
True, but it's still a pain having to do that, all I want is to install a
package in an interpreted language, I'm not building a project in a compiled
language.

~~~
yjftsjthsd-h
> all I want is to install a package in an interpreted language, I'm not
> building a project in a compiled language.

I sympathize, but rather sounds like the problem is that you're _not_ using a
purely interpreted language anymore.

------
timClicks
Question for those who have migrated to Kubernetes - at what point did you
look for something bigger/better than what Docker Compose (or Mesos) can
offer?

~~~
gravypod
Once you need more than one machine, more than one engineer, and a desire to
use existing tooling.

K8S is a platform to build things. Because of this most of the amazing
features you have access to are built by the community (service mesh for
example).

Docker Compose is a mess once you have 10+ services with each having different
containers backing it.

------
nathankunicki
I don't necessarilly agree with his decision to build all images by hand as
opposed to using those available on Dockerhub, however it's a personal choice
and I respect it.

Given that he's using docker-compose, I wonder why he's chosen to host his
images in a repository at all, instead of just specifying the Dockerfile in
the yaml and having them locally.

~~~
poxrud
He mentions that this way he can more easily upload his docker-compose file to
AWS ECR when it's ready for production.

~~~
nathankunicki
Ah, thanks, I missed that comment.

------
humbleMouse
It's all fun and games using alpine linux until you run into weird networking
issues that are caused by slim images.

I was a big fan of slim images until unsolvable bugs started popping up. Like
others have said, not much benefit shaving off a few hundred mbs in the age of
fiber.

------
ggm
I am in remarkably similar state to this 'prior' state being FreeBSD 11
hosted, with elements of other distributed service.

I also looked to docker and gave up. I like bhyve, and have considered a low-
pain migration to bhyve instances to package things into Linux, and then
(re)migrate into Docker. A way to avoid pain, and cost of a duplicate site, to
build out and integrate.

I wish something as logistically simple as docker compose was in a BSD
compatible model, to build packages to. I'd like the functional isolation of
moving parts, and the redirection stuff.

Nice write up. I wonder how many other people are in this 'BSD isn't working
for me as well' model?

~~~
frio
I use FreeBSD for my NAS/utility server at home, and am considering switching
over to Linux now that ZFS seems pretty stable there. NixOS is my happy place
these days.

~~~
ratling
I had not great experiences with ZFS on Linux. Like broken for a month on a
standard distros kernel type issues.

For personal it’s probably fine but I wouldn’t use it in prod again.

~~~
ggm
I'm half-yes half-no on this. I have successfully carried non-root ZFS
partitions into Debian. But, I just lost 15TB to an unexpected
multipath/iscsid zpool import so now.. I am unsure how I think this story
goes.

Debian ZFS is not easy to install as root FS which is .. disappointing. It
would be nice if it was integrated into the net install .iso as a legit disk
install option.

------
gumby
He finds ansible annoying — anybody suggest a good alternative?

I need to re-spin a set of standard utilities on local hardware from time to
time so am looking for the best way to manage the config files (bind, Apache,
and the like)

~~~
mongol
I have lately been thinking an Ansible-like approach but using compiled code,
maybe using Go, could be a way to go. So I went looking for that, and found
Sup. It was not what I had in mind but worth a closer look.

[https://news.ycombinator.com/item?id=12183370](https://news.ycombinator.com/item?id=12183370)

Edit: I have not tried it

~~~
StreamBright
Same here. We should create an efficient compiled version of Ansible, same
feature set much faster execution, single configuration file flavor (only
yaml)

~~~
mongol
My idea is different. Instead of playbook yaml, playbook go source, that
compiles to a fat binary that is transferred over ssh. That decreases the
dependencies to ssh only on the target hosts (no python). The framework would
include an idempotent API that matches all the tasks that Ansible provides.

~~~
StreamBright
Yes, you are thinking about the same. I would just use YAML for the things
that can be configured.

------
vorpalhex
I use a somewhat similar setup, although I kept FreeBSD ala FreeNAS.

I use a FreeNAS server to manage storage pools and the bare metal box, run
Ubuntu vms on top of it, and then manage top level applications in Docker in
vms via Portainer.

This is nice because the vms get their own ip leases, but can still be
controlled and very locked down (or not) depending on their use.

Docker volumes are mounted over NFS from the underlying NAS, and the docker
level data is backed up with the rest of the NAS.

------
xvilka
There is also another option - Consul/Nomad/Vault/Terraform. I only wasn't
able[1] to figure how to setup private Docker registry with Terraform/Nomad.
The rest, including GitHub organisation, DNS zones, etc - can be defined as
code.

[1] [https://github.com/hashicorp/nomad-
guides/issues/50](https://github.com/hashicorp/nomad-guides/issues/50)

~~~
Svenstaro
That's an orthogonal option in my opinion. It still allows you to run a full
docker setup. It just influences how you run manage everything. That being
said, I really like the hashicorp stack and more people should run it.

------
lalos
What about performance hit now that all of the services are running on the
same server? Might be interesting to monitor and put resource limits on
specific services.

~~~
wezm
This is the CPU load on the server right now:
[https://imgur.com/a/nuM3CqX](https://imgur.com/a/nuM3CqX)

Being on HN front page has pushed it up from a baseline of 7.5% utilisation to
about 12.5%.

~~~
fineline
What kind of instance is that on?

Thanks for the article, very informative.

~~~
wezm
Thanks for reading. It’s a 2 CPU 4Gb RAM instance. It’s probably over
specced...

------
ElijahLynn
Can use docker-compose up --build with build options in compose.yml to avoid
the need for a container registry.

------
redsavagefiero
This is pretty classic amateur hour. He basically says 'I now know docker and
everything looks like a container! Look at how productive I can be by
discarding this set piece environment with out of fashion, slow and deliberate
provisioning for docker images that I can create once and run forever with
auto provisioning and dutch oven magic + an editor!'

------
z3t4
Virtual machines are supported on most platforms. And gives more freedom then
docker.

~~~
icebraining
Docker is just setting a few kernel parameters, VMs run a full OS over the
host, there's a cost to that. If you don't need the extra flexibility, there's
no point.

Mind that you don't need to use Docker to use containers, there's always LXD
and others.

~~~
z3t4
Many virtualization technologies use hardware acceleration supported on modern
server grade CPUs. It's not like running on "bare metal" but pretty close. One
use case for VM's is running obsolete software, like some old proprietary OS,
it's a weird feeling when the software run 100x faster then the hardware it
used to run on.

------
oarsinsync
> My sites are already quite compact. Firefox tells me this page and all
> resources is 171KB / 54KB transferred

16323 bytes of text, 6933 bytes of vector graphics, 23,256 bytes. 22KB of
content, 171KB total, 87% of the transfer total is potential bloat.

It could be worse, but there's almost certainly room for improvement.

------
jaimex2
I applaud the dedication if this really is just for personal infrastructure.

------
coleifer
This is what existential terror looks like. In the face of something
incomprehensible and impersonal, OP decides to spend a while rearranging the
deck chairs. Nothing has changed, but perhaps OP felt a little better about
himself for a while.

~~~
dwaltrip
What is the "incomprehensible and impersonal" thing that the OP is facing? I
skimmed the post and I have no idea what you are talking about. If I was
feeling adventurous, I might claim you are projecting a tad :)

------
gcb0
In sum, zero benefits from using containers.

Still have to decide on a single OS to reduce maintenance problems. Could just
have installed all the services (which are all available as packages) and
handled the configuration files instead of configuration files+docker file+s3
costs of docker image with nothing but the base os + one package and a
configuration file.

~~~
Jureko
That's not a fair assessment and I'm surprised this is the top comment. In
this case, the huge advantage to a containerized setup is that everything is
now easily portable. If his server goes down, or he just decides to move, OP
can now deploy all of his websites onto another server instantly. He also
quotes the ability to build (and test) locally before shipping images to
production, which is a really neat workflow. Improved security comes as an
added bonus.

As for the "s3 costs of docker image", it's a few cents per month.

~~~
NhanH
Concerns about server going down or changing cloud provider imo is not
particularly interesting or even useful advantage to mention for _personal_
infrastructure. Considering that it's likely we might change our personal
infrastructure less than one every year and I've never got a case when an
unmaintained docker setup can run 6 months later, I'm not sure if the value
for portable is that high.

~~~
geezerjay
> Concerns about server going down or changing cloud provider imo is not
> particularly interesting or even useful advantage to mention for personal
> infrastructure.

Why? Personal projects aren't more stable or bound to a single provider. If
anything, personal projects may benefit more from a deployment strategy that
makes it quite trivial to move everything around and automatically restart a
service in a way that automatically takes dependencies into account.

> Considering that it's likely we might change our personal infrastructure
> less than one every year

In my experience, personal projects tend to be more susceptible to
infrastructure changes as they are used to experiment stuff.

> and I've never got a case when an unmaintained docker setup can run 6 months
> later,

The relevant point is that the system is easier to maintain when things go
wrong and no one is looking or able to react in a moment's notice. It doesn't
matter if you decide to shutdown a service 3 or 4 months after you launch it
because that's not the usecase.

> I'm not sure if the value for portable is that high.

That assertion is only valid if you compare Docker and docker-compose with an
alternative, which you didn't. When compared with manual deployment there is
absolutely no question that Docker is by far a better deployment solution,
even if we don't take into account the orchestration funtionalities.

------
ratling
I blacklist Vultr/choopa in every environment I manage. They make zero effort
to deal with botnets and bad actors. They are the first org where I went to
the trouble of blacklisting the ASN so new ranges get blocked as soon as
maxmind updates.

Everything else matches patterns we use. WebPageTest is probably the most
hilariously janky application that’s the best at what it does that we use.
Standing it up locally so you can test internal stuff is a revolting
experience I’ve had several times.

TBH I found kubernetes easier to use than docker compose. Mainly because I saw
little reason to learn the syntax when I was already using kubernetes yamls
and kubeadm makes it so easy to stand up. What you have looks pretty simple
though so I may take another look at it.

You can actually ship images as tarballs and reimport them. That’s what I do
on personal stuff instead of standing up an ECR/registry. As long as you
version tag your containers it should be fine.

HTTP/2 is pretty much a nonissue for everyone except some janky java clients
at this point. We turned it on a couple of years ago with only minimal issues.

~~~
barbecue_sauce
Out of curiosity, how does one go about blacklisting an ASN?

~~~
steventhedev
Maxmind (and others probably) regularly publish lists of IP blocks mapped to
the owning ASN. Basically geoip but instead of mapping to geographical
regions, maps it to ASN. If you want to roll your own, you can probably start
from a public route reflector and go from there.

~~~
broknbottle
whois -h whois.radb.net -- -i origin AS20473 | /usr/bin/egrep '^route' | awk
'{print $2}'

------
edoo
The real sauce in my infrastructure is simply 'generational' storage. No more
clean installs with a backup dir of the system. All my data volumes get
mounted into my machine/s as needed. I can swap out all my hard drives with
new ones without any downtime too which is nice. I could reinstall my desktop
each morning and it would barely set me back 15 minutes. If I also properly
stored my general system settings as code (cfengine, puppet, chef, etc) I
could practically just launch ephemeral desktop instances and have basically a
native experience.

~~~
dman
Could you elaborate a bit what generational storage means? If you get a brand
new machine - how does it get configured in 15 mins?

~~~
edoo
For me a long time ago an upgrade meant buy new hard drives, do a reinstall,
and copy either the entire old drive or main data dirs onto the computer.

Now my data volumes are separate from the system. I used raid for redundancy
on the physical machine with another drive for differential filesystem level
backups (snapshot every day for 60 days with only differential storage cost,
see rdiff-backup). When I upgrade I just add new drives, assign them into the
raid, wait for them to sync, and remove the old drives. By generational I mean
my storage and how I do things aren't going to change when I reset things like
they used to. All my data is outside the system I'm using and must be
attached.

I can for example, spin up an AWS machine, run the package install commands
for the apps I normally use, VPN to my fileserver and mount my home directory
and project volumes, and in quite literally about 15-30 minutes have the exact
same environment as I do at my house. CFEngine et al would speed that up quite
a bit.

~~~
dman
Thanks for the explanation, super helpful.

~~~
edoo
You are welcome. I'd have to say the biggest benefit for me personally is the
distinct separation and permanence leads me to keep it very organized. When I
had my files 'in' my system the lines blurred and things got spread all over
the place. I still keep my download dir on the base system and only things i
really care about make it to the data drive so there isn't much pollution. I'm
a bit scatterbrained in that regard and this method accidentally solved that.

~~~
Dahoon
That sounds like a good setup to keep things where they belong. Thanks.

------
sureaboutthis
And after all that, RedHat dumps Docker and Docker adopts Kubernetes. He
should have stuck with the stability of FreeBSD--which he prefers--and used
jails.

~~~
sgt
But Kubernetes still uses Docker, don't they?

~~~
1337shadow
Kubernetes relies on containerd which works with docker, rkt or anything that
works with containerd.

The prefered alternative to docker is rkt.

~~~
sascha_sl
No they don't. Containerd is the code Moby and Docker are built from,
developed at Docker Inc.

Kubernetes will take anything that implements CRI.

[https://cri-o.io/](https://cri-o.io/)

[https://kubernetes.io/docs/setup/cri/](https://kubernetes.io/docs/setup/cri/)

~~~
1337shadow
All right ! Thanks for the heads up, deeply appreciated.

However, I still stand with rkt rather than docker...

~~~
sascha_sl
It can be quite complicated if you use a docker version without containerd's
CRI. And you do if you follow the version recommendations (because docker has
a lot of regressions). GKE does it, so we do it.

Kubelet actually has a translation layer baked into it that it starts in-
process when detecting docker, which provides the gRPC CRI interface on a real
filesystem socket.

[https://github.com/kubernetes/kubernetes/tree/master/pkg/kub...](https://github.com/kubernetes/kubernetes/tree/master/pkg/kubelet/dockershim)

