Hacker News new | past | comments | ask | show | jobs | submit login
Everything I know about Kubernetes I learned from a cluster of Raspberry Pis (jeffgeerling.com)
470 points by alexellisuk 15 days ago | hide | past | web | favorite | 77 comments

A good method for learning how clusters work and playing with them without having to spend a lot of time re-building when you break things is kind (https://kind.sigs.k8s.io/).

Each node is a Docker container, but the version of Kubernetes running inside it is vanilla Kubeadm, so it's quite representative of what "real" clusters would look like.

The great thing about it is you can spin up a cluster in < 2 mins locally to try things out, then it's a single command to delete.

I use Kind (and Minikube, and a number of other solutions too), but this is kind of my "Kubernetes-the-really-hard-way" fun project.

Note that I maintain a parallel configuration that runs a multi-node cluster on Vagrant for local development [1] as well as a docker environment built for CI purposes[2] using almost all the same images, with the same Ansible playbooks to configure them all across platforms.

[1] https://github.com/geerlingguy/raspberry-pi-dramble/tree/mas...

[2] https://github.com/geerlingguy/raspberry-pi-dramble/tree/mas...

Thank you for all of your great Ansible work! I leverage a lot of them at work and they have been great! (I’ll be playing around with these!)

yeah I think this approach will have fewer limitations than kind, and also it's fun to try different approaches!

BTW I think your ansible book is great, really helped me get to grips with it.

Sorry, late to the party.

Thanks for sharing. I fully agree with the fun, and the really-hard-way part.

However, if we ignore that part for the moment, would you think having an R-Pi cluster handy is worth it for training/sandboxing/testing with the aim of applying the skills in your day job?

Are there any advantages to having access to your own R-Pi kubernetes cluster, despite having stuff like Kind and Minikube available online?

How does this compare to k3s? https://rancher.com/docs/k3s/latest/en/

Disclaimer: k3s creator

Kind is intended to be a development tool to run fully upstream clusters using kubeadm in Docker containers.

k3s is a custom distro that is oriented towards lightweight environments. The scope of k3s and kind are different (distro vs tool to run k8s in docker). There is another project called k3d (https://github.com/rancher/k3d) that more directly compares to kind. k3d will run k3s clusters in Docker. k3d creates clusters significantly faster and uses less mem/cpu resources than kind.

Kind is explicitly not for production workloads[0]:

  Non-Goals: Being “production workload ready” - kind is meant to be used:

    for testing Kubernetes itself
    for testing against Kubernetes (EG in CI on Travis, Circle, etc.)
    for “local” clusters on developer machines
    NOT to host workloads serving user traffic etc.

K3s seems happy to support production workloads. It's a single binary and gets you a cluster in one command:

  curl -sfL https://get.k3s.io | sh -
  sudo kubectl get nodes

Personally, I use both Kind and k3s. Kind is great for running locally and k3s is awesome for single-node kubernetes on a vps.

[0] - https://kind.sigs.k8s.io/docs/contributing/1.0-roadmap/#non-...

"not meant to be used" mostly just means that when something breaks, you're on your own.

Which, you know, you were anyway.

What are the benefits you find in using Kubernetes for a single-node cluster on a vps?

I host a few personal websites and Kubernetes allows me abstract the server setup. I don't need a highly available cluster and found k3s on a single node to be a nice replacement for manually configuring a reverse proxy, systemd services, deployment scripts, etc.

In that scenario it's kind of like a much nicer systemd for docker containers.

KIND uses kubeadm which is the upstream standard for bootstrapping. It also supports multi masters and even launches a haproxy load balancer for kube-apiserver! K3S uses it's own bootstrapping, as well as non upstream binaries. It also replaces etcd with it's own shim.

Non-kubeadm bootstrapped clusters are snowflakes in my opinion. You will find many docs that will not work with k3s due to missing systemd. You also lose some flexibility. For example there is no way to change the container runtime cgroup driver.

KIND is designed for local use on your workstation. Cluster API is the project to use for production deployments. You can drive Cluster API with KIND for testing though!

k3s has no direct relationship to systemd, I'm not sure what things not working you are referring too.

k3s has full flexibility to change any k8s option or component, it merely defaults to settings to meet it's goals of being a simple, lightweight, and secure distro.

> k3s has no direct relationship to systemd

Exactly, this differs from every kubeadm bootstrapped cluster which uses systemd to run the kubelet, as well using common config locations and behavior.

> k3s has full flexibility to change any k8s option or component

Quick Example: How do you change the kubelet's cgroup-driver? This looks to be hard-coded into the k3s binaries.


k3s is usually (AFAIK) run on VMs for production purposes, rather than being used for testing, which is where kind shines.

The major advantage of kind is you download a single binary, run one command and quickly have a functional, disposable cluster, anywhere that Docker runs.

k3s also shines in dev/test/CI. It's smaller and spins up faster and has specific features to preload applications on startup. It's a single ~50mb binary also so it's fast to download and run in CI. Many users have moved from kind to k3s in CI because of its speed, flexibility, and simplicity.

+1 to this. k3s is an extremely lean and simpler k8s. it is a fully certified k8s distro ..with backward compatibility code removed.

perfectly fine running on rpi or production.

Everything I know about Kubernetes I learned from a single-node (pseudo)clustered refurbished ThinkPad X201.

It cost me 90€ at an IBM refurbished sell. It is downstairs by the router, and it has been hosting everything for me except my email and my blog (which I want to host but I'm not sure about the reliability of my ISP's service, this morning it went down for 5h12m, without prior notice or anything).

It is amazing how much you learn from doing stuff. I'm currently on my 3rd year of univerity for CS, so I've tried the academia-style learning process, reading books on my own and doing stuff on my own. The latter is the best method by far.

I only really learn by doing stuff. I read books usually to get started but mostly quickly have to turn to building something. Then revisiting the book on the go. If I only read the book/paper/whatever, I usually ‘think’ I get it but nearly always that isn’t the case :-)

3 years ago, I also wanted a bare metal cluster for my homelab. I wanted x86-64, low power consumption, small footprint, and low cost. I ended up building this 5 node nano ITX tower:


I think that the exposed boards adds to its charm. Doesn't help with dust though.

Yours is a lot neater than the four node bare cluster I built a few years ago: https://rwmj.wordpress.com/2014/04/28/caseless-virtualizatio...

One issue with caseless machines is the amount of RF they emit. Makes it hard to listen to any broadcast radio near one and probably disturbs the neighbours similarly.

I'm now using a cluster of NUCs which is considerably easier to deal with although quite a lot more expensive: https://rwmj.wordpress.com/2018/02/06/nuc-cluster-bring-up/

Very nice; I've considered doing something similar and running some production sites on it from my home, but the limitation has always been my terrible Internet bandwidth through Spectrum.

We almost got Verizon gigabit fiber a few years ago... then AT&T ran fiber to the front of my neighborhood last year, and then never ran it to the houses. As it is, I'm stuck with 10 mbps uplink, which is not enough to be able to do most of what I would want to do with a more powerful local cluster.

This is very cool. Curious, roughly, how much did this setup cost?

Great article! Never stop tinkering.

Here‘s how I got to know Kubernetes:

By end of 2016, the Kubernetes hype was just about to pick up real steam. As somebody who always liked the idea of running something like an own cluster in the cloud, I attended KubeCon Europe in early 2017. The event was sold out, and took place in a venue almost too small for the number of attendees. It was great. During the event I was just about to finish the Hobby Kube[1] project. Back then weren’t any guides that addressed all the problems encountered when running Kubernetes on a budget, using low cost VPS on random cloud providers. So I dived into the subject in the second half of 2016 and started writing a guide, including automated provisioning using Terraform. I discovered WireGuard in the process of looking for a solution to efficiently securing network traffic between hosts. This still makes me feel like I was an early adopter to something that’s becoming hugely popular.

If somebody would like to add a Terraform module for deploying to Raspberry Pi, please ping me or open a PR here[2].

[1] https://github.com/hobby-kube/guide [2] https://github.com/hobby-kube/provisioning

I learned k8s with some NUCs we had laying around at work. Might be easier than Pi, but not as cheap. Some things I used: https://metallb.universe.tf/ (LoadBalancer) https://cilium.io/ (Networking) https://rook.io/ (Persistent Storage)

I also use MetalLB and Rook. Can't say enough good things about them.

I have also used Kube-Router (https://kube-router.io - Digital Ocean's non Virtual networking plugin for bare metal environments; it puts your containers on your physical network, which is freaking neat) and loved that, but since I started deploying kubernetes with Rancher I've found for dev clusters I'm not caring about what networking is used. (currently running Canal).

Not sure what we will decide on when we go to production.

I've had a lot of performance issues with Rook. I setup a k8s cluster six months ago on a few ThinkCenter Tiny's (more powerful than a Pi). They each have a single 120GB SSD, so I've allocated a directory on the filesystem (ext4) to Rook rather than a dedicated device.

Originally I only had 2GB of RAM and Rook ended up using most of this. Even now the CPU usage by Rook is quite high even when the system is idle. It hasn't lost data though, even though one of the SSDs died and was unrecoverable, so props for that. To be fair I haven't tried upgrading since I installed it so maybe my issues would be resolved by that.

Overall Rook feels somewhat overkill for a homelab user, but I haven't seen any recommended alternatives.

You need to set a memory target for your OSDs, or they will gobble up as much as possible for their cache. In Rook you do that by specifying the resource limits on your CephCluster[0].

Sadly, Rook enforces some pretty high minimum values (2GiB per OSD, 1 GiB per mon), which is fine for production but can make homelab experimentation annoying.

[0]: https://rook.io/docs/rook/v1.1/ceph-cluster-crd.html#cluster...

Does Rook give you the equivalent of EBS root volumes for your nodes then? Is that the function you have it providing? Does it offer something beyond using local host storage and minio?

I ask because I've generally been confused about the use case for Rook despite having read the "what is Rook?" paragraph many times on the project home page. My assumption is that it lets you build your own internal cloud provider. Is that correct?


Rook will give you:

- Ceph: Block Storage, Shared storage, Object storage

- EdgeFS: Object but apparently will function as Block and File

- Minio: Object

- NFS4: Shared

- Cockroach: Database

- Cassandra: Database

- YugabyteDB: distributed SQL database (new to me)

Only the first two are marked as stable.

I'll mention that they're all independent projects under rook's umbrella.

Not directly, rook provides a storage backend for in-cluster workload persistent volumes.

In theory you could manage kvm machines with kubevirt and back the machines with PVCs from rook. I have not tried this and would be curious how the performance is.

I would say they're functionality quite equivalent, you just wouldn't pull them in as a root volume.

You should be able to point your default StorageClass at Ceph and have it create RBD block storage devices for you, which would auto create pvc, that you mount you actual data in in your kube manifests.

The parent comment was asking about root volumes for kubernetes nodes

> Does Rook give you the equivalent of EBS root volumes for your nodes then?

I think we are saying the same thing if i'm not mistaken?

I doubt the parent really cares about whether or not it's a root volume, but for the record you can mount it to root.

It'll just essentially be empty, which means you have to add a step of populating root with something useful. You'd have to either try and use an initContainer to copy in a filesystem at runtime, have ceph give you a pre-populated directory, probably via thin provisioning, or do something out of band (I've did this in k8s).

This is usually more effort than it's worth though, as container runtimes populate the root for you with whatever you want anyway. Plus, if you start treating containers as blobs like VMs, you'll end up in the situation where you don't know where your important variable data is, which leads to situations where people forget to back it up and test it.

I only said "root volumes" as it's a common use case for EBS volumes. For instance with etcd running in a container you would want the host volume to be an EBS volume since it's critical it's a critical K8S component.

Right, but you would only need to mount the etcd data partition (/var/lib/etcd afaik), rather than the entire node and/or container.

The main problem you have here, is chicken and egg. How do you use a StorageClass, kubernetes PVC, or rook to provision Block Storage for etcd, when you need etcd for kubernetes to function, and you need kubernetes to function for rook, et all.

At some point, you need to bootstrap the world, which is people either start off with cloud APIs, ansible, or PXE.

."The main problem you have here, is chicken and egg. How do you use a StorageClass, kubernetes PVC, or rook to provision Block Storage for etcd, when you need etcd for kubernetes to function, and you need kubernetes to function for rook, et all."

I totally agree. The EBS lifecycle management is generally handled by something like Terraform. That's why I was wondering if the use case for Rook is primarily bare-metal Kubernetes since AWS/GCP et al. already provide these. So I'm wondering that even in a bare-metal environment where you still need to use config management tools like Ansible/Terraform to do things like provision block storage what's the upside of Rook over existing iscsi/Ceph/minio installations?

>"Not directly, rook provides a storage backend for in-cluster workload persistent volumes."

Right so is it fair to say that the use case for Rook is if you are running Kubernetes on bare metal? For instance if I'm Kubernetes cluster on AWS then AWS already provides PVC via EBS and S3 volumes. Or am I overlooking a use case where you would run Rook on cluster running on AWS/GCP?

Yes, rook is great for bare metal and would enable dynamically provisioned persistent volumes.

Rook runs a ceph cluster inside your kubernetes cluster to provide the storage. The downside is this consumes cpu/memory (and obviously storage) resources to run, whereas on AWS, EBS is integrated into the platform so it does not "run" inside your cluster (other than the aws cloud-provider that provides the integration).

If you wanted to run the same storage backend on bare metal and AWS, rook would enable that.

Thankfully these days, places like Digital Ocean and Linode have managed K8s where you only pay for the compute nodes, for $5/month each.

So it's fairly cheap and easy to learn on a "real" cluster without having to build one.

Building a cluster is a fun and relatively easy project. You learn much more by having the hardware at your fingertips. You can simulate network failures and power failures or you can crash an important daemon. By causing problems you can see how K8s responds and manages itself when it does not have the nodes it expects. It is important to know these things because, for example, if you create a pod instead of a replica set then the loss of a node will mean the inability to use the pod assigned to that node. You need to know how a pod and a replica set differ in order to create a self-healing stack. You can learn all of that with the cloud solutions but the ability to answer your own questions will always be the superior means of learning.

The Agile movement has convinced many that the solution that gets you up and running today is the best. A good engineer is not afraid of working and learning instead of just buying a pre-baked implementation and calling themself an experienced user. The cluster that I know like the back of my hand is my “real” cluster. The production cluster is someone else’s that I just rent.

Through my work on this project, I've learned a ton about kubectl, kubeadm, the control plane, etc. that I never learned when managing EKS clusters at a previous job. It's good to know more of the 'guts' of an insanely complex ecosystem like Kubernetes, which is why I often recommend people follow Kelsey Hightower's "Kubernetes the Hard Way" guide to install Kubernetes from primitives at least once.

I bought into the rhetoric about the superiority of using real tech when I was studying for IT. It will make you more competent when you actually get on the job. It gets you used to encountering issues that come up in real environments far more than virtualized/simulated ones E.G. loose cables.

Yet to be honest I wouldn't recommend going the real tech route because I don't think it justifies the time and money investment when you can learn the lessons it teaches you on the job. One thing I realised is that it really doesn't really help you pass certs or look good in interviews aside from your ability to say "I made an k8s cluster out of raspberry pis". I might endorse it from a strict fun standpoint.

Sure. It's not necessarily skipping all those things you mention that's the motivation. Just the desired order and upfront effort to learn things about K8S.

I LOVE Jeff's work and I own his Ansible book. The setup is pretty awesome. FWIW I think you could do this with local VMs a little more easily and provision with Vagrant. Having said that, a cluster of Pis is super fun!

What do you know, that's exactly how I do local development work for the cluster ;) https://github.com/geerlingguy/raspberry-pi-dramble/tree/mas...

Aha! Makes sense.

Yeah, using local vms is going to greatly expand access to this kind of learning. A current developer laptop is more than beefy enough, and I'd suggest that there aren't many useful lessons to learn from the pi hardware that can't be learned fully virtualized.

But if you're having fun, of course, more power to you :)

Just an aside, your parts list for the pi dramble has 4 Pi 4B's but micro usb power cables

Oops! I forgot to update that when I switched everything out for the 4 Bs. I'll update that in a little bit.

It has been updated.

This seems like a fun project and a great way to learn Kubernetes, but if I'm dropping this much money on it, I'd like it to have some productive purpose afterwards.

I'm a full-stack web dev primarily using node/react/postgres. I've also got some projects currently hosted on a Linode instance. Ideas on fun/productive uses for this cluster after I've built it and messed around with Kubernetes?

Move the projects to your cluster and see what happens!

Consider the fact that, if you make improvements to the cluster, all of your apps will see that same lift. So if you were to set up backups on the cluster's persistent volumes and its databases, you'd get free backups for all of the projects you've moved. Same with monitoring, autoscale, and so on.

RPI's will handle Node surprisingly well, from my experience. A small cluster of them would be well suited to any number of web projects. A few years back, I played around with haproxy and a few Pi's running Node servers. You may be surprised how well they work as servers, as long as you're not expecting Xeon-level speeds.

The Kubernetes ecosystem is evolving rapidly, you might want to keep it around to play with different CNIs, CSIs, service meshes, operators for clustered software lifecycle, and more.

I suppose I was lucky dabbling in one of my side-projects because I had money to burn. I put it all in Google Cloud and then learned just how much you have to complicate the stack (beyond the complication of K8S) to lower infra costs.

Suddenly I wasn't using basic Kubernetes any more, I was setting up a new ingress controller so Google wouldn't launch a new load balancer instance for every public service I exposed. Those things aren't cheap.

It was an amazing way to experience just how much you can suffer in the cloud, and just how far you can go down the rabbit hole with this kind of tech.

Wonder if some old netbooks could be used for this purpose. Doubt I am the only one with a small pile of them laying around.

From experience, anything with 2GB or more RAM can be a master node. Workers can even have 1GB and work just fine.

Be warned, though, in my experience etcd requires a somewhat decent read/write latency, or else it's going to fail, and when etcd fails everything fails. Your changes don't apply, etc.

_Technically_ 1GB is the limit, but yeah, at 1GB I was hitting OOM errors every day or so (this was on the Pi 3 B+). I was very happy when I found out the Pi 4 had 2 or 4GB models! No matter what, you should also make sure to not schedule Pods on the master node(s) when you have limited memory (or in most cases, really...).

Or set tight resource limits.

Setting resource requests also enables the use of HorizontalPodAutoscalers.

Very true. The Pi Dramble's web pods are set to scale using the HPA.

Thanks Jeff for your great Ansible roles!

This is also how my alma mater (Cal Poly SLO) teaches Hadoop. Building real world clusters are expensive, and giving each student their own is difficult. However, small clusters of Raspberry Pis are cheap, and it's also very easy to demonstrate how unplugging one affects the cluster.

@alexellisuk almost all of my interest in Kubernetes is due to your work. Thank you for everything you do!

You can also build a cluster on your computer if it's beefy enough by running multiple VMs. I bought a hp z820 workstation, 16 cores, 128gb for $1000 a few years ago and that's my k8s experiment land.

I don’t fully understand where the line goes from “all I need is a Docker container for nginx + Postgres + Redis + my services” to “I need Kubernetes”

When does one need to go from just Docker containers to container orchestration like k8s?

Do you care about availability? Do you care about rolling upgrades? Do you want to practice any of those two? Then it might be time to make the switch.

Yes, Docker has Swarm Mode. And yes, it is easier to get started with than K8s.

The problem is that it locks you into a lot of choices when you first spin up each (Docker) Service, that can't be changed without taking down the Service and then recreating it with the new option. Want to change how upgrades are rolled out? Too bad!

Kubernetes still has a few settings like that, but since (K8s) Services are just responsible for routing traffic to separately-managed Pods, you can always fall back to rolling the upgrade manually, or doing a blue-green deployment.

I do blue-green deploys with Docker and I deploy the services as “app-1”, “app-2”, “app-3”, etc.

How do you route dependencies to the correct service? If you're using a stack then you end up needing to blue-green the whole thing, if you're not then you still need to blue-green the whole dependent subset of the dependency graph.

And for anything exposed outside of the swarm you'd need to use some reverse proxy, since only one service (in the whole swarm) can bind a port at any one time.

> How do you route dependencies to the correct service?

Reload the nginx config with proxy_pass switch from blue -> green or green -> blue.

As soon as you want load balancing for high availability but I personally am not a fan of Kubernetes.

What's an alternative?

I can also recommend MagicSandbox (https://www.msb.com/) which provides a lot of learning content alongside a real k8s remote environment.

For me it was a Blue Gene/L.

Thanks Jeff!

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact