Each node is a Docker container, but the version of Kubernetes running inside it is vanilla Kubeadm, so it's quite representative of what "real" clusters would look like.
The great thing about it is you can spin up a cluster in < 2 mins locally to try things out, then it's a single command to delete.
Note that I maintain a parallel configuration that runs a multi-node cluster on Vagrant for local development  as well as a docker environment built for CI purposes using almost all the same images, with the same Ansible playbooks to configure them all across platforms.
BTW I think your ansible book is great, really helped me get to grips with it.
Thanks for sharing. I fully agree with the fun, and the really-hard-way part.
However, if we ignore that part for the moment, would you think having an R-Pi cluster handy is worth it for training/sandboxing/testing with the aim of applying the skills in your day job?
Are there any advantages to having access to your own R-Pi kubernetes cluster, despite having stuff like Kind and Minikube available online?
Kind is intended to be a development tool to run fully upstream clusters using kubeadm in Docker containers.
k3s is a custom distro that is oriented towards lightweight environments. The scope of k3s and kind are different (distro vs tool to run k8s in docker). There is another project called k3d (https://github.com/rancher/k3d) that more directly compares to kind. k3d will run k3s clusters in Docker. k3d creates clusters significantly faster and uses less mem/cpu resources than kind.
Non-Goals: Being “production workload ready” - kind is meant to be used:
for testing Kubernetes itself
for testing against Kubernetes (EG in CI on Travis, Circle, etc.)
for “local” clusters on developer machines
NOT to host workloads serving user traffic etc.
curl -sfL https://get.k3s.io | sh -
sudo kubectl get nodes
 - https://kind.sigs.k8s.io/docs/contributing/1.0-roadmap/#non-...
Which, you know, you were anyway.
Non-kubeadm bootstrapped clusters are snowflakes in my opinion. You will find many docs that will not work with k3s due to missing systemd. You also lose some flexibility. For example there is no way to change the container runtime cgroup driver.
KIND is designed for local use on your workstation. Cluster API is the project to use for production deployments. You can drive Cluster API with KIND for testing though!
k3s has full flexibility to change any k8s option or component, it merely defaults to settings to meet it's goals of being a simple, lightweight, and secure distro.
Exactly, this differs from every kubeadm bootstrapped cluster which uses systemd to run the kubelet, as well using common config locations and behavior.
> k3s has full flexibility to change any k8s option or component
Quick Example: How do you change the kubelet's cgroup-driver? This looks to be hard-coded into the k3s binaries.
The major advantage of kind is you download a single binary, run one command and quickly have a functional, disposable cluster, anywhere that Docker runs.
perfectly fine running on rpi or production.
It cost me 90€ at an IBM refurbished sell. It is downstairs by the router, and it has been hosting everything for me except my email and my blog (which I want to host but I'm not sure about the reliability of my ISP's service, this morning it went down for 5h12m, without prior notice or anything).
It is amazing how much you learn from doing stuff. I'm currently on my 3rd year of univerity for CS, so I've tried the academia-style learning process, reading books on my own and doing stuff on my own. The latter is the best method by far.
I think that the exposed boards adds to its charm. Doesn't help with dust though.
One issue with caseless machines is the amount of RF they emit. Makes it hard to listen to any broadcast radio near one and probably disturbs the neighbours similarly.
I'm now using a cluster of NUCs which is considerably easier to deal with although quite a lot more expensive: https://rwmj.wordpress.com/2018/02/06/nuc-cluster-bring-up/
We almost got Verizon gigabit fiber a few years ago... then AT&T ran fiber to the front of my neighborhood last year, and then never ran it to the houses. As it is, I'm stuck with 10 mbps uplink, which is not enough to be able to do most of what I would want to do with a more powerful local cluster.
Here‘s how I got to know Kubernetes:
By end of 2016, the Kubernetes hype was just about to pick up real steam. As somebody who always liked the idea of running something like an own cluster in the cloud, I attended KubeCon Europe in early 2017. The event was sold out, and took place in a venue almost too small for the number of attendees. It was great. During the event I was just about to finish the Hobby Kube project. Back then weren’t any guides that addressed all the problems encountered when running Kubernetes on a budget, using low cost VPS on random cloud providers. So I dived into the subject in the second half of 2016 and started writing a guide, including automated provisioning using Terraform. I discovered WireGuard in the process of looking for a solution to efficiently securing network traffic between hosts. This still makes me feel like I was an early adopter to something that’s becoming hugely popular.
If somebody would like to add a Terraform module for deploying to Raspberry Pi, please ping me or open a PR here.
I have also used Kube-Router (https://kube-router.io - Digital Ocean's non Virtual networking plugin for bare metal environments; it puts your containers on your physical network, which is freaking neat) and loved that, but since I started deploying kubernetes with Rancher I've found for dev clusters I'm not caring about what networking is used. (currently running Canal).
Not sure what we will decide on when we go to production.
Originally I only had 2GB of RAM and Rook ended up using most of this. Even now the CPU usage by Rook is quite high even when the system is idle. It hasn't lost data though, even though one of the SSDs died and was unrecoverable, so props for that. To be fair I haven't tried upgrading since I installed it so maybe my issues would be resolved by that.
Overall Rook feels somewhat overkill for a homelab user, but I haven't seen any recommended alternatives.
Sadly, Rook enforces some pretty high minimum values (2GiB per OSD, 1 GiB per mon), which is fine for production but can make homelab experimentation annoying.
I ask because I've generally been confused about the use case for Rook despite having read the "what is Rook?" paragraph many times on the project home page. My assumption is that it lets you build your own internal cloud provider. Is that correct?
Rook will give you:
- Ceph: Block Storage, Shared storage, Object storage
- EdgeFS: Object but apparently will function as Block and File
- Minio: Object
- NFS4: Shared
- Cockroach: Database
- Cassandra: Database
- YugabyteDB: distributed SQL database (new to me)
Only the first two are marked as stable.
In theory you could manage kvm machines with kubevirt and back the machines with PVCs from rook. I have not tried this and would be curious how the performance is.
You should be able to point your default StorageClass at Ceph and have it create RBD block storage devices for you, which would auto create pvc, that you mount you actual data in in your kube manifests.
> Does Rook give you the equivalent of EBS root volumes for your nodes then?
I think we are saying the same thing if i'm not mistaken?
It'll just essentially be empty, which means you have to add a step of populating root with something useful. You'd have to either try and use an initContainer to copy in a filesystem at runtime, have ceph give you a pre-populated directory, probably via thin provisioning, or do something out of band (I've did this in k8s).
This is usually more effort than it's worth though, as container runtimes populate the root for you with whatever you want anyway. Plus, if you start treating containers as blobs like VMs, you'll end up in the situation where you don't know where your important variable data is, which leads to situations where people forget to back it up and test it.
The main problem you have here, is chicken and egg. How do you use a StorageClass, kubernetes PVC, or rook to provision Block Storage for etcd, when you need etcd for kubernetes to function, and you need kubernetes to function for rook, et all.
At some point, you need to bootstrap the world, which is people either start off with cloud APIs, ansible, or PXE.
I totally agree. The EBS lifecycle management is generally handled by something like Terraform. That's why I was wondering if the use case for Rook is primarily bare-metal Kubernetes since AWS/GCP et al. already provide these. So I'm wondering that even in a bare-metal environment where you still need to use config management tools like Ansible/Terraform to do things like provision block storage what's the upside of Rook over existing iscsi/Ceph/minio installations?
Right so is it fair to say that the use case for Rook is if you are running Kubernetes on bare metal? For instance if I'm Kubernetes cluster on AWS then AWS already provides PVC via EBS and S3 volumes. Or am I overlooking a use case where you would run Rook on cluster running on AWS/GCP?
Rook runs a ceph cluster inside your kubernetes cluster to provide the storage. The downside is this consumes cpu/memory (and obviously storage) resources to run, whereas on AWS, EBS is integrated into the platform so it does not "run" inside your cluster (other than the aws cloud-provider that provides the integration).
If you wanted to run the same storage backend on bare metal and AWS, rook would enable that.
So it's fairly cheap and easy to learn on a "real" cluster without having to build one.
The Agile movement has convinced many that the solution that gets you up and running today is the best. A good engineer is not afraid of working and learning instead of just buying a pre-baked implementation and calling themself an experienced user. The cluster that I know like the back of my hand is my “real” cluster. The production cluster is someone else’s that I just rent.
Yet to be honest I wouldn't recommend going the real tech route because I don't think it justifies the time and money investment when you can learn the lessons it teaches you on the job. One thing I realised is that it really doesn't really help you pass certs or look good in interviews aside from your ability to say "I made an k8s cluster out of raspberry pis". I might endorse it from a strict fun standpoint.
But if you're having fun, of course, more power to you :)
I'm a full-stack web dev primarily using node/react/postgres. I've also got some projects currently hosted on a Linode instance. Ideas on fun/productive uses for this cluster after I've built it and messed around with Kubernetes?
Consider the fact that, if you make improvements to the cluster, all of your apps will see that same lift. So if you were to set up backups on the cluster's persistent volumes and its databases, you'd get free backups for all of the projects you've moved. Same with monitoring, autoscale, and so on.
Suddenly I wasn't using basic Kubernetes any more, I was setting up a new ingress controller so Google wouldn't launch a new load balancer instance for every public service I exposed. Those things aren't cheap.
It was an amazing way to experience just how much you can suffer in the cloud, and just how far you can go down the rabbit hole with this kind of tech.
Be warned, though, in my experience etcd requires a somewhat decent read/write latency, or else it's going to fail, and when etcd fails everything fails. Your changes don't apply, etc.
Setting resource requests also enables the use of HorizontalPodAutoscalers.
When does one need to go from just Docker containers to container orchestration like k8s?
Yes, Docker has Swarm Mode. And yes, it is easier to get started with than K8s.
The problem is that it locks you into a lot of choices when you first spin up each (Docker) Service, that can't be changed without taking down the Service and then recreating it with the new option. Want to change how upgrades are rolled out? Too bad!
Kubernetes still has a few settings like that, but since (K8s) Services are just responsible for routing traffic to separately-managed Pods, you can always fall back to rolling the upgrade manually, or doing a blue-green deployment.
And for anything exposed outside of the swarm you'd need to use some reverse proxy, since only one service (in the whole swarm) can bind a port at any one time.
Reload the nginx config with proxy_pass switch from blue -> green or green -> blue.