A single 3600X will grossly outperform this cluster (and cost less) with less headaches (you don't have N physical machines) by using KVM to deploy a few virtual machines and using Kubernetes to orchestrate and allocate within those VMs. You'll also have a lot less latency between nodes running in VMs on the same physical host.
Another thing that unfortunately sucks about Raspberry Pis (less with Pi 4, but still mostly applies) is really shitty I/O performance...
I spent a large amount of time over the past summer and fall trying out various ideas to have a "cluster" at home that was both practical and useful. While, the PIs were nice, they never really amounted to much more than a demo. Latency and I/O become real problems for a lot of useful interconnected services and applications.
Honestly, if Ryzen 3000 hadn't come out, for cheaper cluster builds (~300-400) I still think Pis would be a solid choice but... Ryzen 3000 is just so fucking fast with a lot of cores, it's truly hard to beat.
Addendum: to touch on used servers, yes your power bill will go way up, no joke, but for some applications like large storage arrays-- it's hands down the cheapest/easiest route. Search by case, not by processor, it sounds weird but the case is likely the most valuable part of the old server (like ones with 20+ SAS2 slots for $500) or PCI-E slots that GPUs can fit into.
Server hardware vendors have traditionally not given two shits about their servers being north of 90 decibels, and I'm pretty sure I've witnessed a few that were pushing 100.
That Raspberry Pi is probably going to absorb more noise than it makes.
Isn't the whole point of using these layers of abstraction over hardware that you pay someone else to manage it?
The only times I can imagine needing hardware on my desk are when latency is super important or when I constantly need to manipulate the hardware (change hardware, play in bios, etc). In either case, I would not use k8s to run my software.
Information radiators make no sense at all if you view them from a purely objective standpoint. Wouldn't it be faster to just open the web page on your computer? They only make sense because of the way humans interact with the world (and I suspect in particular, the doorway effect).
The last demo I saw for Pi clustering, the guy fiddled with the blink rate and color of an LED on the motherboards. Sounds boring, as a demo it summed up a whole lot of crap into a simple visual.
That person is also needed in the future, and that person needs to start learning somewhere. A good start is messing around with it on a few Raspberry Pis
I have had the same setup for over 10 years, a PC in the garage running Ubuntu with lots of ram. I just upgrade it every so often with last-gen CPU and motherboard and swap-in the RAID controller.
I script all of my dev environments, if I need a K3S cluster it's literally 2 minutes (Bash, Python and some Ansible).
Want a set of RHEL machines? No problem, it's another script and 30 seconds.
For training you can't beat it. If I have to get out the "big guns" I have an old Dell tower server with 128GB of memory that I got cheap and has 12/24 threads. Sure it's a power-hog but I use it surgically.
Pis are great and you get a nice, limited blast-radius but the SD cards go wrong, they are slow at single-thread and they don't really have enough memory (Pi4 4GB being an exception). You can't beat x64 for compatibility either.
Also, gotta factor in electricity hosts.
I think it's far more realistic to choose a hypervisor (KVM, raise the roof) install it and configure it, run N virtual machines on a single machine, provisioning and slicing the hardware, setting up and managing the network between them, having disks run in a raid so the VMs don't starve for I/O and data is replicated, and then running kubernetes (or whatever personal hell you want) within your "virtual cluster." That's the whole "running a cluster" part all of these Pi Kubernetes things miss because they're running on consumer grade hardware.
Your hosting provider is surely not using thousands of small under-powered machines running bare metal installs of your applications.
Want to find out how resilient your cluster is? Just fucking kill one of the VMs. Boom. Instant feedback. Now try that with a PI, you'll have to unplug it, and then wait five minutes for it to boot. It becomes tedious fast.
Clustering and distributed systems are extremely complex and difficult... just having physical machines doesn't really even begin to scratch that problems you'll face, just my two cents.
Some deeper integration between Kubernetes and the hardware (acceleration/offload ASICs maybe), branding of k8s + this hardware as a unified product, and this would literally just be a mainframe. Which is not a terrible idea! Maybe Kubernetes is the mainframe operating system of the future.
For a long time I have been watching the ebb and flow between peer to peer and client server and it’s gotten quite a bit fuzzy lately. I suppose if you treat cloud providers as a large amorphous server, it sort of still fits the mold.
It's a neat form factor but you could just buy some regular Raspberry Pis and an Ethernet switch.
$189 is a little expensive for what you get.
The main idea behind Turing Pi is to deliver compute to the edge. If we look at cases where some compute will run low latency, highly available and internet independent apps to automate processes, and often in a hard to reach places, then the classic servers, not a solution. Turing Pi is an early version of edge computers with cloud-native architecture. Why it's important? Because if you are a business with some services running in the cloud and you want your edge computing organically to coexist with your cloud stack, then edge clusters could be a great choice. The speed to innovate and deploy your code into production to both cloud and the edge environment could be a critical component.
The existing Turing Pi model more oriented at forward-thinking developers who want to learn and push cloud-native to the edge. Why Raspberry Pi computers? They are not the most powerful computers, but they definitely can lower the entry point for developers by offering a huge and well-documented software ecosystem.
> The nodes interconnected with the onboard 1 Gbps switch. However, each node is limited with 100 Mbps USB speed.
Not only that but Compute Module 3+ are limited to 1GB RAM, is it really expected someone could run a realistic workload? How stable is the control plane node with such limited resources?
It seems like picking up 3 raspi4s (4GB RAM each) and powering via PoE would be a must better result.
edit-before-actually-posting (sorry I'm a bad person for typing a reply before clicking the link):
Wait, they're selling these things? Ok, then I'm stumped. What would you do with 8 rpi's that you couldn't do with one?
Any idea how the specs compare?
These are interesting ideas, but until the compute modules start having more on board RAM, you'd be much better off working with a few RPi 4's. They will be faster (1Gbps ethernet, more RAM), and cheaper since you only need a gigabit switch to connect them together, not a custom carrier board.
If the compute modules start to get more powerful, then having to deal with only one power supply and ethernet uplink would be nice. It's a very appealing idea. But, you're going to almost always have a better experience with a bunch of standard RPis.
3D printed trays for rockmount fiber cassette blanks and set it in a network rack.
It's also pretty cool being able to turn the pis on or off by cycling the switch port from your network control plane (Unifi in my case)
You could even do it dynamically via the api for a "bare-metal autoscaling" type workflow.
You can get one large 5V PSU and go directly to the pins on the Pi. No need to bother with individual USB adaptors...
This looks fairly similar.
And can we all just pause for a moment and look at that heat sink on the ethernet controller? Holy cats, what's goin' on there?
Personally, for my multi-node test clusters, I just run VMs on cheap x86 hardware.
But there are reasons to run containers on VMs in production too. Hypervisors and container orchestration tools solve very different problems. Depending on what problems you are trying to solve, it might be useful to leverage both.
The default pod limit per node 110. This is by design and considered a reasonable upper limit for the kubelet to reliably monitor and manage everything without falling over into NotReady/PLEG status.
If your node has a ton of cpu and memory, then 110 pods will not come close to utilizing all the metal. You can go down the path of tuning and increasing the pod limit, but this is risky and often triggers other issues on components that are designed for more sane defaults.
It also means that if your node goes NotReady (not a hardware failure), it's now a much bigger deal because you have fewer total nodes and many more pods to re-schedule at once.
This is solved by splitting up these massive nodes into smaller nodes via virtualization.
It's also nice having an api-driven layer to manage and upgrade the vms versus shoehorning a bare-metal solution. I would argue it also encourages immutable infrastructure by making it much more accessible.
There are bare-metal solutions but it's often more complicated and slower than launching/destroying vms.
The main reason being on premise deployments. Our workload is primarily machine learning models, as such they are well suited to docker containers due to good support and being able to quickly build images, deploy and tear down.
Additionally, whilst the task would be well suited to a Kubernetes deployment which is ideal in the cloud, on prem we're likely running on a single box running multiple VM's. Hence we use containers within the VM.
It isn't the perfect solution and definitely has it's downsides, but it is good enough and better than the alternatives when considering everything possible.
You could write your application multiple times, and publish multiple images for the same application across different infrastructures.
Or, just build your application once to run in containers, and run those containers on k8s everywhere.
It's basically just NUC machines with rack mount.
It's a lot more powerful than Rpi, quite power-efficient (when compare to actual server rack) and dead silent.
More importantly, this setup can handle some actual workload