Lightweight Kubernetes Operators with WebAssembly – FOSDEM 2023 [video]

mesebrec · on Feb 5, 2023

If you prefer to read, this blogpost explains the same project: https://merlijn.sebrechts.be/blog/2022-09-05-wasm-k8s-contro...

yjftsjthsd-h · on Feb 5, 2023

More interesting to me is the line in the first graph labeled "no isolation", which has lower memory still. I would expect operators to be trusted, so I don't mind the security implications running them without isolation, if that's what that really means. Is that a viable way to reduce the resource use even further?

pickledish · on Feb 5, 2023

Thank you for the link!

> Developers want to use Kubernetes in the edge, but it uses too much memory for most devices

...wow, I had absolutely no idea that this was a thing people wanted or a problem folks were out there trying to solve. No judgement, who am I to say with so little information whether it’s a good idea or not, just — what a concept eh

mesebrec · on Feb 5, 2023

For clarification, I'm talking about devices such as home automation gateways, planes, and production lines.

These devices often have bespoke toolchains for deploying and managing software on them. This is an additional barrier of entry: cloud teams need to learn a new toolchain to move to the edge. Moreover, these toolchains often need changes to how release management works. So naturally, the question arises "why can't we just use the same tools that we use in the cloud?"

Right now, Kubernetes isn't the greatest fit for the edge. Distributions such as K3s and KubeEdge try to solve some of the problems with Kubernetes itself, but that's only half of the picture. A large advantage of Kubernetes is the ecosystem of third party operators. So that's what we're trying to address.

MuffinFlavored · on Feb 5, 2023

What does the edge mean? I've always understood it as, you are most likely using cloud hosting of some sort and edge means "an instance of some code running in the geolocation closest to your customers"?

I can't think of many projects I've worked on where I've come across that need for performance. How do you handle connecting to a database or cache? Do you have a bunch of databases shaded at the edge? Does the edge talk back to your database that isn't at the edge?

Buzzwords: Kubernetes, WebAssembly, edge

Not saying that with any negative connotation. Interested in learning typical every day problems that require this. I thought edge was for like... get static HTML/CSS/JS/images/video to customers as quickly as possible (sub 50ms)

coredog64 · on Feb 5, 2023

Chik-Fil-A and Target run k8s in each store. CFA’s experience doing so had them providing lessons-learned feedback to the DoD so that they could replicate it for on-aircraft software deployments.

twalla · on Feb 5, 2023

On-aircraft k8s gives CrashLoopBackoff a whole new meaning.

agoose77 · on Feb 6, 2023

I can only picture how one of those three terms would work

mesebrec · on Feb 5, 2023

For this project, I'm specifically referring to low-power devices sitting inside of the user's premises. For example, home automation gateways, factory machines, drones etc.

I'm not really talking about CDNs or offerings such as Fastly Compute@edge.

Right now, most of these devices have bespoke solutions for deploying and updating software. Switching those devices to Kubernetes allows developers to use the same tools and processes from the cloud.

cpitman · on Feb 5, 2023

It means varying levels of "not a central data center". For anyone wanting to do edge, you first need to understand what it means to them. Sometimes the edge is distributed sites that still have a rack of hardware, say in factories or distribution centers. Sometimes it is one or two computers at all of your stores, but still "general purpose" hardware. And sometimes it is purpose built devices, like Point of Sales Systems or a radio on every train. Etc.

Each form factor has different considerations, and each application different design patterns. For example, at a retailer I worked at each store had a message broker, and orders would be sent through that broker up to the central data center. If/when the store was disconnected, it could continue to queue up orders until the connection was restored. As opposed to housing everything in the data center, and needing to fall back to a paper process if the internet goes down.

wrs · on Feb 5, 2023

I would define it as “the closest place to the end user that you can deploy software”. In the case of a web app, yes, that may be a CDN point of presence. In the case of Walmart, there are servers in the store running business apps. On a farm there might be something running on a fancy tractor.

MuffinFlavored · on Feb 5, 2023

If the servers running in-store at Walmart have to talk to a server hosted by Walmart that isn't in store, what's the benefit?

mesebrec · on Feb 5, 2023

* much lower latency for most operations * the ability for graceful degradation when the network or cloud is down. You don't want the checkouts to stop working just because the cloud is down.

If literally every action requires a round trip to the cloud, then edge computing isn't for you. That's not the case for a lot of functionality, however.

Arbortheus · on Feb 5, 2023

I think in this scenario it just means a localised server. I imagine the problem people are trying to solve is just wanting to manage infrastructure in modern GitOps/Infrastructure-as-code ways.

Ultimately people want to deploy all their tech infrastructure in such a way that they can address the "treat X as cattle, not pets" problem. Kubernetes is one way to do this.

azakai · on Feb 5, 2023

> What does the edge mean?

I was confused by that too. I've seen "edge" mean certain things in the web space (Cloudflare workers, Fastly compute, etc.), but in this article it apparently means "IoT" (Internet of Things, tiny devices with internet connections).

factorialboy · on Feb 5, 2023

> What does the edge mean?

Server with limited capabilities located near you.

Felmioytfr · on Feb 5, 2023

K8s doesn't need to do unlimited things.

Why run docker if you can push k8s further down with perhaps less features?

K8s could be as lightweight as nomad.

p_l · on Feb 6, 2023

There's separated kubernetes api-only server (to run just the control plane) and a virtual kubelet implementation that actually runs systemd services.

Felmioytfr · on Feb 6, 2023

Do you have a name or link?

p_l · on Feb 6, 2023

KCP (api-only server): https://github.com/kcp-dev/kcp

SystemK (virtual kubelet): https://github.com/virtual-kubelet/systemk

pmig · on Feb 5, 2023

We are currently developing a Kubernetes operator in Kotlin and exploring methods the reduce memory footprint. Graal Native gives us a reduction from 400mb to less than 200mb. But seeing your numbers it still seems bloated.

Felmioytfr · on Feb 5, 2023

Hui that sounds massive.

My quarkus kotlin native image build results in 80mb last time I checked

pmig · on Feb 5, 2023

Yes we needed to include the quarkus operator SDK. Do you use Graal native in production?

Felmioytfr · on Feb 6, 2023

Thaxll · on Feb 6, 2023

Why did you built it in Java in the first place, Go would probably yield way better result.

pmig · on Feb 6, 2023

We were more familiar with Kotlin than Go. Also the Go Operator Framework looked more a less like a code generator. With the Java operator sdk many of these tasks are integrated into the framework. In the end all operators just performing and receiveing REST calls against the Kubernetes API.

Thaxll · on Feb 6, 2023

I don't understand how using WebAssembly can reduce memory consumption since container runtime don't add any. Running your app natively or in Docker should not change the memory usage.

super256 · on Feb 6, 2023

This is about reducing the footprint of the controller plane.

kincl · on Feb 5, 2023

If one of the main advantages is being able to suspend to disk why not just implement CRIU (Checkpoint/Restore In Userspace) in Kubernetes?

spullara · on Feb 6, 2023

There is support for it: https://kubernetes.io/blog/2022/12/05/forensic-container-che...

imglorp · on Feb 5, 2023

Wasmtime didn't mention unloading of code. Where is that coming from? I'm interested to apply that elsewhere!

phickey · on Feb 6, 2023

Wasmtime modules have an internal arc, and are unloaded from memory when there are no more references.

imglorp · on Feb 6, 2023

I see, thank you.

otohp · on Feb 6, 2023

I dont understand why you would want to run k8s on an embedded device on the edge? What problem is this solving?

politelemon · on Feb 5, 2023

> Why

> Developers want to use Kubernetes in the edge

Why?

Arbortheus · on Feb 5, 2023

I think in this scenario "edge" is a general term for a localised server. I imagine the problem people are trying to solve is just wanting to manage infrastructure in modern GitOps/Infrastructure-as-code ways.

Ultimately people want to deploy all their tech infrastructure in such a way that they can address the "treat X as cattle, not pets" problem. No one wants to be SSH-ing into boxes to update container image SHAs. Kubernetes is one way to do this.

You get a lot of stuff for free with k8s that you'd need to reinvent the wheel for otherwise.

qbasic_forever · on Feb 5, 2023

Someone like Chick-fil-A has tens of thousands of cash registers that are really just little Linux computers. They want to make sure all of those registers are running their most up to date cash register code, has access to their internal Chick-fil-A data, (inventory, finances, menu, etc) and do it in a fault tolerant way so one store losing internet access will eventually recover when it gets online again.

Kubernetes is perfect for this kind of scenario--Chick-fil-A HQ runs a k8s control plane that all of the stores and registers connect to as nodes and receive explicit code and other state to run. From a central command they can instantly update everything, monitor it, add/remove nodes, etc. They can do it all with just k8s and kubectl, they aren't bodging together piles of shell scripts, ansible scripts, custom tools, etc.

remram · on Feb 6, 2023

Kubernetes is not perfect for this scenario at all? Kubernetes is meant for clusters of interchangeable computers, to make sure a networked app stays available when losing some nodes. It provides unified networking (all pods in same network, even if on different nodes), service discovery, scaling, zero-downtime upgrades.

None of that works or even makes sense if you have end-user devices. Nodes are not interchangeable, you can't route around breakage (if one point of sale goes down, that's one physical device with a dark screen, adding one pod somewhere else won't help), the networking is mostly useless, zero-downtime impossible (you have only one physical screen to use, can't have two containers grab it during a rolling upgrade), service discovery is irrelevant unless you are running your database on those cash registers for some reason.

Use Ansible/Salt/... instead. Kubernetes is hype, and it's what people know, and it can technically do some of those things, but it's a terrible tool for this job.

cassianoleal · on Feb 6, 2023

Your entire second paragraph seems to wholly ignore the existence of StatefulSets.

remram · on Feb 6, 2023

You mean DaemonSets, maybe? I don't know what you're trying to say.

cassianoleal · on Feb 6, 2023

lol you're right, I do. Sorry, was just waking up when I wrote that!

What I'm trying to say, is:

> Nodes are not interchangeable

They don't have to be. You just need to make sure that certain pods run on all nodes that match certain criteria. That can be easily achieved with DaemonSets, taints and tolerations.

> you can't route around breakage (if one point of sale goes down, that's one physical device with a dark screen,

Sure, that's the same regardless of how you deploy software to the POS.

> adding one pod somewhere else won't help)

DaemonSets won't do that, so it's not a problem.

> the networking is mostly useless, zero-downtime impossible (you have only one physical screen to use, can't have two containers grab it during a rolling upgrade)

I believe your argument here is that k8s provides more features than what you need. That can be said about anything, all the way down to the hardware. Surely there's a lot of things the CPU on the POS can do that's mostly useless in this context.

> service discovery is irrelevant unless you are running your database on those cash registers for some reason.

Same as above, I guess...?

remram · on Feb 6, 2023

As I said, it "can technically do some of those things, but it's a terrible tool for this job".

You can also use Kubernetes as a key-value database, wrapping values in ConfigMaps. You can do that, it will work. It is a terrible tool for that job too.

cassianoleal · on Feb 6, 2023

It's definitely not meant to be used as a key-value database and I agree it's a terrible tool for that - although that capability can be useful in certain situations.

That said, it's actually meant to deploy software to computers. It's actually not bad at all at that. I see nothing in your arguments that show why it's a terrible tool for this job. That said, this is mostly personal opinion anyway. You say using ansible/salt/etc is better for this job. It might be, depending on many factors. It's not great either though. I imagine Kubernetes is solving real problems for those who are using it in this kind of context.

Note that I'm not defending it, I just don't see why it would be so bad.

locustous · on Feb 6, 2023

Can track inventory and sales via couchdb, it's built for accommodating limited connectivity and maintaining versioning and updating a remote sync.

Have the clients update themselves when new releases are done. Have a backup access method, say, ssh if there are issues.

Corporate can pull all relevant data from couch to know the exact state of the distributed network.