Hacker News new | past | comments | ask | show | jobs | submit login
UKL: A Unikernel Based on Linux (redhat.com)
156 points by perbu 6 months ago | hide | past | web | favorite | 37 comments

This is great progress towards making unikernels non-runtime specific.

However, I am still skeptical of the idea that unikernels will ever be production friendly. The biggest deficiency, by the definition of a unikernel, is the complete lack of debugging tools available. There is no top, perf, etc in a unikernel. In docker, I can still exec into a running container to debug and investigate. Unikernel issues lend themselves to “just restart it” rather than “let’s debug and fix it” mindsets.

I’ve run a k8s environment with kata containers for a while and as time has gone on, I’ve found that Solaris Zones (smartOS) are probably the most scalable way to achieve process/VM isolation for applications.

You would debug an unikernel just like you can debug the OS kernel. Either it brings its own internal "remote debugging API", or the hypervisor helps you with that and provides interactive tracing/debugging features and dumps with CPU and memory state that you load into a local debugger.

The kernel debugging technology exists for a long time now, and I don't think its far fetched to see public clouds exposing those abilities in a secure manner, and unikernels including their own tools for remote debugging. Any kernel can be debugged, and unikernels are no exception.

Then the question becomes, should all application developers become kernel developers/debuggers? Wouldn't that impose a significant barrier to adoption?

Probably (hopefully) no more than having a standard library forces you to be a "standard library debugger". That is to say, a little bit, but you can easily ignore the parts you don't care about.

htop and such don't make sense inside a single process system; logging usually gets sent to remote syslog/elk/etc. many apm solutions work out of the box

one of the biggest misconceptions about unikernels is that they are somehow of a slimmed down linux - it's easier to think about them as individual programs provisioned as vms - would you ssh into a process? why?

all 'debugging' tools work perfectly fine w/unikernels although I'd draw a fine line between real application level debugging which should never happen on production and ops tooling

> I'd draw a fine line between real application level debugging which should never happen on production and ops tooling

Sorry, but that's just plain wrong. One very common reason why people do that is, because performance debugging may give very different results on dev, staging and production systems that's why people want tooling that gives them as accurate as possible application level debugging information without impeding the behavior of the system. It's a very common problem in large scale systems.

I think that can be solved by good tracing and logging facilities. Both of which are already available in Linux (eBPF, ftrace, tracepoints, perf events...). You may have to customize them or polish them further though and provide more user friendly interfaces/frontends for them.

Though the amount of tracing/logging to equate to debugging is so much that you will want to turn it on selectively.

And you will want a mechanism to iterate quickly to see effects of tweaks.

Eventually you have built an interactive debugger.

Maybe. But KGDB and friends do exist too in Linux. And (AFAIK) not everyone likes using those, some just like throwing a few printks here and there. That is not universal obviously :) My point is that yes, unikernels may not be as easy to debug etc at this point as regular usermode code, but it is something that can be worked around and I think if people begin to use them in-mass frameworks and solutions will emerge.

Yes. It is a deficiency/roadblock to adoption.

One that could improve.

> would you ssh into a process? why?

Oh god, I wish. It's more common that you probably think.

You have a lot of options approximating this with a lot of thicker runtimes like Java/JVMs and Erlang/BeamVM or interpreted REPL'd languages like Python or Lisps. And they're incredibly useful.

Pulling up a REPL into a running process like a JS console in the browser is basically the same thing.

Yes I agree. We have to refactor the way we debug and realize it’s devugging a single process and that process is the application.

Sounds like adding the concept of debug builds and opt in instrumentation of release builds for unikernels would mostly address your concerns. I think it’s fair to say they’re not ready yet, but never seems unlikely to me.

Yes, how do you know EXACTLY what code/service/diagnosis/communication library you'll need in the future?

Answer: you don't. You don't know what corner you're painting yourself into.

The linux kernel, outside of its excessive amounts of drivers, is full of code that is useful for computers, vms, and containers in lots of different situations: normal operation, compromised, at load, in distress, etc.

I doubt you can even make a list of the utilities you'd need ahead of time for getting stats on the container's state. network? disk? processes? memory?

Linux obviously used to scale to very constrained computers (386/486s with a couple megs of ram was the PC state of the art when Linux was initially developed) up to the current supercomputers and large vms on AWS and other vendors.

A lot of the size bloat in linux for container images is the drivers. For containers and VMs, one really doesn't need all the driver variants because a VM/container should just be presented a limited virtual hardware interface. Then you could greatly reduce the driver portion of the monokernel.

Once you get rid of that, linux should probably concern itself with a couple "power of 1000" kernels.

kiloherz/kilobytes of ram (maybe not even bother with this) aka the 80s computer. Since this is an 8 or 16 bit computer, linux may not be practically back-scalable to this mode of computing, but I don't know linux history enough

megahertz/megabytes of ram aka the 90s computer aka the 32 bit era.

gigahertz/gigabytes+ (2000s to modern) aka the 64 bit era.

terascale is basically served by 64 bit kernels afaik.

Your typical container will basically fit into one of these profiles I would guess. But the basic linux/unix model should work for every one of them... because it has, since the 70s, on machines from the PDP-7 on up.

So IMO, container focused derivatives of Linux should concentrate work on tailoring to these levels.

Container applications should basically be targetted at one of the levels.

Granted, maybe the 1000 factor jump is a bit big. Economically there is a big difference between 1, 10, 100 MB and gigabyte memory spaces in particular, and what you can cram into a machine or pod. But the KB image should be able to overlap with the low end MB (although that takes ugly segmentation pointers and other memory extension hacks). The MB image can DEFINITELY encroach on the low end GB image.

Recent paper on being able to debug unikernels: https://dl.acm.org/citation.cfm?id=3267845

And a blog writeup on it: https://blog.acolyer.org/2018/11/14/unikernels-as-processes/

disclaimer: I'm interested in nabla containers and have an open PR on the runtime.

Link to GitHub repo:

The article itself should have linked to it. It's a bad look to announce a project and only link to a bunch of other "competing" projects. Makes it looks like vaporware, but apparently there is code out there.

Why post a link which is not clickable?

Cannot see any repo. Just his email.

How does this differ from LKL (the Linux kernel library : https://github.com/lkl/linux )?

A simple scan of the repo's description and the first sentence of the linked article should answer that question. LKL is a library that allows you to do Linux things (read/write linux filesystems) from outside of Linux, essentially re-using the kernel's code. UKL, is an unikernel which essentially allows you to compile an application into a standalone OS that runs it (not sure if that's entirely correct).

Now that I think about it a little more, they're kind of similar but I think unikernels are more of a way to get an application to run on hardware without all the extra bloat of a monolithic kernel (like if you're running a web service on a small virtualized machine) and the LKL is just for re-using Linux code.

Seeing as how you can already run normal Linux like a unikernel on N-1 cores, and get all of the debugging and management goodness on core 0, I don't see the point. At all.

the big idea is that most companies are heavily virtualized - if you are using aws/gce you most definitely are and many private datacenters are as well - I'm not going to speak for ukl in particular but in general unikernels are vastly more secure than a normal process and much faster (if you are already virtualized which you are on aws/gce) - there is absolutely no reason to have 2 layers of linux if you are in 'the cloud'

Except the fact most production level code has support from the OS running scripts or some other hack to keep the code running.

I have yet to see a production stack without this.

How is this different from UML?

It's the other direction.

UML runs Linux kernel in ring 3. It emulates hardware protection mechanisms in order to make debugging easier and to run VMs when there is no kernel access.

UKL runs userspace apps in ring 0. It discards hardware protection mechanisms in order to make processes run faster, decrease boot time, and decrease management overhead.

No one knows what that is

UML used to refer to an entity/relationship+methods modeling language.


Thank goodness we hear almost nothing about UML any more.

Just last week someone on our project was drawing a sequence diagram for an interaction between lambda calls on a cloud provider.

UML is pretty much in use on the enterprise space.

Why the hate for UML diagrams. They are quite useful.

This is really exciting. Great write-up, and I'll be following with interest.

That seems like it could be an interesting way to launch Rust applications. Typically (and I'm assuming UKL is no different), unikernels don't have any memory protection between applications and the OS. That's usually a bad thing; however, if your compiler can verify that memory accesses are all in-bounds, then all the hardware memory checks that an MMU would do in a regular OS are just unnecessary overhead. So, with UKL and a Rust application, you could get approximately the same level of safety as before, but without the performance cost of all that runtime memory protection. Entering the kernel can be just a regular function call, context switches are cheap, etc... That could be a huge performance boon for some apps.

In this sort of setup, unsafe Rust is like kernel code -- it effectively has privilege to read/write to any memory.

I'd also feel better if the kernel itself were written in something like Rust.

Most unikernels are written in type safe languages, as such the type system plays the role of a MMU.

all unikernel implementations I'm familiar with are single process by nature and thus the protection that you need on a multiple process system such as linux is completely un-necessary, especially since they are always deployed as vms

Well, strictly speaking it's safer to have a system where the application can't write into kernel memory space and the kernel can't (accidentally) write into application space.

You don't have the risk of leaking secrets or malicious interference between applications that are supposed to be isolated if you've only got one app, but I could still see where someone might rather have memory protection than not in a single-user/single-application environment.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact