Hacker News new | past | comments | ask | show | jobs | submit login
LightVM – A new virtualization solution based on Xen (neclab.eu)
215 points by fanf2 on Nov 1, 2017 | hide | past | web | favorite | 55 comments



Great job and nice to see Romania featuring in the news!

To those who just spent the last two years retraining your teams and retooling your infrastructure explicitly for docker (who may show up in this thread embracing and enhancing with a large marketing budget shortly), do take this opportunity to learn the architectural and management/maintenance value of abstraction. ;)


You do realize that Docker containers have an abstract interface and can be run on all OCI runtimes right?


VM's tend to lose the overlay layered filesystem which can dramatically reduce disk usage. Having the filesystem reset to a clean state for every new container is a huge feature of containers. And VM's tend to need predefined dedicated resources for things like memory. A process in a container would only allocate memory when it needs and can free up for other processes to use. It's not all about the startup speed.

That said, VM's have their place, and docker has the option to switch out backends. It's entirely possible to replace runc with some other tool that starts VM's instead of containers. (That's already happening today with Windows containers.)


An example for such an alternative backend would be runv, which can apparently be used with Docker (though I couldn't get it to work well, but that was probably just my fault)

https://github.com/hyperhq/runv#run-it-with-docker


>Having the filesystem reset to a clean state for every new container is a huge feature of containers.

Could you use file system snapshots for this? Maybe also for the layers?


If it works for docker[1], it probably works for this.

[1]: https://docs.docker.com/engine/userguide/storagedriver/zfs-d...


Xen is very much inspired by exokernels (you could even make the argument that it is an exokernel), so it makes sense that someone would push it more in that direction.

That being said, if you're going to go that way,it's to bad that there isn't more inspiration from the past 20 years of OS design. A capability based security/object management interface would nice. I also really like Akaros's VM threads model; IMO that'll be the way we end up running what we currently call unikernels.


it's to bad that there isn't more inspiration from the past 20 years of OS design. A capability based security/object management interface would nice.

Agreed and seL4 comes to mind. It's capability based, quite fast and secure. For that matter, it's also quite small.


Akaros looks pretty interesting- the M:N aspect is certainly where I think we need to go.



If I'm reading this right, that's pretty major. The isolation benefits of a VM with a bootspeed faster than docker?


https://hypercontainer.io is also in this space. Doesn't boot as fast as docker, but faster than a normal VM.



Right, Clear Containers is a feature of both rkt and cri-o container runtimes. rkt has had that feature for a couple years, if I recall correctly.


With the debugging benefits of a brick.


As long as debugging isn't fundamentally incompatible with a unikernel approach, it's just a matter of maturity. And it's pretty silly to criticize a new, prospective technology for immaturity.


> As long as debugging isn't fundamentally incompatible with a unikernel approach, it's just a matter of maturity.

Well no, it's not "fundamentally" incompatible, it just doesn't seem very practical. And practicality matters IMHO.

Maybe I'm missing something but it seems the debug mode of these systems would be very similar to how you debug a regular kernel. I.e. you have some GDB stub provided by your hypervisor. That might be fine for GDB, but what about all the tools to observe a running system? How would I ftrace, perf, strace, netstat, tcpdump, poke around in /proc and /sys etc?

Sure there is noting stopping you from developing equivalent tools. But it's not very practical. Building a truly isolated container environment for Linux (something like Zones) would also give you isolation and provisioning speed, but you get to keep all your tooling.


> Well no, it's not "fundamentally" incompatible, it just doesn't seem very practical. And practicality matters IMHO.

I don't think it's fundamentally impractical either, since a unikernel is simply the parts of the OS that you need compiled into the application binary. Maybe this means it is actually running an SSH server, but I think it will look more like a debug service running in the application (if only because unikernels likely wouldn't have processes or files or other concepts that are designed for a multi-tenant world).

> But it's not very practical.

I think this is the question--do the gains justify the costs. And considering the gains include a reduced attack surface, simpler orchestration, reduced resource allocation, etc. I think this is the case in the cloud market, which is only growing.

> Building a truly isolated container environment for Linux (something like Zones) would also give you isolation and provisioning speed, but you get to keep all your tooling.

You don't get to keep all of your tooling with containers anyway--unless you're rolling your own orchestration and container runtime, you're almost certainly using the tools provided by or building off of your container runtime. This is especially true if your service runs on dozens or hundreds of hosts--you're probably not just SSHing onto prod servers to do your debugging with your ordinary tools; you're relying on something that abstracts over those hosts (of course, there are exceptions, but they're just that: exceptions).


Try it with the LING Erlang-on-Xen unikernel and see for yourself if it's not debugabble.

Looks like they've got a solution for dubugging: https://twitter.com/erlang_on_xen/status/641628659657371648


Again, I'm not saying it can't be done. If Erlang+Xen=Debug then that's great.

But if we're comparing unikernel-VMs to containers, my guess is that most users are after the "take my $LANG binary and concatenate it with half a Linux kernel"-workflow. And for that general case, most standard tools are off the table.


I haven't been on the server side in a while, but 1) isn't Xen falling out of favor and 2) is docker boot speed a big problem?


Xen is having a tough time keeping up with qemu/KVM which can take advantage of all the development time Intel spends on the kernel and the scheduler.

Obviously Xen will have a long life due to AWS, but I am failing to see anything that machined and kvm don't offer here on the performance oriented side.

There is probably be some use case but the xen time slice model doesn't seem to know enough about frequency binning or the impacts of high energy/heat instructions on them to be very competitive in the applications I can think of.

Dockers main problem is an increasing number of use cases, and their engineering decisions not meeting everyone's needs. To be honest if it wasn't for faux container support needs on windows/osx I am betting that systemd-nspawn would take over that market due to this.

For the most part docker performance is mostly limited by concurrency and amdahl's law (including fs perf). I don't see Xen solving these issues, but there are very real security benefits.

One very real use case I can think of is that any person or process that can launch a container on Docker is effectively given password-less sudo access, particularly because you can't disable the privileged flag. The attack surface on Xen would be much lower in that case but once the exploits start arriving docker could change that choice and reduce their attack surface quite a bit.


Boot speed is a big selling point for Docker, but Docker's isolation story is poorer than a VM. If you can make a VM boot as quickly as Docker while preserving isolation, then Docker loses a big selling point.


It’s called Zones (on illumos, prev. on Solaris)


I'm a huge fan of Illumos and SmartOS and think that Zones + LX Branding are a far superior technology than the hodgepodge of cgroups + namespaces + userland container technology. HOWEVER---the amount of driver support in the OpenSolaris forks, the awful package management system, the ancient IPF system etcetera make it a non-starter for most environments. I have used SmartOS in production (along with the Triton ecosystem) and found it to be well thought out and compelling, albeit woefully immature and buggy.

I think that these reasons, plus the EOL nature of Solaris "upstream", would easily put off most people in charge of making a long term technology commitment. I know it did for me.


Well, you are right, but remember how old SmartOS actually is... I kinda hope more people get it, and start investing into that amazing open source technology. I always liked BSDs and Solaris more than Linux (who I find really messy and chaotic).


Solaris as an upstream was dead long ago; Oracle stopped publishing source years before killing their own product. The heirs of opensolaris seem to have managed well enough


Imagine you wanted to run ImageMagick in a docker image in order to reduce the attack surface. The container startup time could become a significant fraction of your image processing time.


Maybe faster than docker, but how about Intel's Clear Containers?

https://lwn.net/Articles/644675/


The Tinyx tool mentioned in the paper doesn't seem to be published anywhere.

It doesn't help that the name was already in use (by a minimal X11 server).


> We achieve lightweight VMs by using unikernels

One problems with unikernels is a lack of debugging/tracing tools (like Dtrace/eBPF): https://www.joyent.com/blog/unikernels-are-unfit-for-product...


That blog post is pretty crappy.

See discussions here: https://news.ycombinator.com/item?id=10953766


The blog post has some flawed arguments, but still there is almost no observability/tracing tools for unikernels.


Some do, some don't. But to link to this as the 'super-argument' every-time somebody mentioned unikernels is pointless and wrong.

Its also simply not relevant for many cases.


Try it with the LING Erlang-on-Xen unikernel and see for yourself if it's not debugabble.

Looks like they've got a solution for dubugging: https://twitter.com/erlang_on_xen/status/641628659657371648


How is this different than zerovm?

http://www.zerovm.org/


Maybe that ZeroVM is a NaCl sandbox, while LightVM is a Xen VM?


ZeroVM, thanks for the link. I wonder if this solves glibc type dependencies across platforms, it seems unclear.


It does not appear to be very similar at all.


Super excited about this. Amazing progress. Kudos to authors.


So, unikernels?


No, not unikernels. They use unikernels and small linux builds for examples to show their improvement of Xen itself.


> We achieve lightweight VMs by using unikernels for specialized applications

Unikernels appear to be part of the solution...


Yes, but it's more "we made Xen faster so you can use unikernels even better". Unikernels aren't the new thing here.


In case the ACM link isn't available to everyone, here is a copy hosted by NEC: http://cnp.neclab.eu/projects/lightvm/lightvm.pdf


Thanks, we've updated the link from https://dl.acm.org/citation.cfm?id=3132763&CFID=824760366&CF.... I'm not sure I've seen the ACM block traffic like that before.


I believe the CFID in the URL is a user-specific token, and user-specific URLs shouldn't be fetched from all over the world, triggering the block.


It would be nice if the mods could update to this link since it's accessible by anyone.


To save you some time, this is VMs + Unikernels.


Unikernels are mentioned in passing, but are not the cause of the speedup. The main point of the paper is instead the introduction of "LightVM, a complete re-design of the basic Xen control plane optimized to provide lightweight virtualization".


Unikernels were my first thought. Thank you for the summary.


It's not an accurate summary though, see the sibling comment to yours for a better one.


ACM can't handle the Hacker News flood.


I have had temporary bans on the ACM for opening 8 papers in a new tab. ACM can't handle the me.


Sounds like ACM needs to learn how to Internet.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: