Where Sony pushes their own special formats like Memory Stick, Ubuntu pushes Upstart, Juju and now LXD. I think in the end this isn't entirely helpful to the ecosystem as a whole when you have Ubuntu attempting to push their special formats of things while not bringing all that much more to the table. The question of how systemd compares to upstart might be something to consider with my upstart comparison, but essentially all I see from Ubuntu is pushing their brand of tooling and most of it is very Ubuntu specific.
On the other hand RedHat usually releases software that can (and usually does) make it to other distributions and ecosystems. I'm not entirely sure what LXD is going to bring to the table beyond what Docker or similar utilities offer, and like many Ubuntu projects it currently seems to be VERY light on documentation. Actually, where do I even access the documentation for this project? Oversights like this are what killed Juju or MAAS for me and yet Ubuntu pushes those projects like crazy at every conference I've seen them at (Gophercon for instance).
Red Hat is a different business model. Their venturing into the cloud is more recent. Historically, they've been more into the support business, and this necessitated having a lot of people fix bugs in the Linux ecosystem. That and their acquisition of Cygnus Solutions means they're the de facto gatekeepers of the Linux kernel and much of userspace.
Canonical is a more Apple-like company. They care about being internally consistent and formulating their own brand, interacting with the outside only where necessary.
So I don't think the comparison with Sony holds any water.
Intel processors have VT-x that provides hardware to help speed up virtualization, isolate memory, etc. AMD has something similar. You can break out of a docker container and get to the host OS and other containers. With a hardware assisted hypervisor, it is possible to hide container memory from other containers at a level lower than the "host" os.
If I understand docker and VT-x correctly, hardware assisted virtualization can be used run N instances of a container while only having 1 instance in memory. VT-x can rewrite memory reads/writes transparently and deny writes to certain locations of memory.
Docker containers share the kernel with the host and depend on it for isolation. This would add the hardware assisted isolation of containers without the overhead of another kernel per container, plus the other benefits of docker.
For example if there is a local kernel privilege escalation / DoS / etc. bug that can be triggered by a non-privileged user (or a root-inside-container user) will those exploits still run inside LXD?
DoS is still a problem but containers should provide mitigation for that. You can make the vmm prevent DoS, but it's better to keep the vmm small and light.
As for local kernel privilege escalation, yes, it would still run, but it might not matter. In theory, the VMM can isolate all virtual machine resources such that rooting a VM only gives you that VM. I can't figure out how they extend that protection to containers yet since VT-x was made for full virtual machines and containers share a kernel.
Even with Xen or KVM you do have an attack surface:
* guests can send network packets to the host, which interacts with the networking code on the host. If exploitable you get to execute code/DoS the host. Hopefully not because then so could any other remote machine.
* guests can execute instructions which get emulated / need extra privilege checks done in the hypervisor. See recent vulnerability regarding MSR registers in Xen.
* guests execute hypercalls which obviously interacts with the hypervisor. Bugs here, if exploitable, can be nasty.
* guests need to read/write their data to disk. Are we sure they can't read the data of a (possible already deleted) other VM?
* guests read/write from memory ... was the memory of previously deleted/crashed/migrated guests properly scrubbed? Can any of the hypercalls/etc. be used to read another guest's memory, or access uninitialized memory containing pieces from old guests?
Of course the attack surface of a hypervisor is smaller than that of a full kernel (where you also have a lot of syscalls, disk formats, etc.), but that doesn't mean hypervisors are suddenly bulletproof.
I think this quote sums it up quite well: http://marc.info/?l=openbsd-misc&m=119318909016582&w=2. Also have a look at the PDF referenced in that thread.
The question is where does LXD stand from a security pov. between these simplified categories (no order implied):
- running multiple different processes as same user
- running processes in different LXC containers as root-in-container on same host
- running processes in different LXC containers as non-root on same host
- running multiple processes as different users
- running root processes in different KVM VMs on same host
- running non-root processes in different KVM VMs on same host
- running root processes in different Xen/domU VMs on same host
- running non-root processes in different Xen/domU VMs on same host
Or in other words if you get an account/container/VM on a shared machine from a hosting provider using technology X, how does that compare to getting an LXD container from a hosting provider?
(provided that other unknown users can run LXD containers on the same machine as yours).
In the pure sense, a hypervisor doesn't need to do anything except create a virtual machine. It doesn't need a way to interact with a user or even the vm once it is created. I have written a bare metal, type 1 hypervisor that did nothing but key log. The guest never made a hypercall and wasn't aware that it was a guest at all. Side note, I'm not an expert. Hypervisor research is just for fun.
We know there is an attack surface on LXD immediately because of the REST API and its interaction with containers. Any resource mediation also exposes an attack surface. Resource mediation is difficult, but not impossible. The attack surface really depends on implementation.
With my limited knowledge of the linux kernel, I can imagine a kernel running in its own vm, a vm for every container, and every container sharing read-only access to the single kernel. Each container could also be isolated via the same memory protection. I don't know enough to say that's possible. I think you're more knowledgeable than I am about lxc and the kernel in general. Any thoughts on this?
And I'm worried about the privileged kernel/hypervisor parsing/interpreting data from the unprivileged container.
In that sense the situation is not much different from a server: if you can exploit a bug in the server you can run/perform actions with the server's privileges.
Same situation with the kernel.
I'd wait until there are some more design/architecture docs about what LXD is exactly to say more though.
That I think is a key point to this discussion -- in order to contribute to an open-source project run by Canonical, they insist upon you giving them more rights to the code you give them than they're willing to give you. Many people are understandably put off by that.
The fact that systemd comes after Upstart is I think less germane to the point that parent was trying to make about Sony than the fact that Canonical insists on being in control of their projects, and puts up rather high barriers to anyone who wants to contribute patches. I am sure that I will hear responses about various patches systemd maintainers have refused or said they wouldn't be receptive to, but that's a difference of kind (not just degree) from insisting upon assigning Canonical the ability to re-license the code outside of the GPL.
I think is more about ownership and IP than re-licensing. Copyright assignment adds value to Canonical as company, eg in case it is acquired.
Now, I have no problem with anyone who wants to sign the CLA and believes that Canonical is acting in good faith. But Canonical is asking for additional value from contributed code than what the GPL provides, and isn't compensating people for this value. Some people have a problem with that, and it makes it harder for Canonical-hosted projects to get community involvement or to be adopted by other distros, where maintainers have to choose between signing a CLA so patches get accepted upstream or continuing to maintain their patches themselves.
I agree with you, but I remember reading that argument backing the CLA (defense in court, to increase the value of the company in case they want to sell it and to be able to close the code); I can't find a link though.
Including Red Hat, which is where systemd was developed!
The original success of Ubuntu was really incremental improvements - they took Debian as a stable base and improved on the installer and on the defaults, fixing a big bunch of the small obstacles that prevented non-technical people from using the system.
Last time I saw, they were rewriting it from Python to Vala to fit mobile devices. Before that, they rewrote it from Perl to Python to fit modern development ecosystem.
IMHO, Ubuntu is going round in circles.
They are not pushing their enterprise solutions on Latin America anymore. Don't know what they are doing.
But then there's the line about "working with silicon companies to ensure hardware-assisted security and isolation for these containers" -- WTF?! If using OS-based virtualization, why would you need hardware assistance for "security and isolation"?! And if that "hardware assistance" is being used for something so basic as proper containment, what happens if you don't have that assistance? Is LXD then vulnerable to privilege escalation? And who are the "silicon companies" we're talking about (like, is this Intel or is this not Intel?), what is the ISA, when does it tape out, how is it being validated, etc. etc. etc.
It's very frustrating for an announcement to be so putatively technical and yet provide so few answers; is there a deeper technical explanation of LXD somewhere?
I would guess that Canonical is talking about getting companies to contribute Linux kernel patches for cgroup interfaces to various northbridge-managed hardware virtualization tech (e.g. IOMMU tech like Intel's VT-d.)
So you could have, say, one virtual ethernet card per container (letting you run a container as a promiscuous-mode packet filter for its own VPC subnet, while still not being able to snoop on other VPCs' traffic) or one virtual GPU per container (allowing you to containerize OpenCL apps), while still having your containers acting like regular processes otherwise.
I am not convinced that this is what they are talking about though...
Today most OS based virtualization is using "hardware assistance". Those are for often for memory and IO device managment (even passthrough). Not sure if this is _the_ assistance they mention but just an example of how it could work.
I think the best guess is what derefr posited, above: that they are using HW network virt as a way of avoiding building in proper network stack virtualization like that found in Crossbow. Then again, given the degree to which LXD appears to be aspirational rather than actual, we might be overthinking it: perhaps the conversations with "silicon companies" are like LXD itself -- a daydream about what might be rather than a concrete reality.
> And it’s going to be a real hypervisor?
> Yes. We’re working with silicon companies to ensure hardware-assisted security and isolation for these containers, just like virtual machines today. We’re working to ensure that the kernel security cross-section for individual containers can be tightened up for each specific workload.
Sorry, but WTF? Is it a hypervisor or not? From a security perspective, one kernel per container or LXC? If the latter, as the rest of the announcement seems to imply, what is the "work with silicon companies" about? Either compromising Linux allows you to get access to other containers on the machine, or it doesn't. It can't be both.
They then go on to state that LXC is a "real hypervisor" with live migrations and such. What? Did they take an established Linux household name, with wikipedia article and everything, and name their new semi-related project identically?
And if it's a para-virtualized solution they're pushing, are they really competing against Xen? I'm not sure it's wiser than competing with Docker.
Anyone from Canonical here and can explain what's going on?
Edit: You know Docker doesn't use LXC by default anymore, right? It uses a different container library called libcontainer.
This could be a marketing problem, but it's the impression OpenStack gives me whenever I look at any of it.
Even "just" individual components like Swift makes me want to bang my head against a wall just from looking at an architecture diagram.
Of course, for large deployment you may end up needing all that complexity. The difference is that with OpenStack you need to figure out what you can disable. With the Docker eco-system, you get to figure out what you need to add as you build. The latter approach is much more friendly.
It always astounds me how some people massively over-estimate the size and influence of Docker's marketing... Why yes, of course! The way we got Google, Microsoft, Amazon and IBM to integrate it in their products is by ghost-writing PR fluff. That's also how we got 600 people to contribute 9,000+ pull requests over 18 months   . Not bad for marketing monkeys!
Seriously - after seeing so many hackers work so hard to improve the project every day, the "it's all marketing fluff" crap always gets to me. It's just plain disrespectful. How much legitimate engineering work do you need to see before you start respecting other people's work?
Docker resonates well with people because it focuses on aspects of virtualization that people care: development and deployment. It alleviates the need of using complicated configuration management tools by providing layered images, and encourages fine grained containers by supporting first class volume sharing.
The fact that it integrates well with other virtualization stacks is a proof that for a good part it's orthogonal to them.
My logical opinion says that Docker is a useful tool which, though flawed in many ways, provides real value to a large number of users. I've even recommended Docker be used for new projects in my company, on the basis that it fits in well with what we're trying to do.
My emotional opinion is that Docker trades on trends in startup-world, systems engineering and the open source movement for the sole purpose of eventually generating revenue. This capitalistic perversion of what were before two idealistic and noble things (open source, engineering) is, quite honestly, abhorrent to me.
So to answer your question: while I might eventually respect its engineering accomplishments, I despise it on principle. I hope one day it turns into a simple useful tool that people can decide to use or not use without being cajoled by developer evangelists.
Except that the last thing anyone with a monetary stake in your business will do is tell you to open source your main product. The Docker project has fought damn hard, and continues to, to make sure that a carefully curated line between business and the open source project exists (see for instance the creation of the Docker Governance Advisory Board).
Compare that to the combined marketing budgets of HP, Dell, Rackspace, Redhat etc. I've probably had more spent on me by OpenStack marketing (taking flights & lunches etc into account) than the marketing budget of Docker prior to their recent funding round.
If you take "marketing" to mean random 3rd parties writing how they use Docker to solve actual problems, then yeah - I see a lot more of that than I do for OpenStack.
I agree Docker is easier to deploy though!
Well established within the last year or so.
But yeah, good to have more stuff. We'd still be talking on AT&T rotary dial phones from the 60s if we didn't competition.
Then again, maybe I don't understand how LXC work at all.
"LXD - the Linux Container Daemon"
''"Published on 4 Nov 2014
Dustin Kirkland, Product Manager at Canonical introduces LXD (lex-dee), a new hypervisor that delivers capabilities to LXC containers that cloud users demand in scale out infrastructure. LXD is a persistent system daemon developed to enable the secure management and live migration of LXC (lex-cee) containers via an easy to use command line interface and REST API."''
The concept is relatively simple, it's a daemon exporting an authenticated REST API both locally over a unix socket and over the network using https. There are then two clients for this daemon, one is an openstack plugin, the other a standalone command line tool. ''
The main features and I'm sure I'll be forgetting some are:
- Secure by default (unprivileged containers, apparmor, seccomp, ...)
- Image based workflow (no more locally built rootfs)
- Support for online snapshotting, including running state (with CRIU)
- Support for live migration
- A simpler command line experience
This work will be done in Go, using the great go-lxc binding from S.Çağlar.
Now as to what this means for LXC upstream:
- A new project will be setup at https://github.com/lxc/lxd .
- Code to this project will be contributed under an Apache2 license, no CLA is required but we will require contributors to Sign-off on their commits as always (DCO).
- Discussions about lxd will happen on lxc-devel and lxc-users.
- Contributions to github.com/lxc/lxd will happen through github pull requests only and reviews will happen on github too.
This is kept separate from the main tree because at least initially, I believe it best to have a separate release schedule for both of those and because it tends to be easier for Go-only projects to live in theirown branch.
In order to be a good hypervisor, we also need to make containers feel like they are their own system and so we'll be spending quite a bit of time figuring out how to improve the situation. Some of the work presented at Linux Plumbers is going to contribute to that, like cgmanagerfs to provide a reasonable view of /proc and a fake cgroupfs, Seth's unprivileged FUSE mounts and all the cool things mentioned in Serge's earlier post about
Now as for the next steps. We will be creating the repository on github over the next few hours with Serge and I as the initial maintainers. Once the project is properly started and active, we will promote some of the most active contributors to commiters.
The first few commits in there will be text versions of the
specifications we came up with until now. This should also serve as a good todo list for people who want to get involved.
Over the next few days/weeks, the existing code which was used for the demo at the OpenStack summit in Paris will be submitted through pull requests, reviewed and merge.
see more: https://lists.linuxcontainers.org/pipermail/lxc-devel/2014-N...
and check the thread here :
(source: http://www.zdnet.com/ubuntu-is-working-on-a-new-secure-conta... )
This press release seems to have been writen in 5 minutes.
This is good for everyone. Docker doesn't even use liblxc anymore by default, it uses libcontainer. Wonder why Ubuntu isn't getting behind libcontainer.. In any case, the stuff being pushed to upstream projects, like the kernel, will flow back down to docker and everybody can enjoy new awesomeness.
Because not everyone wants what docker offers. Some people prefer and want the more VM-esque behaviour provided by LXC.
To me LXC is the real deal, while docker offers limited convenience at the cost of flexibility and platform lock-in. And I have zero interest in that.
I hope Ubuntu continues to offer good LXC-support, and then docker (or whatever the other hip thing of the month is) can do whatever docker does, because it's external to whatever distro people is running.
> To me LXC is the real deal, while docker offers limited convenience at the cost of flexibility and platform lock-in.
I'm not really sure what makes liblxc the "real deal" and "libcontainer" not? Would you care to expand on this though? The true flexibility you are alluding to is, I believe, provided by the kernel itself? Are you deriding libraries that abstract interfacing with these features? Where is the platform lock-in coming from? Docker has been making inroads into many non-linux platforms, even Windows recently.
> and then docker (or whatever the other hip thing of the month is)
Are you suggesting Docker is a "flavour of the month"? That's... A unique perspective. In any case, as a counter point, I'd like to offer up that RedHat itself has partnered with docker via OpenShift. If one were looking for linux flavour of the month, I RedHat would be the LAST place they would look.
At a data center where I worked a while back I saw thousands of VZ containers on boxes that could only manage maybe sixty KVMs. If the issues around security and flexibility can be fixed, there is opportunity for orders of magnitude improvements in density and power utilization.
Running the same kernel/version of OS is fine across my host system and the guests.
Please stop using container/virtualization as a security mechanism. It never worked, i never will work.
( source: http://osdir.com/ml/general/2014-11/msg06783.html )