Tuning JVM for a VM - Lessons Learned, Directly from VMware

sciurus · on April 12, 2012

Efficiently using memory when you have three levels (Hypervisor->OS->JVM) that are largely opaque to each other is an real challenge. Doing ballooning at the JVM level [0] instead of the OS level makes sense for VMs primarily running Java applications. VMware often seems to be the first to get a new technology out, but their competition quickly follows (e.g. KSM for transparent page sharing and compcache for memory compression). I wonder if Oracle or Red Hat are working on their own JVM ballooning implementation.

[0] http://pubs.vmware.com/vfabric5/topic/.../vfabric-tc-server-...

rrdharan · on April 12, 2012

[Disclosure: I worked at VMware from 2002 through 2011.]

I take issue with the claim that the competition "quickly followed" on for (at the least) the transparent page sharing feature. VMware was shipping transparent page sharing in their type 1 hypervisor since ESX 1.5, which was released before I joined the company.

KSM was first proposed in 2008, and I believe it didn't actually ship until 2009 (note: the delay was at least in part because the developers wanted to avoid the possibility of exposure to patent litigation since VMware held a patent on the technology, software patents are evil, blah blah).

You can tell a similar story for VMotion/live migration of running virtual machines; VMware first shipped it in 2003 and it was at least 2007 before any competing hypervisors were shipping a similar feature (Hyper-V didn't have it until 2008 R2, Xen had it sooner - possibly in 2007?).

HackR · on April 13, 2012

Out of curiosity what made you leave VMware?

sciurus · on April 13, 2012

Thanks for the correction!

Yeroc · on April 13, 2012

Azul Systems has been marketing a different solution to this same problem in the Zing VMs for some time now. I believe their solution is based on adding extra APIs to the underlying kernel etc. so the JVM's heap isn't so opaque to the OS/hypervisor.

The VMWare solution sounds like a simpler, pragmatic "hack" to solve the problem. It would be interesting to hear from someone that has used both technologies to see how they compare in practice.

JoachimSchipper · on April 13, 2012

As I understand it, Azul rather solves this problem by giving control of the memory-mapping unit to the VM. Which allows some really neat tricks (I'd expect it also allows VM->kernel privilege escalation, but their machines are Java appliances anyway.)

kristianp · on April 13, 2012

That link is broken, can you fix it please?

sciurus · on April 13, 2012

Sorry, try http://pubs.vmware.com/vfabric5/topic/com.vmware.ICbase/PDF/...

gaius · on April 13, 2012

Is it just me or is running a VM in a VM just crazy? The application stack now looks like

OS -> Hypervisor -> VM -> OS -> VM -> App

Where "App" might be a service so you might have this stack duplicated dozens of times for an actual app that a user can use. And you don't save anything either, rather than "processes" your sysadmins manage "VMs". What happened to

OS -> App

emmelaich · on April 16, 2012

Yes it bothers me a bit. I think the 'everything to a VM' culture at the moment unfortunately misses doing isolation in a lighter way. (jails,zones,lxc,lguest etc). Some of those are not quite mature but still. Relatedly sometimes license requirements are per cpu, and the license accepts vmware restrictions but not cgroup or resource manager type restrictions.

I hate licensing.

guelo · on April 13, 2012

Or you could just run the JVMs on the base OS and use good old file system security if you really need sandboxing. The supposed advantages of adding a OS VM layer seem iffy if you're just running one app per.

rickette · on April 13, 2012

It always amazed me that the JVM doesn't have any options to restrict CPU resources (something VMware does have). The only alternative I can think of is to use the unix taskset utility to limit JVMs to specific CPU cores. Anybody got experiece with a setup like this?

Limiting memory usage and file permissions with the JVM is indeed easy.

sciurus · on April 13, 2012

Create a cgroup hierarchy with the CPU subsystem attached, then put each JVM in its own cgroup in that hierarchy. By default, each JVM will get an equal amount of CPU time. A JVM can be given relatively more CPU time by increasing cpu.shares for its cgroup. You can similarly manage memory and i/o with the memory and blkio subsystems.

http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6...

guelo · on April 13, 2012

Let the OS manage the CPUs, you can use nice if you need to prioritize.

ntoshev · on April 13, 2012

There is also cpulimit if you want to set hard limits.

derleth · on April 13, 2012

So, how reasonable is it right now to run a JVM directly on a VM?

gaius · on April 13, 2012

Oracle are running JRockit directly on the hypervisor, no OS.

http://docs.oracle.com/cd/E23009_01/doc.200/e22518/introduct...

_3u10 · on April 13, 2012

Pretty reasonable as proof of concept, MS runs the .NET VM as an operating system, see Singularity. There's also JavaOS, JNode, JX

In practicality the issue is drivers, databases, webservers, etc.

derleth · on April 13, 2012

There are a few ways to solve this; the simplest might be to replace the OS with a library (a libos) linked into the JVM that implements whatever the JVM needs from an OS API in terms of what a specific VM offers. This is the idea of the MIT Exokernel Operating System, XOK.

http://pdos.csail.mit.edu/exo/

zokier · on April 13, 2012

http://labs.oracle.com/projects/guestvm/

edit: see also LiquidVM, possibly aka JRockit VE