
Running Java in a Container - edw519
https://mesosphere.com/blog/java-container/
======
r4um
Not just the JVM, lot of libraries build thread pools based on number of
processors and some of them even hard code the multipliers (X the number of
processors etc).

Setting up each one becomes a lot of work fast, so we wrote a LD_PRELOAD[1]
hook that overrides sysconf[1] _NC_PROCESSORS* call to get number of
processors availabale/online to a certain specified value and is baked in the
docker image by default during builds.

[1] [http://man7.org/linux/man-
pages/man3/sysconf.3.html](http://man7.org/linux/man-
pages/man3/sysconf.3.html)

~~~
MaxBarraclough
Can we expect that kind of horror to fade away as Java evolves?

In the .Net world there's the ThreadPool class to manage the single process-
wide thread-pool, and the Task class to enqueue and orchestrate concurrent
jobs (it uses ThreadPool and almost completely abstracts away the thread
management).

(You _could_ write your own thread-pool, of course, but for most code that
wouldn't make sense.)

As I understand it, the JVM is rather behind in this department. (Not to
mention async/await.)

~~~
pjmlp
java.util.concurrent is actually more advanced that TPL, with contributions
from Doug Lea.

Java just lacks async/await primitives.

~~~
bunderbunder
java.util.concurrent certainly has more buttons and levers you can fiddle
with. I could probably burn at least a week trying to grok the whole thing.

------
ognyankulev
The just-released Java 10 has improvements in the container case:
[https://www.opsian.com/blog/java-on-
docker/](https://www.opsian.com/blog/java-on-docker/)

edit: at the end of the article, it is acknowledged that there are
improvements in Java 10 and the following link is provided:
[https://bugs.openjdk.java.net/browse/JDK-8146115](https://bugs.openjdk.java.net/browse/JDK-8146115)

~~~
ChristianGeek
There was a discussion about this yesterday on Reddit:

[https://reddit.com/r/java/comments/85t7dt/java_on_docker_wil...](https://reddit.com/r/java/comments/85t7dt/java_on_docker_will_no_longer_suck_improvements/)

------
jrudolph
This. I've seen container builds for Java applications broken in the described
ways fail in production over and over again. Just because Docker makes
building an image very easy doesn't mean that you will end up with a
production ready image using three lines of code in a Dockerfile. Most of the
time people don't even bother to use a non-root user to execute the jvm in
their container...

That's why I feel platforms like Cloud Foundry are a much better fit for teams
that don't have tons of container experience but want to get the benefits of a
containerized runtime. The CF java buildpack[1] for example automatically
handles OOM and heap settings calculation while building your application
container.

disclaimer: co-founder at meshcloud, we offer a public cloud service for Cloud
Foundry and K8s hosted in german datacenters.

    
    
        [1] https://github.com/cloudfoundry/java-buildpack

~~~
nerdwaller
I wrote up my experience[0] on containerizing JVM based applications a bit ago
using the cloud foundry java buildpack’s memory calculator. Fortunately the
JVM now has a way to respect cgroup memory[1] making it a bit simpler.

[0]: [https://medium.com/@matt_rasband/dockerizing-a-spring-
boot-a...](https://medium.com/@matt_rasband/dockerizing-a-spring-boot-
application-6ec9b9b41faf) [1]:
[https://twitter.com/codepitbull/status/934384652806221825?s=...](https://twitter.com/codepitbull/status/934384652806221825?s=20)

------
rdsubhas
Fabric8 has really good base java images[1] with a script that simply sets
environment variables with right GC and CPU parameters before launching Java,
with nice sane defaults.

Heavily encourage anyone running Java in containers to use their base image,
or for larger organizations to create standard base image dockerfiles that set
these JVM envvar parameters. A simple contract is: ENTRYPOINT belongs to the
base image, CMD belongs to downstream application images (unless something
else essential).

Just don't use vanilla "FROM openjdk:8-jre" and expect it to work. That's the
worst way to kill application performance and reliability in a container.

1: [https://github.com/fabric8io-
images/java/blob/master/images/...](https://github.com/fabric8io-
images/java/blob/master/images/centos/openjdk8/jre/run-java.sh#L3)

~~~
deepakhj
Jdk8u131+ and 9 support detecting cpu and memory limits to set heap and core
usage.

[https://blogs.oracle.com/java-platform-group/java-se-
support...](https://blogs.oracle.com/java-platform-group/java-se-support-for-
docker-cpu-and-memory-limits)

~~~
rdsubhas
We tried those flags in the beginning since it was introduced in docker
openjdk image[1].

When we dug in further, we find its just not trouble free (i.e. experimental).
The default is to use 1/4th of RAM which is entirely inefficient [2]. The
"MaxRAMFraction" parameter allows to specify 1/n fraction and not possible to
efficiently use 65% or 75% of memory. The only place to start is to set
MaxRAMFraction=2 and that already means only 50% of memory is used for heap.
That produces a lot of wastage. A lot of resource efficiency is gained by
starting with 65% or 80%.

OpenJDK 10 is introducing a new option "MaxRAMPercentage" [3] and that goes
closer to making a script unnecessary.

TL;DR - The default flags are still experimental in JDK 8/9, and deemed to be
better on Java 10. A script is just better for consistency.

1: [https://github.com/docker-
library/docs/pull/900](https://github.com/docker-library/docs/pull/900)

2:
[https://news.ycombinator.com/item?id=16636544](https://news.ycombinator.com/item?id=16636544)

3:
[https://bugs.openjdk.java.net/browse/JDK-8186248](https://bugs.openjdk.java.net/browse/JDK-8186248)

------
mcgin
These options were also backported to Java 8 in update 131 -
[https://blogs.oracle.com/java-platform-group/java-se-
support...](https://blogs.oracle.com/java-platform-group/java-se-support-for-
docker-cpu-and-memory-limits)

------
xcq1
Anyone here actually use Java containers in production?

Sadly the article mentions very little in terms of practical advice. We've
tried running some small Java 8 Spring Boot containers in Kubernetes which are
configured to use max ~50M heap and ~150M total off-heap yet decide to use in
excess of double that memory so we end up with either a lot of OOMKills or
overly large memory limits.

~~~
olavgg
Honestly I don't feel there is any need to run Java in containers. The war/jar
file is its own container with its own dependencies. The JVM still makes the
same syscalls as it would inside a Docker/Kubernetes container.

In fact I would rather look at serverless architecture before considering
docker/Kubernetes.

~~~
cp9
when you run a polyglot stack with java/python/go/node on top of a cluster of
machines, you will love to have them containerized and uniform. It makes
scripting and CI so much easier.

or, when you have a legacy app that relies on java 6, but you want everything
else to run on java 8, the ability to drop everything into a container with
its runtime is a life saver.

source: I'm the devops person that's responsible for making this work

~~~
olavgg
We already run a polyglot stack at our company, and we use Docker(nvidia-
docker) for our Python environment. With Java there is no need, and it is a
lot less work updating and upgrading the JVM and our Java applications. I
would use Docker for Java 6 though.

------
matttproud
We created a model and applied unsupervised ML to tuning the JVM for execution
in containers and eventually open sourced the project:

[https://github.com/sladeware/groningen](https://github.com/sladeware/groningen)

Unfortunately the experiment state persistence management capability is
broken.

------
Edmond
The real killer app would be the ability to fully containerize all jvm
instances running on any given box.

It will be a setup where one jvm instance on the host basically serves the
role of "master" in terms of class data and shared object loading while each
container instance uses its memory allotment only for running computations
specific to the application in that container while sharing memory objects
with other containers as much as possible.

It is possible to do something similar at the moment but it requires going
through a hodge-podge of painful hacks. A seamless solution to this would
basically make the jvm an out of the box poor man's polyglot PaaS platform.

~~~
jjtheblunt
Aren't you describing every OS since Solaris 2.x that uses shared pages across
processes as well as Solaris 2.x did?

~~~
Edmond
My OS internals knowledge is rusty so I am not sure they are quite the same
but likely similar.

If you think of a typical jvm application (true for non-jvm apps as well), a
significant chunk of class data will be shared since apps are typically using
the same libraries (with deltas in versions), allowing easy reuse of class
data across all container instances on a host would be a major scalability
advance.

~~~
karambahh
There is a java feature called class data sharing that does that since Java 5
but I've never seen it in the wild.[0]

Java 10 will bring AppCDS via JEP 310[1]

[0][https://fosdem.org/2018/schedule/event/class_data_sharing](https://fosdem.org/2018/schedule/event/class_data_sharing)

[1][http://openjdk.java.net/jeps/310](http://openjdk.java.net/jeps/310)

------
stuff4ben
Have to say I really liked this article, but most of it was spent on how
containers actually work on the backend. Which I think is fantastic as it's
one of the most concise and easy to understand articles I've seen on the
subject. I forwarded it to my team because a few of them still think Docker is
some black-magic voodoo.

------
Skunkleton
Wasn't the original promise of the JVM to provide container like services to
applications? This field is way outside where I normally work, I would love to
hear some expert commentary about this.

~~~
hibikir
It kind of does: Jave EE was all about having single Java Application Servers
running on each host, and deploying applications/services on top. In practice
every service was built by the same company though, so focused moved toward
shared libraries in containers and things like that: Search for OSGI.

The issue is both that the security concerns were way softer than they are
today, and that all your dependencies better be deplorable in a JVM too.
Modern ideas of having databases deployed along with the services that need
them don't work quite as well as just using OS level virtualization. That said
many a crufry old company still deploys hundreds of services to production by
loading .war files into a cluster of servers running JBoss or WebLogic.

~~~
gunnihinn
> and that all your dependencies better be deplorable in a JVM too.

Remember: Never use the standard library when a forgotten, half-assed CPAN
module you can run via Perlito will kind of do.

------
joerg84
One of the authors here: First of all thanks for the great disucssion here, it
really motivated to write another follow-up article. There were some comments
asking for more practical advice which is totally fair as this was supposed to
be more informational and creating problem awareness. What else would you like
to see covered in a follow-up?

~~~
spotlmnop
A base Java image and a set of recommendations. Most of us here (I assume) are
running containers in K8s or Swarm. Our issue is - there is no base docker
image which the java community can embrace. I have pods which restarted 35
times in the last one month.

------
allsunny
We use shaded/fat jars where I work because it's easier to ensure we never run
into issues with missing dependencies. It does however come at the cost of
longer build times. I've been hoping that Docker might be able to help with
this by allowing us to keep all of the dependencies in the container base
image and then just add the new class files in the build process. Is that a
reasonable assumption?

------
adrianmonk
I wonder if it might be a good general rule to never give just 1 CPU to _any_
multi-threaded application (JVM or otherwise).

Often, there's a mix of threads, some of which are doing CPU-intensive stuff
and some which just need to do some quick thing to unblock something else,
like start some new IO when some IO completes. With 1 CPU, any time any CPU-
intensive stuff is happening, these quick things have to wait their turn until
the next time slice. With 2 CPUs, you have twice the work and twice the
likelihood of having to deal with CPU-intensive stuff, but CPU-intensive
moments don't necessarily happen at the same time, so you have better odds of
having one of those CPUs immediately available to do the small, quick stuff.

------
mboehm
>> But the in case of containers with a hard memory limit, the entire
container will simply be killed without warning.

At least in my experience that is not true. I quite often run into this issue
and the OOM-killer will only kill one of the processes inside the container,
not the entire container.

>> The same is true for default memory limits. The JVM looks at the host
overall memory and uses that to set its defaults.

Well, I guess if you launch a JVM anywhere without setting appropriate memory
settings, you are doing something fundamentally wrong.

~~~
rad_gruchalski
>> At least in my experience that is not true. I quite often run into this
issue and the OOM-killer will only kill one of the processes inside the
container, not the entire container.

Hence one does not run multiple processes in the container or / and handle
crashes of potential child processes correctly.

~~~
mboehm
Well, I guess, sometimes you need to break the rules. But you are right.

------
fokinsean
What a great read! I'm mostly in application land all day, and even though I
have daily interaction with docker containers I don't have a good
understanding what is going on under the hood. Now I partially understand why
my coworker who manages our K8s cluster added some of the VM arguments to our
java application images.

------
kev009
It would be nice to have some kind of cooperative API for runtimes like Java,
Go etc to where the OS or container manager can provide watermark hints for
the heap size and when to run collections.

~~~
techman9
How does the JVM determine the number of cores on a Linux system by default?

~~~
joerg84
One of the authors here: This is the OpenJDK 8 implementation:
[http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/tip/src/os...](http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/tip/src/os/linux/vm/os_linux.cpp#l4967)

