
Comparing Java GC Collectors - ingve
http://eivindw.github.io/2016/01/08/comparing-gc-collectors.html
======
hyperpape
I know that profiling using VisualVM (or most other profilers) can
substantially change the garbage collection behavior of an application
([http://psy-lob-saw.blogspot.com/2014/12/the-escape-of-arrayl...](http://psy-
lob-saw.blogspot.com/2014/12/the-escape-of-arraylistiterator.html)).

I don't know if using VisualVM in the way this article does has the same
effect--sounds like the author is not doing any actual profiling, so it may
not be an issue. Additionally, since this is a synthetic benchmark, you might
argue that it's not important. Still, I'd tend to think it makes more sense to
use OS level tools to monitor the CPU usage and avoid changing the behavior of
the JVM.

~~~
dpratt
I concur. These results would have been better if they'd been generated using
the native GC logging format and then run through an analysis tool.

Similarly, the article mentions that the synthetic benchmark was run for only
a minute. I am heavily suspicious of this - any results you get here will be
useless until HotSpot has had a bit of time to optimize the running code. I
heavily suspect that the G1 throughput would have been significantly higher if
the benchmark had been allowed to run longer.

~~~
iheartmemcache
Not only did he not give the Hotspot time to warm up but we have no idea what
the overhead of the Dropwizard stack is (i.e., if its heavily IO dependent
that's going to obviously shift things), what Application<T> looks like, what
the emitted code is. If it's anything like Clj, the overhead of the package
runtime is not insignificant. The native GC logging is way better than using
log4j (which is great for logging, but abymsal in situations like this for
obvious reasons...). He's on OSX, if he wanted to conduct this with even the
slightest bit of rigor he would have used Dtrace and probed a few application
runs with appropriately placed probes. Like it's already been mentioned, he
should be using whatever the JDK (or ideally JMH, IME) gives him to bench
rather than that log4j derivative, because that could potentially introduce
disk seek variance between the two GCs (i.e one could be simply buffering,
then capturing stdout's FD, and flushing -- rather than having 60 very
significant potential 'get FD, (potentially block until you can acquire a
mutex on it; then potentially wait to seek out) , write, flush/sync' while the
other one might have just done it once (as I have no idea about the the
library he's using to log, nor have I used an OS X file system in production
(unless it uses UFS1 in which case I when I was 13 or 14 for a bit). He's
writing to a tty so that's probably not going to impact performance too much
I'd imagine, but if you've real-time cat'd a 200 meg file out to a tty, you'll
see that the benchmark is actually output to the screen rather than the
throughput of the disk (again, not sure why, probably has a lot to do with the
X11/Xorg frame-buffer being slammed and/or the bloat of the term you're using
(you'd absolutely see variance between something minimal like xterm + sh, vs
gnome-terminal + zsh)). Anyways, yeah, he should have read some Oracle docs to
minimize the number of independent variables between his two cases to make
this even moderately definitive. (Also I'm baffled why he even put in any IO
what so ever if the whole point was to bench the 4 different modes of GC...).

Anyways, instead of tearing apart his post by pointing out the complete lack
of rigor and experimental design foresight he exhibited, something fun to do
is to take the Sun JVM, IBM J9, OpenJDK, Graal (or something along the lines
of that) and Zulu (if you can scare up a Zulu license you're in for a pretty
interesting treat). It's amazing how one IL can yield so many different
performance results.

~~~
hyperpape
I don't see anything about log4j, he's using the JVM built in GC Logging, and
can you explain the dtrace reference? Wouldn't think it's what you'd use for
Java GC analysis. Is this some kind of odd sarcasm?

~~~
iheartmemcache
Absolutely nothing sarcastic about it.

RE: Log4j. I haven't written Java professionally in almost a decade, so my
mind substitude "something similar to log4j" in. Slf4j, log4j 2.0, both libs
serve (as I understand it, am I horribly off base here?) the same purpose --
your kernel will be forced to make a syscall to write, in this case to his
stdout, introducing unnecessary variance when your goal is to contrast two
states of GC.

RE: Dtrace is a fantastic tool for making granular benchmarks. Again, theres
nothing sarcastic about it. As I previously mentioned here JMH would be ideal,
confirmed by Oracle[1] as the method to use. (Read the second section,
starting with a few paragraphs up from Listing 6.) He basically gives a
prescriptive method by which one can use the same tool I mentioned to combat
the same issues I brought up, especially for situations like this (micro-
benching). DTrace would take it one step further (again, assuming you're
skilled enough to use it appropriately) -- you'd end up getting effectively a
caliber of measurements on par with Intel PIN. I suppose drastic overkill for
this situation, but no, there was nothing sarcastic (nor incorrect, as far as
I can tell) about anything I said.

Edit: As I re-read my response to him, I'm baffled as how you can even think
this was some sort of sarcastic jab. The dude made an awful post, and I tried
to diplomatically raise a few of _many_ issues with his method. I guess I was
being a bit of a dick to him by just hammering him away. He's probably just
some 19 year old and I should have been constructive instead by spending 15
seconds to search for one of the plethora of sources available on how to
properly benchmark the variance of different styles of garbage collection on
stack-based VMs. I suppose my foul mood and an easy blog post as a target will
yield make for poor manners. Apologies, OP

[1] [http://www.oracle.com/technetwork/articles/java/architect-
be...](http://www.oracle.com/technetwork/articles/java/architect-
benchmarking-2266277.html)

~~~
hyperpape
Sorry for the bit about sarcasm, I was just really confused by some things,
and said something silly and obnoxious.

 _The native GC logging is way better than using log4j (which is great for
logging, but abymsal in situations like this for obvious reasons...)_

He's using the native GC logging. The log4j clone is application level
logging, which is what originally confused me.

I see you're worried about variance, but I'm not sure a super synthetic
benchmark is better. With GC measurement, you're concerned about how an actual
application produces garbage--unlike writing a controlled benchmark with JMH.
I'd rather see an actual Dropwizard app/servlet hit with JMeter or something
like that, and enough runs to smooth the variance--the opposite of tearing out
the logging.

 _he would have used Dtrace and probed a few application runs with
appropriately placed probes._

Still confused by this. Hotspot does GC logging. How do you propose he improve
on that with DTrace?

------
eivindw
Main point of the article was really to show how to use gc logging correctly -
calculating max, avg pause-times plus throughput. The VisualJ stuff was just
added for fun. I ran without it - and the numbers are the same. Also running
for a minute was just to get images to compare.

In a real application you just need gc logging and a tool to calculate
metrics. Point is that you have to measure - my numbers are only valid for a 1
min running testapp..

------
kilink
There are so many other factors that ultimately determine GC performance. The
GC must be tuned based on the workload. For instance, the out of the box
defaults for things like newRatio [1] are not ideal for web services with lots
of short-lived objects.

[1]
[https://blogs.oracle.com/jonthecollector/entry/the_second_mo...](https://blogs.oracle.com/jonthecollector/entry/the_second_most_important_gc)

~~~
eivindw
Yes, choice of collector is just one of many tuning options. Point is you need
to log and measure your own app for every change you make.

------
joncrocks
Article is trash.

There are different collectors. They are different.

No really?

~~~
nomadlogic
also i was unable to find a reference to the version/build of java the author
was using - thus negating any numbers/benchmarks presented.

------
Scarbutt
What is the default GC for the 1.8 JVM?

~~~
hyperpape
Parallel. G1 will be default in 1.9.

Edit: the part about 1.8 is wrong. 1.8 doesn't have a standard default.

~~~
lovelearning
Not sure what you meant in the edit, but the first sentence is correct.
Hotspot 1.8 does default to Parallel collector in the absence of any explicit
GC setting VM arguments.

[1]:
[http://hg.openjdk.java.net/jdk8u/jdk8u60/hotspot/file/37240c...](http://hg.openjdk.java.net/jdk8u/jdk8u60/hotspot/file/37240c1019fd/src/share/vm/runtime/arguments.cpp#l1569)

~~~
hyperpape
That's what I initially thought, but then looked at

[http://docs.oracle.com/javase/8/docs/technotes/tools/unix/ja...](http://docs.oracle.com/javase/8/docs/technotes/tools/unix/java.html)
and confused myself.

