
Peter Lawrey Describes Petabyte JVMs - ancatrusca
http://www.infoq.com/news/2015/03/petabyte-jvms?utm_source=hacker%20news&utm_medium=link&utm_campaign=petabyte%20news
======
Alupis
There are some interesting challenging when scaling this large with any
system.

Peter referenced some of Azul System's stuff and their custom GC.

Cliff Click from Azul has done some great Google Tech Talks about JVM
development and scaling JVM's out to really big systems.[1][2]

[1] A JVM Does That? -
[https://www.youtube.com/watch?v=uL2D3qzHtqY](https://www.youtube.com/watch?v=uL2D3qzHtqY)

[2] Java on a 1000 Cores -
[https://www.youtube.com/watch?v=5uljtqyBLxI](https://www.youtube.com/watch?v=5uljtqyBLxI)

------
jkot
Peter also has some great off-heap libraries. It is relatively easy to bypass
GC completely.

~~~
Alupis
Here's Peter's HugeCollections repo[1].

I expected to see some native code being called via JNI, but it's all done in
Java.

[1] [https://github.com/peter-
lawrey/HugeCollections](https://github.com/peter-lawrey/HugeCollections)

~~~
aardvark179
Well, the native code is replaced by the use of sun.misc.Unsafe to do the off
heap allocation and access (look in io & storage classes in
[https://github.com/OpenHFT/Java-
Lang/tree/master/lang/src/ma...](https://github.com/OpenHFT/Java-
Lang/tree/master/lang/src/main/java/net/openhft/lang/io) ).

I wonder what this sort of library will look like when Panama is done,
assuming it provides the sort of structured off heap storage support I'm
hoping for.

------
jinmingjian
Azul's practices in Java/JVM are really great but finally down to its
commercial focuses.

Java has many improvements in recent years. Most of high-performance Java
sides I have summarized in one of my unofficially published project:

[http://land-z.org](http://land-z.org)

The problem is that the language evolution of Java itself is still in the
style of one company and the hacking to deep Java is expertise-based.

> Work is already underway for atomic value types, classdynamic, a new FFI
> layer, and a next-gen, fully programmable JIT (Graal).

most are already in other languages. Do not mention Java 9, can you tell me
which of them will come to Java 10? Is the year 2018 OK? That's, people said
"why Java was slow".

~~~
pjmlp
> Azul's practices in Java/JVM are really great but finally down to its
> commercial focuses.

Just like any other programming language that tried to bring research into the
mainstream.

If it wasn't for the money invested by companies into C++, Java, .NET and
JavaScript ecosystems, we would all be doing C most likely.

As all the better alternatives to C on its day, died from lack of investment.

As for waiting.

It is no different from C++ developers eagerly waiting fro concepts lite and
modules, which might be available across major compilers (not only desktop
systems) around 2020 if all goes well.

Besides how many companies in the world have to worry about petabyte JVMs?
Very few.

------
ryanobjc
This is really cool and awesome, but after seeing the slow pace of JVM
development over the last 6 years I really despair of doing major
systems/database work in Java. It just feels like we are 10+ years ahead of
where the VM is at.

~~~
pron
> the slow pace of JVM development

Just in the past couple of years we've had a terrific new GC and the most
powerful profiler I've ever seen (Java Flight Recorder) added to the JVM (the
latter, unfortunately, not to OpenJDK yet). In a year, we'll have modules, JIT
caching/AOT (possibly) and runtime control over JIT optimizations[1]. Work is
already underway for atomic value types, classdynamic, a new FFI layer, and a
next-gen, fully programmable JIT (Graal). Many of those enhancements are not
only ahead of other production quality offerings, but truly state of the art
if not groundbreaking. Even when I look hard I can't find another runtime
that's less than 5+ years behind (and that's being generous).

[1]: [http://openjdk.java.net/jeps/165](http://openjdk.java.net/jeps/165)

~~~
needusername
> we've had a terrific new GC

I assume you're referring to G1. AFAIK the first paper on G1 was published in
2004 when they already had working code — in the past couple of years it has
merely stopped crashing on a regular basis. I remember times during JDK 8
development when they would fix one or two crashing bugs every week. In
addition it has a 200ms stop-the-world pause that you can't get around (you
can try to lower it but not into the 10ms area).

> and the most powerful profiler I've ever seen

jfr has been in JRockit for a long time. It took them four years to port it to
Oracle JDK. I doubt it will ever be part of OpenJDK and assume you'll always
need an Oracle Java support license if you want to run it in production. I
dare you to get a quote for an Oracle Java support license.

~~~
pron
What's your point? Are those technologies less groundbreaking or not years
ahead of anyone else because they've been developed for years?

My experience with G1 has been a higher CPU consumption overall, but a very
significant reduction in worst-case STW pause times. It works especially well
for "session scope" objects: objects that are born together and die together,
but may live for a relatively long time (seconds to minutes). I am not aware
of any other runtime with GCs that even come close to the JVM's.

I am also not aware of any other profiler with such depth of reporting and
such low overhead as JFR.

The work on Graal will continue for years, but when it is ready, it will be
way ahead of the curve.

Innovation in the JVM is vibrant, and we're getting new, extremely powerful
features at a good pace. The fact that each of those features takes years to
implement well only proves the emphasis on big advancements (although minor
advancements, like the JIT control feature in Java 9, are also made
regularly).

~~~
needusername
> What's your point?

JVM development is slow. Things take years to a decade. If we want to have a
GC that's better than G1 a decade from now we have to start now. G1 does not
have an order of magnitude improvement in stw pauses in it. It may be good (or
acceptable depending on where you stand) today but a decade from now we need
something better.

The best features in the world are useless if you can't use them because of
prohibitive licensing cost.

~~~
pron
If you mean development _latency_ , then you're right ( _throughput_ is quite
high). And there are people working on next-gen GCs and next-gen JITs already.

------
josho
What surprised me is that when scaling vertically they stuck with Intel. Have
IBM/Oracle really become that irrelevant?

~~~
gnu8
I think the idea is to bring this kind of scale to a wider audience at a lower
price point.

------
Corporate_Shill
tl;dr: our huge, highly-dynamic financial models are really expensive to run.
Please use your own resources and consider helping us get the Java ecosystem
up to a state where we could do something like this without reinventing Java
inside our niche. =)

