JVM tuning is very touchy and perfectly specific to the application. Why not let the computer figure it out for itself?
"Groningen is a framework to run automated experiments on servers that run on the Java Virtual Machine (JVM) such that the most optimal JVM settings outcome can be reached with the least amount of human effort and time while maximizing safety."
I went thru the code and some of the info on GitHub (1) and it is still not clear to me how it tests all the different JVM settings for a specific application.
Having spent a fair bit of time optimizing JVM GC, I understand that there is a bit of dark magic to it. That said, with a little bit of work you can typically find the right configurations for most use cases.
This is due to a wide variety of GC options and a long history of iterating on best practices. I'm curious how well newer languages that focus on performance but offer GC (Go, Rust, etc) have dealt with this. Is it a case where they have just jumped to some "best of breed" GC configuration set? Or do programmers in these languages just get left without all the lessons learned on the JVM.
I'm honestly curious, what is the state of the world in non-JVM languages and GC optimization?
In Rust, garbage collection is just a library type (akin to the Boehm GC for C++). And because the language expects to get by 99.9% of the time without any GC at all, other facets of the language design will necessarily restrict the strategies of GC that can be supported. For example, Rust allows you to store pointers directly into data structures, which means that a moving GC would need to be able to know about all references to any data that has been moved and update all pointers in memory appropriately. Rust isn't a language that's willing to pay that sort of dynamic bookkeeping cost, so the GC in Rust's stdlib will probably never be moving or compacting.
There is scope for moving/compacting GC, as long as creating a reference from a GC pointer is registered and pins a GC allocation in place until the reference disappears (i.e. it will theoretically only be a cost with GC pointers, not anything else).
Also it's worth noting that Rust does not have a GC yet, and I don't believe anyone has really got in there and iterated on possibilities yet. (Graydon, Felix (pnkfelix) and I have all done some experimentation with it, but all proof-of-concept/very basic, not really bullet-proof/production-ready things.)
Go's GC has not been terribly fast compared to more advanced ones like Java's, but it's getting better and it's also a LOT easier to keep from throwing off so much garbage in Go. I have production Go services that run in < 10MB of space. I don't think I could run even trivial Java servers in that space.
Sure I hace quite a few programs that run in 64mb of ram (default) jvms. Probably could go lower for those but I never even looked at improving those. Java programs can use very little ram if appropriately designed for a problem. I.e. don't forget java 2 me
In Rust we usually just don't use GC; the language doesn't need it for safety. It'd be nice to offer one though.
Note that we finally have a moving/compacting GC API which uses SpiderMonkey's GC for Rust objects as part of Servo, so you can view that as a Rust GC of sorts if you like.
As of 1.3 Go has a fully precise garbage collector with a parallel mark phase (not sure about the sweep). I imagine the next step will be a generational collector, however the extensive use of internal pointers may prevent further development into a moving and/or compacting collector.
"Groningen is a framework to run automated experiments on servers that run on the Java Virtual Machine (JVM) such that the most optimal JVM settings outcome can be reached with the least amount of human effort and time while maximizing safety."
https://code.google.com/p/groningen/