If you're interested in these topics, I strongly suggest you to take a look at Facebook Presto's source code. Unfortunately, they don't have any documentation about the internals of Presto but they extensively tested and implemented these techniques in order to archive C/C++ like performance in Java. The slice library (the underlying memory management library of Presto) might be a good start for that: https://github.com/airlift/slice
Whenever I read something like this I have a bad feeling about it. They are trying to get "C/C++" performance in Java by bypassing all the JVM's management and garbage collection. They even basically reimplemented malloc() for memory heap management. Why not just ditch Java itself?