I think that Java is seriously being downplayed by HN crowd.
I was pretty much exclusively a Java programmer for the first decade and a half of my career, before moving to Node and TypeScript. I don't think I could ever go back at the point. Most importantly, this is my first time where the entire code base (front end and back end) is in the same language and toolchain, and I think it is the single most important thing I've seen in years for improved team productivity. The ease with which engineers can go between front end and backend is an incredible boon that shouldn't be underestimated.
JS also uses several times more memory, is slower, and has a terrible (non existing) threading model. Yes you can run multiple instance of node or whatever, but sharing objects between them requires message passing which is orders of magnitude slower.
Until JS has a good threading model I'm never using it for backend. It's too expensive to use a bunch of single core machines to make up for it.
All of our devs use Typescript and Java daily for front and backend, the only overhead is making sure objects were passing around match on both ends. The only advantage to using the same language for everything is hiring inexperienced devs that don't know both IMO
> the only overhead is making sure objects were passing around match on both ends
Seem like is it a big deal based on the first sentence.
I've never been convinced of the single language argument. Sharing code between frontend and backend sounds good but as in practice there's little overlap... models have subtle differences, there's extra logic server side... All in all it's not very practical.
To me the most awesome part of a fully TS project is that you can use the same interfaces everywhere. If you take the time to define them for any input / output, everything is pretty much guaranteed to be sound.
> the type erasure problem is far worse than Java
Check out RunTypes , amazing to guard any incoming data.
There is big difference between "this API can blow" and "it can blow everywhere".
Don’t Node.js worker threads solve exactly this?
Which means you are second guessing the compiler, which provided you with such great guarantees.
You should do your best to minimize this kind of code.
It completely succeeded in that!
Java (well JVM) developers today can
- Write code on any of Windows/Linux/macOS
- Deploy that code on any of Windows/Linux/macOS
Not a lot of language/platforms can claim to this amount of success, let alone with such an amazing set of tools and ecosystem.
In practice, how many developers write C# on a non Windows platform? I'd say a very, very tiny minority.
On the other hand, Java is being written on all platforms and being deployed on many as well.
Plenty do - think of deploying c# web services on Linux servers / containers.
I write C# on Linux and I am basically the only person I know who does that.
Whether that effort pays off, only time will tell.
You can also compile JavaFX apps AOT for iOS! It's called Gluon Substrate, check it out.
Java really does run on a lot of stuff, even if OpenJDK itself may not.
That's what people say, but I don't see that.
The same Java code is extremely portable, from Windows, MacOS, Linux, etc, on both the server, cli, and GUI app side.
It's just that its UI libs have historically been over-engineered shit like Swing.
Java owes 90%+ of its ubiquity and longevity to its success on the server.
Applets were very much a victim of various power struggles within the browser industry, combined with Sun's general lack of competence on the desktop - for instance, their online upgrade engines have always sucked. Though in fairness, nobody got that right until Chrome.
i wonder if it was because at the time, the different DOM api in each browser was so immature, that to unify it into a single api is too big a task. The applet+blackbox region for rendering is the easiest MVP. Of course, with hindsight, that turned out to be a piece of crap.
That crown properly belongs to the UCSD P-System, which was the Java of the 1980's. It was the same idea as Java - compilation to a bytecode which an interpreter ran. It failed because the interpreter performance penalty was too high.
Java also started out as an interpreter, which made it too slow. Steve Russell of Symantec invented a JIT for it, and like the lumbering Allison-engined P-51 getting a supercharged Merlin, it brought Java to life.
You could run p-system on a lot of machines - Apple II, IBM PC, TI-99/4A, PDP11... but how would you (and why would you) distribute your code across machines with such different storage media?
I transferred files from my PDP-11 (8" floppies) to my PC (5.25" floppies) using Kermit.
I coded in UCSD-P quite a bit (and played a few games written in it, Wizardry on the Apple 2 anyone?).
But UCSD-Pascal never reached a tiny fraction of the audience that Turbo Pascal did.
Are you talking about the host OS and the fact that Turbo Pascal was Windows only, as opposed to Pascal UCSD which was a VM?
I do it daily during most of those 25 years, developing on Windows, deploying across multiple flavours of UNIX.
In my experience, the engineers that can do that and still produce solid code are very, very rare.
In retrospect, as an early adopter of languages like Scala/Groovy, I really like how Java just waited and watched for a few years to see what was good in those languages and let them make mistakes on the way to building something stable and then adopted a lot of things that made those languages fun.
Java since 11.x onwards had been a great mix of developer productivity, stable core (other than people writing trivial projects, most people want something that lasts for years without random bugs), portability, and great tooling (specially from the IntelliJ side, as well as from the debugging side).
I'd much more openly recommend Java as a loved language now than back in 2010 (though Elixir is the new and shiny project I'm playing around with right now ;)).
But I was wrong - the most popular language by far for the programming test was Python.
In high-traffic environments, that ignorance punishes you. I've always felt Java and the JVM are of the mindset that you need a Ph.D. to even understand how it works or how to configure it, and if you can't get it, then you're just a bad programmer.
You need to know if you're blocking threads, if there's memory contention, and if libraries you pull in are using the forkjoin common pool (which you're likely using as a default threadpool). And when something blows up, finding the reason (even for any of the above issues) is really tough. You can use flight recorder, heap dumps and gc logs all day, but good luck navigating it all unless you're a genius. I've seen too many devs end up shrugging and hoping the issues are transient.
Even figuring out proper threadpool usage isn't straightforward. Look at the number of concurrency abstractions just to model concurrency in your system: https://www.youtube.com/watch?v=yhguOt863nw. It's ridiculous.
Lots of large tech companies "seem" to "make it work." But if my experience is at all similar, they're just relying on a handful of Ph.D.'s to hold the hands of the rest of the company when it comes to troubleshooting.
Part of the reason I fell in love with Elixir/Erlang and the BEAM is that it provides a simple (actor) concurrency model (with a single concurrency primitive, a process) and guardrails (time-slice scheduling) to prevent libraries from shooting you in the foot. OTP's observer makes finding bottlenecks a breeze.
For the web, taming concurrency feels way more important than any cpu-crunching perf gains the JVM can give you. I'm too stupid for the JVM; I'll stick to tools that take away numerous categories of complexity and get me closer to mastering my system.
Huh? Java is used in huge traffic backends, including HFT with minimal latencies acceptable all the way to Google and Twitter scale.
If anything, Java is much more fast and low level than the typical languages used for huge high-traffic services -- Rails, Python, etc - never mind about what's used in "low-traffic situations".
ANd, to that, I'd agree. I've built high traffic stuff in Java. We built it, load tested it, it was terrible. After multiple rounds of profiling, tweaking GC settings, tweaking threadpool sizes, rewriting things to be async, finding out that a client library wasn't reusing connections properly, etc, we finally had acceptable performance...that still was less than I'd gotten out of the box in similar, IO bound services, written as unoptimized Erlang.
I've seen the same troubles with alternatives, just without the amazing tools, featuresome standard library or widely-accepted conventions.
Erlang is amazing and is places concurrency in a more central position. I'm hopeful Project Loom will greatly diminish the gap while carrying legacy code forward unchanged.
Part of the reason for its success has been its strong commitment to backward compatibility, so it's to be expected that it might accumulate many ways of doing things. Python wisdom tells us this is often a Bad Thing. 
I imagine Java's approach to concurrency and parallelism might be quite different if it were designed today.
Probably not, actually. Project Loom's initial goal was to rethink concurrency on the JVM from scratch. What they came up with was:
* Make threads really, really cheap
* Make thread locals work better (as scoped locals)
* Add a few Executor utilities to help you control sub-tasks better (structured concurrency)
It turns out that Java concurrency is pretty damn good already. It provides all the different paradigms you might want to explore, is efficient and well specified. Meanwhile they realised that many of the alternative approaches to concurrency are in reality trying to work around the high cost of kernel threads. When you make threads really cheap, a lot of the motivation for other approaches falls away and the existing set of tools in the JDK come to the fore.
There are, however, a few things in Java's early concurrency support that make various things harder, including Loom, and we're having to put some extra effort into grappling with them.
Probably the most obvious is the fact that the language and VM requires every object to have a monitor lock that can be synchronized and waited/notified. In 1996 this was viewed as "Ooooh, sophisticated, building locking and concurrency support into the platform!" In recent years this has started to get in the way. Really only a very few objects are used as locks, but the _potential_ for every object to be locked is paid by the JVM.
It also intrudes on Project Valhalla, which is trying to define "identity-less" inline types (formerly, "value types"). Ideally, we'd want all conventional objects and inline objects to be descendants of java.lang.Object. But Object has the locking APIs defined on it, and locking is intertwined with object identity. So, does Object have identity or not? There are some solutions, but they're kind of weird and special-cased.
Another issue is that the locks defined by the language/VM ("synchronized") are implemented differently from locks implemented by the library (in java.util.concurrent.locks). Loom supports virtual threads taking library-based locks, in that when a virtual thread blocks on a lock it will be dismounted from the real thread. This can't be done with language/VM locks, so there's an effort underway to migrate the those locks to delegate to library code for their implementation. This isn't an insurmountable problem, but it's yet more work to be done, and it's a consequence of some of the original designs of Java 1.0's concurrency model.
When I think about Java concurrency today I tend to think of java.util.concurrent or the JMM. Perhaps that's odd.
As for compatibility: why is Java 8 market share so high in 2020?
I believe that the reason why Java 8 is so popular is because there were a lot of backward compatibility problems with Java 9, compounded by the fact that Java 11 (the next LTS after Java 8; both Java 9 and Java 10 had very a very short life) removed many APIs deprecated by Java 9.
All libraries that matter on the Java eco-system are already on Java 11.
Worse is that Kotlin fanboys don't get it, that without access to modern Java their Java FFI is worthless, as all Java 8+ libraries on Maven Central will slowly become useless on Android no matter what.
Additionally the language cannot expose JVM capabilities, unless they had even another backend.
So it will be stuff like value types, JNI replacement, proper generics, customized JIT and SIMD on the JVM, and plain old Java 8 on ART.
First by adding their own features to the JDK, and today simply by making Kotlin the main language to program in on Android.
Android is completely unshackled from Java today.
Android is compeltely
That is precisely my point.
Android has completely unshackled itself from Java development. Between its reliance on Open JDK and Kotlin, it literally has zero dependencies on Java.
If the Android team plans to rewrite all of them in Kotlin, be my guest.
Maybe they will manage before Fuchsia goes live and Flutter wipes the floor, and then everyone will be doing Dart anyway.
Have you noticed how shitty are all the languages designed at Google?
Thankfully someone that was there since Java 1.0 days bought its rights.
GraalVM would have been killed at birth.
I am also looking forward to the complete Android development environment to be running on top of Kotlin/Native, otherwise it will be so funny having to port Studio and everything else that depends on the JVM to modern versions, while Android itself is frozen into a Kotlin ecosystem + Java 8 subset.
What are you talking about?
Android developers can use Maven Central like any other Java developers without care about what JDK these dependencies were compiled with nor even whether they were written in Kotlin (most did not, obviously).
> I am also looking forward to the complete Android development environment to be running on top of Kotlin/Native, otherwise it will be so funny having to port Studio and everything else that depends on the JVM to modern versions, while Android itself is frozen into a Kotlin ecosystem + Java 8 subset.
Again, what are you talking about? Android development happily upgrades to the latest version of Kotlin without any trouble. Porting Studio? What? Do you even understand anything about any of these matters?
My point is simply that Android development today has zero dependencies on Java but you seem to have a thick chip on your shoulder and determined to spew toxic bile at Java and its ecosystem, while feeling some vague hate at Google in general.
I have zero interest in this debate, have fun tilting at these windmills.
Stating otherwise just proves that you don't know Java.
Android Studio and the complete Android toolchain runs on top of a JVM implementation, as the JVM moves forward, JetBrains will be forced to update InteliJ to take advantage of newer JVM versions, which will force Google to update all their Android development environment.
Just for kicks they are already being forced to do this,
Again, another proof of total lack of knowledge regarding Android
Toxic bille at Java?!?
Quite the contrary, I love Java since 1996 that is my third pillar alongside .NET and C++, what I completely hate is that Google played a Microsoft move with their flavor of Android Java (aka Google's J++), helped Sun going bankrupt withering them the revenue stream from Java deployments on Android, didn't bother to rescue Sun hoping that it would close doors without a hiss, now with its Android Java forces Java developers to create special versions of their libraries tailored to Android, and has a bunch of Silicon Valley fanboys supporting their damaging actions to the Java eco-system.
What were again your apps on Play Store?
That's not really fair. The point of the Erlang language was its novel and opinionated approach to concurrency. Java wasn't trying to be like Erlang, it was trying to lure programmers by having significant similarities to C/C++.
But when it comes to remote debugging, and more specifically, a general "I want to understand what is happening in production", the ability to attach a REPL, alongside your tools, is amazing. I can insert a breakpoint, sure (if I for some reason built my production instance with debug info), but just as easily (without any debug info compiled in!), and more usefully, I can query actor state, mailboxes, etc, fire a message to a process to see what happens, etc...all the things you'd get with a REPL running locally in your dev environment, basically. Do stuff like query for internal state for a process, then call a function with it to see what happens to the data, all in isolation from the normal execution flow (since immutable data gives you a degree of safety to actually run that live code, with copies of the live data, and see what happens). I can even remotely load new code, if I want, effectively allowing me to deploy a hotfix without taking the node down. And I can do all of this in prod. All of this is, of course, super dangerous, but with great power etc etc.
If you write the service stateless it's incredible what you can achieve with a couple small instances of a default spring boot container.
Can you point to another language that has anything remotely comparable to `java.util.concurrent`? Also, Java is getting green threads by means of Project Loom.
Not sure how to interpret this comment.. If high concurrency and high performance matter, that is precisely where Java shines. The only other reasonable option would be C++ but it brings so much pain with it that Java is the way to go.
If traffic is low and performance doesn't matter (which is most sites), then sure, use whatever favorite scripting language.
Or am I misinterpreting the argument?
To be fair, docker is already a pain on my machine (using Fedora 32). I gave up on using docker at some point.
What's next for Java should be relatively little change; let a language like Kotlin without all the baggage be the way ahead on the JVM. There's a remarkably good compatibility story there; way better than basically any other language ecosystem out there, that's the real legacy of Java.
On the other hand, they need to have everything. In many other languages, it's common to just use a library written in a different language. For some reason, the foreign function interface of Java seems to have been designed to be hard to use, so instead of using an already existing library, Java developers tend to go through the route of "Rewrite It In Java".
Java is big because it has been around a long time and was decent when it came out. Java was the Go of its generation. Nothing radically new but wrapped up in a way people liked and was familiar with in large part due to the success of C/C++ prior.
Whatever achieves mass adoption after Java will also be behind the times by the time that happens, and as geeks we will have moved on to whatever is newer and cooler.
I am pretty neutral towards Java as a language. My biggest issue is with the software culture of over-engineering and complicating things. Java guys seems very dogmatic about how to design software.
Very early in Minecraft's development, people were already decompiling/modifying/injecting their own mods, and a lot of frameworks (Bukkit, Spigot, etc.) emerged to provide a common API for modding.
The large modding community arguably had a very positive impact on Minecraft's early success -- Although I don't have any quantitative metrics to reinforce that point, I fondly remember early Minecraft as having a relatively technical community that tinkered with the game as a sandbox for countless custom experiences.
Sadly, in the real world, it seems that Scala is mostly relegated to the Spark world.
Nothing competes with Java. Nothing. Because Java wasn't about destroying the competition; Java was about creating a reality that otherwise did not and could not exist. It was about imaging the "what could have been", and then creating that.
And somehow it is still widely used within Google, Facebook, Amazon, Twitter...
But for a service which needs to handle 500 concurrent requests at maximum and doesn't have to deal with TLS anymore it will be fine. And there's enough of those services out there.
A lot of the Java code in bigger companies is also written based on older frameworks like earlier versions of Servlet and J2EE. Those programs will also not make any use of async mechnanisms and prefer a simple programming model instead.
Threads (and pointers, which you compared them to) are the abstraction at the hardware level - everything else has to be built on top of them in one way or another. Just because you have access to threads (or pointers) doesn't mean you have to make poor architectural decisions. I'd like to draw your attention to Doom Eternal which takes the thread pool model through to its logical conclusion. (https://twitter.com/axelgneiting/status/1241487918046347264) I hope you'll agree that's an example of meeting the needs of real software. (I'm sure it's not the first or only example of that approach, it was just on my mind because it came up recently.)
Most of the highest performance server code across many industries in written in Java. So...
I wonder what do you consider could possibly compete with Java in this space?
This is a different GC design than V8 and Go, which use older collector designs with high overhead. They need to collect very frequently because their stop the world pauses get longer with more garbage. Javas new collectors are near constant time, even with terabytes of garbage, so it's much more efficient to wait until the heap builds up and collect much less frequently. Ironically, Java appears to use tons of ram because it has better garbage collectors.
When you configure Java GC to collect frequently, it turns out Java uses 2-3X less ram than JS, and far less than Python, Ruby, etc. It does use more than Go, about 2X. But the point is it uses a lot less ram than most other popular languages for web dev.
Unfortunately the "Java uses too much ram" is used in defence of using things like JS and interpreted languages, when in actually it uses much less if you configure it to.
The various factors I see are:
- JVM overhead 10-15mb seems to be required just to get off the ground,
which directly relates to the JVM replicating a bunch of OS
functionality (inevitable tradeoff for portability)
- Missing value types - any complex data structure ends up with
big overhead from storing references
- Stack frames - if you need a lot of threads you need a lot of stack,
and a surprising number of threads run
* JVM overhead - all these advanced optimizing compilers and GCs have non-zero footprint. They need RAM to run their code and they need RAM to perform their tasks.
* Compiled code cache - JVM keeps both the original bytecode and generated machine code in RAM.
* OOP overhead - each object has 2 or 3 words of overhead for the object header vs zero in languages like C or Rust. Even when you don't need dynamic dispatch or object locking, you pay for it.
* Inability to compose bigger structures other than by allocating separate objects and using pointers to reference them - these pointers need space and are not cheap on 64-bit architectures. This is going to probably partially improve with Valhalla, but at this point it is mostly guessing and it has been in development for years.
* No support for packed arrays.
* The smallest unit of loadable code is a class. If you needed a single function, the JVM loads a whole class containing it, and its required dependencies. This is not only bad for memory usage but also for startup time. Unless you pay a lot of attention, it is easy to load 80% of code in order to just display a help message (this is based on a real issue I worked on - I'm not making this up). Compare that to code in languages like C - the OS loads code that gets executed.
And back to GC - I agree the default settings are often to blame, but there is a reason JVM defaults to using RAM so aggressively. GC becomes very inefficient when it doesn't have enough "room". And low pause GCs achieve their low pause goals by trading throughput. Switch from parallel STW GC to G1 and your maximum sustainable allocation rate goes down by a few times.
Java doesn't have a "struct". If you want to represent an array of 64-bit signed integers, Java has you covered with its primitive arrays. But if you want to represent an array of anything more interesting than that, (say, a tuple of a double and a long), you have to serialize and deserialize those objects to and from parallel primitive arrays or byte buffers. Because if you do the language-natural thing and use an Object array, you're paying a huge price in memory: 4 or 8 bytes per pointer in that array, plus a 16 byte Object header on each Object. And, of course, those Objects are all individual allocations, not necessarily contiguous. That's a lot of overhead!
Of course, Java programmers concerned with memory usage don't put up with this. Lots of solutions have been devised. OpenHFT's Chronicle Values is one example I came across recently. But this feels like fighting with the language compared to how easy it is to be efficient with memory in C. If you told a beginner C programmer to make an array of compound objects, it's not unlikely that their array will take up exactly as much space in memory as it intuitively seems like it should. (8 byte double + 8 byte long) * 100 values = 1600 bytes in C, no fuss. If you asked the average Java programmer for that they'd give you something that would take up 3 times as much memory. And because Java makes that behavior natural, it "uses too much ram". It doesn't matter as much that it's possible to convince it not too.
Plus many people overlook just writing a couple of native methods and be done with it.
Somehow this is how people write "Python".
Also Valhalla and Panama are around the corner.
I think this information is a bit dated. Go has a highly advanced concurrent collector with very low pause times (~1-100μs). V8 also has incremental marking, concurrent marking, and parallel compaction. Its pause times are more like 100-1000μs. V8's GC has been tuned more and more to save memory (i.e. smaller heaps) because people have so many tabs open these days.
Uh, most of the rant about Go GC below is out of date, they made some huge improvements from 2016-now. I'm leaving it up because someone replied to it
Unless something has changed in the last year or two, Go's GC is similar to Java's old CMS collector which is being deprecated.
Go's GC is non generational and non compacting, both hallmarks of modern GC algorithms. It's not a modern "moving" collector. It also has several stop the world phases. It's basically a design from the 70's. The GC uses old simple algorithms because the team was under a time crunch when it was developed. This may have changed, but that's how it was in 2018-2019 timeframe.
The pause times are short because GC runs very frequently. It has to because performance with large amounts of garbage is quite bad. This results in significant GC CPU overhead.
I don't know much about V8 collector besides that it is generational and compacting, so more modern than Go. But it's still a "stop the world" design. In that regard it's still closest to Javas old CMS collector.
Javas new collectors, ZGC and Shenandoah, both have near constant time stop the world phases. You can collect terabytes of garbage with only a few milliseconds pause. In V8 or Go, this would be many seconds pause time as the mark phase is "stop the world" in both.
You can find benchmarks that show one way or the other, but in badly behaving or allocation heavy applications, Java's new collectors or older G1 will perform far better. V8 and Go are dishonest about their GC performance by showing average pause times with high collection frequency. The important GC cycles are the long ones, so you really want to measure worse case pause time under load.
Under heavy load Go's design falls over. It's not compacting, not generational, and the mark phase pauses the application. IMO, it's just not a good GC. V8 is better, it is generational and compacting, but mark phase is still STW. Java's ZGC isn't generational but importantly, the mark and sweep phases don't stop execution. No matter how big your heap is and how much garbage, your GC pauses will be short
> ... In V8 or Go, this would be many seconds pause time as the mark phase is "stop the world" in both.
Like I said before, your information is outdated. V8 has both incremental and concurrent marking. I even mentioned it in my comment, but apparently you didn't read that either. V8 only stops the world for semispace evacuation and compaction. It doesn't compact the entire old generation at once, but decides on a per-page basis.
For Go's GC, I am going by public information presented by one of its primary designers, Rick Hudson, who has since retired.
You can argue with his slides if you want. https://blog.golang.org/ismmkeynote
Java's new GCs sound fantastic! It's great for the field in general. However, I would encourage you to spend less time misrepresenting other people's work and making up numbers.
For Go, I'm going to be a bad HN user and not read the whole article. Sorry, it's just too long for this time of night. It does appear that my understanding of Go GC is out of date. There's been many improvements in the last couple years. Some strange behavior due to not enough knobs to configure GC, but it appears to have a near constant GC pause? https://blog.twitch.tv/en/2019/04/10/go-memory-ballast-how-i...
I'm annoyed that Google doesn't offer much benchmarking results for V8. Huge articles about improvements made with a single benchmark image. And they didn't use standard benchmarks for either so it's unclear what they're even benchmarking. The Go slides you linked include benchmarks from some guys production server he tweeted images of, a bunch of standard benchmarks but they only show % throughout improvement, and no pause times.
Well unfortunately you are still misunderstanding, so let me be more precise so we are talking about the same thing. V8 uses incremental marking (i.e. splitting mark work into smaller chunks and interleaving those chunks with mutator time) as well as concurrent marking (i.e. multiple parallel collector threads marking in the background, concurrent with the mutator). Not mentioned in the article, but sweeping of pages is also incremental (i.e. dead space reclaimed on-demand when free lists run empty) and concurrent when idle (i.e. in the background). So the statement "V8 still uses STW for mark sweep" is just wrong. Like I said before, V8 only stops the world for semispace scavenges (fast, < 1ms) and compaction (slow, ??ms), but compaction is less frequent than mark/sweep, which is incremental and concurrent.
You also misunderstood what is reported here. That 50ms main thread marking time is cumulative, meaning those 50ms are spread over the entire garbage collection cycle, split up into small increments so that the mutator (main thread) is not stopped the entire time. It's explained there in the text and illustrated in the second-to-last diagram.
> quite bad compared to ~5ms or less
Again, it is not 50ms pause, it's 50ms work, split into much, much smaller incremental pauses, typically less than 1 ms each. That number is not presented in your linked article but is pretty typical. The V8 GC needs sub-millisecond pause times because it has a soft realtime requirement in that it may end up on the critical path for frame rendering (60fps = 16.6ms).
> For Go, I'm going to be a bad HN user and not read the whole article.
FTA "...The August 2017 release saw little improvement. We know what is causing the remaining pauses. The SLO whisper number here is around 100-200 microseconds and we will push towards that. If you see anything over a couple hundred microseconds then we really want to talk to you and figure out whether it fits into the stuff we know about or whether it is something new we haven't looked into. In any case there seems to be little call for lower latency. It is important to note these latency levels can happen for a wide variety of non-GC reasons..."
TLDR: if you see pause times of more than a couple hundred microseconds, call the red phone.
Also, please note, I am just trying to provide accurate information about the collectors I do know about, designed by people I work(ed) with. I don't know enough about ZGC or Shenenadoah to confidently assert anything about their performance characteristics, but based on what I read I am actually very excited to see them make it into production. I consider advances in GC to be overall a good thing for everyone, and would encourage you to be more open to learning the advantages and disadvantages of various systems without as much derision and not try to pick sides.
There are times when I'd happily trade more frequent GC pauses for a smaller per-process memory footprint. How do you find a reasonably small Xmx that doesn't lead to OutOfMemoryError exceptions?
That's tough to figure out. In new versions of Java, I think 14+, if you use ZGC collector it will return unused memory to the OS. Memory options vary depending on collector, but new versions of ZGC support "soft max" heap size and uncommit. Together it might be close to what you're looking for https://malloc.se/blog/zgc-softmaxheapsize
I should mention the GC situation was worse until the last few years. Until ZGC and Shenandoah came around, Java still didn't collect frequently but when it did there were long pauses. This is what V8 and Go's collectors were designed to avoid. They have more overhead from collecting frequently, but low pauses. With the new Java collectors you get the best of both worlds.
You can set heap ratios and such for older collectors to decrease Java memory use with those, but IMO you're better off using ZGC and uncommit these days
Indeed. That fact obviates this option in most cases. You have to spend time tweaking obscure, unstable knobs (the X in Xmx means Oracle is free to alter its meaning at any time) and risk either a.) serious failures in production or b.) poor results because the conservative choices necessary to avoid 'a' achieved little improvement and you wasted your time. The real world for most enterprises is a vast heard of communicating components and toying with GC switches multiplied by N things is a nonstarter.
So while you're technically correct that excessive memory use by conventional JVMs is "not always true," in practice you are wrong. That reality comes with a real cost that appears on a real bill every single month.
Go's approach of minimal knobs leads to unfixable problems in production. Java gives more options to tune GC for your use case
I didn't bring up Go, but since you did the thing I see is that Go -- a much younger language -- is going places Java never has, or did so only haltingly. Caddy is a case in point. Here is Go taking on nginx, haproxy, Envoy, etc.
The people that once imagined using Java for such things have retired or moved on to other battles. No one seriously ponders attempting 'systems' tasks with Java any longer; that whole space was ceded to more efficient languages. My opinion is that Java's poor efficiency -- a big part of which is its excessive memory consumption -- is the reason for this.
That's my opinion. What I know for fact is that today, when people are making design decisions about new services and their deployment, Java is a problem; it is understood that anything implemented in Java is going to sort right to the top of the list of memory pigs in the cluster, and you can only afford so many of those.
This feels like G1GC erasure. I'll also say we've tried out ZGC and while pause times were low, it had a huge CPU overhead and the performance of our application was notably worse and we went back to G1GC. We're still on Java 11, so maybe we'll see some magic when we eventually try the newer versions.
Give it a try.
Newer versions of ZGC support class unloading and have some other performance optimizations
In my limited tests I never saw a GC pause over 5ms. I was basically hammering a Spring Boot application with HTTP load tester.
While not-pausing is generally a huge improvement (after several decades of GC development), now what about thrashing of CPU caches?
On another note, I'm not aware of other freely available GCs in other languages that are able to easily scale to multi-GB/TB scale memory usage. A while ago, I benchmarked an open source key/value golang project and it performed miserably when it reached GB level memory usage.
By aggressive you mean they actual do that now right? As far as I know before ZGC no gc did that and they're still back porting that feature to G1 right?
Edit: I'm actually quite pleased with ZGC I have the eclipse language server use it and my editors memory usage on average is so much lower.
GraalVM is showing enormous promise in this area, alongside efforts like Project Valhalla.
I think the emergence first of microservice and then of FaaSes has lit a fire underneath OpenJDK folks and others in the ecosystem.
Although Java applets didn't pan out, it gave people a glimpse of the future; paving the way for Shockwave, Flash and the rich interactive web applications that dominated the 2000's. As Java pivoted to the server, it also ushered in the next generation of enterprise web applications.
Happy 25th Java! From a language many first experienced via scrolling web tickers to a rock solid server-side platform that went on to dominate the enterprise. Java will remain ubiquitous for many years to come--even if many don't even know it's there.
When googling tutorials, I see the same material I found 12 years ago. A lot must have happened since then.
What's a good resource to learn Java for somebody who already knows how to program? I'm interested in ecosystem, tooling, best practices, common pitfalls etc.
Yes. As others have said, 'java is stable' which means the old stuff still works. Which in turn means that a lot of people are still using it and still writing blog posts about it. That still makes these old things often obsolete, or needlessly complex and just 'lesser than'. They don't support certain nice features or support them very badly, or have other significant downsides - 25 years of experience does lead to insights, after all.
20 years ago, you loaded your JDBC driver with `Class.forName`. You _STILL_ see this in many examples (and it hasn't been necessary for 15+ years).
These days, you:
* Use a load balancer like hikari
* Consider raw JDBC as basically nuts as far as an API goes, and you use JOOQ or JDBI. Or JPA/Hibernate of course, if you don't want SQL/want DB independence and don't think you'll need to performance-tweak queries too much.
* You use serializable transaction levels and toss _all_ the code that interacts with DBs into a lambda so that the framework can handle retries for you.
And that's just DBs. As a general trend:
Libraries tend to wax and wane. Right now spring is _very_ popular. JSP is the kind of outdated crud that is just a straight up 'do not use this right now' (even if it still kinda works). For date stuff, use java.time. Libraries in general are more focussed on configure-via-java-code, and dip more into code generation and annotations (example: JOOQ). You don't use The JSONObject API, you use jackson or gson. The list is very long.
I have no particular advice on how to know all this stuff as someone not familiar with the (modern) java ecosystem, though. Just pointing out that 'stable' doesn't translate to 'not much new in the past 15 years'.
Try kotlin instead of Java, and if you must use java, read the book Effective Java.
If you use kotlin, check out ktorm for db access, http4k for http api’s
Try Scala instead of Kotlin, it's much more powerful, and you can safely avoid the mad Scala libraries jam-packed with symbol infix notation.
I'm the perfect case of what you sad re. old devs -- I've been using various JVM languages for ~15 years, but still didn't think of replacing JDBC :) Will check those out!
OpenJDK is the name of Oracle's (one and only) Java implementation project (take a look at the logo at http://openjdk.java.net/). Oracle JDK is the name of the commercially supported product built from OpenJDK, and Oracle also distributes the JDK under a 100% free license (http://jdk.java.net/).
While OpenJDK has been the open-source part of the Sun/Oracle JDK since 2007, Oracle recently completed open sourcing the entire JDK, so that there are no more paid features. The JDK used to be part-free and part commercial, and now it is completely free; you only pay Oracle -- or other companies --- for support if you want it. Other companies contribute to OpenJDK as well, but Oracle still contributes ~90% of the work, and all OpenJDK builds by all vendors are licensed by Oracle. So while you absolutely don't need to pay Oracle (or Sun, as you did before) for using the JDK any more now that it's 100% open, you should at least know that Oracle is the company that (primarily) funds and develops OpenJDK.
(I work on the JDK, i.e. OpenJDK, at Oracle)
The real damage that Oracle caused by their Android lawsuit, and by their JDK licencing scheme change, will reverberate for long time.
> and by their JDK licencing scheme change
The JDK licensing change was that Oracle changed the JDK from part-commercial part-open to 100% open for the first time in Java's history. On the commercial side, the change was from part-upfront, part-subscription to just subscription, which cut the price for customers by a factor of 5, I think.
What's important to remember about Java is that it's huge, and many companies make money off of it, and so companies have an interest creating FUD over Oracle's involvement. I can't speak for other parts of the company, but there's near consensus among Java users that Oracle has been a better steward of Java than Sun, both in terms of technical investment as well as licensing.
Or Dropwizard if you want microservices that are more light weight and performant than Spring.
Or Vert.x or Play if you want to code reactive microservices.
> No one does inheritance anymore composition's all the rage.
Hmmm, that is oversimplification IMHO. When OOP first gained popularity, there was a lot of emphasis on inheritance. Inheritance got misused resulting in systems that became too rigid to change over time. The proverbial pendulum has swung the other way resulting in this sentiment of "no one does inheritance anymore" but inheritance is still very powerful and productive provided that you learn how to use it correctly. Think of inheritance as an advanced feature best left for more senior engineers to use.
Reactive is probably a mistake with loom looming around the corner.
It might not be the trendiest, but it's definitely the most popular (as in "in demand").
Modulo usual caveats about not promising anything and safe harbours and forward-looking statements, VMware is interested in Spring efficiency up to a very high level in the org chart. Watch this space.
Disclosure: I work for VMware.
Does maven still have issues with snapshot builds and classpaths?
I've managed to avoid Spring, everyone I know that's worked with it complains it's difficult to work with.
For microservices I like simple, single-purpose pieces like sparkjava.
What do you mean by this?
Startup times, or developer onboarding?
Because I've had this conversation before if the latter. To me, Spring seems like the last gasp of "Enterprise" Java. Too much is implicit and obscure (aspect-oriented programming is an anti-pattern, IMHO), too much is configured (yuck, XML).
Each to their own I guess.
Java is fundamentally a slow adopter of new techniques (it just got lambdas in JDK 8), but a lot of the Google libraries fill in the gaps.
Note that if you are learning Java for Android development, that's a whole different sub-discipline. In that case I recommend the Android tutorials since most of the work is dealing with the Android SDK.
Interesting use of the word 'just'. At the risk of making readers feel old... Java 8 was released _6 and a half years ago_.
It could be anecdotal, but I've found in practice vendors and companies are conservative about their JDK upgrades. I haven't seen anything prior to JDK 6 in a while, but I don't think the upgrade cycle is as fast as say, python minor version upgrades.
As for guice, my preference for reflective runtime injection is Weld since it's the standard reference implementation.
There are a lot of features that have been subsumed into the JDK, and you should usually prefer the JDK implementation where available. Guava has deprecated the redundant functionality, so if you pay attention to your IDE you will be fine.
I did forget about the cache's though! Good call.
Protobuf (another Google Java product) also made big breaking changes in libraries between 2 and 3.
The relatively recent adoption of the 6 monthly release cadence is helping a lot - particularly with the small feature additions that used to get stuck behind the release train.
That said, for most orgs, the biggest changes you might see would be in libraries and frameworks used. There will most likely be less XML, better build tools and more modular library usage than 12 years ago.
Regarding the "bite the bullet". I was also a bit afraid, but Java is a great language. Yes, it's a bit verbose but that's compensated by its amazing tooling, specially IntelliJ IDEA.
- The current release of Java Standard Edition (SE) is 15; but many applications are still using version 8 or something in between.
- The enterprise version (EE) of Java is now called Jakarta and part of the Eclipse foundation  Focus on Cloud Native (Kubernetes, Docker ...)
- naming schemes: Java 1.8 is just Java 8, Java 1.11 is Java 11 etc.
- the versioning cadence of Java has changed with version 10 (in 2018) from "every few years" to "twice a year", that's how we got to version 15 in such a short period of time.
- The officially recommend way to build Android apps today is in Kotlin, but there is still support for Java.
- The JDK is used for developing apps in Java, the JRE just to run those apps. The two seem to be converging: everybody just downloads the JDK. The JDK contains a debugger, a shell, a document generator a compiler etc.
- OpenJDK: the project containing the Java source, is available in two builds OpenJDK and OracleJDK (the difference seems to be in commercial support from Oracle, not in code / functionality). Oracle is the primary contributor to OpenJDK 
Not really. That’s the point of Java. It’s stable.
Check the Wikipedia page on Java history to see the handful of new features then just look for tutorials on those that you need to use. Most of these features are things you can pick up as you go.
1) It finally got anonymous functions in Java 8, so you no longer have to use anonymous classes and complicated design patterns as substitutes.
2) The distribution model changed from targeting a preinstalled JRE to bundling your own runtime with jlink and jpackage (i.e. the same as native applications and .NET Core).
For Spring, Jakarta EE, Micronaut or Quarkus the docs of their respective websites is enough.
HotSpot is a pretty great codebase actually. It's very easy to read compared to the CLR. The issue with it isn't that it's bad code or fragile, it's just that it's very complex, but they're reducing complexity over time by removing obsolete optimisations (obsolete, or so they argue).
The SVM codebase is also nice but it's a very different model. Over time the codebases may be merging, as Graal the compiler gradually replaces C2. But that could easily take a decade.
Java and GraalVM come from two different parts of the company. They're aren't done by the same people.
No need for xcode on os x or gcc on Linux, or clang on bsds?
A significant reason for Java to be relevant even today is Android.
But Java is the majority language on the back end too.
Except for periodically updating my Java AI book  (5th edition was released July 2020), I don’t much use Java because most of my customers want to use a Lisp language.
Where should Java go now? I think both OpenJDK and also Oracle are doing a good job adding new features. I would vote for faster startup time; keep improving language conciseness; better data initialization literals.
Just a few months ago I dusted off an old project from 1997, loaded it up in IntelliJ IDEA, built it, and ran it. It worked! And that's Java's best feature, it's long-term language and library stability. I worry that it is at risk now with Oracle's new 6-month release cycle.
Funny how that never worked out. Even the few Java desktop apps that don't look like 30 year old SunOS apps (IntellIJ is probably the best-looking Java app), have to have substantially different versions for each platform.
A few people in the research groups tried re-writing a few apps in Java, like Acrobat Viewer, but nothing ever came of it.
It's rare for any software to achieve a kind of stable sustainability that allows it to continue development for as long as 25 years.
It gives me hope for some of the big open source projects - hope that maybe, eventually, all the bugs will be fixed - even if it takes decades.
Java is like your old wife/husband/spouse. It's not sexy, you probably don't enjoy much when doing things with it. But it's dependable and reliable, And, aren't you where you are now thanks to it?
Newer languages.. yeah they're sexier, more fun to play with, make people think you're cool when with them, but they might end up wasting your time :)
Sorry to hear about your marriage going bad. Maybe some counseling will help?
Based on the current state of affairs oracle and the jvm seem very far away from mainstream ML
MXNet also makes deep learning a first-class citizen in Java, but yes the research community is firmly entrenched in Python atm.
I see JDK languages as being in a decent spot for deploying ML/DL, and maybe an "emerging" language for training DL models.
Folks like Jeremy Howard have increasingly been expressing growing pains with Python and TensorFlow has been looking for something with a better type system like Swift.
If I had to take a guess, I'd say that Python is likely going to be sharing the ecosystem with a language with a better type system and performance like Julia or something on the JDK.
Instead, Julia should really be first to mind when considering that question. Although I might expect the author to respond with something about the mysteries of 'production'.
The book below provides a nice overview of using Java for DS and ML