Hacker News new | comments | ask | show | jobs | submit login
An Opinionated Guide to Modern Java, Part 3: Web Development (paralleluniverse.co)
226 points by dafnap on May 15, 2014 | hide | past | web | favorite | 163 comments

It's funny. An "Introduction to Modern Java Web Development" sounds like a primer for the Play Framework, Scala, and Akka. Each of the examples looks like the starter documentation for Play, in that you've got your json manipulation, routing, connecting to a database, DI, actors, etc.

Java devs-- you seriously owe it to yourself to spend the time investigating and ramping up onto Scala and Play. This is where the future of Java web app development is being driven from, by Typesafe and the Play / Spray /Akka open source developers. You do yourself a great disservice by sticking with Spring, Hibernate, JBoss, and the old standbys.

First of all, your attitude is off putting and condescending. You don't convince people by telling them "You guys are living in the past, look how I do things, you should do the same".

Please enough of that, especially coming from the Scala community. Let's act as professionals and judge tools on their merits instead of instigating flame wars.

Second, I'm doing both (web Java at work, web Scala on my spare time) and in my experience, there is really no clear winner. What's especially interesting is that I can find about the same number of positive things to say about the Scala tool stack (Scala/Typesafe platform/Akka/Play) as I can say negative things about each of them. Every time I'm happy about something in the Scala world, I find something I'm not happy with that counter balances it (tooling, slowness of template recompilation, unprovedness of the actor model, Play's arguable step backward in the v2 compared to v1, etc...).

Java is impossibly verbose and has a very limited type system compared to Scala but man... do I develop things quickly with it. There is close to zero friction to get from nothing to something workable, maintainable and fast. And the tooling is top notch, the environment and compilers are super stable, and Java 8 is numbing a lot of the pain I used to feel. In contrast, I feel that I'm often fighting against the Scala compiler whenever I write Scala code. It feels nice in the end to see how concise and neat the code looks compared to Java, but I'm never really convinced the pain was worth it.

So, back to your original point: I've tried (and continue to experiment with) all these "new" technologies that Scala is claiming to bring to the table, and so far, I'm unconvinced that they are a clear improvement over what we currently use in the Java world.

And given that Scala continues to be a marginal language on the JVM, I don't think I'm the only one doubting that Scala represents the future.

Don't take my comment as "quit living in the past". Take it as "here's how exciting the future is". I'm not telling you your startac phone is crap, I'm telling you there's this great new thing called the iPhone/Android/smartphone.

The real benefits of Play and Akka become apparent when a third-party service you send API calls to starts acting flaky. Or when your web service that you offer to clients is suddenly being hit by an iPhone game that just got purchased by a million people. Or when that nasty little bug that winds up throwing an exception from time to time rears its head all of a sudden. Play and Akka give you a resilience that is very hard to come by in the Java world.

Typesafe's home page has, in giant type, "Applications are becoming Reactive" and they are absolutely spot f'ing on. Writing blocking code, waiting on threads, assuming you're running on one JVM, these things are great and all, and very easy to do because all of those templates are built into IntelliJ and you just have to click a few buttons to get all that, but wow, the performance difference once you get into the Play world is just frighteningly good.

I don't find mark242's comment off putting or condescending at all. He's just telling is like it is.

Of course there are Java developers who can kick ass with Spring, Hibernate, Servlet containers, and war deployments, but all of these tools are huge, bloated, and starting to show their age. Modern frameworks like Play make things simpler for those of us who don't have +7 year JEE experience with Spring and Hibernate.

The comment is off topic since the article is not about Spring or JBoss and it recommends against using Hibernate. The article is about a modern lightweight stack (JAX-RS, Jetty, Dagger, JDBI).

Not it's not, rebutting another poster who is commenting about Play (a modern java web development framework) in a article that is about modern java web development frameworks. Your comment is counter productive to this conversation, and the purpose of Hackernews in general.

He's just telling like he thinks it is.

There, fixed that for ya.

I agree with the Java vs Scala poster. And unlike him, I have a 30,000 line scala production project under my belt. Everything he says is true, and then some. Ever split up a file into chunks because intellij can't edit it effectively?

Yeah, didn't think so. (Just telling like it is)

> And unlike him, I have a 30,000 line scala production project under my belt. Everything he says is true, and then some. Ever split up a file into chunks because intellij can't edit it effectively?

I have a 20k line Scala project in production and editing it has certainly never been an issue. Why does Intellij have problems with large files?

I've written several large Scala projects and never had Intellij choke on large files.

Incrementally compiling a 3000 line file in 2011 as you edited it.

I'm sure the Jetbrains folks have fixed that right up, but at the time, breaking it into 500 line chunks was required, due to the exponential nature of the slowdown we accomplished.

The basic reality is Scala compiles are slow. Syntax and red-line highlighting isn't cheap, and no amount of brilliant russian IDE developers can totally fix everything.

2011 was a long time ago.

scalac certainly isn't fast but I just clean compiled our 20k line project in 75 seconds.

I'm not claiming that's great or anything, I just don't find it to be a showstopper.

> Incrementally compiling a 3000 line file in 2011 as you edited it

Heh, that's brilliant, read the SBT compilation guide, 1 source file per class (obviously that's not set in stone, but a good guideline to follow).

Also, SBT is boss. Enabling automatic build in one's IDE is asking for pain. Why, why is the IDE blocking when I save the file? That's why.

> The basic reality is Scala compiles are slow

For deployment sure, but not a show stopper either (20K LOC in around a minute on warm JVM).

For incremental builds Scala is not even remotely slow, particularly if you follow best practices and break your application up into modules (sub projects in sbt world).

Anyway, things have changed (a lot) since 2011, we're not in the stone ages anymore -- if you want that, go check out Haskell where you'll get no tooling, no stack traces, and eternal compile times ;-) With a superior type system, brilliant community, yada, yada if that's your thing.

Same here, we have multiple large Scala systems, never had issues editing files in IntelliJ.

> ... Java 8 is numbing a lot of the pain I used to feel.

Interestingly, the biggest thing I wish I had when writing Java is algebraic data types and pattern matching. A few years ago I would have said it was lambda, but having sum types would make expressing many things so much easier.

Been there, done that. In the previous company I worked at, some new projects were done in Play. The result: frequent API changes in Play, bad Maven integration, illogical APIs from the Java perspective (an artefact of Play being written in Scala), little documentation, and I don't know how it is now, but back then it was very had to make a minimal REST application without pulling a lot of baggage in.

We rewrote these applications using JAX-RS (RESTEasy) and they were much simpler, easier to maintain, without API breakage in new framework versions. It just a bunch of methods with annotations. XML and JSON serialization is automatic (via JAXB and Jackson). And you can use the same lightweight ORM as Play (Ebean), since it's an external project.

Well-thought out, mature technology is often better for getting work done than the hype of the day.

I've just started using play for the last two weeks, but so far i really don't understand the critics on java vs scala or the documentation. True, so far i've only dealt with json. / db / file upload & download , but even that covers quite a vast area, and it was pretty obvious that every time play had both a simple and documented way of doing those , even in java.

I don't know. I loved Play Framework 1. Simple, lean, and lightning fast development. Play 2 came out and it looked strange, with various Scala things mixed in. It's a lot more complicate and bloated, and more buggy. The core piece is in Scala. The build is now sbt, yet another different thing to maintain. The template is Scala, slow to develop and slow to compile. The much-touted async feature is a meh. Most web apps don't have a high scalable requirement that async would help. If I really need high performance async support, I would go with Vert.x.

Play 1 was almost perfect. Really hope someone would fork Play 1 and continue with it.

Personally I have become quite fond of Ninja Framework ( http://www.ninjaframework.org/ ); as simple as Play, but pure-Java (Java 8 too!). Even has a similar hot-reload mechanism as Play!

Author here. As I explain in the article, I cannot recommend any framework that encourages asynchronous code. It's simply the wrong approach, no matter what functional tricks are used to make it more palatable. And the blocking Play APIs both feel foreign to Java, and provide no benefit over the standard, and widely implemented, JAX-RS.

Blocking in Play should be foreign. On modern hardware architectures, writing code that blocks is the equivalent of throwing up your hands and saying "I can't trust myself to write efficient code so I'll just scale out my hardware and hope for the best". This is how Amazon winds up making so much money off Java developers who get constrained by thread pools and wind up spinning up a million instances of m3.medium machines.

This is, imho, what makes Play so fantastic. That you can have def index = Action { hardComputation() } for blocking code, and change it to def index = Action.async { Future { hardComputation() } } and you have magically changed your application to be asynchronous through the request-as-actor paradigm of Play. Play makes it easy to write async code. Play makes it way easier to control configuration for things like thread pools, dispatchers, and the like vs. Jetty. The performance of the nonblocking async code is incredible, and winds up saving a ton of money and developer time.

Doing a 'hardComputation' async provides no advantage. The cpu is limited to how much computation it can do and if you have enough of those 'hardComputations' running simultaneously whether sync or async you will hit the limit of the cpu.

Async is only beneficial in terms of IO.

Preemptive multitasking makes this much more valuable than you portray it. Play also makes it really easy and you don't have to seriously think about it.

You should read the end of the post, then. The asynchronous, non-blocking approach, is always wrong. Regardless of your hardware architecture.

"The asynchronous, non-blocking approach, is always wrong."


If you have a method that relies upon a DNS query, and your DNS server starts to respond to requests very slowly, your blocking application will run out of threads very quickly, and you'll be spending time poring over thread dumps trying desperately to see why all of your threads are waiting when the load average on your app server is basically zero. Your monitors will be going off like crazy because it appears that your application is flapping, sometimes able to handle a request and sometimes not. You will be tearing your hair out (and Java devs have all done this, though maybe not because of DNS).

If you have a non-blocking application, you will get an alarm that one of your pages may be acting a little slower than normal (even though the rest of your app is running normally). You'll look into the controller source, see that it's performing a DNS query, and poof, your debugging is done and you're off to restart bind.

I think this is an argument for timeouts, fail over, and load shedding not asychrony.

If you have a sufficiently slow DNS you will chew through all the memory available to your non-blocking application in an amount of time that isn't functionally different from how long it will take you to exhaust your threadpool.

The whole point of the lightweight thread approach with shrinking stacks is that you will use the same amount of memory for both and they will fail at roughly the same point.

You won't run out of (lightweight) threads because you can have millions of them. Blocking code on top of lightweight threads give you exactly the same scaling characteristics of nonblocking code.

I know what you mean by the post and your current assertion that non-blocking is "always" wrong regardless of architecture, but I think there are a few caveats that you really need to apply here.

You are talking about a problem domain where concurrency is what you are scaling and where the things that would block are orders of magnitude slower than processor time.

If you were working in a problem domain where latency is what you are scaling and the blocking calls are on the same order as processor time, non-blocking approaches can be best as the blocking mechanism can still carry overhead even with light weight threads.

Maximizing throughput is another beast entirely as well.

Would you mind substantiating why asynchronous code, successful in plenty of places and pretty much every new environment you're likely to run into, is so dogmatically the wrong approach?

> You do yourself a great disservice by sticking with Spring, Hibernate, JBoss, and the old standbys.

Given the amount of money you get with them, doing consulting in Germany, I think they will stay around for quite a while.

I'm curious if this is comparable to Rails rates.

For UK market, check out jobserve.co.uk. From what I've seen there, the rates for a typical strong dev are about the same in the Java and the web scripting languages worlds (js/node, ruby). However, for rare and/or exceptional experience, pay in the Java world can skyrocket. Also, finance is hiring mostly Java/.net/c++ guys, and their rates are much higher than what everybody else is willing to pay.

You can get into projects doing 60K € working for others, freelancing is higher.

No, it isn't. I was a Scala enthusiast but experience has taught me to like Java and its maturity and dislike Scala's unstable complexity (maybe it's more stable now than back when I was onboard) but the language by design embraces abstraction building that is nice-looking superficially but deeply hard-to-understand when different features interplay, which also makes the tooling harder to get right.

Also, Spring is not that bad, though sometimes I have hard time to understand some of the initialization/runtime problems. Hibernate and JBoss I rather skip, and use Tomcat and Dalesbred for relational data access.

My favorite - write Scala (or even Kotlin) with Spring (the new versions of Spring) or - Java EE 6/7 with non standardized Jersey MVC (hope they'll add it in Java EE 8)

I think Play / Spray and Akka are great but let's differentiate between the language and the framework.

You can write Play / Akka with Java (they added Java 8 support recently) and you can write Spring / Java EE with Scala (I do that all the time and it's working great)

More than that, I have mixed Java / Scala projects and it works like a charm (once you configure maven / gradle correctly that is)

The "new" Spring versions are nothing like it used to be. Also the new Java EE ones. no XML configuration files, all convention over configuration. And I sometimes prefer a @GET + @Path or @RequestMapping annotation over concatenating routes with ~ like they have in Spray.

It's just a matter of preference and personal taste, but let's separate the language discussion with the framework / ecosystem discussion, they are not the same.

> you can write Spring / Java EE with Scala

It'd be nice if we could also write Gradle with Scala, or some other decent JVM language.

The article doesn't even recommend Spring, Hibernate, or JBoss; in fact, it specifically recommends against Hibernate. Instead, it recommends Dropwizard, JDBI, and Dagger among others.

I put my money on Spring, Hibernate, JBoss lasting longer than Scala, Akka etc.

Those technologies have stayed on, while other trends have come and and go.

Actually, it is basically Pivotal versus TypeSafe, not just Spring versus Akka or anything like that.

I put my money on TypeSafe.

Pivotal still (05/2014) doesn't have Java8 support for some of it's frameworks (grails https://jira.grails.org/browse/GRAILS-11063 ). TypeSafe pre-released versions of Play that take full advantage of Java8 features.

Given that if you need to run a Java 1.4.2 app on top of JBoss (or Glassfish!) then the types of technologies that these two companies are pushing are solving problems in a different domain.

And arguably the PermGen change in Java8 would benefit Grails/Groovy the most.

Grails 2.4 will support Java8, and RC2 was released today!

Been there, done that. I switched from Java to Scala and Play. Back to Java 8 and Spring MVC and couldn't be happier to be back.

> in that you've got your json manipulation, routing, connecting to a database, DI, actors,

While I agree with you about the others 'json manipulation' is not an area that Play can even be compared to Jersey (Jackson). Jersey transparently converts the request body to the required input type whereas with Play you still have to un-marshal the body while taking care of errors (admiddently that can be 'hidden' away using Action composition but still is work that the developer has to worry about). With Jersey its magic that just works.

Well, he doesn't recommend any of Spring, Hibernate or JBoss, if you read the article. And I think his point was modern _Java_ development.

Agreed, Java, JDBC, HTML/CSS/JavaScript - this is my preferred stack, simple, linear, understandable, maintainable.

Scala, Lisp, Haskell, Smalltalk do not register in my puny brain. C, C++, C# (which I don't EVER do any more), Java, those resonate and make sense to me. Bash and Python too for scripting.

I'd rather put my money on Spring learning lessons from Play / Scala / Akka. If they don't, the writing may indeed be on the wall.

I don't feel like Spring can fix the core problems here, unfortunately--they're Java Problems, and Java Problems grow more and more acute as time goes on (though Java 8 is a good step in the right direction).

The pleasant, compositional interaction borne of Scala's syntactical flexibility is why Play's wonderful to use (and Akka as the backend is largely behind-the-scenes). I used to be a detractor early in Play2 because of some large-scale-unfriendly decisions, but they've pretty much all been fixed.


For your point to stand, you'd have to spend some time explaining why you think Play is superior to DropWizard.

A very good article; well written and explained.

For Java web development, it is worth re-considering Spring Boot though * (http://projects.spring.io/spring-boot/)

Spring MVC and Data (et al) powered entirely by annotations, and (e.g.) Thymeleaf for templating, can lead to some fairly powerful yet concise apps.

We just started using this where I work, and it's a great step forward compared to old style, XML driven Spring MVC and Hibernate.

Also Spring Boot does bring quite a lot of support for REST, via RestTemplate/RestOperations, along with support for consuming and producing JSON and/or XML at the controller levels.

* Edit: I say re-considering since the article only briefly mentions it.

Full JavaEE development is pretty analogous - POJOs and annotations. I almost gave up on enterprise Java when every EJB required multiple classes, interfaces and often XML configuration too.

Modern JavaEE isn't nearly as heavy as the article seems to think ... but I can't argue with the article's underlying pragmatism. Use the simplest set of tools that accomplish the task!

I am especially fond of his view on database layers. There's JDBC, which is extremely verbose and needs scaffolding to make it usable, and then ORMS, that will quickly fall apart the moment you actually need to do anything interesting with the database.

The Spring solution to this problem was always my favorite part of their stack: JDBC template removed most of the error prone boilerplate from JDBC, adds a couple of features, and still lets you write SQL directly, which is probably what you should be doing in almost all the cases where a relational database becomes valuable.

I can't wait for shops to start to use Java 8. The removal of so much boilerplate when trying to build functional interfaces should make the Java toolsets move forward very quickly.

Again, just use jOOQ. It's beautiful executed design, written by smart people.


I never cared for the library until I realized I'm spending a lot of time on the developer's blog; he is prolific and very knowledgeable about everything database and java. Was sold on it soon after.

That is precisely why I created this little project. It generates the boiler plate code to leverage jdbcTemplate and organize the data access objects.


Also, this fellow HNer has been experimenting with Java 8 features. Pretty cool. https://github.com/benjiman/benjiql

I think myBatis does something similar, abstracts JDBC and lets you write SQL easily.

I think that thread scheduling is really not the high pole in the tent for large numbers of threads. It's stacks. If you want to have a million stacks and each is allocated to the highest watermark that thread ever reached you will run out of memory.

I suspect that task size has to be smaller than the typical bit of web processing code for context switching overhead to really dominate, or even be expensive enough to matter. That says as much about how expensive common tools and frameworks are as it does about the cost of context switching.

That is kind of the issue in Go at the moment. They have been oscillating between default stack sizes and also switched from green threads to real OS threads backing goroutines.

EDIT: scratch the last statement about Go using OS threads for goroutines. I was thinking of something else.

Sometimes on Linux systems, even allocating a large stack size doesn't actually consume that as physical memory. Malloc might give you the memory block, but until you actually make function calls or allocate data on the stack, it might not consume the memory.

Someone should really compile a list of all the non-toy languages/environments that started with green threads and switched to native threads, with explanations of what they were trying to do and what happened and why they switched. It seems to be a popular path. Before the next person thinks "oh, this would be so simpler and more problem-free if I use green threads", that person should really review the prior art, heh.

Usually, when people say "green threads" they mean 1:N scheduling (i.e. the application consumes one kernel threads, and schedules N threads on top of it). Lightweight threads, on the other hand, employ M:N scheduling, i.e. they use multiple kernel threads and schedule even more fibers on top. That's the approach taken by Erlang, Go, and Quasar. And Erlang has done this well for a long time.

Now, fibers are not perfect: their main shortcoming is integration with legacy code. The optimal approach is scheduler activations, or "user assisted" kernel scheduling.

On the one hand, I've observed that operating system implementers have repeatedly tried and rejected green threads for their pthread implementations (Solaris many-to-many threads and FreeBSD KSEs are both now historical footnotes). In Linux around 2002, there was a new many-to-many pthread implementation called NGPT that was backed by big players like IBM and Intel, until a couple of Red Hat developers (Ulrich Drepper and Ingo Molnar) obsoleted it with NPTL and some accompanying kernel optimizations. So at the OS level, a 1-to-1 correspondence between kernel-level and user-level threads seems to be what mature implementations settle on. Also, early JVM implementations for Unix used green threads, but that's also now a historical footnote.

On the other hand, Go does use green threads, and the main developers of Go are by no means naive. So maybe green threads do have some benefit, but only when implemented in something higher-level than libc and pthread.

It seems like Go uses it's own threads but backed by OS threads, rather than actually pure green threads. Yes?

But yeah, a whole lot of people seem to think "man, OS threads suck, surely we can do better with green threads," only to find out why OS threads suck, and that many developer-years of work have gone into making them not suck more.

OS threads by no means suck, but they can't make certain assumptions that lightweight threads can about thread behavior. They're great for a lot of processing, but not so great when they have to constantly block and unblock.

IIRC, SQL Server was their first product to use this.

Relevant HN thread:


BTW, I wonder if Paul Turner's work on fibers will change this equation if/when it makes it into the production kernel.


Google has had very promising initial results with it.

I've been wondering about this as well. Goroutines start with an 8KB stack size now; native threads seem to use an 8MB _virtual_ stack size of which often only one or two pages (of 4KB) will be used. So is the primary remaining advantage of goroutines vs threads the efficiency of the go scheduler?

FWIW, musl libc uses a much smaller default thread stack size (80KB). But apparently, Rich Felker wasn't able to find any good data on how big thread stacks actually need to be (see http://git.musl-libc.org/cgit/musl/commit/?id=13b3645c46518e...).

EDIT: So I guess we should run more big multi-threaded server apps with musl and see what breaks.

When did Go switch from green threads to real OS threads backing goroutines, and why? Can you at least point to a mailing list post or version control commit?

Sorry apparently that is not the case, I am an idiot, I was thinking of something else.

Deferred allocation of stack pages only buys you so much. Each thread is going to burst usage causing stack space to be committed and once it is committed the memory is gone until the thread is reclaimed.

If you are in a million thread scenario it is usually because most of them are idle and retaining state for some eventual activity. That also tends to mean that they are long lived.

This applies to both goroutines & native threads, right? I guess with both you could periodically re-shrink the stack / release memory back to the OS, but I don't think either go or any native threading system I'm aware of does that.

It would be interesting to do this for native threads at the kernel level: it's very similar to swapping, except for stack data you could just throw the data away if you could figure out that it was indeed dead.

go is a moving target, but I believe it is committed to having segmented stacks that shrink.

Native threads don't have a way to reclaim stack memory other then swap. My personal opinion is that this should be solved in the kernel and that kernel threads should have an option for reclaiming stack space so I can run as many as I want.

I don't get the point of saying that application servers are dead and then... embedding application server within the application. If applications should be isolated they can be easily deployed on different application servers. The result will be the same, with the difference that we get all advantages of easy application deplyment using war files, and some nice tools supporting the whole process.

One of the problems with application servers is like well, port separation, process isolation, when a change to the context is desired but it might affect all the apps you have to be very careful. Having your web app be its _own_ app is hugely advantageous.

You can, and usually do, do exactly that with app servers.

It's true that app servers were originally conceived as a way of hosting multiple apps in a single JVM, but it quickly became apparent that this was a terrible idea, and nobody does it. You run one instance of the app server per app. The app server is really just a great big bundle of useful libraries and a web framework.

so why separate the two if there is always 1 app per app server?

(I'm guessing that was "So why separate ...", and you've just had root canal)

They're separated in the sense that the app server is something you download that exposes an API, and the app is something you write on top of the API. It's much the same as the way the JVM and the class files are separated, or the way the OS and the JVM are separated. It's a fairly straightforward, pragmatic application of layering.

Maybe we mean different things by "separated". I want the app server bundled into the app, not an assumed dependency that the app has on the target environment.

Rack in the ruby world exposes an API that ruby web apps build on top of, but you never install an app server then install your app into the server.

Aha, yes, that makes sense. I don't think there's anything fundamentally wrong with packaging the app server along with the app. It's just not traditional in the Java world.

My preferred solution would be to package the app and the app server as operating system packages, and have the app depend on the app server. That makes the dependency explicit, but doesn't leave you with a gigantic deployable.

Here is one reason I like the bundled approach. I've been using golang recently which compiles into a machine executable. I was building a REST API that many third parties would use. When they wanted to use the API for development all I had to do was send them the bundle and tell them how to start. There were no other dependencies at all. It didn't assume a particular package manager, OS, or anything else. Take this file and execute it, that's it.

this in general leads to an even worse practice: creating one huge standalone app instead of several specialized web apps. when you develop a service with an embedded server, your endpoint will have an unique port (as each app will have to allocate its own unique port number) and all the idea of restful URI gets broken.

You lost me. How does embedding the app server break URIs?

in the embedded mode you have one application server per app, so you have one app at port 80, another at 81, etc... so you can't just have a nice URL pattern for your applications, becase each one will have different port number. Unless you set up additional reverse proxy...

you put nginx in front of it, which you should do anyways for most things.

Yep agreed, it's called reverse proxying, I do this with Tomcat as well.


It's great to see how Java web development has progressed. I tried to make a Spring MVC web app work like a decent, modern, readable and productive web application two years ago and it just wasn't possible.

Anyone interested in modern JVM web development I would still advise to invest time in learning Scala (or Clojure). They can work with your legacy code and libraries, but my productivity and general joy in programming shot through the roof when I started using Scala.

You can actually make a good Spring MVC-based webapp (with Thymeleaf, AngularJS and etc.); check out the JHipster Yeoman generator: https://jhipster.github.io/

Man, I wish that would have existed 2 years ago.

> but my productivity and general joy in programming shot through the roof when I started using Scala.

That was exactly my experience. Maintaining Java code from time to time makes me realise just how big the difference is.

As a modern java developer I was surprised to find no mention of Vert.X which to me seems like one of the most modern and forward-thinking java web frameworks. Not only does it support scaling your app out-of-the-box but also provides a simple way to deal with concurrency.

I've been stuck using Vert.x for a year and half. I recommend using something else.

It's a questionable framework mashed together with a poorly though out messaging system.

I would rather use Spring Integration, Quasar, or Akka.

Would be interested in any more concrete description of the problems? I'm considering using Vert.x in a project, mainly because of the polyglot features (ability to be extended by many different people some of who only know Python, others who only know Java and some who might only know Javascript, etc).

Non-standard build/deploy/run.

Classloader per verticle type makes dependency injection painful. "I don't know what a Spring "ApplicationContext" is" -- Tim Fox

Uses multiple classloaders so you can "run multiple versions of the same module at the same time". Not something I have ever done or would do.

Uses Hazelcast for clustering but uses tons of classloaders so using anything other than simple types in Hazelcast is hard and inefficient.

Message bus is tightly coupled to application cluster through Hazelcast.

Messaging is missing the most useful features of AMQP like wildcard topics and queues. Only possible to do very basic message routing. Many messages have to be sent multiple times to mimic advanced routing.

Dividing everything into 'verticles' encourages use of callbacks for everything. This increases code size and complexity which increases the need for testing. 'Callback hell'

Encourages polyglot programming.

Writing Vertx apps with Groovy makes me feel like I could add a whole chapter to 'How to Write Unmaintainable Code'.


Wow - thanks for taking the time to put those down.

Some of those I consider advantages - polyglot is the whole reason I looked into it, Groovy is the main language the app I'd be integrating it into is written in, callbacks are the mainstay of many other frameworks (try writing anything in Node.js without a callback!).

That the clustering is poorly designed is probably not an issue in itself since in my case it's a web server embedded in another app, but it's a concern that the framework isn't well thought out and I don't like that (it sounds like) the whole thing is heavily entangled with Hazelcast (which I have no need for, or interest in).

Thanks again for writing down your thoughts!

You're most welcome.

> try writing anything in Node.js without a callback!

I wouldn't even try writing anything in Node. Server side javascript is not an option I would consider. http://www.youtube.com/watch?v=bzkRVzciAZg

Hazelcast on it's own is pretty useful if you need a distributed data grid in Java although it's not without its quirks. Unfortunately it's much less useful in Vertx because of the classloaders.

Callbacks as an antipattern is one of the main points of this HN post.

> Encourages polyglot programming.

Some of these "problems" aren't problems at all, e.g. what's wrong with polyglot programming?

> Writing Vertx apps with Groovy makes me feel like I could add a whole chapter to 'How to Write Unmaintainable Code'.

Vert.x enables many JVM languages to program it. Perhaps the real problem's with the language you've chosen. Groovy's business purpose is to ensure jobs (i.e. consulting contracts and conference seat sales) for life for its promoters at SpringSource. Choose a language with another mission and that could make all the difference.

> Vert.x enables many JVM languages to program it. Perhaps the real problem's with the language you've chosen. Groovy's business purpose is to ensure jobs (i.e. consulting contracts and conference seat sales) for life for its promoters at SpringSource. Choose a language with another mission and that could make all the difference.

The framework and the language are both problems in my opinion. If it was up to me I would have chosen a different framework, messaging system, language, and more. Wasn't my decision to make though.

Hmm, I find it surprising, hasn't been my experience at all.

Isn't Vert.X async a-la Node?

It is asynchronous event driven like Node but can process events on multiple threads

How does asynchronous futures end up being more complicated or in any way worse than the blocking threaded approach?

    for { 
       user       <- asyncGetUser(1)
       company    <- asyncGeCompany(user.companyid)
       longresult <- asyncProcess(company.getSomething)
    } yield longresult

His issue seems to be the potential for modifying or referencing mutable state where in the yield block (or map or flatMap or whatever).

It can be a problem. In Akka actors, referencing sender() from a future will be unpredictable because sender() could've changed in the mean time.

I think there are three strong solutions which address this problem:

1) Immutable state. Solves this problem completely but accidental capture remains an issue.

2) Hiding mutable state within actors. I hesitate to present this as a general solution though since it really requires going all-in with actors (I don't consider that to be a bad thing, necessarily).

3) Projects like Scala Spores [1] aim to tackle this at the compiler by capturing mutable references and executing async closures in an immutable environment and prohibiting accidental capturing. IOW, turning accidental capture into a compiler error. I'm excited about this one.

[1] https://speakerdeck.com/heathermiller/spores-distributable-f...

You can (and should) use Futures independently of Akka actors. You're correct, hidden mutable state with Actors can be a problem, as with mutable state in general.

Absolutely, but people make mistakes.

It's also somewhat unavoidable when you're dealing IO, unless you stick to .pipeTo(self)

Yep, that's very nice code, except:

    user = getUser(1)
    company = getCompany(user.companyId)
    longresult = process(company.getSomething)
is a) simpler (because that's what your normal code looks like), b) performs exactly the same (as fibers basically do the same thing, only transparently), and c) retains context (like ThreadLocal variables).

So if that was the only way, I'd say, fine. But once you have lightweight threads, they are always preferred (even Scala now has its own poor-man's lightweight threads in the form of async/await).

> simpler (because that's what your normal code looks like)

Actually, no. My code will always look like the for comprehension be it async or not, because my 'get' methods will always return an Either, Option or some other monadic context to represent the computation succeeding or not.

> performs exactly the same (as fibers basically do the same thing, only transparently),

Yes, I agree.

> c) retains context (like ThreadLocal variables).

Threadlocal variables are almost always code smell. If you find yourself using them, there is almost certainly a better way of accomplishing the same thing.

I don't see how having a Fork/Join threadpool that 'powers' the futures would be any slower than lightweight threads either.

When I learned to program (back in the 80s), I learned that if you want to tell the computer to do operation X and when that's done, do operation Y, you just write both statements one after the other. If you're comfortable using for comprehensions to achieve that same goal -- that's great. To me it seems that your code simply replicates what a thread does. If you have a problem with your thread's implementation -- fix it. If you want the compiler's help -- use the compiler to write the for comprehensions, or monads for you. Personally, I don't think you should need monads for "do X then Y", and neither does Scala, by the way. Scala says that you only need them if you want to say "do X then Y, but don't use OS threads".

I wanted to ask you, what about running multiple operations in parallel? I've long been contemplating async/futures vs lightweight threads and yet to have come to a conclusion.

With futures I can say, start operation 1 and operation 2 in parallel, then chain a callback to execute using both pieces of data, saving me some latency of doing the operations serially. How do you do this using quasar? Now that's a trivial example... what about arbitrary dependency trees? This falls out naturally with futures but I don't know of a nice way to do this with lightweight threads. e.g. 1 operation branches off into 5 other parallel operations which then chain some of their own additional callbacks for processing before finally bringing all the results back together, perhaps to form a json response. Does this example make sense?

Seriously, just use scala Futures. Here's one of many super easy ways to do parallel computations in scala: val list = getItemsToProcess() //List[SomeObj] list.map(so => Future { processorHeavyMethod(obj) }) //happening in parallel now val finished = Future.sequence(list) //Future[List] finished.await

With lightweight threads you can spawn as many fibers as you like. Creating and starting a new fiber is basically free. You can start fibers and join them, in any dependency tree structure.

Of course, you can keep using futures (what I call semi-blocking API), only futures that block the fiber rather than the thread, when you join them.

Okay so to run operations in parallel you have to go back to futures, and for more complex dependencies you would need callbacks/transforms. So this means with fibres you could use the simpler synchronous model for serial operations, but future model for parallelism, i.e. a hybrid model. My thoughts are that it might be simpler to adopt a single model rather than two. On the flip side you could argue that with fibres you don't need to use the more complicated parallelism model all the time and only when needed.

If you want to run operations in parallel and then "join" them, you need some joining mechanism. A fiber itself can be joined and return a value, so in Quasar, Fiber implements Future. But that's semantics: with fibers, if you want to run operations in parallel, you spawn more fibers; if you then want to join those operations -- you join the fibers.

Of course you can have long-running fibers that interact with one another through channels. See: http://blog.paralleluniverse.co/2014/02/20/reactive/

Thanks, this is the kind of answer I was looking for. I'll have a read once I'm on a bigger screen.

Man, am I ever fascinated with the complexity Java devs have to put up with.

Maybe it's just been a long time since I had to deal with Java-land, but you need to know about so many different things just to get off the ground. Granted, this may be easier with newer frameworks, but still.

> but you need to know about so many different things just to get off the ground.

I think that's probably true of just about any mature environment these days, isn't it?

Nobody wants it to be true. You make something new, thinking, this time i'll do it right, it'll be so simple, easy to get going with. And it is at first. Then you add stuff to deal with all the things that turned out to be pain points, all the edge cases and use cases that someone had that seemed reasonable after all, before you know it, it's a monster again.

To get off the ground with Rails 3/4, you sure need to know a lot. That wasn't neccesarily true in Rails 1/2 when some of us got started, and we were able to learn the new stuff gradually as it was added in, but to get started from scratch, let's say you don't even know ruby very well, oy, it's a lot now.

Yes, it's true of just about any mature environment these days.

And why? Because we want a mature environment to do a bunch of things for us, so that we don't have to (badly) re-implement everything. But the price of that is, we have to learn how to get the framework to do those things for us.

It's certainly not true with Go so far in my experience. Download, grab some packages that do specific things, put together a solution. Low noise.

To be honest most of the java code I write is just gluing a bunch of libraries together to accomplish the desired task.

If all you need is to get a minimal set of code handling HTTP requests, plain servlets aren't much bigger or more complex than most web microframeworks. "Here's your request and response objects, handle it."

Jersey is extremely simple to get running, even on App Engine (if you're into that). Getting Flask, uwsgi, and nginx running on OpenShift isn't particularly difficult, but I had to follow a detailed blog post to get it working.

Jersey is very simple to get running, but it has pretty big drawbacks: there's not much you can do with it. Dropwizard and similar ameliorate this a bit, but compared to Play (which does have onboarding challenges) you have a long road to hoe to go from Jersey to a nontrivial app.

And, personally, the problems don't end there--I think I'd quit programming before going back to any Java-based hat-on-top-of-JDBC. Scala's Anorm/Slick have given me enough of the "there's a better way" to make going back really unpalatable.

For the last few years my apps have trending towards components that don't do much individually, but can be made to work well together. I haven't used Jersey on anything major, but it fits the profile I'm looking for.

For what it's worth, Akka is even more relevant to what I'm doing.

That's fair, but Jersey's not really very good at many of those individual things. It's not a great tool for a JSON API unless your consumption target is pretty much the same domain library on the other side because you'll end up neck-deep in @JsonProperty. It's not great as a frontend because Freemarker, StringTemplate, and Velocity are all pretty gross.

Play, for me, gives me good tools for doing stuff, and I can make individual services as big or as small as I want.

Even simpler, Spark Java. You can't make REST on any platform or in any language I've tried any easier and still be building on a solid foundation.

Looks really slick. Thanks!

Welcome but the guy who invented it, Per Wendel and those who help him maintain it are the ones who deserve the credit. If you're into Lambda's (I'm not yet but trying to bend my mind to want to use them), he's working on adding them for version 2 which should make it even simpler.

Version 2 was released some days ago :-)

That's because, after getting into the air you will be bought and then you will find out that you wish you had started with solid technologies and architectures.

Could the person or persons who downvoted my comment explain why or what you disagree with here?

As opposed to?

Well… checkout Flask, for instance:

  from flask import Flask
  app = Flask(__name__)

  def hello_world():
    return 'Hello World!'

  if __name__ == '__main__':
Of course, you still need to know how Python Does Things, and The Thousand Ways To Deploy An App, and Package Management and so on, but for this trivial example it's a lot more conceptually lean than the semi-equivalent offered above. I don't need to know about `Bootstrap<Configuration>` or a `JModernConfiguration`.

It's fair to point out that there will be a corresponding text file or option somewhere else, but…

import static spark.Spark.*;

public class HelloWorld { public static void main(String[] args) {

      get("/hello", (request, response) -> {
         return "Hello World!";

Like I said above, "Granted, this may be easier with newer frameworks, but still." :). Good to see it's you can have some lighter loads.

TFA mentioned this was how Modern Java Web Dev With Instrumentation is Done (worth mentioning that Javaland instrumentation is world class), so just going by the examples he provided.

I think that is more a difference in which frameworks you use as opposed to which language. The examples in the article are using individual components to build out the system. This provides unlimited flexibility and will be very valuable if the system is going to scale in complexity.

My example on the other hand uses Spark, a micro framework that is not very flexible and hides a lot of the complexity, meaning that if your own solution becomes more complex you may have to abandon Spark entirely.

I've seen that dichotomy in every web framework I've encountered.

>I think that is more a difference in which frameworks you use as opposed to which language.

I don't think it has much to do with which language per se but more to do with what the language community considers acceptable. You can write mostly head-ache free frameworks filled with "magic" in Java like the rest of them, but it's not a strong cultural value.

To turn to the microframeworks example, I can't speak to Spark but in Ruby land it is not hard to flesh out your sinatra app into a more complex beast should the need arise - converting Sinatra half way to a mini Rails is not something I would recommend, but mostly because you lose easy-library support and ease-of-update down the road (you shouldn't be writing your own custom framework period, sort of thing), and not necessarily because of a lack of flexibility.

That's not much different here to the hello world in the article. Lets see...

You import some stuff... check. You create an instance... check. You define a route... check. You define a method to implement the route... check. You return a String... well the example returns a map, and converts it to JSON. You start the app in a main method... check.

Sure the java is a little more wordy and there are some extra params which are not used (the IDE would fill those in automatically when you implement the interface), but the complexity is the just same as far as I can see.

Looking past shorter syntax and names, and a bit of magic to intuit the response content type, that's not a whole lot different from the code that gets Undertow spun up [1]. I suppose if you're worried about syntax and short names, Java is not right for you.

[1] http://undertow.io/

Got to include ruby!

  require 'sinatra'
  get '/' do
    "Hello World!"

You had me until you said Asynchronous I/O is faster than Synchronous I/O in Java.


The Rob Van Behren slide is priceless: dude started building an async server and at the end realised he wrote the foundation for a threading package. I once asked myself the same question and did a little micro benchmark https://github.com/alpeb/io_benchmarks concluding IO was faster. Even so, I ended up doing all my stuff in Scala+Play because it's an immense pleasure and it's so easy to scale.

I worked with rvb closely last year and we always used to make fun of him whenever this debate came up. Then one day we realized he had co-authored a paper with Eric Brewer on the topic: https://www.usenix.org/legacy/events/hotos03/tech/full_paper...

It's highly cited "paper" and just as wrong. Blocking I/O is implemented via poll(), NIO is epoll. Blocking IO has to copy the array at last once (on the stack for <64kb, else malloc/free). Most NIO implementations use heap ByteBuffers and multitude of copies which is their downfall.

Blocking IO cannot have predictable latency under load at very least, you are left at the mercy of OS thread scheduler. Due to various reasons (e.g. mutator threads should not block GC and compiler ones) thread priorities are honored.

I'd argue that a well written NIO (and virtually there are no good open source NIO impl) will beat flat out any blocking. NIO is both faster and offers better/predictable latency under load.

Ok, now I'm curious: what is a good implementation? Also, whats wrong with ByteBuffers? I was under the impression that they are usually memory-mapped and should be 0-copy.

There are 2+1 major types of ByteBuffers: Heap- backed by byte[] (or char[], int[], etc) Direct: backed by C memory allocated via mmap (on linux). mmap can map to the RAM or a file. Memory mapped files are not an interesting case for NIO impl that works with sockets (On a flip note: FileChannel.transferTo(SocketChannel) doesn't involve memory mapping when the kernel supports it. Windows never supports it, though)

Most impl. use heap ByteBuffer, then parsing requires state machines and often they are simplified by copying the buffers. The blocking IO doesn't really need a state machine as the stack serves that purpose. Then there is some reactive alike pattern (submitting tasks to an ExecutorService) that costs some more latency. Certainly, it's easier to work with and reason about, yet the more hand-outs there are the worse the performance/latency is. There are minor issues like the choice of a good queue. It is an important one as java lacks MultiProducer/SingleConsumer queues out of the box, or even single producer/single consumer. Java does have MP/MC queues (CLQ is an outstanding one) but one has to pay some extra price (incl. false sharing sometimes) to use them.

Ultimately the blocking IO cannot be "faster" than NIO per se since under the hood it uses poll(2)[0] with one socket. Before that it copies the java byte[] to a new location - for smaller byte[] it's the stack. Technically one can blow up the JVM if the stack is very tiny while entering socket.getOutputStream().write(byte[])

Lastly Selector.wakeup() has a stupid issue that involves entering a synchronized block each time even if there is an outstanding wake-up request already. Wakeup requests are implemented via pipes on linux (and a socket pair on Windows) that requires kernel mode switch. During the wakeup all the threads attempting to carry the task block on that very selector for no real reason. It can be played around with a CAS, so only one thread actually enters the monitor.

I will repeat myself blocking IO doesn't have predictable latency and cannot be enforced. In the end it's all about the latency as bandwidth can be bought, more machines deployed but you can't buy latency.


Some internal stuff on heap vs direct buffer: http://stackoverflow.com/a/11004231

Ok, I was aware of the difference between Direct vs. Heap ByteBuffers, and I guess I understand the argument about poll/epoll. Now what I don't quite get is why most open source projects chose to use the slower implementation. Don't they know any better? Is it a portability issue? Bugs in certain JVM/OS combinations? Netty.io claims to be 0-copy capable, so I guess that this must be one of the good ones that are available?

After seeing the message I've decided to check netty.io's code and I am pleasantly surprised. It has been ages since I checked the project. They use almost all tricks in the book - CAS around selector.wake(), handling the zero returned keys,ref. counting buffers allocator, even a SC/MP queue.

Only couple of downsides: 1) there appears to be the lack of bounded queues and it's a non-trivial one. Bounded queues are important to ensure proper back-pressure on 'producers' and/or killing slow peers. 2) encoding pipeline may require serializing the same message multiple times when sending to multiple clients even if the serialization results into the same byte stream. However this is really a minor issue.

Like I've said I'm pleasantly surprised.

This article comes up every time someone mentions I/O "speed". You have to be very careful with making blanket claims in either direction.

These tests were a particular kind of network traffic, and were testing for max throughput across many connections. Minimizing latency, smaller message sizes, and/or fewer connections can make the decision to use NIO vs standard IO libraries come down in different places. (Not to mention that you cannot program the standard IO libraries in a no alloc form).

That's not what I said, or at least not what I meant. If you have blocking operations (that take a long time), then thread-blocking (as opposed to fiber-blocking or async) IO will require too many threads.

I've heard of Quasar before and had a general idea of what it is, but didn't look at the documentation carefully until now. My understanding is that I can run arbitrary synchronous code in Fibers? For example, consider the MongoDB client library:

  DBObject r = collection.find(query);
It blocks while getting the results of the query. If I do something like this:

  for (int x=0; x < 100000; x++) {
    new Thread(() -> DBObject r = collection.find(query))).start();
it's going to start 100,000 threads and freeze my computer. However, with Fibers I can do:

  for (int x=0; x < 100000; x++) {
    new Fiber(() -> DBObject r = collection.find(query))).start();
and it will work fine since these are lightweight threads (like Go goroutines). I guess my main question is, can I use arbitrary unmodified synchronous code like this to run in Fibers or would the library have to be modified to support it? In this case, would someone have to update MongoDB library to add support for Fibers?

Hi. You don't have to change the library in order to make it work through fibers. You have to wrap it. In case it has efficient implementation of the async-api you have to implement the fiber-synchronous using the asynchronous api. If it hasn't you have to wrap it with threadpool. You can take a look in the implementation of the JDBC wrapper here: https://github.com/puniverse/comsat/tree/master/comsat-jdbc

The short answer is that you can't just use any blocking code. The Comsat project (https://github.com/puniverse/comsat) contains integrations of popular libraries with Quasar fibers, without change to their API. Under the hood, this is done using transformations made available in the FiberAsync class. Like eitany said, you can use FiberAsync to integrate the library yourself, or wait for an integration module in Comsat.

If library provides asynchronous APIs, it would yield great performance when integrated with Quasar fibers. If not (like JDBC), it would work as well as it does on regular threads, but won't interfere with all the other great stuff fibers can do.

Agreed - for example a (canonical example) chat application (long lived connections, small amounts of data). But that's still not performance, that's then scalability? As in "how many active chat sessions could one server handle?".

The opposite canonical example being a static, small-file web server. Short connections. How many files can you serve per second?

Great work Ron.These should be a small (or maybe a full) book some day.

Java is not my main language, but I liked the Advanced Topic: Blocking vs Non-blocking section and in general have been following Parallel Universe technology stack.

If anyone is interested more in the async vs actors, there is a nice new podcast with Ron Pressler, Fred Hebert (Learn You Some Erlang For Great Good author), Kevin Hamond and Zachary Kessin


There is a discussion about concurrency, how Quasar works underneath (hint: there is an interesting bytecode transform that takes place during loading), shared state, Erlang, Clojure and Datomic. Anyway, highly recommend.

Thank you for sharing the amazing work. I do not know if there is a part 4 coming or not but do you have a good pattern for validating users' input?

I will say though, dismissing Rob Von Behren in the area of async vs. sync IO with an anecdotal paragraph is a ballsy move. (i.e. "wrong approach")

I've been misunderstood. I have not commented at all about whether sync or async IO is better. I meant that if you have a long operation that blocks for a while, letting it consume a kernel thread is bad.

Do you really need to write a controller and all its actions for each resource with Jersey?

For static resources you can load them like this:

    public class WebAppConfig extends ResourceConfig {
      private final String[] mimeTypes;
      public WebAppConfig() throws IOException {
        //load static resource from htdocs dir
        Collection<File> files = FileUtils.listFiles(new File("./htdocs"), null, true);
        ArrayList<String> mimeTypeList = new ArrayList<String>();
        for (File file : files) {
          final byte[] contents = FileUtils.readFileToByteArray(file);
          Resource.Builder resourceBuilder = Resource.builder();
          final ResourceMethod.Builder methodBuilder = resourceBuilder.addMethod("GET");
          String mimeType = Files.probeContentType(Paths.get(file.toURI()));
          if (!mimeTypeList.contains(mimeType)) {
              .handledBy(new Inflector<ContainerRequestContext, byte[]>() {
                public byte[] apply(ContainerRequestContext req) {
                  return contents;
        //load dynamic resources implementing interface Webpage
        Reflections reflections = new Reflections(new ConfigurationBuilder()
            .filterInputsBy(new FilterBuilder().include(FilterBuilder.prefix("com.example.mycompany"))));
        Set<Class<? extends Webpage>> webpageClasses = reflections.getSubTypesOf(Webpage.class);
        for (Class<? extends Webpage> webpageClass : webpageClasses) {
        mimeTypes = mimeTypeList.toArray(new String[0]);
      public String[] mimeTypes() {
        return mimeTypes;

      public synchronized void start() throws Exception {
        WebAppConfig config = new WebAppConfig();
        HttpServer httpServer =
            GrizzlyHttpServerFactory.createHttpServer(URL, config, false);
        CompressionConfig compressionConfig =

Thanks, but I was referring to say a dynamic database-backed resource.

With SDR, you have an @Entity, and a @Repository for that entity, and SDR handles the whole HTTP part, i.e. you don't need to write a @Controller to expose the repository.

Just wondering if Jersey has anything similar.

I ask because with Spring Data REST, you don't

Is there a preferred way to submit typos other than in comments?

of course, email us at info@paralleluniverse.co - thanks!

>Let me put this clearly: asynchronous APIs are always more complicated than blocking APIs..

IMO I would rather use a single threaded server and manage synchronicity explicitly with callbacks, composable promises, comprehensions, monads, and "other functional shenanigans" vs having to sync shared state across threads and manage thread pools.

If you keep your data immutable you can have the best of both worlds.

Have you used a STM?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact