I'm really excited about this. While node-js is a great project, it still doesn't have the awesome instrumentation and tooling around it the JVM does. Additionally, real threading is damn nice, and the JVM definitely has that.
Combining this with languages like ruby, clojure, and scala seems like a definite win.
Async-by-default doesn't seem like a great model for programming the server-side part of most things that are done over HTTP. Three use cases jump out at me:
2. Threads are scary.
3. You want to make certain actions asynchronous over HTTP (i.e. client says "start", then server maybe says "done" later).
Now consider Clojure:
2. Clojure makes a lot of threaded operations pretty non-scary. Its native data structures are all immutable and it has constructs for concurrent state. These are not any harder to work with than the async/callback model. I find it more natural, but I'm used to Clojure so that could be bias.
3. Async-when-desired is already easy in Clojure. Futures provide a very easy way to do stuff in a thread pool without blocking. Agents provide the same thing for state changes. It really is as easy as (future do-blocking-thing) and (send-off an-agent do-blocking-thing some-args).
Finagle works fine. Thank you. Love it. But it does not take off. Why?
What all the JVM Node.js clones are missing and what Node.js sets apart are async libraries. There is no async (MySQL) JDBC driver for starters. If your IO drivers are not async, your async container is not very useful in real life.
This. The biggest difference is in the two cultures: blocking is anathema to the Node.js community - they will literally reject libraries or code that blocks because it destroys the entire model; the JVM community does not value non-blocking code - most of the core (JDBC, Networking in general, File system operations) is all written in a blocking style - the JVM community accepts this with the implicit assumption that threads will help assuage those issues.
Python, Ruby and Perl all have the same cultural tolerance for blocking code. The Node.js community has a complete lack of tolerance for blocking code.
I work with the JVM every day (Clojure) and wish it was different wrt the common use of non-blocking code, but it's going to be a long road to get there on the JVM.
Java Executors combined with Guava's ListenableFutures easily turn any blocking operation to an asynchronous one.
Netty's entire model is asynchronous, and Java 7 now has AsynchronousChannels for IO which, I assume, Netty will make use of.
All in all, the JVM has a much more solid and performant foundation than anything Node can provide. The whole difference will come down to a programming style preference. I am not entirely sure why Vert.x adopted the Node style rather than the proven servlet container, as I'm sure both styles provide comparable performance. I guess each may shine under different loads/usage patterns (my guess is that Vert.x/Node can squeeze more performance from a single thread, but servlets are more scalable).
Is that supposed to be bad? The programming style will be the same. If threads are done right, and the JVM can manage their affinity well (especially on NUMA architectures), it's best to use them and pass a relatively small amount of data between them, then they can provide much better performance than accessing the same large piece of RAM from many threads (that's what happens if you simply replicate a single event-loop thread with asynchronous IO).
Maybe I am missing something, but how can you possibly have an async SQL driver without threads like this? This sounds like a case of your Node.js database driver hiding the exact same behaviour described here within C code.
If the wire protocol for the driver is published, then you can write a 100% async driver for it. I.e. no threads blocking, ever. In fact, I already did this for redis and vert.x (I will dig out the code for this some time).
If you are dealing with something where you don't know what the wire protocol is and you just have a blocking client library to play with (e.g. JDBC - JDBC is, by definition blocking - see the JDBC API), then you can't do much but to wrap the blocking api in an async facade and limit the number of threads that block at any one time.
This is exactly what we do in vert.x. We accept the fact that many libraries in the Java world are blocking (e.g. JDBC) so we allow you to use them by running them on as a worker.
This is one area where we differ from node.js. Node.js makes you run everything on an event loop. This is just silly for some things, e.g. long running computations (remember the Fibonacci number affair?), or calling blocking apis.
With vert.x you run "event-loopy" things on the event loop but you can run "non event-loopy" things on a worker. It's a hybrid.
A limited number of threads will not scale as real async wake-on-data connections will scale. If demand is higher than your thread pool, for the use case that you're web response builds on async backend requests, your site will be down.
(notice the Connect method on line 325 of binding.cc)
At some level a client-server database driver isn't all that different from any other network client; you send a request over a socket and wait for a result. There's no reason you have to block while waiting.
Moreover some databases (like Postgres) let you receive asynchronous notifications signaled by transactions on other connections; that's how trigger-based replication systems like Bucardo do their thing.
Because it would be based on asynchronous socket responses. So you wouldn't iterate like you currently do w/ a ResultSet but rather have a simple "RowHandler" or sorts. However you still run into the trouble you do w/ node if you decide to do a lot of blocking work in there instead of just sending the row to some ExecutorService thread to get worked on.
I am talking about the cultures surrounding these languages and frameworks. Node's community rejects blocking libraries. Java's does not. I've used the non-blocking frameworks in Perl (POE and my own), C (select, and some of the poll variants), Ruby (event machine) and they are fine if you can avoid blocking libraries -- in these communities it is generally acceptable to write blocking libraries. I don't see it as a technical hurdle, I see it as a cultural one.
You need to start backing up your claims with actual data. What networking libraries are blocking in Node.js that are not blocking in Twisted? Moreover, what can't you do with a Twisted Deferred that you can do in Node.js?
There are two main things that block in computing:
You better believe that Node blocks on CPU, so what I/O does Node not block on that Twisted does?
We run a major site on Java, had some thread trouble years ago in the very beginning, works very well now when tuned. Threads work.
BUT: I assume people will move to backend services with REST and combining REST backend results to a page. This increases IO a lot and will kill your latency and default thread models when you do sync code. You'd need to use async IO, composeable futures to manage latency and thread count. And if you do async backend REST, why not do async JDBC etc. But there are no libraries.
>This. The biggest difference is in the two cultures: blocking is anathema to the Node.js community - they will literally reject libraries or code that blocks because it destroys the entire model
Really? So they reject any kind of library that does anything except call a callback? Because everything else, from calculating 2+2 to creating a template blocks. And it doesn't matter when it happens, when it happens it blocks.
Your async container still can be useful, since you can have worker threads executing blocking operations. This works fine as long as your JDBC queries are fast enough to feed the event loop without creating tons of threads.
In my experience, these are a small percentage of cases so the majority of the app can use blocking I/O -- even long-polling code, as long as you are using the async framework for the client connection.
Finagle looks like it is a bit lower level framework compared to vert.x. While Finagle aims to be a framework that supports services using multiple protocols, vert.x appears to be much more tailored for HTTP web services.
But take a look at the Finagle example from Heroku and compare it to the example from vert.x. There's a lot of boilerplate in the Finagle version because Finagle is a general purpose async service framework, which was my entire point.
The topic was the non-existence of drivers in other platforms, you brought quality up. Node itself isn't 1.0 yet. I don't see how not having anything is better than having something in development.
transloadit.com, from the module mantainers, has been using it for an year+. I don't have any magic insight into what modules sites use, but judging from the activity in their repo it's quite popular (~1000 watchers, 100+ forks): https://github.com/felixge/node-mysql
Maybe I'm one of the weird ones, but I absolutely love types and would probably write everything in JS if it had typing similar to Java or C++. As it is, I'm using Java on Jetty instead for my current web application but would love to see a really solid event Node.js style typed framework. With that said, at this point I'm not sure I'd give up my Servlets, frameworks(like CometD) for doing WebSockets, and the other niceties that a true servlet container gives me. I can't wait to see where this goes though!
Not really, typically you'd have one thread handling all sendFiles through a single selector over the destination sockets. As a matter of fact, it's actually really painful to do sendFile in Java from a single thread, because when the channel isn't ready for sending, rather than returning EAGAIN and letting you busy-loop or wait/retry or whatever, it throws an exception. So you have to use a selector to do sendfile, and in that case, why not use multiple tasks with the same selector?
(Response to jbooth, but for some reason I can't reply to that directly.)
The whole point of sendfile is to make one system call to send all the data in one stream to another, which in general may block. If you're polling and sending only small chunks at a time (whatever you can write without blocking), is it really that much of an advantage over read/write on the same poll? (If you're not doing that, then you have to block, and you have to dedicate a thread to it.)
If the socket you're writing to has been set to nonblocking, then sendfile exhibits the behavior I described, sending EAGAIN sometimes (check man sendfile). This means typically you want to put a selector in front of it and poll the selector, then send to any sockets that are writable, loop back and poll again.
It's still an advantage over read/write because you're getting the 0-copy behavior.
"real concurrency" is a silly term, but I assume he means threads and therefore a multicore concurrency model vis-a-vis thread-level parallelism, allowing a single VM to utilize all cores in the system.
This is opposed to Node, which must run at least one VM for each CPU in order to utilize all of the cores in a system, unless your only use of threads is ThreadPoolExecutor-style pools, then you can use the horrible hack that is threads-a-gogo
Yes, I meant threads ;) E.g. A web server using node.js on a 32 core server. You would have to manually manage 32 instances of node, and use a load balancer or the cluster module in order to route requests to the instances. With vert.x you just start one instance and from the command line you tell it how many instances to start. It then scales over your cores, no glue code or cluster module to write. (There's an example of this on the front page of the website).
Having VM per core may be quite beneficial -- you get more fault tolerance, immunity to GC glitches and one tier less when scaling over several machines. And there are nice tools to manage multiple processes.
Frustrating to see a library like this be called "Next generation" when the code structure is a step backwards as far as I can tell. We have had green threads, and more than that green threads that can multiplex over multiple cores for a long time. Let's move on, people.
> InfoQ: What about running a real-time app on the JVM vs. on Node.js, with respect to debugging, monitoring and operations?
> Answer: I'd say monitoring and operations are really the concern of the environment in which you deploy vert.x than vert.x itself. e.g. if you deployed vert.x in a cloud, the cloud provider would probably provide monitoring for you.
I'm pretty sure I don't want to use a web server by people who think a 5 line demo that gives unrestricted access to the hosts file system is the best way to show off it's capabilities. Sorry, but that's just stupid.
OK. I think I know what I did wrong. On the Github (download link on the website) I click "Download .tar.gz" assuming I was getting the latest version. I might got the latest version but it seems to be the latest of source. I should have chosen one of the packages.
Fibers (or equivalent constructs) aren't supported by all the languages that Vert.x supports (e.g. Java) so we can't really support something like that until we can do it in all the langs.
I know Fibers/Green threads are all the rage right now, and it is certainly something to keep an eye on, but I am not entirely convinced that roll your own threading is going to be any more performant than what the kernel can do.
If we can find a way of implementing fibers efficiently, that supports millions of fibers on a single JVM instance, I would be interested.
There's really no conflict between the statement, "every request gets it's own thread" and the use of thread pools.
The point is that a given request in a servlet container is handled in it's own thread. That thread will probably come from a pool and be reused to handle another request of course, but that's sort of immaterial.