The thrust of Ted's argument (as I saw it, and completely agree with) is that telling people "inexperienced programmers can create high performance systems" is misleading or dangerous without pointing out that you can easily shoot yourself in the foot of you don't understand what is (and isn't) handled "magically" for you.
Using an algorithm that performs poorly in php or python isn't (necessarily) going to magically perform well in node.js's non blocking execution strategy. Abstracting away blocking network or io calls isn't going to help if you don't know (and notice) that your problem is the O(2^n) algorithm you're using.
I don't think he contends that. He was just using Fibonacci as an example to show that IO is not the only way to block a process. Although he doesn't go into it, this issue is one thing that you have to weigh when you decide to use eventing instead of threads: can you break up any potentially long running computation to maintain availability?
My main beef with node.js is the hype, to be honest. Many of the people who are really excited about it seem completely unaware that this kind of thing is hardly new. Tcl has been based around an event-loop for ages! Erlang was made in the mid-80s, and has a very mature (and thoroughly integrated) event-loop-based architecture. Also, its focus is on fault-tolerance / error handling in a multi-node system. I don't know what progress Node.js has made there, so far.
His preferred infrastructure (once I cut through the swagger) is:
Highly-tuned request dispatch --> multiple processing threads (or processes) --> back out to requesting client
For certain workloads, e.g. long-polling, pubsub or high-number-of-client workloads, unix dispatch overhead is actually very significant. One cannot instantiate 500 python threads on most unix boxen without severe doom.
For these workloads, essentially ones where you need to push text around with a minimum amount of stream processing on top of it, node is totally, totally brilliant. It is crazy fast. Magically so, even.
On the other hand, if you started down the road with node, and realized at some point that you needed to refactor some of your slow code, you could do so easily. Doing the reverse (scaling an alternate scripting architecture) is not always so trivial.
AFAIK, nothing's stopping anyone from writing a CGI module for Node. It took years for Python to come up with WSGI and Ruby to come up with Rack. People are fed up with CGI and I don't blame Node's developers for excluding it.
Indeed, nothing has stopped several people from doing it:
$ npm search cgi
NAME DESCRIPTION AUTHOR KE
cgi A stack/connect layer to invoke and serve CGI executables. =TooTallNate
fastcgi-stream Fast FastCGI Stream wrapper for reading/writing FCGI records. =samcday fc
koku Node.js bindings for the Mac finance app Koku =cgiffard
nodeCgi A fastcgi-like server designed to accept proxied requests from a web server and exe
scgi-server SCGI (Simple Common Gateway Interface) server =yorick SC
- 1 request to http://server/fiboslow
- 1000 requests to http://servir/fast-response
The point is that node will not process any of those 1000 requests until 1st one is finished, while any multithreaded/multiprocess server will do just fine, and process those 1000 requests in parallel.
It's funny that the same people that criticized Java for its AWT EDT as poorly designed, are not praising the same thing in Node.js ;)
Standard Python: 1:07.31 :)
python 2.6: 58.2 sec
python 2.7: 49.8 sec
python 3.2: 51.4 sec
I couldn't believe when folks on Twitter seemed to be buying his "argument."
(Apologies for the self-promotion. I had literally just published mine when I saw this.)
Yes, there are other ways of blocking apart from IO. IO tends to be the main culprit, our web apps are mostly waiting for something. A database connection, a file system, another http request.
By making that asynchronous means we don't waste CPU time waiting. We let those external systems do their thing and when it's done, then the rest of the code runs. The important thing is that the IO isn't the bottleneck in node.js. It can do something else while the IO is doing it's thing.
This means that node.js is better at dealing with IO-laden processes better than the typical gamut of web frameworks.
CPU-bound processes are going to block unless they are performed outside of the main event loop. Dziuba is pointing that out, without offering up the obvious approach.
The approach to dealing with CPU intensive tasks is to delegate it to something that can be asynched out. If your platform of choice is multi-threaded, spin up a thread and run it there.
The node.js way, as I understand it, is to use IO to offload that intensive process somewhere else (at least until web workers is bedded in and ready to use). Since the IO is non-blocking, node.js doesn't consume much resources in waiting around for a response from the server/framework dealing with the CPU intensive activity.
The more I use it the more I see node.js as a pipeline connector between IO resources. Those other IO resources can either be other frameworks, or separate node.js instances that do one small job well. (So an IO process could just be a separate node.js instance that performs a CPU intensive task. In this way it doesn't affect the main request recipient in receiving more incoming requests).
One multi-core server can have a dozen or more instances of node running, each doing their specialised tasks and talking to each other asynchronously via IO.
Sure, it's not beginner level stuff. Even Dzubia himself didn't point out the better approaches to his naive solution - in node.js or any other language. node.js is as bad as every other framework when it comes to naive implementations of recursive algorithms. But offloading the calculation out of the main event loop thought non-blocking IO is a different solution that node.js offers. That's one key differentiator.
It would be interesting to see Dzubia demonstrate the implementation of his concocted problem in the framework / language of choice.
If your server runs in a single thread AND you do expensive calculations per request, you're doing it wrong. You should be caching results, and you need to either move the processing work to a backend server (written in C or something) or shard the frontend across a lot of cores (multiple processes / webworkers). In any of these configurations, nodejs will work great on the front end.
But, most of Ted's post was needless bile. The argument he made isn't justified by the evidence he gave. Don't bother trying to argue with him, its not worth your time.
Situations where blocking is going to be an issue:
1. You have roughly equivalent CPU usage per request, which means that you just have too many requests. Threaded or evented will both get bogged down here.
2. Most requests use a small amount of CPU, but every once in a while a request will require an ENORMOUS amounts of CPU. You want the one person with the crazy demand to feel slow, but everyone else to be unaffected. How often does this happen, exactly?
Most modern web apps spend the majority of their time talking to databases, which in NodeJS is a nonblocking operation. If you need to steamroll your CPU frequently, then perhaps NodeJS is not for you, but I don't think that's a common use case.
If the 500ms request is rare, then occasionally some requests will be delayed vs a threaded setup. Keep in mind, that V8 is about 1 order of magnitude faster than CPython/PHP.
If you use webworkers for cpu-intensive tasks (like fib), nodejs will perform just as well as all the other web frameworks out there.
And if you do consider workers (because you don't have a better algorithm), what happens for the cheap version? Do you offload it to a worker as well, potentially incurring a spin-off cost greater than the cost of the computation itself, or do you end up with two different codepaths (one sync and one async, just to ensure you're getting as complex as you can) depending on the computation's expected duration?
How exactly would you avoid that 500ms block in other platforms?
Big deal. Use workers to solve that. End of story.
in a browser I'd go with a setTimeout or setInterval hack to "fork" another thread. is this the same in node?
It outlines one solution where the calculation is split into callbacks dispatched through the event loop, making use of process.nextTick()
a,b = 0,1
for i in range(0,n):
a,b, = b,a+b
Yes. He was just illustrating a point about blocking on the CPU.
Dziuba is the very paradigm of an ill-informed lazy troll. He has made his reputation giving other lazy people reasons to not learn new things.