I think a better response to Ted's argument would have been to rewrite the fibonnaci number generator as a forked-off process with a callback that updates the requesting page on completion.
The thrust of Ted's argument (as I saw it, and completely agree with) is that telling people "inexperienced programmers can create high performance systems" is misleading or dangerous without pointing out that you can easily shoot yourself in the foot of you don't understand what is (and isn't) handled "magically" for you.
Using an algorithm that performs poorly in php or python isn't (necessarily) going to magically perform well in node.js's non blocking execution strategy. Abstracting away blocking network or io calls isn't going to help if you don't know (and notice) that your problem is the O(2^n) algorithm you're using.
> He contests that because a shitty Fibonacci number generator performs poorly Node.js is worthless. I contend that shitty Fibonacci generators are shitty in whatever language or framework.
I don't think he contends that. He was just using Fibonacci as an example to show that IO is not the only way to block a process. Although he doesn't go into it, this issue is one thing that you have to weigh when you decide to use eventing instead of threads: can you break up any potentially long running computation to maintain availability?
This response misses the point, which is just that it's trivial to show that you can write blocking code in node. Any program that does significant computation will still block and pause all network IO (without setting up side processes, etc.). "The stupid fibonacci benchmark" is just a cliche way to chew up the CPU.
My main beef with node.js is the hype, to be honest. Many of the people who are really excited about it seem completely unaware that this kind of thing is hardly new. Tcl has been based around an event-loop for ages! Erlang was made in the mid-80s, and has a very mature (and thoroughly integrated) event-loop-based architecture. Also, its focus is on fault-tolerance / error handling in a multi-node system. I don't know what progress Node.js has made there, so far.
Virtually every request made to the server is blocking. The point of a server request is to retrieve/store data, process it, and respond. That's a blocking operation by definition. I can't respond before processing the data. I can't process it before retrieving it. You get the picture.
He totally misunderstands node's reason for existence, and also why it can be awesome.
His preferred infrastructure (once I cut through the swagger) is:
Highly-tuned request dispatch --> multiple processing threads (or processes) --> back out to requesting client
For certain workloads, e.g. long-polling, pubsub or high-number-of-client workloads, unix dispatch overhead is actually very significant. One cannot instantiate 500 python threads on most unix boxen without severe doom.
For these workloads, essentially ones where you need to push text around with a minimum amount of stream processing on top of it, node is totally, totally brilliant. It is crazy fast. Magically so, even.
On the other hand, if you started down the road with node, and realized at some point that you needed to refactor some of your slow code, you could do so easily. Doing the reverse (scaling an alternate scripting architecture) is not always so trivial.
I like node.js, but this post completely misunderstands "Node.js is Cancer". The point of that post isn't that node.js is slow; its point is that node.js does concurrency, but not parallelism; i.e., if your code is CPU-bound, it can only serve one request at a time.
It's a bit extreme to conclude that since Node's built-in HTTP server works this way, that it's "cancer." People still use it because it suits their purposes; there's nothing wrong with that.
AFAIK, nothing's stopping anyone from writing a CGI module for Node. It took years for Python to come up with WSGI and Ruby to come up with Rack. People are fed up with CGI and I don't blame Node's developers for excluding it.
> AFAIK nothing's stopping anyone from writing a CGI module for Node.
Indeed, nothing has stopped several people from doing it:
$ npm search cgi
NAME DESCRIPTION AUTHOR KE
cgi A stack/connect layer to invoke and serve CGI executables. =TooTallNate
fastcgi-stream Fast FastCGI Stream wrapper for reading/writing FCGI records. =samcday fc
koku Node.js bindings for the Mac finance app Koku =cgiffard
nodeCgi A fastcgi-like server designed to accept proxied requests from a web server and exe
scgi-server SCGI (Simple Common Gateway Interface) server =yorick SC
I put Ted squarely in the category of "contrarian" rather than troll. I think it's important to have people calling out the Emperor for having no clothes, even if the Emperor is wearing clothes. It makes for debate, which in turn makes for reasoned decision making.
The point is that node will not process any of those 1000 requests until 1st one is finished, while any multithreaded/multiprocess server will do just fine, and process those 1000 requests in parallel.
It's funny that the same people that criticized Java for its AWT EDT as poorly designed, are not praising the same thing in Node.js ;)
Not sure whether pointing out the slowness of other languages is a considered argument to make here.
Yes, there are other ways of blocking apart from IO. IO tends to be the main culprit, our web apps are mostly waiting for something. A database connection, a file system, another http request.
By making that asynchronous means we don't waste CPU time waiting. We let those external systems do their thing and when it's done, then the rest of the code runs. The important thing is that the IO isn't the bottleneck in node.js. It can do something else while the IO is doing it's thing.
This means that node.js is better at dealing with IO-laden processes better than the typical gamut of web frameworks.
CPU-bound processes are going to block unless they are performed outside of the main event loop. Dziuba is pointing that out, without offering up the obvious approach.
The approach to dealing with CPU intensive tasks is to delegate it to something that can be asynched out. If your platform of choice is multi-threaded, spin up a thread and run it there.
The node.js way, as I understand it, is to use IO to offload that intensive process somewhere else (at least until web workers is bedded in and ready to use). Since the IO is non-blocking, node.js doesn't consume much resources in waiting around for a response from the server/framework dealing with the CPU intensive activity.
The more I use it the more I see node.js as a pipeline connector between IO resources. Those other IO resources can either be other frameworks, or separate node.js instances that do one small job well. (So an IO process could just be a separate node.js instance that performs a CPU intensive task. In this way it doesn't affect the main request recipient in receiving more incoming requests).
One multi-core server can have a dozen or more instances of node running, each doing their specialised tasks and talking to each other asynchronously via IO.
Sure, it's not beginner level stuff. Even Dzubia himself didn't point out the better approaches to his naive solution - in node.js or any other language. node.js is as bad as every other framework when it comes to naive implementations of recursive algorithms. But offloading the calculation out of the main event loop thought non-blocking IO is a different solution that node.js offers. That's one key differentiator.
It would be interesting to see Dzubia demonstrate the implementation of his concocted problem in the framework / language of choice.
I had the pleasure of missing Ted's original post. I read it before this one, and it was pretty misinformed. I could write a similar article arguing that assembly is useless because it takes so much code to implement a web server using it.
If your server runs in a single thread AND you do expensive calculations per request, you're doing it wrong. You should be caching results, and you need to either move the processing work to a backend server (written in C or something) or shard the frontend across a lot of cores (multiple processes / webworkers). In any of these configurations, nodejs will work great on the front end.
But, most of Ted's post was needless bile. The argument he made isn't justified by the evidence he gave. Don't bother trying to argue with him, its not worth your time.
So, this may be a silly question, but what exactly are you doing that's going to use so much freakin' CPU?
Situations where blocking is going to be an issue:
1. You have roughly equivalent CPU usage per request, which means that you just have too many requests. Threaded or evented will both get bogged down here.
2. Most requests use a small amount of CPU, but every once in a while a request will require an ENORMOUS amounts of CPU. You want the one person with the crazy demand to feel slow, but everyone else to be unaffected. How often does this happen, exactly?
Most modern web apps spend the majority of their time talking to databases, which in NodeJS is a nonblocking operation. If you need to steamroll your CPU frequently, then perhaps NodeJS is not for you, but I don't think that's a common use case.
I don't think using, say, 500ms worth of CPU is all that unusual of a need. If you try to do that with node.js, everything in your program will be blocked completely for 500ms. It doesn't make node.js useless, or 'cancer', but it is a real limitation.
But it does not fix the issue of e.g. having used a quadratic algorithm which is breaking your server because in production a user is shoving an order of magnitude more data than you expected (or tested for): you used that algorithm in-request because it was fast. Now it's not fast anymore and your production is getting killed. That's all there is to it, until you fix your code the application is not degraded, it's DOS'd every time that user does something.
And if you do consider workers (because you don't have a better algorithm), what happens for the cheap version? Do you offload it to a worker as well, potentially incurring a spin-off cost greater than the cost of the computation itself, or do you end up with two different codepaths (one sync and one async, just to ensure you're getting as complex as you can) depending on the computation's expected duration?
The platform does it: other requests are handled by other threads or processes (pooled or not), one request is going to consume 500ms and the rest will keep on trucking concurrently. Unless you've gone above the capacities of the machine itself, other requests will be little to not affected.
Is there a reason that these examples are using some sort of deeply nested call stack? Is it just to enforce 'slowness' in the function calls?
a,b = 0,1
for i in range(0,n):
a,b, = b,a+b
I strongly disagree with what Ted wrote, but I really wish he would come out and enter a dialog with the community. There's a thread on the mailing list, twitter, and blog responses floating around now, but no substantial effort by Ted to respond to or even acknowledge the replies he's received.