
Node.js - A giant step backwards? - jemeshsu
http://fennb.com/nodejs-a-giant-step-backwards
======
danh
Allow me to be a little grumpy:

I hate tabloidy headlines with question marks. As in "Queen Elizabeth: Is She
a Transvestite?", "The Moon: Is It Made of Cheese?" or "Linkbait: Will It Ever
End?".

~~~
dmotz
<http://en.wikipedia.org/wiki/Betteridges_Law_of_Headlines>

~~~
peter_l_downs
Well sourced. Thank you.

------
peterhunt
The big reason callback-based systems are hard (despite the fact that built-in
flow control constructs need to be re-implemented) is that functions are no
longer composable. That is, if I write a bunch of code that doesn't need to do
any I/O, I'll just return a value to the caller. If at any point in the future
the spec changes and _any_ function in this call stack needs to do any I/O,
all the code that ends up calling this function needs to be refactored to use
callbacks.

There really should be language support for this sort of thing (like
coroutines) so these sort of cascading changes don't need to happen.

~~~
bascule
There are ways to compose functions, like the deferrable pattern. They just
all kind of suck and are poor replacements for having a stack.

I'm all for using coroutines to solve this problem. That's the approach taken
by my Celluloid::IO library:

<https://github.com/tarcieri/celluloid-io>

Unfortunately Ryan Dahl is adamantly opposed to coroutines so that's not going
to happen in Node any time soon.

~~~
peterhunt
Deferreds make your code nicer, but they still don't magically make your code
composable.

What would make Node more attractive is if it supported copy-on-write
multithreading and gave me a way to cheat and use asynchronous I/O (like a
wait(myFunctionThatTakesACallbackOrDeferred) function)

~~~
bascule
There are many ways to compose deferrables, primarily by grouping several of
them together into something which is also a deferrable. See the em-scenario
gem for examples:

<https://github.com/athoune/em-scenario>

Note: I still think this approach sucks.

V8 provides a really awesome shared-nothing multithreading scheme via web
workers. It's just nobody uses them.

~~~
peterhunt
Oh right that's definitely true and is much more elegant. I was talking about
when a function written in synchronous style (in a long stack of synchronous
calls) needs to call something asynchronous.

------
hasenj
In my experience, node.js is a terrible choice for most web applications. If
you really need real-time, sure, go for it. But if you're interested in node
as an alternative to rails/sinatra/django/flask, then stay away from it. The
cool stuff you get for free like coffeescript/jade/stylus can also work with
the ruby and python frameworks.

Node sucks for general web apps because you have to program everything
asynchronously. This is a major step backwards, and quite frankly it feels to
me like trying to program in assembly. It's not expressive at all. You have to
write your program in some pseudo code, then translate that to async code. And
what for? What's the advantage you'll get? scalability? Who says you will need
it? This is exactly where you should remember that premature optimization is
the root of all evil.

~~~
dahlia
> Node sucks for general web apps because you have to program everything
> asynchronously.

To be fair, the exact problem is not async itself, but forcing CPS
(continuation-passing style) for serial routines. For example, gevent and
eventlet use greenlets (coroutines for Python) to avoid unnecessary callbacks
in serial routines.

~~~
robfig
I'm curious -- how would the author's example look using greenlets?

------
mixu
This is really old - discussion from 6 months ago:
<http://news.ycombinator.com/item?id=2848239>

Basically, async I/O gives you more options than "block the whole world while
you go read this stuff", and that means that old idioms aren't effective.

You gain more control: can choose when to block, when to limit concurrency and
when to just launch a bunch of tasks at the same time. At the same time, you
need to adopt a few new patterns, since you can't/don't feel right blocking
execution every time you use an external data source. It's definitely a
tradeoff and not a magic bullet.

For my longer take on this, see <http://book.mixu.net/ch7.html>

~~~
bascule
You don't get to choose when to block. Blocking becomes an error which hangs
the event loop. The only option for blocking calls is to use a thread pool
that sends events back to the event loop when a thread finishes running.

~~~
mixu
You're right, I should've said "emulate blocking by checking that all of the
tasks we queued have completed upon finishing a task before proceeding while
allowing the event loop to run" instead of "blocking". It's not really
blocking the event loop, only a controlling the flow of a particular path of
execution. Only a few native API's provide synchronous versions that will
block the entire process until they complete - like the filesystem API's
fs.readFileSync.

------
kls
While I agree that event programming is very different, some of the issues the
author brings up can be dealt with by architecting a program for an event
based system. I understand that, is the authors gripe. That it is sometimes
hard for someone coming from a non event control flow background to adapt at
first. But items like the loop example are examples of mixing half control
flow and half event based programming. What should be done in that situation
is that it should not be returning a list, it should be returning a promise
that will get called on completion of the list. Or a more elegant solution
would be to notify listeners when a new item of the list is parsed so that
they can observe it and see if it is an item that they are interested in. I
understand the authors frustration, but it appears to me, that some more
articles on best practices would help bring clarity on how to deal with these
form of patterns.

------
vvcephei
I hope this isn't too pedantic, but I wish the author would stay away from
calling Node "concurrent". The whole point of Node is that it's asynchronous
but not concurrent: there is a single thread of execution for your whole
program, which is what lets you ignore locking, etc.

In fact, when you program for Node it's really important to keep this in mind,
since (contrary to another statement from the article) not all libraries are
asynchronous. If you select a synchronous db driver or write a long-running
loop, it will block the rest of your program.

In general, though, I thought it was a good piece. I'm sure many heads have
exploded on first introduction to node (and JS in general).

~~~
bascule
You're confusing concurrency and parallelism. Concurrency describes having
several tasks or operations that contend on resources or events. They may or
may not execute in parallel, and certainly don't in Node. In contrast, Netty
provides a thread pool for executing events and thus provides concurrency and
parallelism.

~~~
devs1010
As far as I know, by definition (coming from someone with a predominately Java
background) concurrency requires multi-core processors, or, for example, a
distributed application with at least 2 nodes that communicate with the same
central server / "hub" (which would still require a multi-core processor, as
far as I know, on the server). The central tenet mentioned with concurrency is
often that of "race conditions" where it cannot be predicted which thread or
node will access a resource first. If a task isn't executed in parallel, such
as it has to wait for the other to finish before it starts / proceeds, I don't
see how it could cause a race condition. Admittedly though, I don't have deep
knowledge of how high level code translates to actual CPU instructions so it
could be possible, or even likely, that if the processor is switching between
tasks in such a way that each line of code, so to speak, that is run is from a
different method that concurrency issues would occur even on a single-core
processor. A language like Java has robust and mature aspects of the language
to deal with this so I would be wary of using something like node.Js if this
wasn't well documented. Concurrency issues are nasty and I have seen firsthand
the mayhem they can cause in legacy applications I've worked on.

~~~
ww520
Concurrency doesn't require multi-core. Parallelism requires multi-core.

------
EGreg
This guy doesn't know much about javascript, I am guessing. I made some gists
that take his code and add minimal changes to it, that fix the problems he
complains about:

"Two different code paths, can't do DRY" really?
<https://gist.github.com/1678395>

"Oh noo, I can't return the results because they are async". That's what
callbacks are for. You know what you CAN do? Do I/O in parallel that's what!
Node makes it easy. <https://gist.github.com/1678415>

Anyway I hope this illustrates the point. The guy says it exactly right in one
place: "Once you get your head around thinking in async terms, node.js starts
to actually make a lot of sense." And therefore it is not a giant step
backwards.

There are more elegant ways to write this (see
<http://qbix.com/plugins/Q/js/Q.js>) but these are just minimal changes to his
own code.

------
CoffeeDregs
I tend to agree with the post, but I find the one-language-to-rule-them-all
thing too compelling to fret too much over asynchronicity. Although it'd
probably lead to lots of synchronous code, it would be nice for it to to be
easier in nodejs to be synchronous sometimes and asynchronous sometims.

I would refactor the code to something like (still not as simple as
synchronous):

    
    
        asynchronousCache.get("id:3244", function(err, myThing) {
          var useResult = function(err,_myThing){
              // We now have a thing from DB, do something with result
              // ...
          }
         
          if (myThing)
             useResult(null, myThing);
          else
             asynchronousDB.query("SELECT * from something WHERE id = 3244",useResult);

------
jrockway
The problem is that the author simply didn't notice the refactoring and
abstraction opportunities available. (One question to ask yourself: "How do I
test this?" If you can't answer that question, the code is wrong.)

We'll start with the synchronous example:

    
    
        myThing = synchronousCache.get("id:3244");
        if (myThing == null) {
          myThing = synchronousDB.query("SELECT * from something WHERE id = 3244");
        }
    

This is verbose and tedious. We should really make the API look like:

    
    
        myThing = database.lookup({'id':3244}, {'cache':cache_object});
    

Let's apply this idea to his asynchronous example. We want the code to look
like:

    
    
        database.lookup({'id':3244}, {'cache':cache_object}, function(myThing) {
            // whatever
        });
    

So instead of writing this:

    
    
        asynchronousCache.get("id:3244", function(err, myThing) {
          if (myThing == null) {
            asynchronousDB.query("SELECT * from something WHERE id = 3244", function(err, myThing) {
              // We now have a thing from DB, do something with result
              // ...
            });
        
          } else {
            // We have a thing from cache, do something with result
            // ...
          }
        });
    

We need to refactor this. Remember, node.js is a continuation-passing-style
language. So let's set a convention and say that every function takes two
continuations (success and error).

Then, to compose two functions of one argument:

    
    
       function f(x, result, error)
       function g(x, result, error)
    

To:

    
    
       h = f o g
    

You write:

    
    
       function compose(f, g){
           return function(x, result, error){
               g(x, function(x_){ f(x_, result, error) }, error);
           }
       }
    

(Data flows right-to-left over composition, so "do x, then do y" is written:
"do y" o "do x".)

Now we can cleanly write a complex program from simple parts. We'll start by
creating a result type:

    
    
        result = { 'id': null, 'value': null, 'not_found': null }
    

Then, we'll implement cache functions that take keys (as results of this type)
and return values (as results of this type). Looking up an entry in cache
looks like:

    
    
        cache.lookup = function(key, result, error){
            new_key = key.copy();
            cache.raw_cache.lookup(key.id, function(value){
                new_key.result = value;
                new_key.not_found = false;
                result(new_key)
            },
            function(error_type, error_msg){
                if(error_type == ENOENT){
                    new_key.not_found = true;
                    result(new_key)
                }
                else {
                    error(error_type, error_msg);
                }
            });
        };
    

Looking up an entry in the database looks about the same. The key feature is
that the "return value" and the "input" are of the same type. That makes
composing, in the case of "try various abstract storage layer lookups in a
fixed order", very easy. (Yes, the example is contrived.)

    
    
        dbapi.lookup = function(key, result, error){ ... };
     

Now we can very easily implement the logic, "look up a value in the cache, if
it's not there, look it up in the database":

    
    
        cached_lookup = compose(dbapi.lookup, cache.lookup);
        cached_lookup(1234, do_next_step, handle_error);
    

You can, of course, generalize compose to something like:

    
    
        my_program = do([cache.lookup, dbapi.lookup, print_result]);
    

Writing clean and maintainable code in node.js is the same as writing it in
any other language. You need to design your program correctly, and rewrite the
parts that aren't designed correctly when you realize that your code is
becoming messy.

Continuation-passing style is pretty weird, but you do get some benefits over
the alternatives. Writing a program with coroutines involves deferring to the
scheduler coroutine every so often, littering your code with meaningless lines
like "yield();". Using "real" threads is even worse; your code looks like
single-threaded code, but different parts of your program are running
concurrently. (Did you share any non-thread-safe data structures, like Java's
date formatter? Hope not, because you won't know you did until the production
code dies at 3am.) Continuation-passing style lets you "pretend" that you are
executing multiple threads concurrently, but the structure of the code ensures
that only one codepath is running at a time. This means that libraries that
don't do IO don't have to be thread safe, since only one "thread" runs at a
time.

All concurrency models involve trade-offs over other concurrency models. But
when comparing them, make sure you're comparing the actual trade-offs, not
your programming ability with each model.

~~~
vasco
I think you've actually demonstrated how something that would be really simple
in python, gets horribly complicated in javascript. Or maybe it's just me...

~~~
jrockway
If you use a continuation passing style in Python, then the code looks about
the same. Most Python programmers use threads (and let the GIL give them a bit
more thread safety than C++ and Java programmers get) or Twisted (with
Deferreds).

I think you'll write better JavaScript if you know Python because Python
encourages you to use named functions instead of lambdas. JavaScript fanbois
get very excited about anonymous functions and overuse them; Python doesn't
let you use anonymous functions for anything useful, so you tend to name
things. (object.method is also nice syntax for working with callbacks.)

Anyway, Python and Node feel about the same to me, except for the fact that
Python has nicer syntax.

------
prodigal_erik
Transforming functional or imperative code into continuation-passing style
isn't that big a deal. If javascript weren't such a pain in the ass just to
parse, there would probably be tools to do that. Maybe coffeescript will do
it, but this is why macro-extensible languages are a big win—I'd have the
right hooks to easily do it myself rather than waiting for the implementors to
officially update the language (or get elbow deep in their internals and hope
they accept a huge patch).

~~~
chc
There's already a branch of CoffeeScript, called Iced CoffeeScript
(<http://maxtaco.github.com/coffee-script/>), that does this. The example in
the post would look like:

    
    
      for blogPostId in recentBlogPostIds
        await asynchronousDB.getBlogPostById blogPostId, defer(err, post)
        templating.render post
    

Though you still can't simply return the result — you'd have to use a
continuation — but it does make it simple enough to use CPS in general.

------
shangaslammi
To make things easier, you can use node-fibers
(<https://github.com/laverdet/node-fibers>) to structure your asynchronous
code with coroutines or, if you are feeling less adventurous, async
(<https://github.com/caolan/async>) is an excellent helper library for common
asynchronous code patterns and it works on the client-side as well.

------
sek
This callback programming is the reason i quit node.js. It is easy to get
something up and running, but then it feels like i never get out of the chaos
i created.

This whole thing additional to the whole mess JavaScript is? I never liked it
to begin with, but there is no real alternative until Dart is ready. Every
time something comes out for JavaScript it adds another abstraction and chaos
in my opinion. jQery for example, really impressive to begin with, but when
you see what a horrible mess you can create with it...

There is a reason why big companies never adopt these things, i can't imagine
how it would be to take over a node.js app from someone else.

Now you can argue that this takes practice. Crockford may write JS from
heaven, but i don't want to invest my time in this language. These
inconsistencies are not fun to deal with and when Dart is here, companies will
drop it very fast.

I am now stuck with Scala, it is the complete opposite. It is complicated to
get in, but when you get it, you have a gigantic toolbox to solve every
problem the way you want. For web programming i recommend Lift, but when you
want to get in fast and a fan of async try Play2.0. Node.js made async
popular, it should get credit for that.

~~~
jonknee
> There is a reason why big companies never adopt these things, i can't
> imagine how it would be to take over a node.js app from someone else.

That's incredibly inaccurate. Big companies adopt a lot of crazy things,
craziness isn't much of a deciding factor. Node.js is used by plenty of large
companies and despite your tastes, Javascript in general is ridiculously
popular in companies of all sizes.

~~~
sek
References please, clientside doesn't count. Google forbid it and i don't know
of any big company how has SSJS in production. It's definitively not
ridiculously popular.

~~~
marekmroz
Apparently Walmart also uses it for mobile:
[http://venturebeat.com/2012/01/24/why-walmart-is-using-
node-...](http://venturebeat.com/2012/01/24/why-walmart-is-using-node-js/)

------
MatthewPhillips
Use recursion. Fixed:

    
    
      asynchronousCache.get("id:3244", function doThing(err, myThing) {
        if (myThing === null) {
          asynchronousDB.query("SELECT * from something WHERE id = 3244", function(err, myThing) {
      		// We now have a thing from DB, do something with result
      		doThing(err, myThing);
          });
    
          return;
        } else if(err !== null) {
      	// Handle error.
      	return;
        }
    
      // We have a thing.
      });

~~~
CoffeeDregs
Were you implying that your solution was simpler?

------
ok_craig
Tame JS (tamejs.org) makes asynchronous code in node.js very very easy.

~~~
pstuart
Thank you!

------
nagnatron
I thought that this was going to be a node.js bashing.

What a let down.

------
danbmil99
The problem is, Javascript continuation syntax is ugly and verbose. All the
nested indentation fails to map to our human sensibilities about what the code
is actually designed to do.

Someone needs to fix this

------
wicknicks
I am not convinced with the blog post example. If ordering and null entries
really matter, then you must put appropriate instructions to handle them.
Ordering can be tackled by adding the blogPostId to each entry, and sorting
the resulting collection with this key (assuming that you don't pick up a
million+ posts).

I advocate polyglot programming, and using node.js for tasks other than what
its designed for (server side async programming) might result in unfavorable
results.

iPods are not lousy because one can't text with them.

------
gexla
My take on this post is that the author was calling Node a giant leap
backwards because he believed all code should have the look and feel of
Python.

But now, he's not so sure.

Did I miss anything?

ETA: I understand that coding for Node looks and feels weird, but so does
coding for Lisp, Smalltalk, Haskell and a long list of other programming
languages.

~~~
batista
The problem is that the problems Node solves, can be solved better and more
cleanly than a spaghetti code of callbacks, with coroutines for example. It's
just that Node and javascript are not up to the task.

Now, the thing Node has going for it, is that, despite being inferior than
other similar technologies, it has a big following (community matters), lots
of libs (libs matter), and it's based on an easy and familiar language to
many.

------
pimeys
I kinda like the em-synchrony for Ruby. It handles the callbacks with fibers,
so my actual code doesn't have the callback hell of Javascript. Although the
implementation of Ruby fibers is not-so-nice at this point. I hope they'll fix
it in the next versions.

~~~
bascule
em-synchrony's problem is it has people wrap asynchronous libraries one-at-a-
time. There's a few problems with that:

You're exposing a synchronous API, but still can't take advantage of the huge
ecosystem of Ruby libraries that already expose synchronous APIs.

Wrapping libraries becomes a one-off chore. Each individual library must be
wrapped to work in an em-synchrony system, and if the libraries aren't both
asynchronous and wrapped in fibers you can't use them. This not only shrinks
the ecosystem of libraries further, but is also more error-prone than
providing a general coroutine abstraction around socket IO.

Providing a generalized abstraction for doing synchronous I/O with
sockets/fibers and an evented backend is exactly what I'm working on in
Celluloid::IO:

<https://github.com/tarcieri/celluloid-io>

~~~
pimeys
This is a very interesting project. I'll have to dig deeper into it soon.

------
ww520
Aync codes like threaded codes are different from the simple sync code. These
different style coding are there to take advantage of the concurrent benefit
of the system. Developers with simple single flow control code background
often complain about the extra flow control complexity when it's outside of
their comfort zone. Think of it as a level up on your skill.

There are libraries out there that add syntactic sugar to make async code look
like sync code. Like,

    
    
        group (
            asyncfunc1()
            asyncfunc2()
            asyncfunc3()
        )

~~~
mattadams
I think my biggest gripe is that these things are only possible by using one
of the many libraries or rolling your own solution. The rather disorganized
state of the node.js libraries is far too confusing for most of us who don't
do Node 24x7.

~~~
tlrobinson
Agreed. The solution needs to be part of the language, or at least node
itself, not a 3rd party library (or 10)

------
rpledge
It takes some work to get used to the change in flow control, but it seems to
be worthwhile (at least it has been so far for my project). Coming from a real
time/embedded background seems to have helped me because typical those systems
are heavily event based. I don't know if node.js will change the world but I
think it's worth at least playing with just to get some experience with the
programming model.

------
ricardobeat

        getPosts (ids, cb) ->
            res = []
            stash = (err, post) ->
              res.push post
              cb(res) if res.length is ids.length
            db.getPostById(id, stash) for id in ids
            
        getPosts [...], (posts) ->
            # go on...
    

use `res[i] = post` if there is some implicit ordering.

------
nwjsmith
Declaring node a 'giant' step backwards is a stretch. Callback spaghetti isn't
the problem it set out to solve. It is meant to provide easy(-er?)
concurrency. If you measure it against its goals, I think it's pretty good.

------
scriby
I wrote a module based on fibers to help with these sort of problems. Take a
look at <https://github.com/scriby/asyncblock>

------
brendoncrawford
This is a rather sensational article. The author's first example is pretty
easily solved:

    
    
      getFromCache = function (id, callback) {
        asynchronousCache.get(['id', id].join(':'), function(err, myThing) {
          if (myThing == null) {
            asynchronousDB.query("SELECT * from something WHERE id = $id", {id:id}, function(err, myThing) {
              callback(myThing);
            });
          }
          else {
            callback(myThing);
          }
        });
      };
    
      getFromCache(3222, function (myThing) {
        console.log('myThing:', myThing);
      });

~~~
brendoncrawford
Or if you want more re-usability:

    
    
      getFromCache = function (id, query, callback) {
        asynchronousCache.get(['id', id].join(':'), function(err, myThing) {
          if (myThing == null) {
            asynchronousDB.query(query, function(err, myThing) {
              callback(myThing);
            });
          }
          else {
            callback(myThing);
          }
        });
      };
    
      getFromCacheSomething = function (id, callback) {
        var query = buildQuery("SELECT * from something WHERE id = $id", {id:id});
        getFromCache(id, query, callback);
      }
    
      getFromCacheSomething(3222, function (myThing) {
        console.log('myThing:', myThing);
      });

~~~
davej
Don't forget to bubble errors up the callback chain along with return values
(I always forget that too):

    
    
        getFromCache = function (id, query, callback) {
          asynchronousCache.get(['id', id].join(':'), function(err, myThing) {
            if (myThing == null) {
              asynchronousDB.query(query, function(err, myThing) {
                callback(err, myThing);
              });
            }
            else {
              callback(err, myThing);
            }
          });
        };

------
phzbOx
I rarely use ifs and whiles using javascript.. There are better, higher level
libraries that take care of it for you. As a bonus, some of them let you make
it * paralleled* with the same syntax. Obviously, when you switch to a new
language, you need to learn their new paradigms / designs.

~~~
adgar
> As a bonus, some of them let you make it * paralleled* with the same syntax.

JavaScript is single-threaded, so parallelism within a single JS VM is not
possible.

~~~
phzbOx
Hence the * _

