

Bullet-proof Node.js coding - ideamonk
http://stella.laurenzo.org/2011/03/bulletproof-node-js-coding/

======
tlrobinson
While this is a good article, things like this make me wonder if Node's
asynchronous programming model is a good one. Certainly I don't think it's
worth the mental overhead when you aren't dealing with highly concurrent IO
situtations.

I constantly feel like I have to jump through hoops when programming against
async APIs. None of the code in the article feels "elegant" to me.

Of course the popular alternative, preemptive multithreading with shared
mutable state, is probably worse.

~~~
jerf
I've criticized Node.js in the past on this site but I actually only recently
realized what is really _wrong_ with it in a strongly-grounded computer
science way rather than an intuitive way. Or rather, wrong with the
asynchronous event-based programming style in general that Node.js adopted,
rather than Node.js in particular (which is a fine implementation of the bad
idea).

Does anyone remember the structured programming arguments that occurred, oh,
ten years before I was born? There was a lot of arguing back and forth, but
one of the arguments made was that when your program was structured, your
position in the program means something. Just making up some quick psuedocode:

    
    
      def x(y: int, z: int):
          for i in 1 to 10:
      -->      print i * y + z
    
      def a(b: int):
          for i in 1 to 5:
               print i
               x(b, 3)
    

At the line that I marked, the very fact the program counter is pointing
there, and by extension the stack trace up to the point given, means certain
things. We know we have an x and a y which aren't just "undef", we know we're
in a call from a (in this case), and therefore, any preconditions or
invariants that are provided by those functions are also in effect at this
point in time. One of several reasons spaghetti code is bad is that the
program counter means much less; we don't know how we got there, we don't know
what invariants are in effect, in the old school assembler case we know almost
nothing at all, for all we know we just lept here with a goto.

As is almost always the case, this is a small trivial example, but when you
start layering many things on top of each other, layering in functions that
provide safe file handling or other resource management or state machines or
any of the other tools we've built on structured programming or its smarter
child Object Orientation, it becomes difficult to in some cases impossible to
follow all this in your head. Or deal with the work to make this stuff compose
together properly without layers actively stomping on each other. After
spending some time with the modern functional world and their increasing focus
on composability, working with async event programming feels like stepping
back in time 20 years.

Asynchronous event-based code is not as bad as old-school assembler, because
the event handlers themselves are still using the ideas of structured
programming internally. But the code as a whole is spaghetti code. I'm not the
first to say that, but it turns out upon reflection it's not a slur or a
metaphor, it is actually descriptive and fair. Async event code actually does
share many of the critical properties of spaghetti code, as the term was first
used. You don't have a call stack, you don't have the invariants, you're just
adrift in the code.

Of course with massive amounts of discipline you can function anyhow. You can
program structured code in assembler with massive amounts of discipline. But
A: you are spending valuable developer mindpower maintaining that discipline
which is better done by a language/runtime/VM, no matter how smart you are
you're still better off spending your smart on something other than raw
plumbing and B: it's actually harder than you think. We've so thoroughly,
utterly internalized structured programming since even before I was born that
we can't hardly even see what we're getting out of it, and consequently we
don't easily realize what we're giving up when we adopt this style. We aren't
any better _people_ than our assembler ancestors, we're not particularly more
disciplined than them, and they switched to structured programming for a
reason.

I'm this critical because I'm actually trapped in this style at work,
fortunately just in one of several subprograms but it's still annoying as hell
and the one that always takes far longer to work with than I'd like. It's an
excess of experience, not a deficit, causing me to be this critical. And I've
really come to loath this style. Compiling is for computers, not humans.

That's what Erlang and similar languages that can take care of the
asynchronousness at the VM/language level bring to you; they bring you _at
least_ back up to structured programming in power and safety, and possibly
beyond. And the arguments about how wonderful async event based programming is
and how you've got it all under control and how it's performant and not a
problem sounds to me like an absolute repeat of assembler programmers ranting
against structured programming back in the day, to an almost scary degree...
and every bit as correct and likely to win the future.

~~~
stellal
Very good points. I wrote the article and have struggled a lot with how
_right_ node is as a general purpose tool. My interim conclusion, for many of
the reasons you point out, is that to the extent that it can be made
relatively safe, it is a good tool for those situations where my other choice
would have been to write a massively scalable, non-blocking C program. This
class of problems does not come up every day, but it is coming up more and
more as the web and mobile arena move to more of a fully connected mesh of
clients and the switchboards that arbitrate their exchanges need to become
more flexible and efficient.

On the flip side, building massively scalable systems based on blocking IO is
rife with problems and I've found that enforcing architectures that make it
manageable have a way of obscuring the code and making it difficult for people
to intuitively get right as well. I personally find the explicit functional
style to be just as intuitive as structured and/or OO and therefore find that
when I need it, a platform like node that makes the knife's edge that is
asynchronous programming explicit and manageable with a good functional style
is a good trade-off in the world where I am trying to find less bad options to
a hard problem.

Sometimes you have to step back 20 years in order to break through the layers
of assumptions that were added in the interim. I doubt node is the last word
on the topic, but it is a refreshing interlude to what was becoming an
unwieldy calcification of the theory of how to program computers. The water
will find the right course eventually as we experiment with the different
styles.

~~~
jerf
"it is a good tool for those situations where my other choice would have been
to write a massively scalable, non-blocking C program."

Agreed. I'd take a Node.js implementation over straight C any day.

"On the flip side, building massively scalable systems based on blocking IO"

Please remember that's an implicit false dichotomy. "Blocking" has to do with
the runtime and the VM, _not_ the visual appearance of the programming
language syntax. Erlang is _non-blocking_ , but it is _also_ structured (and
functional). In fact, by my personal standards Node.js is _still_ a blocking
language; you have to jump through hoops to get non-blocking behavior and you
only get it when you ask for it. In Erlang or Haskell, you simply _get_ it. Go
ahead and do a lengthy math computation if you'd like. Take several minutes.
You won't block anything else in the meantime. And you don't have to manually
break the computation up yourself. Just do it.

I say Node is stepping back 20 years not because it feels primitive compared
to C#, but because it feels primitive compared to the "really existing, I use
it in production code" Erlang runtime. Also Haskell, except I can't say I use
that in production code. And probably Go (still waiting for someone to
confirm), and Stackless Python, and several other things.

The real reason I speak up so often on Node.js articles is not that I hate
Node.js, it is that I hate the _hype_ because it is shot through with
falsehoods, which I'm rapidly upgrading to "lies" as the same blatant
falsehoods continue to get around without the community fixing them. Your
alternative is not async event based or synchronous; there's a third option
where the async is managed by the compiler and runtime and it _works_. In fact
it's Node.js that has to catch up to even the previously existing async
frameworks, and Erlang is simply miles beyond Node.js in every way, except it
isn't Javascript. (Which I freely admit is a problem. Erlang-the-language
could stand to be improved, even as it is hard to borderline impossible to
match the total environment.)

(If you know the choice exists and you don't choose it for some reason, hey,
great. Like I said in my first message above, I've got my own async-event
based program I have to maintain, and I'm the original author, it was my
choice, because when I took all the issues into account, it was the right
choice. But you ought to choose with an understanding of the full picture and
all the relevant choices, and understand all the tradeoffs, not because you
think that async event-based programming is better than everything else at
everything. It's got some really serious drawbacks, and there are really-
existing, production-quality, "web-scale" things that can be used that do not
have those drawbacks.)

~~~
stellal
I can certainly see your point, and I am aware that I am playing a little fast
and loose with some of the terminology around blocking vs non-blocking. To be
precise, in node's case, it presents a consistent, callback based API to non-
blocking IO. Given that that API is reflected exclusively in JavaScript the
language, this has impacts the style and appearance of Node programs. The
resultant style is imprecisely referred to as asynchronous, non-blocking, etc.

Using Erlang vs Node on my current project was a serious consideration. It is
well thought out and its message passing and lightweight process based design
is more evolved than anything node has or likely will have. It's also Erlang
and I have found productively programming in it to be incomprehensible. Maybe
its a personal problem, but it is what it is. Ditto for Haskell. I was
surprised to find that my brain didn't bend that way.

~~~
viraptor
There's always Scala/Java with Akka as a way to get both the actor model and a
"more c-style" language. (<http://akka.io/>)

------
ddispaltro
Im curious on node fibers. They seem like a second class citizen in node, any
chance they will become mainstream or will it always be like Twisted is to
Python.

~~~
tlrobinson
Most of the Node core team is vehemently opposed to anything they consider a
language extension, which fibers arguably are, so I doubt you'll be seeing
them baked into Node any time soon.

See:
[https://groups.google.com/d/msg/nodejs/GDqkQzmnwHM/FKETaPivX...](https://groups.google.com/d/msg/nodejs/GDqkQzmnwHM/FKETaPivXH4J)

------
Cyranix
Cache link:
[http://webcache.googleusercontent.com/search?q=cache:http://...](http://webcache.googleusercontent.com/search?q=cache:http://stella.laurenzo.org/2011/03/bulletproof-
node-js-coding/)

------
stephenhuey
This arrived just in time since I deployed my first Node app on Joyent last
night! Great examples--they're all pretty new to me, but I'm planning to
revisit your tips when I get further along.

------
jhrobert
I'm experiencing "node anxiety", let me explain:

There really are two kinds of functions in a node program.

Synchronous functions and asynchronous functions.

This has major consequences during refactoring when what used to be a
synchronous function now needs to become an asynchronous function => all
synchronous functions that used to call the synchronous function needs to be
turned into asynchronous functions themselves.

Sometimes the ramifications go way beyond first expected.

Sometimes the ramifications turned to be massive.

This becomes worse when one realizes that synchronous function are much more
readable, half the code and about 5 times faster than asynchronous ones (see
<http://jsperf.com/asynch-cost>)

Eventually the idea of refactoring a synchronous function into an asynchronous
function becomes a source of worries.

And that, my friends, is not something easy to figure out and is what I call
"node anxiety"

------
cpr
Excellent, meaty article with a lot of practical examples and hard-won
knowledge.

------
noacctplz
For #1, why not suppy a success and failure callback function to your
doSomeAsyncCall function? There is no reason to muddle those two lines of
logic together (which is the real reason for the described error).

And for #2, Relying on javascript hoisting functions isnt a smart way to keep
your code organized.

------
BillSaysThis
Fairly amusing that a page with this title returns a 404 at this moment (and I
tried reloading twice more).

------
sausagefeet
Anyone used node-fibers? What are the gotcha's?

~~~
grayrest
Main gotcha is listed in the "Garbage Collection" section of the readme:

<https://github.com/laverdet/node-fibers>

------
allan_
seems like laurenzos bulletproof node.js code just got caught in the eventloop

~~~
pjscott
The site is running on Wordpress. Incidentally, one of the major reasons for
Wordpress blogs to go down under heavy traffic is excessive keep-alive times
causing too many Apache threads to be open. It's exactly this lots-of-
connections problem that node specializes in handling gracefully.

~~~
stellal
That was embarrassing. I do know how to craft a good apache config, and am
just going to claim a case of being stupid.

It is ironic that node is in fact designed to handle this type of
memory/connection scalability problem reliably, but in this case it was pure
administrative (me) error.

~~~
_pdeschen
Throw nginx in front :-)

[http://blog.rassemblr.com/2011/01/wordpress-need-for-
speed-o...](http://blog.rassemblr.com/2011/01/wordpress-need-for-speed-
optimization-with-nginx-caching/)

