Hacker News new | comments | show | ask | jobs | submit login
The emperor's new clothes were built with Node.js (ericjiang.com)
91 points by erjiang on June 6, 2014 | hide | past | web | favorite | 58 comments



Just because lots of people say "non-blocking, good concurrency, event-based model" without understanding it doesn't mean it's not correct. With a single-threaded, event-based model, you can get an awful lot of concurrency at reasonably consistent latency without ever thinking about threading issues or managing thread pool size to balance memory and concurrency -- and often with less memory, too.

Yes, there are downsides: it's not a natural model for many people; you do still have to consider many complex concurrency-related issues; and control flow libraries are usually needed to make complex code much more readable. But these are either easy to deal with or intrinsic to the complex problem at hand, and the above still stands: once you understand how to program with this model, the "default" implementation usually scales very well.

Yes, Node was not the first to do this, but that's not relevant to anything.

Besides all that: I'd put Node's postmortem debugging and dynamic tracing capabilities up against any other dynamic environment.

As a side note: it's enormously condescending to tell people that a very broad technical choice is not only wrong for their technical problem -- without even knowing what it is! -- but that they're also part of some mass delusion for even thinking that there's anything to it. I hate arguing about this stuff. You keep trashing Node, and we'll keep using it to build large distributed systems.


Hi, I attempted to address your argument in the section roughly titled "Node.js has super concurrency!". You may have to scroll down "below the fold", so to speak.

I presented two different ways of writing a single-threaded, event-based function that can handle "an awful lot of concurrency without ever thinking about threading issues or managing thread pool size to balance memory and concurrency". Essentially, it is possible to get good concurrency in a single thread without the use of many nested callbacks or abandoning the language's natural control flow primitives. It's not necessarily the case that you can't have both event-based concurrency and sane code.


Thanks. Far from not having scrolled far enough, I was actually responding directly to several points in that section, starting with the suggestion that people who say "non-blocking", "super concurrency", and "evented" may not understand what they're saying. I was also responding to your point that Node wasn't the first runtime to do so.

Yes, other environments have compelling concurrency models. As for control flow: it doesn't matter much to me whether the control flow abstraction is built into the language or merely a ubiquitous construct in the code I use. (In fact, that's a tradeoff, too: I like that I can go inspect, modify, and extend these non-primitives easily, and I've done so to add better observability.) The Go example is interesting, but less explicit -- that, too, is a tradeoff. It's all tradeoffs; to pretend like one thing is strictly worse (or better) than all others is not founded.


If you read the interviews with Ryan Dahl, one big reason he chose javascript for his scripting language after considering a number of other options was that javascript did not already have a huge body of blocking libraries.

If you're trying to script a nonblocking system where everything is nonblocking all the way up and down the stack, you're going to end up creating a parallel ecosystem to the normal one, and confuse people by not allowing them to do the normal things they would do in language X.

Javascript didn't already have a huge blocking ecosystem.

And speaking of ecosystem, that's another really good reason to like node. The stream programming that is coming out of the community is brilliant, and the focus on constructing large projects by writing thin modules that use other thin modules ('onion oriented programming') is fantastic.

I've been doing a lot of Scala recently, and it's extremely noticeable that despite the languages power to do things in many different ways, the Scala community frequently seem to choose a strangely convoluted path filled with hidden magic going on behind the scenes where every new library you depend on feels like learning a new language from scratch.

It might well be that the Go community is also as good as the node community for some of this stuff, but the community and library ecosystem is a definite plus point.

Personally I've never had a problem with Javascript coding. It's perhaps easier to write stupid code in it than a lot of other languages, but I know it well enough to know where to be disciplined, and it almost never surprises me now.


This is the article I'm going to refer people to to explain the problems that I have with nodejs. All of these (absolutely true) observations aside, I really like how the the nodejs trend has brought concurrent async-style thinking into the mainstream of web programming.

By "all of these observations aside," I mean that I hate working with it personally, and I'll be using Erlang/Webmachine thank you, but I learn new ways to think about modern problems every day from many of the great people that are part of the nodejs juggernaut.


Sarcasm is pretty heavy in this article, but I think it's a little overly negative...

He makes a lot of comparisons with Go, but Go was created after Node. Many of the languages he notes as possible replacements for the non-blocking rant are obtuse (I love Haskell, but there's no way it's not obtuse)

[EDIT] - turns out they were made pretty much at the same time 2009... Learned something today, I guess. Felt like Go came after node to me but I guess it didn't

Also, I must be the only one that doesn't really mind callbacks. Maybe it's stockholm syndrome.

I also think node should get kudos for at least seeming different enough to prompt the move from enterprise customers. Any move away from burdened "enterprisey" software is worth praising I think. Not because java is bad/lame by default, but instead because it might prompt other companies to lust after something new that might be better (whether it's python, node, go, or haskell is irrelevant)


I picked Go not because it's my dream tool (not having 'map' is kind of sad) but because it's proof that you can have your cake and eat it too. Otherwise, people think that you have to choose between programming sanely and high-performance, but not both.


Yes, I agree -- but Go's got it's own warts. I think they picked up that (err,result) common function signature from node (or they just thought of it themselves, either way).

However, I think once you start trying to manage go routines and all their channels, and all the messages that are flying around, go will start looking less utopian. Of course, a lot of that is subject to code architecture (well architected code will be easy to follow), but I feel like no one's found enough of the dark corners of go yet.

Oh one thing I wanted to mention - Why didn't you include any samples of go routines in your code?

Unless I am misunderstanding go immensely, a statement like "(task3(task2(task1()))" wouldn't actually be asynchronous, you would need to spawn dependent go routines right, and wait on completion on the channels?


> a statement like "(task3(task2(task1()))" wouldn't actually be asynchronous

Correct, because you're passing the result of task1() into task2(), and the result of that to task3(), afaik.


Node hasn't impacted enterprise customers. It has impacted a few high profile silicon valley companies, also Java is not synonymous with enterprise - look at Twitter for example.

I imagine Twitter every day looks online at these node articles and just laughs. That's a team that realizes Java != JEE and that JVM programming is a great thing to be doing.


> Java != Java EE

Not every usage of Java involves Java EE, but that doesn't mean Java EE isn't a really valuable tool for many of us. It has had a bad name in the past, but last version have totally changed that and it's now a very sane tech.

Twitter of course is a highly specialized thing that operates on a Scala that almost by definition not every web site out there will be at. It makes sense to build a lot of their own tools and only use lower level infrastructure as a base.

So the JVM and Java is low enough for them to build their stuff on, but many parts of Java EE may be too high level for their very specific use case.


> I also think node should get kudos for at least seeming different enough to prompt the move from enterprise customers. Any move away from burdened "enterprisey" software is worth praising I think. Not because java is bad/lame by default, but instead because it might prompt other companies to lust after something new that might be better (whether it's python, node, go, or haskell is irrelevant)

The infuriating thing is that it's not the technology that makes the difference, it's the clean slate and getting away from stupid policy. I bet the Paypal people could have written a better version in Java than the Node version, if they weren't required to use the existing internal library and follow existing coding standards and deploy using approved infrastructure and...


> He makes a lot of comparisons with Go, but Go was created after Node.

It's not like Node invented evented concurrency. Something like Twisted predates Node by many years, for instance.


Yes, very true -- Tornado also.

However, if twisted were it's own language, it may end up looking much like node. Twisted is just a library (a damn good one, yes), but if you were to try and transform it into the base framework for an entire ecosystem people might also rail against it from time to time, due to the async-by-default nature.

Also, to me, twisted seems like a much more high-level tool (replicating protocols over simple parallelization/concurrency), than something like gevent


What I'm trying to say is, even in the event that Go was created after NodeJS, the "but Go could learn from Node's mistakes" argument implied here doesn't work. There were enough event-based concurrency systems before the advent of Node, and I'm sure the creators of Go were aware of them.


Also, if you're going to choose a language to compare to, don't choose one where you can't even implement map, filter, etc. I'll take callbacks over having to use loops for everything.


"(I love Haskell, but there's no way it's not obtuse)" I invite you to drink the kool-aid, then you won't think Haskell is obtuse anymore.

Love, Haskell fan-boy


Oh I have partaken of said kool-aid. Maybe I haven't had enough of it.

I've done toy examples, and watched lectures (Erik's C9 lectures are amazing), and I just haven't found anything to write in Haskell.

At this point Haskell is starting to seem like the sketchy guy in the corner at a party. "Oh you think I'm obtuse? just drink a little bit more of this kool-aid here"


I agree with a lot of the arguments in the article, but I do feel a very negative opinion about Node.js. I don't believe Node.js is that bad.

The 2 points I always hear from Node.js are:

1) Node.js is fast. It is really not that fast. It all depends on what you are comparing it to.

2) It is easy to learn. It is not easy to learn. The difference between browser javascript and V8 javascript are considerable. Not everybody will pick it up as easy as you imagine, specially if you are choosing Node.js to allow frontend devs work on backend code. It is a very different mindset.

I like the idea that is Node.js, but I don't think it is mature enough to take over the backend. I think there are other more mature languages right now that do the job without making programs look like spaghetti code.

I think people should take this article with a grain of salt, and keep getting their hands dirty in Node.js. I think it has a future in the cloud.


Why is it that most of the articles which pop up complaining about node's concurrency options don't even mention streams? For most data-manipulation tasks, FRP-style composition of higher order operations on streams turns out to be the best choice; not callbacks, promises, etc. Not even bringing them up in a discussion of concurrency in node makes it hard to take the article seriously.


"everything they call is callback based, returns a Q promise, returns a native promise, is a generator, is a pipe, or some other weird thing because it’s Node.js. (Just tell them to check the type signatures.)"

I think this addressed your question indirectly.


Not sure I've ran across what you're describing. Are you talking about using streams as an option to get around callback hell? Have a link describing this?



Yes, streams/pipes/channels/whatever can be great for certain classes of problems. I omitted them because I very very rarely see them used in Node.js. They are probably under-utilized and I simply didn't have enough examples to think about, since the overwhelming majority of the libraries and code I've encountered use callbacks. It could be a great topic to write about if you have the time.


But what about when you need the whole object in memory before you want to work with it? Also, many of the libraries written for nodejs are not necessarily stream-ready -- but I would argue most are callback-ready.


Computer Programming: A million wrong ways to do something and a million right ways to do something.


And where the only correct answer is "It Depends"


Those "million wrong ways" are not distributed equally amongst languages/environments :)


Isn't it just the case that nodejs is the least worst way of writing a concurrent web server? we know it's not perfect, but what is?


I'd like to think that if we weren't all watching Node we'd have more people to pitch into other things. I hear a lot of bluster about "learn new things" but I often just see people sticking with what they know. When the bright minds of web development are figuring out new ways to make JavaScript less awful, those minds aren't focusing on other solutions.

FWIW I think Elixir holds a lot of promise for the future. Projects like https://github.com/dynamo/dynamo look really cool. Of course, it's new, and alpha. But maybe we could all give it a try?


Erlang pretty much blows Node.js out of the water, in terms of sophistication of concurrency primitives (to mention one thing). Erlang is really just syntax on top of BEAM, which is pretty tailored to the language (you'll notice Elixir and Erlang share identical types etc).

Elixir adds some nice metaprogramming features, and I love that it exists. Erlang is used for a lot of Important Things in production, and its value to many developers is stability and fault tolerance. Erlang moves slow. It's nice to see a sister language that can more rapidly develop new language features (maybe some of the goods ones will get added to Erlang (:

Joe Arms post about Elixir is pretty interesting [1]

[1] http://joearms.github.io/2013/05/31/a-week-with-elixir.html


A whole class of programmers have learnt to write continuation-passing code in Javascript for asynchronous operations, as that's the way AJAX works. Moreover, it's the only way you can do things in the browser.

Node is web-browser style javascript programming, detached from the browser GUI and given the ability to accept inbound connections. Along with this has come a whole slew of fun wheel-reinvention around frameworks and templating engines.


I'd argue the simple approach of having a web server (nginx, apache) send requests to a pool of workers (perl, php, python, ruby) is considerably superior.


the problem is can workers do something else while waiting for the db to respond? AND are all libs built for that use case? tbe only advantage of nodejs is you have no choice but to write code a certain way.


The biggest webservices of today like google search, facebook, aws are all written using models that your parent suggests. Are you saying if these guys rewrite all their code in Go, they will basically cut down 50% of their operation costs or something? What exactly does 'worker can do something else' translate to?


Technically, yes - the workers can be multithreaded. If you put multithreaded workers behind an event-loop web server (both nginx or apache can do this), then they can handle many http keep-alives as well. It's not the same thing as using an event loop throughout the whole stack, but it's pretty effective.


When you take the time to learn relational databases, sql, normalization and maybe some caching, suddenly you don't find yourself spending very much time waiting on the DB.

If something indeed is going to take that long, perhaps you shouldn't be making the user wait on it either. Queue that shit.


I've spend a lot of time porting node.js code to Go. While the Go applications typically get an unfair benefit of being the result of the refactor, they have without fail (in my XP) had similar to drastically better runtime characteristics, while I would argue (and my teammates agree), being much simpler to read and maintain.

I have have written a few web apps in Go as well spending most of my time writing things like high performance proxies, CDNs and databases in Go, and IMHO Go is a "more least worst" way to write highly concurrent code compared to node.


I love your ending :) "Does your local Node.js Meetup group need a presenter? I am available for paid speaking engagements. Email me for more info."


I prefer http://silkjs.net

It handles each request in a forked subprocess so it's a lot more straight forward.


without sounding like a web 3.0 idiot -- how do you manage to scale that? subprocesses are not cheap


I'm sure someone else reading can answer your question based on more experience and expertise than me... anyhow the subprocesses are re-used so there's no overhead in their creation. And if your server is doing I/O bound work (or is just waiting a lot) then it should perform ok. If everything is CPU bound, then you'd want to not create more subprocesses than cores. SilkJS lets you define the number of subprocesses you want.

More info:

http://silkjs.net/sync-vs-async/


Thanks for the additional info -- though I would argue that it starts to sound like apache+fcgi (not that it's bad), or unicorn+wsgi... I think node pushes async to an extreme that makes it truly different from what's out there. Unfortunately, it results in leaky abstractions, and lots of misguided enthusiasm as the post goes into...

You could think of pure functional programming (haskell/ml) as a similar extreme -- Haskell gets a lot of concurrency/parallelism guarantees for free just because of how the language was built (pure functions,etc)... Is it wrong to think that node took a similarly crazy step, but in their case, they weren't able to rebuild the language, and that's what you get?

If I wanted to event-loop, or lightweight thread stuff in python, I'd need some libraries, and many of the ones that you might find are not very easy to learn.


subprocesses are not cheap

This statement depends heavily on the runtime environment. Consider either Erlang or fibers as implemented in C/C++, for example.


We usually only refer to OS-level processes as "subprocesses". You'll see terms like "lightweight threads", "greenlets", "fibers" for concurrency units managed by the application runtime instead.


Exactly what I meant -- thanks... Are you the erjiang that wrote the blog post? If so, thanks, I thoroughly enjoyed it


I don't know about silk but forking subprocess model usually involves caching the subprocess. So it's going to be as light weight as any other cgi thing.


I use ToffeeScript. Its a nice alternative to nested callbacks that looks like normal synchronous code.


Node is popular because it is fairly simple to learn and use. That really is the mains reason why I have seen most people pick it up.


I assume you read what I wrote and I wasn't clear enough. Node is actually quite DIFFICULT to learn and use. It's very very easy to write code that's broken in some way if you don't have a deep understanding of Node.js.

Fun unrelated anecdote: someone was showing off their concurrent Node.js web scraper and as requests piled up, it was apparent that it could handle no more than 5 concurrent requests. Why? The HTTP library's internal concurrency limit defaulted to 5.


Node is popular because it feels light and easy to use.

    http.createServer(...a simple function...)
Holy crap! I just created an http server and now I can respond to events with just a few lines of code!

Once you get deep enough into to it to realize that it creates a tendancy towards complex deeply nested code and that making a good, maintainable application is a non-trivial task... it's far too late. Because of the 'fast' feel of it, you're convinced that these are just things you have to figure out and learn best practices for - it couldn't be that the 'fast' feel was deceptive to begin with.


How many concurrent requests can the six-line HTTP server on Node's front page serve? (I don't know the answer. I've taken a quick stab a few times on a few platforms, but I keep running into local TCP port exhaustion, leading to multi-second latency bubbles, long before I run into any Node.js or physical limit.)


Not sure if you changed the context from HTTP clients to HTTP servers intentionally, but it's true that the default of five connections is an outgoing connections restriction, not related to the HTTP server.

This client restriction is deeply irritating. A project I was involved with is about creating a general purpose hypermedia API, and building up documents involved following the links in parent documents back through the API, which meant that a single request can end up requiring hundreds of further calls to satisfy, which seemed like no problem at first with node.js' easy concurrency (and an auto-scaling fleet of EC2 servers), but the client connection limit turned out to be the cause of a lot of timeouts and prompted a lot of working around.


The client "limit" is just a tunable with a bad default value. You can tune this up and completely eliminate that limit. (The failure mode is pretty bad, but this comes from a lack of observability in abstraction design that pervades most code.) The server example, on the other hand, demonstrates the point that the default way to write a "hello world" web server actually scales very well.


Node.js along with other callback and promise based frameworks are insidious. They are so because they look good in small examples. "Oh look, here is how to serve a file in Node.js". You just have one callback for the request, then another for the error and so on.

In larger cases it turns into a mess, but by that time there is already a buy-in. Run a short little demo, looks fast, read about how "cool it is" on the web some place. Ok let's rewrite everything in it.

Quite often once people started heading the wrong path, it becomes hard to look back and say "Yeah this is the wrong way, let's turn back". That is just basic psychology.

At the same time, I think Node.js is great. It is great as a differentiator when you ask people why they picked it and see what they say. If they come back with the list in the article (basically list "evented","async","webscale" jargon), then that tells me they don't quite understand what is going on underneath. Which is not bad, people have the right to be confused, but it just lets me know on what level to converse with them on.


it's simple to use when one has an understanding of networking principles.I wouldnt say it's simple to use has "PHP",where bascially as long as you can download WAMP,write echo and use FTP you're good to go.

It's definetly not simple to write webapps in nodejs,where a simple exception can take down your entire node server.

I'd even say go concurrency model is simpler and easier to use than nodejs.


We learnt that Node wasn't cancer, but it also wasn't a good technology either. Still, it was useful in keeping the clueless muppets away, lest they fuck up other tech ecosystems.


These articles always seem to boil down to only one thing- "I'm bad at callbacks".

I mean, really, is there any other criticism here? There is a valid point about the true nature of Walmart and Paypal's rewrites, but that's pretty tangential.

There's kind of a thing about some completely rookie mistakes in error handling, but that's really not relevant to anything, and again, tie into the "I'm bad at callbacks" thing.

I'd really be interested in some criticisms of node that aren't variations on stories of the author's past difficulties with callbacks, as most people who use node have no problems with them.

EDIT: Downvotes? Please, point out to me the part of the article that is not about callback troubles.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: