Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Is NodeJS stable enough to build the next Twitter?
52 points by SatyajitSarangi on May 25, 2013 | hide | past | web | favorite | 56 comments
I've already written all of my APIs and the entire backend using NodeJs and Postgres. My biggest fear is what happens if it breaks down, as my start up is something akin to say twitter(not really, but thought that's a better example when it comes to stability and a good API), I'm worried if node is good enough to build the next twitter.

I'm well versed in Python, and if things go balls up, I can rewrite the entire backend in a weekend in Python. But having used Python before in many other projects and been rather pissed with its overhead of realtime operations, I picked Node as the way to go.

But am I right? If things have gone wrong for people here using NodeJs, where does it happen? Which parts should I be most careful in during the architecture of the project?


Thanks for the answers till now. To clarify, no, I'm not building the next twitter. I took it as an example to explain a worst case scenario(or best case scenario, the way you look at it). Recently Jeff Atwood wrote about why he choose Ruby, and he made a valid point that isn't the cool language anymore. It has matured, stable and all of that. Considering that I wasn't using Node since it came into existence, I can't comment on its maturity.

Most examples in Node available on the web, try to clutter up everything in one file. So, I wanted to know the best practices when it comes building an architecture in node.

We've built the control plane to Rackspace Cloud Monitoring in Node.js, and overall the experience has been positive.

A few things to look out for:

1. Error handling. In node you can't just wrap your code in a try/catch. And even if you register a listener for uncaught exceptions, you almost certainly have state shared between requests (for example, a database connection pool), which makes trying to handle such exceptions risky. To use node effectively, you need to be very careful to prevent exceptions from being thrown in unexpected locations.

2. Code rot. It is a lot less obvious what is idiomatic in Javascript as compared to Python, etc. Its easy to end up with a wide range of styles and patterns, which make maintenance difficult.

3. Missed or double callbacks. These are interesting mostly because they are not something you would see in synchronous code, and they can be quite difficult to troubleshoot (especially double callbacks).

Mitigating these issues is as much a cultural challenge as it is technical. Lint everything, review code aggressively, and don't merge code that doesn't have tests. Choose libraries carefully (the ecosystem has come a _long_ way in the last few years).

All of that being said, these are things you should be doing anyway. Develop a good tech culture, but get your product out and grow your user base. If you become the next Twitter you'll have the resources to undo any mistakes you make now.

We are using Node at https://starthq.com and I agree with most of the points above.

Error handling is the major issue, because you need to handle all errors manually, i.e. you can't use try catch to trap all errors further down the stack. If you don't handle all errors your Node process will terminate and you may lose some state. Even if you're confident in the stability of your code, I strongly advise that you use a watchdog process like supervisor to start a new process if the current one terminates.

We've handled this issue and kept business logic code simple by using https://github.com/olegp/common-node which uses fibers to present synchronous APIs, allowing us to use exceptions for error handling.

Be very careful when choosing third party packages, since if they don't handle all errors, again your process will terminate and there's nothing you're able to do about it, even if you're using fibers.

One last issue is changes to the core APIs. Since some of them are still in flux, it is advisable to provide an abstraction layer above them so as to be able to weather any changes. For example when streams2 came out, we only needed to upgrade Common Node, with no changes to the application itself.

Have you tried a Promises implementation like deferred https://github.com/medikoo/deferred to manage callback spaghetti?

Or q? And wouldn't using promises help handle exceptions?

It's been great at http://geekli.st and we run a full node stack. It is true error handling is an issue sometimes, but that's over shadowed by the quality of engineers looking to hack in node. Most common reason I hear for someone leaving a company... "I wanted the opportunity to hack in node at work" - not once, twice but dozens of times. In an age where finding top notch teams to build your stack is really hard, what code base you use is really important to attracting top talent.

Thanks! In fact, I knew that Rackspace uses Node and thus am incredibly happy that I got an answer from you folks. Any particular architecture you guys followed in building your app?

I'm doing as many things right as possible, but I would rather take as much advice as I can get than being sorry later. Node, like Python, isn't really forgiving.

Node.js itself is fine.

The real problem with node.js is the libraries. Just don't use them.

A huge portion of existing libraries is full of hidden bugs, shortcomings, race conditions, edge cases, security issues, unscalable, or unmaintainable (and unmaintained).

This is exacerbated by the fact that npm makes it really easy to publish a library.

Many small buggy libraries.

Core modules are too low level (e.g. http), and you really don't want to use an overlay library.

Not to mention that doing something not trivial fully asynchronously is not as fun as it sounds. You will spend a significant time tracking bugs, fixing edge cases, and making your code stable.

There is still no way to make async code better in core (no promises); and there are a handful of incompatible promises implementations.

Oh, and node.js is not really fine actually. It's not doing everything using asynchronous I/O as you would expect. Node.js uses a thread pool for things like DNS resolution and disk I/O. Only 4 threads by default, for all those things with very different latencies. This means that 4 DNS queries can occupy node.js's 4 hidden worker threads, and block your disk I/O for minutes.

It isn't hard to find out which libraries are maintained and which ones aren't. Maintained libraries are of fairly good quality. This isn't very different from other languages, except that there is significantly more activity around node.

Promises not being in core is a good thing. Eventually many of those use cases will switch to using ES6 generators.

If you want to scale node, you would use multiple processes.

Your DNS example is a corner case. There are discussions around it, and such issues impact all frameworks.

As for security issues, unscalable, unmaintainable etc, those are too generic in nature to comment. I can say this though; node is in production at some of the largest companies in the world and they are talking about it too.

> Your DNS example is a corner case

One of the many corner cases that will kill your application or open it to DoS (malicious or not).

I.e. you can DoS any nodejs application if

    * you can trigger it in making 4 DNS queries
    * and it does disk i/o (or uses any other core module using the thread pool)
> There are discussions around it

I've seen tickets opened since more than a year on this, without anything showing a willingness to improve that. Version 0.9 even removed the possibility to increase the number of thread (which they re-added in 0.10).

> such issues impact all frameworks

When you start using node, you don't expect that your bottleneck is a thread pool.

In non async frameworks you know you'll have this kind of problems, you can design around it, and a DNS query in some module can't block I/Os for the whole application.

> If you want to scale node, you would use multiple processes.

By "unscalable" I meant libraries using O(n) or O(n^2) algos, with 'n' the number of users or the size of your data, where it would have been easy to do it in O(log) or O(1).

> Promises not being in core is a good thing

Why ?

> Eventually many of those use cases will switch to using ES6 generators

It hope it will improve, but we are discussing the current state of nodejs

Premature optimisation is the root of all evil. If you're building something that will be as big as Twitter, the programming language will be the least of your problems. Figuring out how to scale all of the connected system horizontally, independently, will be more important than whether you chose Node or C# or Java or Python.

It does make a difference and choosing good tools and architecture early can make a hell of a difference because today's code become's tomorrow's legacy code. It's also very rare for a complete rewrite to happen, especially into a new language/framework. Look at Facebook; their codebase is still in PHP even if it's transcompiled to C++.

This also isn't a case of premature optimization at all, this is just about making good choices that will persist with the project for a long time. Premature optimization used to mean unnecessarily writing assembly or borderline obfuscated C in the name of performance, which led to programs being difficult to comprehend, hence it being the root of all evil. Today this has been perverted to mean "hey, buddy, if you think about performance you're optimizing prematurely!"

Let's say your automating squirrel cloning. The quickest anyone can do this at the moment is 24 hours. Your solution takes 12 hours using SquirrelClonr. Somebody in the pub reckons he could produce a Squirrel Clone-o-matic which does it in 1 hour, but it would take 6 months to build.

Do you launch today with SquirrelClonr, or delay launch by 6 months to switch to the Squirrel Clone-o-matic? Since it's possible to increase the speed by 12 times, surely you should switch to the Clone-o-matic, right?

It depends on your customers. If you can be profitable by selling squirrels cloned in 12 hours, why wait when waiting could allow a competitor to monopolise the squirrel-cloning market. Plus, if you find out that nobody wants cloned squirrels, however long they take to produce, you've saved six months.

So premature optimisation (in this sense) means "thinking about performance of what you're trying to build before you've even established whether or not you should be building it". If you have no customers, it doesn't matter how quickly your code runs.

Premature optimization doesn't mean anything of the kind.

The difference is it doesn't take much more to write something with Node than it does Python or anything else, and thinking about good architecture (particularly database architecture) is prudent. I said it before, but it's worth repeating: today's code become's tomorrow's legacy code, and certain early architectural decisions can make a hell of a difference later.

Your argument is like saying you're going to build a house but don't know if anyone will want to live in it, so instead of doing it properly you're going to use anything to build it regardless of how suitable, and regardless of how well the construction will stand up against the weather. If someone chooses to live in it, well, crumbling is a great problem to have because you got there first. If you're going to say this analogy is stretched (I'd disagree), think about it in terms of security. Writing decently secure software takes a little bit more mental overhead. Is it worth it to write a secure application from the ground up in spite of not knowing whether users will adopt the service? (The answer is always yes.)

Also, first mover advantage is a myth. There are plenty of examples of the first, or early mover, being toppled by someone who came along later.

Premature optimization means creating code that's difficult to comprehend without justification in the name of performance, not thinking about which language/framework/vm is going to yield a decent performance profile overall.

> Your argument is like saying you're going to build a house but don't know if anyone will want to live in it

The first houses in a complex are usually sold before being built, on the basis of a prototype - be it blueprints, 3D walkthrough, or a show-home. If they can't sell those houses, they don't build the rest. It's the cheapest way of establishing whether there's a market for the properties they intend to sell, and it's much cheaper than finding out there's no market having built 50 of them.

> "today's code become's tomorrow's legacy code"

Worry about it tomorrow. It's a nice problem to have as it means you're still in business. Most businesses started today won't be.

Until you've established that your product is going to have customers, the performance is irrelevant.

My favourite example is the fake 'Buy now' button. You don't need to build the payment process until you know people want to pay, and you definitely don't need to worry about how quickly it runs.

If people want it, they'll pay even if it's slow. If people don't want it, why are you building it? Nobody buys a product just because it's quick (unless performance is the central feature.)

Houses are sometimes built like that, but I've seen indie builders just build the house, rent it out, and then sell it. There are multiple ways to build and sell a house.

Writing code that's maintainable, secure, and performs well isn't all that difficult, and it's certainly not some horrible burden that will eat up all your time at the expensive of customer acquisition. There's no excuse for sloppy work, and yes, customers can jump ship if they think your product is substandard and response time adds to that feel.

You're also not going to be less busy in the future and having a codebase that's difficult to maintain is going to put excessive pressure on you or your team. Sloppy code is harder and more expensive to debug than well written code in all respects. You're also more likely to have major problems (performance, security, grim bugs) that could have been avoided simply by thinking a little up front. It's not all that much more expensive to write decent code. It's also not appreciably more time consuming to write a decent NodeJS app than it is to write a Rails or Django app.

I strongly suspect you're conflating over-engineering with making wise decisions that don't require a huge time burden and make life easier in the long run.

Exactly. Your biggest challenge today is getting some users. Build a product that allows you to do that, and then worry about whether it scales.

It works for us, but it comes with a lot of issues, some of which are pretty major, although this holds more true in the ecosystem, not always in core. Problem is see is that no one seems to know how to solve them (it's a callback, no it's node, no it's v8), takes so much time or just plain don't care (socket.io, I'm looking at you), so you're on your own if you encounter something critical.

Either case start getting traction, don't over optimize at the start and watch out for memory leaks: there are tiny bits of best practices that you must follow (always consume the response? check. close the request appropriately? check. don't crash the whole node process? check.), some of which not really well documented, that can ruin your day should you get some important press and are not implemented correctly.

Take a look at the issue tracker of the libraries you are going to use, check if there's something that can affect you and perhaps contribute back!

I run http://jsonip.com. It's a single node process running on a VPS. It supports more than 10 million requests a day and barely stresses the system. Granted that its a relatively simple app, but it's raw node.js. I can easily scale it in a few simple ways like adding a caching layer and/or load balance a few extra servers. Haven't needed to yet.

I doubt there's a language out there that couldn't do 100qps of this complexity on a single modern core. It doesn't really say anything positive about Node

Why do people always take a figure like this and assume a near linear distribution?

I do agree with the sentiment of your point though but I'd be interested to know what the parent handles at peak

Ha, I love how /about is also represented as JSON. Also, good call adding the 'Pro' field. Are you getting any traction there?

Mozilla uses node.js for several parts of their architecture: https://hacks.mozilla.org/category/a-node-js-holiday-season/ (recommended read).

It should be stable enough for building a large scale application but maybe not easily for an app as big as Twitter (we are talking about a shit-ton of requests per second). You are not there yet though, are you? :)

I'm not sure how you're defining 'stable' here, but I'll comment on another aspect of your concern.

Twitter started with Rails, and at some point decided it was more efficient to do an incremental rewrite on the JVM.

    I can rewrite the entire backend in a weekend in Python.
I'd bet it took Twitter a bit longer to replace their infrastructure, and they survived just fine.

Build in whatever is rapid, for you. At this very moment. Using your resources.

Keep things service-oriented, decoupled. It'll be easy to replace things one component at a time, if needed.

in summary..

I think it's a perfectly cromulent platform to build on in terms of speed and scalability. But, in case you and I are wrong, follow my advice about staying decoupled and it won't hurt as much.

    > cromulent
I had to look that word up on urbandictionary, wiktionary, and tvtropes (apparently a Simpsons reference).

From what I gather, if that's the word you meant, then it doesn't seem that you believe Node.js is a viable platform. By default, of course, I'll assume the expression was just beyond me!


sorry, I didn't intend it to be obscure. :)

it started from a Simpsons reference long ago (17 years, wow) but over time has fallen into (infrequent) usage to mean[1] acceptable. :)

[1]: http://dictionary.reference.com/browse/cromulent

I think by 'cromulent' he means acceptable

Yes, it is stable enough. BUT, that assumes you know how to write code that will work under various conditions that can and do arise.

The main thing that can cause shit to hit the fan is not properly handling errors. I highly suggest using domains, and that when an error occurs, if you can, that you gracefully exit. If not, then all other requests will just abort and that isn't very user friendly.

You will also want a way to be notified of errors, so you can stay on top of them and fix them right away. I use winston and have the error level set to email me.

If you want to talk in more detail, contact me... address is on my profile.

Domains are 0.10 specific I believe, and 0.10 is still bugged in few areas. I'd stay on 0.8.23 for quite some time since we are talking about stability.

I'm using 0.10 in production and haven't had any bugs... what bugs are you referring to?

Also domains are in 0.8

Is yours a large or small app? Do you use streaming/piping/remote services?

Regarding bugs, there are too many to list, just see the tracker and spend some time browsing through them https://github.com/joyent/node/issues?direction=desc&pag... I'm fairly sure you can find pretty much anything.

I'm considering removing all the npm modules from my project and go raw, since most of the obscure bugs are in the ecosystem anyway.

Somehow I missed domains are also in 0.8, thanks!

It is large, but doesn't use any streaming or piping. As for remote services, just a bunch of remote APIs, and they work fine.

With the changes to streams in 0.10, I can understand it having some issues. That slipped my mind... but I don't do much streaming so I just assumed it worked.

Whatever you write now would never scale to be Twitter as it is now, there's no point even considering that as you aren't thinking on anything like that scale conceptually. I doubt most developers can. But then writing something that could scale to Twitter scale when you don't have a business or users or revenue would be pointless.

Specifically to answer you question, no. Node could work as a thin publishing veneer on a much larger stack but you just don't get what you need from Node.js end-to-end.

    Node could work as a thin publishing veneer on a much larger stack but you just don't get what you need from Node.js end-to-end.
I'd be interested in you expanding on that claim. I'm indifferent towards node, but this is my area of concern. I don't see anything inherent to node/v8/js that would be limiting.

There are definitely going to be things that Node is slower for, and others that it excels at.. just like everything else. I would recommend anyone that is wondering about this to check out this great presentation given by Ryan Dahl, the creator of Node, in which it talks about the concurrency model, how it is acheived, and what the consequences of this are. The short story is that Node performs well under high concurrency and and IO bound workload with lots of small files. It is not going to excel so much for a computationally bounded or at serving huge files. It's always going to be about knowing which is the best technology for the job at hand.


    I don't see anything inherent to node/v8/js that would be limiting.
The concurrency model.

If you have enough traction that this is a problem, you can get Ryan Dahl / Joyent / et al to help you.

Node's pretty well tested by now for heavy traffic. It's not my personal cup of tea but I imagine that the obvious bugs and performance degradations have all been squished.

Yes. Voxer uses it at scale, they are doing realtime voice over HTTP

Also lots of other important companies use it behind the scenes.

If you are not handling every possible error correctly, then you might end up with errors taking down your app with no idea how they were initially caused, so your app can suffer significant downtime without you knowing where to start in terms of fixing it.

If things go wrong in python, you'll probably have an easier time identifying it and fixing it due to actual helpful stack traces.

You should be very very careful when architecting your project and make sure you understand error handing to a T.

Stable enough? Sure, for a given definition of "enough".

The advantages to using Python instead of NodeJS are going to be less about stability and more about maintainability and ecosystem.

We're a full node stack at http://geeklist.com One thing not being mentioned that I'd like to share is the extremely supportive and extraordinarily brilliant community of node.js enthusiasts. This has an intrinsic value that we could never replicate with any other code base. Around the world thousands upon thousands are excited to learn and hack in node. Guys like @mikeal @izs @indutny @dscape @dshaw @substack and so many other great developers in the node community jump in to help everyone else all the time. (Just try reaching out to any of them on twitter and you'll be amazed by the support. Yes we did finally just move up to but running 0.10 caused some hiccups we just dont have the bandwidth to attend to right now so we're waiting just a tad longer. In sum node is great and you'll find developers absolutely love hacking in it which means they are enjoying working on your project/business... Which is priceless.

Node is sitting at the core of the new Viki platform. It's been a pretty flawless part of the stack. We do zero-downtime deploys thanks to the cluster module, which also keeps workers running (I haven't seen an unhandled error take down a worker in a while, but as with most any stack you need to be pretty vigilant about your error handling). At over 3K api requests per second to a single box, we hover at around 0.05 load (our machines range from e3-1230 to dual 2620s, so it's hard to get an exact number). When asked to handle more request, the impact on load is pretty linear.

We're also dealing with servers in 4 different locations and some requests need to be proxied to a central location. With a ~300ms round trip from Singapore to Washington, Node's asynchronous nature has been a win when it comes to handling concurrency in the face of backend latency.

This sounds like a Maserati problem http://www.urbandictionary.com/define.php?term=Maserati%20Pr.... To answer your question though, many big companies now use Node.js and it is stable enough.

I think that most people that have lots of experience using Node would say that it is stable enough for the most part (that's what I tend to read anyway, big grain of salt with that). Cross-platform compatibility is much, much better than it was in the distant past. It will be the idea and not the technology you choose that will determine whether or not it becomes the 'next Twitter', as long as it is developed soundly. It shouldn't really matter. If you did end up scaling up before the final kinks are worked out of Node.JS and commonly used libraries, then like you say: Just write it in Python over the weekend and see if you can do better.

I think the only problem I have with any of this, is the next Twitter part. If that's what you're thinking, planning around, projecting scaling issues based on, I'd argue you have already ensured failure.

I can't really speak for twitter but we use node.js happily at Soundkeep. It works greate for our use case of streaming audio processing. It has come a long way since it started. Obviously it isn't as mature at Ruby/RoR or Python/Django but not nearly as bloated either. The comparison is weak though since it isn't a framework but more of an environment. A lot of people like to think of it in comparison to the big frameworks though. Node.js is more about smaller libs and modules than any single framework.

I'm in the streaming field too; how can you say it's working great if you haven't launched yet to the public?

Our beta is actually coming this weekend, our alpha is live currently though. http://alpha.soundkeep.com. It works great for us because we can keep server memory low since we can process and transcode audio as it passes through our server without having to buffer the entire file at once.

Why'd anyone want to build something like Twitter again?

He said "the next Twitter", to say the next big thing with possibly tons of req's per sec. It could not necessarily behave like Twitter or share any features with it.

Why does everybody here seems to stick to words in such a "nerdy" way? :) Use common sense sometime guys.

I am also working on a similar project using node.js and redis. Please make sure that you are using a perfectly fitting data model. I bet most of the load will come from the database and not from the application servers. We went for an in-memory solution for all timelines and graphs.

Twitter was built on Rails when Rails was much younger and less stable. I think you'll be fine with Node.

Are the people, your developers, the ones that make a platform stable, not technology X.

How you manage the database sharding could be more important than if you use nodejs, C, php, cgi + perl...

AirBNB does a lot of node.

Much of their stack is node.

Like others have said, 'too many users' is a first world problem that many would like to have.

Why not use gevent or Tornado? Wouldn't that eliminate all your real time concerns when it comes to Python?

Or twisted.

Sure. So is PHP. Yes I'm making a somewhat sarcastic point. Yes, I'm also quite serious.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact