

How LinkedIn used Node.js and HTML5 to build a better, faster app - jc123
http://venturebeat.com/2011/08/16/linkedin-node/

======
cygwin98
_One reason was scale... The improvements the team saw were staggering. They
went from running 15 servers with 15 instances (virtual servers) on each
physical machine, to just four instances that can handle double the traffic.
The capacity estimate is based on load testing the team has done._

Though Rails is not known of being high performance, this sentence rings a
warning sign in my mind that something must be terribly wrong with their Rails
implementation.

~~~
aneth
I'm a 5 year + rails developer, and this doesn't actually surprise me. Ruby
and Rails are just incredibly slow in my experience unless you cache
_everything_ , which can greatly increase complexity. Right now I'm struggling
with the creation of 2000 active record objects taking 10 seconds, which is
absolutely ridiculous. And that's down from 30 seconds - which was happening
because require was being called for each of those object instantiations by a
third party library. So a 20 second gain by removing 2000 calls to require -
on Heroku - something is wrong there.

The JSON that's generated with all that data is converted to backbone objects
instantaneously in a browser, and the few database calls are instantaneous -
it's all Ruby. I will probably need to create custom lightweight objects just
because ruby is so darn slow. Much of the time is spent in gsub resolving
paths for paperclip attachments - work that would take most languages a few
milliseconds can take 10 seconds in ruby. I know there are many ways to
optimize this but I should not have to at this point.

I find with ruby I constantly run into performance bottlenecks and need to
revert to optimization that I would not have to do on other platforms.

Yours is the typical answer from rails fans. But what could be wrong with his
Rails implementation that could be _that_ slow, assuming they are reasonably
competent - a valid assumption I think given their successful port to node?

~~~
cygwin98
_I find with ruby I constantly run into performance bottlenecks and need to
revert to optimization that I would not have to do on other platforms._

Agree. Ruby tends to be more likely to be CPU-bound than other languages. That
demands a more indepth understanding of Ruby itself. For example, as simple as
string concatenation of use '+' vs '<<' can easily fail lots of people.

 _Yours is the typical answer from rails fans._

I don't quite like the tone of this statement. Although I've been working with
Rails at work for the past five years, I don't consider myself a Rails fanboy.
I consider it a valuable tool to bring along high productivity. I am quite
aware of its strength and weakness.

 _But what could be wrong with his Rails implementation that could be that
slow, assuming they are reasonably competent - a valid assumption I think
given their successful port to node?_

We all know that scaling up Rails is doable though not an easy task. To handle
webscale traffic like LinkedIn in Rails, you can not go very far following the
traditional ActiveRecord way. Very likely, you will introduce a high
performance middleware to prepare the data for rendering say in Java/C/C#. For
the frontend, you have jQuery or whatever Javascript library you like. You
just leave Rails to handle routings. Though some people may challenge that if
it is necessary to have Rails in this architecture. I think Rails can still
make a case even in such scenarios, which is basically what github does.

~~~
aneth
I think you answered the question. You scale Rails by slowly replacing it. The
problem with their architecture is that they hadn't done that yet. Node is
handling their web scale traffic as-is.

~~~
cygwin98
Maybe I didn't make myself clear. My point was that their Rails implementation
may not be well written, i.e., not architect-ed properly since beginning. You
cannot simply throw a few tables and scaffolds and hope it scale. That way may
work for small apps, definitely not for web-scale apps like this.

To be fair, Node.js also has its own quirks, besides the callback maintenance
issues, I also mentioned in another post that Node.js is not as scalable as a
lot of people present it to be.

------
jinushaun
"One reason was scale... The improvements the team saw were staggering. They
went from running 15 servers with 15 instances (virtual servers) on each
physical machine, to just four instances that can handle double the traffic.
The capacity estimate is based on load testing the team has done."

Nice to see an example of Node in the wild. It's so fun and easy to develop in
Node, that it sometimes feels like a toy.

~~~
chrisledet
> It's so fun and easy to develop in Node, that it sometimes feels like a toy.

and using CoffeeScript makes it even easier!

~~~
spjwebster
I'm confused. There was no mention of CoffeeScript in the article. How does
using CoffeeScript make it even easier?

~~~
chrisledet
I was responding to jinushaun's comment on how Node.js is easy to develop on.
CoffeeScript compiles to JavaScript. Give <http://jashkenas.github.com/coffee-
script/> a good read and you'll convert too!

~~~
grannyg00se
From github: CoffeeScript: number = -42 if opposite

Javascript: if (opposite) number = -42;

How is coffeescript an advantage in this example? The javascript seems more
readable to me as it more closely follows regular english grammar.

CoffeeScript: square = (x) -> x * x

Javascript: square = function(x) { return x * x; }

In this example, again the javascript seems more clear. The function actually
has the word function in the declaration. I like that. Essentially
coffeescript took out the word 'function' and 'return' and replaced curly
braces with other symbols.

~~~
stephth
Inline conditionals on the left and the function keyword? From all the
features that CoffeeScript offers it's kinda odd to focus on those minor
details of taste and habit.

~~~
grannyg00se
But it's one of the first things that is presented on the overview so
naturally it's going to get lots of focus.

~~~
stdbrouw
It's a bloody overview mate, it mentions everything in more-or-less random
order :-)

------
akavlie
"Also, the development time was unusually fast."

That's surprising -- my impression has been that one tradeoff with Node.js vs.
frameworks like Rails & Django is a lot more work to implement functionality
they ship with out of the box -- it works at a much lower level.

It also tends to be slower going for a while as you get accustomed to the non-
procedural approach.

~~~
rino_jose
I'm one of the LinkedIn engineers who rewrote the mobile server in node. One
of our biggest concerns going into this was the amount of infrastructure work
we might have to develop to get us back to par with what we had with rails. We
were pleasantly surprised by how much was already there. The Express
framework, in particular, was very nice in terms of features and quality (it's
basically Sinatra for node). For most of what we needed, we were able to find
a node module that could do the job.

We were able to develop this quickly for a couple of reasons. One was that we
already knew rails. We used a lot of rails patterns throughout our project.
Any rails dev who looked at our directory structure would be feel at home.
Another reason was that we had a good understanding of our domain. We were
able to anticipate many features and tasks that would have been been expensive
(time-wise) to implement later. Another reason is that had 2 week release
cycles with demo-able features at the end. This kept things moving at a good
pace.

And of course, we have high caliber staff ;-)

~~~
recusancy
"We were able to anticipate many features and tasks that would have been been
expensive (time-wise) to implement later."

So your opinion, so far, is that it is easier to modify the application, after
it's launched, in Rails than it is in Node?

------
riprock
I'm confused -- isn't Node.js's ruby equivalent EventMachine? Why are they
comparing Node.js, an asynchronous I/O library, with a MVC web framework? I
don't think this is a fair comparison unless they tell us the MVC framework
their Node.js is using, and the server stack their Rails app was using.

~~~
colin_jack
Has anyone done a good comparison of Node and EventMachine (a couple of use
cases load tested)?

------
badmash69
I've often thought of doing a high volume messaging project with Node.js but I
seem to be addicted to jBoss Netty.

~~~
cygwin98
Depends on the traffic you need to work with. I've kept an eye on Node.js for
some time. So far, among the failures involved with Node.js are Plurk and
SyncPad. Plurk switched to JBoss Netty, while SyncPad used Erlang instead.
Though that was almost a year ago, things may have improved.

Not to bad-mouth Node.js here. Just to give some counter examples to anybody
who wants to try Node.js in production.

~~~
tszming
Plurk? Seems they changed from Netty to Node JS, not vice versa?

~~~
simonw
They switched back to Netty again: <http://amix.dk/blog/post/19577>

~~~
tszming
Thanks for the link.

It seems running multiple instance of node.js and use a more updated version
would helps (maybe).

Anyway, I always think that lack of threading support is definitely a feature
instead of incompetency. Threading is hard and average developers should avoid
it. (Unless you are doing scientific computing). If you need to scale,
eventually you will need multiple servers anyway.

------
kennystone
I wonder why they used Node instead of EventMachine, given all the ruby code
they already had.

~~~
rino_jose
EventMachine was definitely on the table, but a big reason we chose node was
that it's asynchronous all the way through. Node modules tend to be
asynchronous by default.

In other evented frameworks, it's hard to make sure that all of your code is
asynchronous (especially when you use existing libraries). If your server runs
into a slow, synchronous function at a critical time, it might come to a
screeching halt and then fall over.

------
jscheel
This is pretty cool to see that LinkedIn used node.js for their mobile
interface for pretty much the exact same reasons I used node for the last
mobile interface I built. To echo the article's sentiment, node works really
well when you are interfacing with a bunch of other services.

~~~
hello_moto
Can you explain a little bit more regarding node ability to interface with
services?

Let's say I have a few services, Service A that does Invoicing, Service B that
does Payments, and the system has to communicate with 2-3 web-services as
well. Let's say there is a need to create some sort of "portal"-ish solution
(mobile or not). In this situation, what would be the advantage using Node?

Would you mind to share when Node is not the right choice as well?

I'm interested to know more about Node :)

UPDATE: Thank you for the replies.

FYI, I do understand the concept of node.js in terms of the async callback and
the argument of client-side and server-side use the same code.

I'm not a heavy Rails user (more of a Java/Python guy) but when it comes to
creating a typical CRUD web-app, Rails would be more productive?

~~~
jimrhoskins
The advantage is that Node uses async/non-blocking IO.

For example, say your portal needs to collect info from 3 sources A, B, and C.
In a synchronous system, you would have to request service A, wait for a
response, then request B, wait, and so on. This means your request is
utilizing the system for all that time, unable to serve other requests,
without extra threads or processes.

Node, and other async systems, when they call for IO, like a service, a
callback is registered for the result. So instead of having to wait for the
result, the system becomes available for other actions, like other requests.
So in this case you could request service A, B, and C all at once, and while
you are waiting for the responses, the system can be handling other requests.
When any of those services completes, it calls back to your code so you can
handle the results, and give a response to the client.

So the advantage is that instead of a request taking A + B + C + extra time,
it can take max(A, B, C) + extra time to serve the request, while serving
other requests concurrently.

Node is not the only way to achieve this, many async systems exist like
Tornado for python, EventMachine for ruby, and many others. But the JavaScript
in node can be particularly fun to work with especially if you are also doing
the front-end JavaScript, as it pretty much brings the context-switching in
your brain to almost nothing.

~~~
agscala
Here's one thing that never made sense to me: lets say a request comes in and
data needs to be collected from 3 different sources, A, B, and C before a
response can be given back to the client. You say that it would take max(A, B,
C) in order to retrieve all the data. Consider this pretty standard snippet:

    
    
        function get(source, callback) {
            var result = getData(source);
            callback(result);
        }
    
        get(A, function(result) {
            get(B, function(result) {
                get(C, function(result) {
                    // CREATE RESPONSE WITH A, B, AND C HERE
                });
            });
        });
    

Wouldn't you have to wait for the data to be retrieved before running the next
callback resulting in a time of A + B + C anyways? Am I missing something
about the way to retrieve data from multiple sources? I don't see how max(A,
B, C) is possible while still knowing for sure when all the data has been
collected.

~~~
mikeryan
This method has basically turned an async request into a synchronous request.
One idea is that you make requests A, B, and C concurrently, parse the data,
then have a handler which waits for the 3 to complete and then combines them
into a single response, as opposed to cascading all the callbacks.

~~~
agscala
I understand that, but I feel like a majority of the javascript async code
I've seen has all been in this format. By chance, do you have an example of
handler which would wait for multiple async requests to complete?

~~~
fatjonny
You could write your own way of doing it or use an existing flow control
library. There are lots. :)

<https://github.com/joyent/node/wiki/modules#wiki-async-flow>

For a specific example, look at async.parallel:
<https://github.com/caolan/async>

~~~
agscala
I wish I knew about this kind of thing last time I hacked on node.js... I
would have had such better & cleaner code. This clears up a lot. Thanks

------
badmash69
I would be interested in hearing how LinkedIn gets around Node.js issues .For
example

Isn't Node.js single threaded ? Would it not under-perform , say compared to
Erlang or Netty, in a multi-core CPU.

~~~
bnoordhuis
> Isn't Node.js single threaded ?

Yes (for now, we might take up V8 isolates).

> Would it not under-perform , say compared to Erlang or Netty, in a multi-
> core CPU.

No. You can spin up multiple processes and handle the load with (for example)
cluster[1].

[1] <http://learnboost.github.com/cluster/>

~~~
reginaldo
> No. You can spin up multiple processes and handle the load with (for
> example) cluster[1].

Which is what I speculate they're doing, as they say they're running just four
instances. Probably, they have servers capable of running 4 threads at the
same time.

------
neovive
Could someone explain the following: "Connections are all stored locally, also
for speed and so if you’re offline, you can still access them."? I'm confused
as to how a connection to a remote resource can be accessed offline. Is all of
the data cached locally? Further down they mention "We don’t use the browser’s
caching system" so I'm assuming they have custom built the cache.

~~~
jtbigwoo
"Connection" is linked-in-speak for "Friend."

------
rufugee
I'd love to see someone put together a guide to Node and the current
javascript world for the Rails developer. I've looked at Node a few times, but
it's so much lower-level than Rails...I'd think it would be more valid to
compare it to Rack. What about the other components of a Rails app? What
should you use as an ORM? What about views? What about routing? Etc, etc.

~~~
jasonkostempski
Here's a good list of available modules
<https://github.com/joyent/node/wiki/modules>. If you get involved in the
community you'll quickly discover which ones are widely used in your area of
concern. I've played with express <http://expressjs.com/> routing and views
with the jade <https://github.com/visionmedia/jade> view engine. I found them
quite easy to learn. When I got into node I also got into MongoDB
<http://www.mongodb.org/> so I haven't played with any ORM tools because there
was no mapping of objects to a relation db needed, no mapping at all really
since I pretty much keep everything in JSON format from front to back.

------
laconian
Wow, that's some awful performance from their Ruby solution. Is synchronous
code the norm in many shops?

~~~
virmundi
Yes, synchronous code is the norm for many. In Java, which is where I'm from,
this is due to a limitation on the JEE spec. While some servers do provide
thread pools (JBoss and WebSphere), that behavior is beyond the spec and
breaks the ability to switch to any container you like.

Some issues Ruby/Rails faces is due to both the language and the framework. It
is not uncommon for JEE apps to pull back 10k objects, show 10 to a user and
pitch the rest. ORMs like Hibernate tend to be fairly quick, especially when
tuned. As a result synch code is not that problematic.

This might start to change with the advent of Scala in JVM on the server. The
actors it provides MIGHT make Oracle rethink the spec a bit. But I doubt it.

~~~
hello_moto
I believe asynchronous programming model made it in to Java EE6 and expect
more to be added to Java EE7.

[http://java.sun.com/developer/technicalArticles/JavaEE/JavaE...](http://java.sun.com/developer/technicalArticles/JavaEE/JavaEE6Overview_Part3.html#asynejb)

