If Python/Djando or Ruby/Rails can get your app out the door and into customer hands faster, it is almost always the right thing to use.
1. There are certainly other, very valid, technical reasons for choosing node.js over other technologies early. But let those reasons be about the problems you are solving today, not the ones you might need to worry about when you have 50 servers to deal with.
In most cases, you're not. You're mostly making a big logic spaghetti mess on both the client and the server AND make your pages load slower, especially for the initial load (client performance, js loading times, etc).
>What makes you think refactoring code is going to give you a 10x improvement in efficiency?
Because even correctly implementing just the caching layer, with nested/micro-caching, can give you up to 1000x improvement in efficiency in the first place.
When you have real data on the client instead of just a bunch of markup, you can be a lot smarter about how and when you make additional AJAX requests. Optimistic updates can make a huge difference in perceived performance.
You will probably will get the same mess. There is plenty of literature about that. I.e: A complete rewrite is what killed Netscape
That's a meme started by a Joel Spolsky article, and maybe an allusion to the "second system effect" (which is about something else altogether). Hardly "plenty of literature about that".
That is, something an order of magnitude more difficult than LinkedIn or 99% of web properties out there.
There have been TONS of successful rewrites. Especially in the web space, it's almost trivial to rewrite your webapp or parts of it. To name but a few:
Twitter, the new Digg, SoundCloud, Basecamp, etc etc.
"That's a meme started by a Joel Spolsky article, and maybe an allusion to the "second system effect" (which is about something else altogether)."
That said, the "second system effect" is not about merely rewriting risks, but especially about architecture and design choices. From the very wikipedia article:
"""People who have designed something only once before, try to do all the things they "did not get to do last time," loading the project up with all the things they put off while making version one, even if most of them should be put off in version two as well."""
That is, if you design your rewrite _without_ wanting to build a bigger, more involved product, but merely a cleaner and more cleanly made product, this does not apply.
Another quote from the very article: """The second-system effect refers to the tendency of small, elegant, and successful systems to have elephantine, feature-laden monstrosities as their successors."""
This is not the case we refer to here. The Netscape got by version 4 had gotten an ungodly mess (and even before that), not a "small, elegant" system. And Mozilla/Firefox, the rewrite, is cleaner and more elegant than Netscape was.
Consider a 100 line Python script. People can rewrite it from scratch in 100 different ways, while improving upon it with no problem. At some point of complexity this stops being true, but Brooks was talking about huge projects, built by enormous teams, like OS/360 and such. Not some 20K - 100K line web project.
What do you mean?
We wouldn't have those if it wasn't for the rewrite. The old code was a mess even before version 4 (by its developers own admission), and it could never get to the point of competing in the engine space ever again.
That is, with the old Netscape rendering engine it would not be possible to extend it to compete in the modern HTML5/Canvas/GPU acceleration/CSS3/add-ons/separate contexts for each tab/etc era.
Biggest improvements came from persistant connections to redis/mongodb and polling for updated information independently of requests so there was no cached-or-fetch shenanigans at all in some areas.
I'm really puzzled by this. A single dedicated large .NET box (16GB RAM and 6 Core Processor, 1TB raid, etc.) runs $150 a month. When I was looking at PaaS, Heroku came in more than five times that much for similar capability. $1,000 a month gets me SIX of these dedicated boxes that can be tied together as needed.
Why was your setup so much more for dedicated that you could _save_ that much, much less why you would be spending that much in the first place?
I'd actually be very surprised if NodeJS running on Heroku (which is built on EC2) performs better than compiled .NET code running on dedicated Windows hardware.
The equivalent in .NET I guess would be BackgroundWorkers that are independently prefetching the data required most of the time but I could never get them to Just Work.
Specifically for my use case 99+ percent of requests receive some data, do some light manipulations and then push the data to redis about 8,000 to 12,000 times a second. With .NET I could only push to locally running software instances because anything remote couldn't keep up with the connection volume (without throwing even more hardware at the problem).
It's really just like using any memcached or the built in .net caching rather than a write through one, new data reaches each dyno either via the periodic refresh or when it tries to create that record and finds it already exists. Writes are done immediately and the caches don't get updated because there's 8 - 16 dynos and why bother updating the single dyno that created it.
I did originally use redis pub/sub to push out updates to everything but I ended up removing it because it was unnecessary.
Here's some example code, it pre-fetches all the leaderboards (not scores) every 30 seconds: http://pastebin.com/asq6eExu
Higher up in the same script is the api for the leaderboard data with stuff like: http://pastebin.com/gsfvsZsv
This is hardly surprising.
The Node.js ecosystem has a lot of potential, but the variety of off-the-shelf add-ons is severely limited compared to either of those more mature frameworks.
If you think Node.js is easy, you've never really experimented enough to understand what makes Rails so effortless. It's a lot easier to produce a production-ready application with Rails than it is with Node.js as it is today.
In four years, as Rails starts to add less critical features, Node.js may well have caught up.
I think in the long run Node will beat the pants off of Rails but it's going to take an enormous amount of work to make that happen.
The missing 5% is mostly things that make your development process more effortless.
I've found that it's easy to get a first cut of an application out inside of two weeks with Rails, but you will probably need more time or lower expectations when working with something more limited like Django or Sinatra.
Since Rails imposes a lot of conventions, applications are easier to organize if you follow the rules. Sinatra is far more open to interpretation, so if you're not disciplined it can turn in to a bit of a mess.
I'm a big fan of the DRY principle and it's much easier to apply within Rails than in other environments. A lot of this relates to how Ruby is a lot easier to meta-program than other languages.
Pro tip: Don't put this on your resume.
For those that love to build things from the ground up or to carve out new solutions, Node is a great place to be. I think it's got enormous potential and is biggest opportunity since Python and Ruby took off around years ago.
Just don't think because you can create the same sorts of apps with Node that it's as easy.
Ruby on Rails even three years ago was laughable compared to today's toolset. Node is catching up quickly, covering ground faster than Ruby ever did, but still lagging.
If you already know Python or Ruby, then ya, using Node.js would be silly. But if you have to pick one because you're equally familiar with all of the languages, then Node.js isn't a bad choice at all.
You sick bastard!
Depending on how they're using node they could very well keep this segregation. But some implementations with node have server/client sharing code. I know I've seen examples of this. I imagine that's how they're doing things.
This argument that using Ruby (or any language that makes programming easy but runs like treacle) to get something out the door a few months earlier is bullshit, I would rather release something a few months later that I didn't have to totally rewrite down the line 10 times.
The reason why "getting it out the door" is so important is because nobody behind the door generally has any idea about what people will really pay for. By making it fast (read: cheap) to make, if you find that you don't get it right, you can try again, and again, and again.
Now, you can argue that "they shouldn't be releasing without knowing they will be successful", but if they have the ability to see the future, they should just take that VC money and put it in the market. People like Steve Blank and Eric Ries have a lot of evidence how running most startups (software startups particularly) using a "build it and they will come" attitude is ripe with failure.
Now, if you are just building for fun, that's different. I knew a guy building a Dropbox clone in C. More power to him (though I wouldn't have touched it, because I think it is too hard to get security right all the time in a language like C).
Will making it faster increase the chances of more people using it? Also this attitude of building shit products (Yes software is a product) because..well the chances are it will fail.. is precisely why so many startups fail. This is why you get one crapola startup after another........
"Show HN: Crap.ly -- we built this in 3 weeks using Ruby, it will scale up to 20 users before craping out, its a twitter scraper that connects to app.net and diaspora showing how much Money your Kick-starter project has made and includes quotes from Paul Graham about how to build a successful startup, because he has built so many"
Imagine Linus had built Linux using fucking node.js
The hard things to solve require a lot of user-facing iteration. Basically, the faster you can try new things, the more likely you'll get product-market fit before you run out of money.
My feelings on Rails are decidedly mixed, but fast prototyping is one of the things they got right.
The basic notion is that every bit of code is better the second time its written, and C development is just too slow for the first iteration.
Really? C for a problem that is largely string manipulation and database access?
The problem is folks, as soon as you get a large number of users every compromise you made by using a toy language or database is magnified 1000 times. Google didn't implement in some scripting language to get to the market a few months early.
Some Rough Statistics (from August 29th, 1996)
BackRub is written in Java and Python and runs on several Sun Ultras and Intel Pentiums running Linux. The primary database is kept on an Sun Ultra II with 28GB of disk. Scott Hassan and Alan Steremberg have provided a great deal of very talented implementation help. Sergey Brin has also been very involved and deserves many thanks.
-Larry Page pagecs.stanford.edu
There's something to be said about rapid prototyping and evaluation.
I don't believe all companies can survive with a python or ruby solution. I do think that, as technologists, we worry too much about the "optimal" solution to technical problems when, in most cases, businesses are made or lost in people problems. People problems are really hard because there are few "right" answers. Instead, it is an optimization game, and optimization games require agility.
If you are that amazing at C or C++ that you can iterate amazingly quickly with them, then use them! That will give you a leg up later. I've been developing software for 15 years and have use everything from C and C++ to Java to Python to Objective C; and I've seen a massive difference in my ability to iterate with each of them.
Optimize for what works best for you, but don't be surprised if you choose C++ and a competitor who doesn't care about the "perfect" solution runs circles around you in the market because they chose something different (even if they re-write in C++ in 10 years, after they've stolen all of your customers).
That's actually exactly what they did.
>This led them to a model where the application is essentially piping all of its data through a single connection that is held open for as long as it is needed.
So it sounds like from using a request/response driven architecture, they adopted streaming architecture. Also, they moved away from Rails and adopted an Event driven approach.
It is well known fact that, it is almost impossible to stream stuff with Rails currently and hence adopting a event driven approach made sense. I can see, how just these two factors alone will contribute hugely to performance.
If you throw in Postgres database in mix which supports asynchronous queries - one can pretty much beat 20 passenger instances serving similar request. The problem again though is, doing non-blocking IO does not reduce database load. So, if likely they re-architected that bit as well.
Now, they did evaluate EM apparently and according to them node.js performed 20 times better than EM/twisted in certain tests and hence they went ahead with node.js. I am curious as anyone else to see, what those tests were.
So according to original article, Node.js did not perform 20 times better compared to existing Rails based backend. According to Prasad, it performed 20 times better than alternatives of Node such as - Eventmachine & Python Twisted (they did evaluate both of them).
Now I am having hard time believing node.js can outperform Eventmachine or Twisted by 20 times. Most benchmarks I have seen and done tell me, node is marginally ahead. I would obviously like to see, what they benchmarked and how?
If they found a 20x performance difference there is probably a bug in one of those competing libraries.
While I'll freely admit v8 is much faster than MRI Ruby, the efficiency gains are likely more related to 1) the rewrite factor 2) moving to non-blocking 3) the fact that the original server ... um, needed love
EDIT: Mobile only. Maybe the title should be updated to reflect that.
I can believe it, if their caching strategy was garbage before and part of a massive rewrite was rethinking that. If you fail to use your caches, you'll pay for it.
I mean it's not like LinkedIn is about split-second changes, you could probably statically generate most of it and serve it with a single nginx instance.
"For our inevitable rearchitecting and rewrite, we want to cache content aggressively, store templates client-side (with the ability to invalidate and update them from the server) and keep all state purely client side."
Better understanding of the problem and experience running the system were probably key for building the new high-performance architecture. Obviously, old one lacked these big advantages.
If you rewrite it in a different language, then you get to blame the old language and framework for all the problems. No harm, no foul-- and you get to put the new shiny thing on your resume.
If I had wanted to add Rails to this comparison I would have compared apples to apples and used Metal instead of including the entire stack.
And this is really intersting:
The article is touted as praise for a stack but my gut says that its really a smart restructuring of how they serve mobile. Either way, good on them for the efficiency boost.
I am not unhappy with my choice but I do not have enough data to compare with other tools. I do not think a lot of people have either. Once you start with a tool you tend to keep it since you invested a lot of time learning it as well as developing something with it. I think few can afford switching tools (e.g. FB switched from HTML5 to native recently for their mobile interface).
I'm confused by that being an advantage - I guess that's better than using some language you don't know at all, but it's still a bit of a different approach to JS, no?
"focus on simplicity, ease of use, and reliability; using a room metaphor; 30% native, 80% HTML; embedded lightweight HTTP server; "
Does going from 30 servers to 3 actually make an impact to a $13B company?
The title could be improved by reflecting that this is for LinkedIn mobile, but the rails->node is correct.
What I'm really surprised is that they got such a huge gain. Most projects I've worked on are DB or I/O bound. Maybe they store everything in RAM.