
An expensive line of code: scaling Node.js on Heroku (2013) - diegorbaquero
http://micheljansen.org/blog/entry/1698
======
brendangregg
"And then it hit me"... ok, a great story, but you have to wonder what people
would do if they didn't just guess the answer.

That much slowness may be disk I/O bound, which would show up in the USE
method, or even just my Linux performance checklist[1]. It'd also show up by
tracing blocking events, even an off-CPU flame graph, but that's overkill.

[1] [http://techblog.netflix.com/2015/11/linux-performance-
analys...](http://techblog.netflix.com/2015/11/linux-performance-analysis-
in-60s.html)

~~~
tayo42
This is on heroku though. You cant really do any of that?

~~~
brendangregg
Ah, you may be right! I have not used heroku. I'd hope they'd make some of the
metrics available somewhere.

------
daurnimator
Not responding. Google cache:
[https://webcache.googleusercontent.com/search?q=cache:4cZEfU...](https://webcache.googleusercontent.com/search?q=cache:4cZEfUFT2rUJ:micheljansen.org/blog/entry/1698+&cd=1&hl=en&ct=clnk&gl=us)

~~~
smarx007
[http://archive.is/MSAav](http://archive.is/MSAav)

------
alanning
Careful what conclusions you draw from this post. The comments have further
discussion which are very relevant:

> Edward Muller says: 1 October, 2013 at 20:29

Was/Is this app located in the EU region? If so, we recently (yesterday
2013/9/30) fixed a problem where each log line would result in a separate,
almost synchronous post to logplex. This problem should now be solved.

> Michel (ed: Blog Author) says: 21 October, 2013 at 10:33

@Edward and Fred: the app is indeed running in the EU region. I haven’t gotten
around seeing if the performance has improved (the app in question was very
ephemeral to begin with), but I’m glad this helped you close in on this issue!

Also, another commenter tested the blocking vs. non-blocking property of
stdout and found the default is "pipe" on Heroku [1].

So the most simple conclusion is that this was a problem affecting Heroku
logging in the EU region that was identified and resolved back in 2013.

1\.
[http://micheljansen.org/blog/entry/1698#comment-522151](http://micheljansen.org/blog/entry/1698#comment-522151)

------
eyelidlessness
Without digging into sources, since Node is single-threaded, that drastic a
performance difference for an IO-bound task would have me looking at whether
logging is synchronous. Node's APIs discourage synchronous IO, but don't
prevent it.

~~~
nprescott
Zeke mentions this in the comments[0] on the article:

 _> The console functions are synchronous when the destination is a terminal
or a file (to avoid lost messages in case of premature exit) and asynchronous
when it’s a pipe (to avoid blocking for long periods of time)._

Which still seems to be the case in the LTS[1].

[0]:
[http://micheljansen.org/blog/entry/1698#comment-522124](http://micheljansen.org/blog/entry/1698#comment-522124)

[1]:
[https://nodejs.org/dist/latest-v4.x/docs/api/console.html#co...](https://nodejs.org/dist/latest-v4.x/docs/api/console.html#console_asynchronous_vs_synchronous_consoles)

~~~
k__
So, always pipe before write?

------
partycoder
node is meant to be used in scenarios when you give most of the work to libuv.
However, if you to CPU intensive work in JavaScript itself, such as:

\- serialization

\- encoding

\- cryptography

\- compression

\- etc. ... then, you are better off doing it on a lower level language.
express logger counts as a form of serialization, and therefore it is slow.

If you absolutely have to do CPU intensive work... then: limit per-tick
execution time and break it down into multiple ticks, otherwise you will block
the node event loop and your system will degrade until it stops processing
requests.

Now, you can find this faster by profiling. Just use a profiler, like
nodegrind or flame graphs or whatever, you will find those bottlenecks very
quickly.

~~~
aikah
Or just ditch node and write your server in Go. There is no reason to use node
server side today, unless that task is specifically javascript related. There
is 0 advantage in using nodejs.

~~~
roelvanhintum
A universal React application would be one...

------
josteink
I know this is an old piece but this line here struck me as odd:

> Furthermore, as each request needs to be stored in a database, it was not a
> matter that can easily be solved using caching. I needed a backend that
> could deal with 100+ requests per second without breaking a sweat.

I thought this was why we had databases in the first place. They've been able
to deliver this performance without any issues for decades now.

What's the issue?

> I’m a bit surprised that a bit of logging has such a severe impact on
> performance on Heroku and even more surprised that they recommend you to
> enable logging in their Express tutorial.

While I agree that was certainly a surprise to me too, I'm not surprised by
the advice. Not all apps need to scale massively, and how else are you going
to debug your app deployed in Heroku when strange things start to happen?

~~~
bhaak
It's phrased oddly but what he's saying there in other words is: "I can't use
caching for writing to the database". This is of course true but surprising as
that has nothing to do with the problem.

Maybe that's was just his usual approach "if it's too slow, cache it" instead
of looking at _what_ makes it slow. As mentioned in the first paragraph, he
didn't have to "tackle the really problems[tm]" so far. So he didn't have lots
of practise yet.

Although I also agree that with 100+ requests per seconds, the database
certainly should not be the problem but you need to get the data to the
database in the first place (and that turned out to be the issue).

~~~
smcl
I can see the logic - if you have a service that only takes GPS co-ordinates
and dumps them into a database, and it's underperforming then my first thought
would also have been "why is my DB insert slow?" instead of "what part of my
service is slow?" (in fact I'd have probably ditched the DB insert and written
a console.log() which likely would've turned up the problem). These sort of
educated guesses serve me well most of the time and take a short period of
time to verify - if the guess is wrong we can look in-depth using whatever
tools are available safe in the knowledge that not much time was wasted

Speaking of tooling, is there much available for nodejs that would allow this
sort of investigation into performance, or debugging line by line?

------
strictfp
I got a x1000 performance degradation in node from using a simple JSONPath
expression. That was an eye opener!

------
0xmohit
Just curious: is the blog running on nodejs?

------
n3d1m
I think the page got slashdotted? There's a bit of irony, article about
scaling wasn't scaling ;-)

