
Node.js in Flame Graphs - stoey
http://techblog.netflix.com/2014/11/nodejs-in-flames.html
======
ChuckMcM
The moneyquote:

 _" We made incorrect assumptions about the Express.js API without digging
further into its code base. As a result, our misuse of the Express.js API was
the ultimate root cause of our performance issue."_

This situation is my biggest challenge with software these days. The advice to
"just use FooMumbleAPI!" is rampant and yet the quality of the implemented
APIs and the amount of review they have had varies all over the map.
Consequently any decision to use such an API seems to require one first read
and review the entire implementation of the API, otherwise you get the
experience that NetFlix had. That is made worse by good APIs where you spend
all that time reviewing them only to note they are well written, but each
version which could have not so clued in people committing changes might need
another review. So you can't just leave it there. And when you find the 'bad'
ones, you can send a note to the project (which can respond anywhere from
"great, thanks for the review!" to "if you don't like it why not send us a
pull request with what you think is a better version.")

What this means in practice is that companies that use open source extensively
in their operation, become slower and slower to innovate as they are carrying
the weight of a thousand different systems of checks on code quality and
robustness, which people using closed source will start delivering faster and
faster as they effectively partition the review/quality question to the person
selling them the software and they focus on their product innovation.

There was an interesting, if unwitting, simulation of this going on inside
Google when I left, where people could check-in changes to the code base that
would have huge impacts across the company causing other projects to slow to a
halt (in terms of their own goals) while they ported to the new way of doing
things. In this future world changes, like the recently hotly debated systemd
change, will incur costs while the users of the systems stop to re-implement
in the new context, and there isn't anything to prevent them from paying this
cost again and again. A particularly Machievellan proprietary source vendor
might fund programmers to create disruptive changes to expressly inflict such
costs on their non-customers.

I know, too tin hat, but it is what I see coming.

~~~
akkartik
You're assuming that your closed source vendors are perfectly aligned with
you. In practice they almost inevitably seem to cause capture
([https://en.wikipedia.org/wiki/Regulatory_capture](https://en.wikipedia.org/wiki/Regulatory_capture)).

Open/closed is a red herring here. Projects slowing down as they succeed seems
to be a universal phenomenon, from startups to civilizations. Specialization
leads to capture. I think almost exclusively about how to fix this:
[http://akkartik.name/about](http://akkartik.name/about) (you've seen and
liked this), [http://www.ribbonfarm.com/2014/04/09/the-legibility-
tradeoff](http://www.ribbonfarm.com/2014/04/09/the-legibility-tradeoff)

Disclosure: google employee

~~~
nostrademons
Yep, closed source doesn't solve the problem either. If you believe that just
because you're paying money for someone to take responsibility for a problem,
they will actually solve the problem in a way that's amenable to you...well,
there are numerous closed-source software vendors looking to sell you
something.

In practice, the way to avoid this is to keep the software as simple as
possible. Try to adjust to your user's most pressing current needs, not every
need they might conceivably have. Killing features and deleting code is as
important as launching features and writing code; make sure that your
incentive systems reward this. Very often, third-party code gets pulled in to
scratch one particular itch; if it's no longer itching, rip the code out. If
it is still itching and you've built significant parts of your system around
it, you may want to think about replacing the innards with a home-grown
system.

~~~
barrkel
When a software provider I use starts ripping out features I relied upon, I
start looking for an alternate provider, one that isn't so eager to kill
features. And in particular, I try not to learn or rely on any new features,
if there is past behaviour of feature removal by the provider.

It's better to be careful - very careful - about what you add, and to have a
story for migration, than to remove features.

~~~
nostrademons
This depends on industry, of course - in consumer web it's much better to risk
pissing off a few customers but make the majority of them happy than to keep
all your existing customers but risk losing out on a new innovation that gives
a competitor a toe-hold. Enterprise SaaS probably has different trade-offs,
and software infrastructure probably different still.

This paradox, BTW, could be thought of as the full-employment theorem for
entrepreneurs. As long as it is rational for a business to avoid change for
fear of having to remove or support it later, then there will exist changes
that a company with no customers and no codebase could implement that no
incumbent would dare. Some of these are bound to be useful to some segment of
the market, and that's why you get continued disruption in technology markets.

------
thedufer
> It’s unclear why Express.js chose not to use a constant time data structure
> like a map to store its handlers.

Its actually quite clear - most routes are defined by a regex rather than a
string, so there is no built-in structure (if there's a way at all) to do O(1)
lookups in the routing table. A router that only allowed string route
definitions would be faster but far less useful.

I can't explain away the recursion, though. That seems wholly unnecessary.

Edit: Actually, I figured that out, too. You can put middleware in a router so
it only runs on certain URL patterns. The only difference between a normal
route handler and a middleware function is that a middleware function uses the
third argument (an optional callback) and calls it when done to allow the
route matcher to continue through the routes array. This can be asynchronous
(thus the callback), so the router has to recurse through the routes array
instead of looping.

~~~
andrewvc
A lot of people here are right, the right way is with an NFA. I just want to
add that the solution is not even hard, you can do it with string
concatenation and capture groups using regexps. Regexps are NFAs, and are
highly optimized C code in just about every JS engine.

If I have the routes /foo/bar and /foo/bar/(\d+) I can generate the regexp
((^\/foo\/bar$)|(^\/foo\/bar\/\d+$))

I'm not at all surprised, the quality of software in node is pretty low, I've
seen numerous issues in node libs being just as boneheaded. I swear, the fact
that the express devs overlooked a key optimization is crazy. Rails, by way of
example, uses the Journey engine to solve this problem
([https://github.com/rails/journey](https://github.com/rails/journey))

~~~
thedufer
> I can generate the regexp ((^\/foo\/bar$)|(^\/foo\/bar\/\d+$))

And how would you know which one got matched? The regex match isn't going to
tell you that. Also, it needs to recognize if multiple were matched, which is
definitely not going to be done by the built-in regex matcher.

It's certainly possible, but pretending it's trivial isn't helping, either.

~~~
blinks
For more concrete syntax, consider the following Python:

    
    
      >>> import re
      >>> route = re.compile(r'(?P<fb>^/foo/bar$)|(?P<fbd>^/foo/bar/\d+$)')
      >>> route.match('/foo/bar').groupdict()
      {'fb': '/foo/bar', 'fbd': None}
      >>> route.match('/foo/bar/1').groupdict()
      {'fb': None, 'fbd': '/foo/bar/1'}
    

If the fb group is set, act on the first route. If the fbd group is set, act
on the second.

~~~
rudolf0
I know very little of NFAs/DFAs/FSMs, or even string parsing in general, but a
year ago I built a URL matching engine using exactly this method in Python, in
combination with Google's RE2 library
([https://code.google.com/p/re2/](https://code.google.com/p/re2/)). It was far
faster than anything else I experimented with, and RE2 also improved the speed
dramatically by eliminating backtracking.

Nice to know that what I made is considered the best solution algorithmically.

------
rwaldin
I'm surprised nobody has mentioned that express has a built in mechanism for
sublinear matching against the entire list of application routes. All you have
to do is nest Routers
([http://expressjs.com/4x/api.html#router](http://expressjs.com/4x/api.html#router))
based on URL path steps and you will reduce the overall complexity of matching
a particular route from O(n) to near O(log n).

------
remon
I wonder what the thought process was behind moving their web service stack
(partially?) to node.js in the first place. For a company with the scale and
resources of Netflix it's not exactly an obvious choice.

~~~
tjholowaychuk
I share this thought, I'm not trolling, I really believe node is a bad
solution for something like Netflix.

Node has its perks but for a money making machine that relies solely on being
available and providing a good customer experience, not so much.

I can't imagine the ops nightmares at that size, one buggy code path and the
entire cluster could be down. These are issues that drove me away from Node to
Go, in my opinion Node has way too many issues to run in money-making
scenarios.

~~~
nmjohn
> in my opinion Node has way too many issues to run in money-making scenarios.

You can say that about a lot of languages other than node. How many billion
dollar companies have their software written in PHP - a language that many
people would agree has far more glaring issues than node.

My point, is that I believe your comments miss the larger picture. There is
far more involved with deciding which language to build a product with than
"which language is best."

I don't disagree with you about Go being a great language - but for most
companies, it is not even remotely practical to use. Hiring talented Go
developers is hard because there are so few. Their current employees may or
may not know Go, what do you do about that? Etc.

~~~
tjholowaychuk
For sure, but jumping into Node from a clean slate is probably not the best
idea, legacy is different.

If anything the comment on hiring highlights the use of Node being
problematic. It's difficult to write robust systems in Node even as a seasoned
Node developer.

------
elwell
TIL, SVG's can display labels on element hover:
[http://cdn.nflximg.com/ffe/siteui/blog/yunong/200mins.svg](http://cdn.nflximg.com/ffe/siteui/blog/yunong/200mins.svg)

Nice, contained way to show data like this.

~~~
TazeTSchnitzel
SVGs can contain ECMAScript, video, canvases, animation, all sorts of things.
They're the replacement for Flash that nobody seems to use.

~~~
escape_goat
Well, there was a good reason for that, for a long time: IE didn't even begin
to support .svg until version 9, which means that it isn't a realistic option
for deployment for anyone who needs to support pre-evergreen browsers.

I investigated them a year or two or so ago, because I found that AngularJS
could work quite nicely within an .svg document, which opened up some exciting
possibilities. My recollection is that at the time, there were some critical
cross-browser problems with font rendering that made the topic very kludgy and
complicated. I except that matters have improved since, but I do not know to
what extent.

~~~
Andir
And IE still doesn't (nor do they plan to) support declarative animation in
SVGs. (SMIL) Someone there decided that script based animation is more robust
so they just bypassed that capability. I think it's a terrible idea because
you forego having tightly integrated SVG animation modules that can just be
dropped in as needed.

------
vkjv
> ...as well as increasing the Node.js heap size to 32Gb.

> ...also saw that the process’s heap size stayed fairly constant at around
> 1.2 Gb.

This is because 1.2 GB is the max allowed heap size in v8. Increasing beyond
this value has no effect.

> ...It’s unclear why Express.js chose not to use a constant time data
> structure like a map to store its handlers.

It it is non-trivial (not possible?) to do this in O(1) for routes that use
matching / wildcards, etc. This optimization would only be possible for simple
routes.

~~~
tedchs
That seems like a pretty low size to me... how are people getting around this
when they need to handle >1.2GB of data on Node?

~~~
jonny_eh
Native code modules I assume.

------
tjholowaychuk
Sounds like a documentation issue, or lack of a staging environment. I've
written and maintained countless large Express applications and routing was
never even remotely a bottleneck, thus the simple & flexible linear lookup. I
believe we had an issue or two open for quite a while in case anyone wanted to
report real use-cases that performed poorly.

Possibly worth mentioning, but there's really nothing stopping people from
adding dtrace support to Express, it could easily be done with middleware.
Switching frameworks seems a little heavy-handed for something that could have
been a 20 minute npm module.

------
_Marak_
I read:

"This turned out be caused by a periodic (10/hour) function in our code. The
main purpose of this was to refresh our route handlers from an external
source. This was implemented by deleting old handlers and adding new ones to
the array"

 _refresh our route handlers from an external source_

This is not something that should be done in live process. If you are updating
the state of the node, you should be creating a new node and killing the old
one.

Aside from hitting a somewhat obvious behavior for messing with the state of
express in running process, once you have introduced the idea of
programmatically putting state into your running node you have seriously
impeded the abiltity to create a stateless fault tolerant distributed system.

~~~
emeraldd
When I concluded what they had to be doing and then read the actual
confirmation of what they were doing I was somewhat shocked. Why on Earth
would you want to programatically recreate the routes in an express app?!?!?
It would be really interesting to see a write up on what/why they think this
kind of behavior is needed in the first place ....

------
TheLoneWolfling
> benchmarking revealed merely iterating through each of these handler
> instances cost about 1 ms of CPU time

1ms / entry? What is it doing that it's spending 3 million cycles on a single
path check?

~~~
jdmichal
Running (uncompiled?) regular expressions, it seems.

~~~
mikeryan
So I was a bit unclear on the parent's post but I don't think this time was on
a route lookup if I'm reading the thread and post correctly the static file
handler getting inserted multiple times. This handler will generally match on
any route but then is doing something like "if file exists, return static
file, if not look for the next handler" in this case the "if file exists" part
was the "path check" thats taking 1 ms and was happening multiple times.

I could be wrong but it seems like the design of the route lookup mechanism
(the global array) was actually a bit of a red herring, the real issue was the
ability to attach multiple instances of the same handler to the same route.

 _Something was adding the same Express.js provided static route handler 10
times an hour. Further benchmarking revealed merely iterating through each of
these handler instances cost about 1 ms of CPU time._

~~~
TheLoneWolfling
A simple "if file exists" check shouldn't take 1ms on average.

OSes cache directory entries for a reason.

I mean, even Python manages 40,000 checks / second:

    
    
      >>> timeit.timeit("os.path.exists(data)", setup="import os; import random; import string; data = os.path.join(r'C:\Windows\System32', ''.join(random.choice(string.ascii_letters) for _ in range(10)))", number=40000)
      0.9998181355403517

------
clebio
> I can’t imagine how we would have solved this problem without being able to
> sample Node.js stacks and visualize them with flame graphs.

This has me scratching my head. The diagrams are pretty, maybe, but I can't
read the process calls from them (the words are truncated because the graphs
are too narrow). And I can't see, visually, which calls are repeated. They're
stacked, not grouped, and the color palette is quite narrow (color brewer
might help here?).

At least, I _can_ imagine how you could characterize this problem without
novel eye-candy. Use histograms. Count repeated calls to each method and sort
descending. Sampling is only necessary if you've got -- really, truly, got --
big data (which Netflix probably does), but I don't think the author means
'sample' in a statistical sense. It sounds more like 'instrumentation',
decorating the function calls to produce additional debugging information.
Either way, once you have that, there are various common ways to isolate
performance bottlenecks. Few of which probably require visual graphs.

There's also various lesser inefficiencies in the flame graphs: is it useful
(non-obvious) that every call is a child of `node`, `node::Start`, `uv_run`,
etc.? Vertical real-estate might be put to better use with a log-scale?
Etcetera, etc.

~~~
donavanm
> The diagrams are pretty, maybe, but I can't read the process calls from them
> (the words are truncated because the graphs are too narrow).

Flame Graphs provide SVGs by default. You should be able to zoom if your
broser supports it. The current version also supports "zooming" in to any
frame in the stack, resetting that frame as the base of the display. Also WRT
the base frames of 'node' et al its because Flame Graphs are a general use
tool for stack visualization, it might be 'main' for a c program or the
scheduler looking at a system.

> They're stacked, not grouped, and the color palette is quite narrow (color
> brewer might help here?).

Colors by default have no meaning and the palette is configurable. The current
lib can also assign colors by instruction count/ipc and width by call count,
if you have access to that.

> Sampling is only necessary if you've got -- really, truly, got -- big data
> (which Netflix probably does), but I don't think the author means 'sample'
> in a statistical sense.

It is sampling. Flame graphs re typically used with something like
perf/dtrace/oprofile which dumps stacks at a couple hundred to thousand hertz.
Actual call tracing is (typically) not feasible for large/prod stacks.

------
drderidder

      > our misuse of the Express.js API was the 
      > ultimate root cause of our performance issue
    

That's unfortunate. Restify is a nice framework too, but mistakes can be made
with any of them. Strongloop has a post comparing Express, Restify, hapi and
LoopBack for building REST API's for anyone interested.
[http://strongloop.com/strongblog/compare-express-restify-
hap...](http://strongloop.com/strongblog/compare-express-restify-hapi-
loopback/)

------
wpietri
From the article:

> What did we learn from this harrowing experience? First, we need to fully
> understand our dependencies before putting them into production.

Is that the lesson to learn? That scares me, because a) it's impossible, and
b) it lengthens the feedback loop, decreasing systemic ability to learn.

The lesson I'd learn from that would be something like "Roll new code out
gradually and heavily monitor changes in the performance envelope."

Basically, I think the approach of trying to reduce mean time between failure
is self-limiting, because failure is how you learn. I think the right way
forward for software is to focus on reducing incident impact and mean time to
recovery.

~~~
akkartik
Without over-training on this one incident, and without guidance on how to get
from here to there (I'm still working on that):

1\. Don't get suckered by interfaces, share code. If you create code for
others to share ("libraries"), stop trying to hide its workings.

2\. You don't have to learn how everything works before you do anything. But
you should expect to learn about internals proportional to the time you spend
on a subsystem. Current software is too "lumpy" \-- it requires days or months
of effort before yielding large rewards. The first hour of investigation
should yield an hour's reward.

3\. "Production" is not a real construct. There will always be things that
break so gradually that you won't notice until they've gone through all your
processes. Give up on up-front prevention, focus instead on practicing online
forensics. And that starts with building up experience on your dependencies.

More elaboration:
[http://akkartik.name/post/libraries2](http://akkartik.name/post/libraries2)

My attempt at a solution:
[http://akkartik.name/about](http://akkartik.name/about)

My motto: reward curiosity.

------
ecaron
My biggest takeaway from this article is that Netflix is moving from Express
to Restify, and I look forward to watching the massive uptick this has on
[https://github.com/mcavage/node-
restify/graphs/contributors](https://github.com/mcavage/node-
restify/graphs/contributors)

~~~
sadkingbilly
Yes, but their original bug was from dynamically loading routes from an
external source. I don't see how Express is to blame for this. Moving to
Restify is not a solution, but they state having different reasons for moving
(support for bunyan logging? But Express already supports this too).

~~~
ecnahc515
the bug was related to dynamically loading routes, but the true cause was that
express allowed duplicate handlers. They were loading routes dynamically
correctly, that wasn't a problem, it was that when doing that express let them
duplicate routes.

~~~
sisk
That's a feature.

From the API docs:

> Multiple callbacks may be given; all are treated equally, and behave just
> like middleware. The only exception is that these callbacks may invoke
> next('route') to bypass the remaining route callback(s). This mechanism can
> be used to perform pre-conditions on a route, then pass control to
> subsequent routes if there's no reason to proceed with the current route.

------
forrestthewoods
If I had to pick one line to highlight (not to criticize, but was a wise
lesson worth sharing) it would be this one:

"First, we need to fully understand our dependencies before putting them into
production."

~~~
gdulli
In my experience developers constantly overestimate the gain of using a new
dependency and underestimate the amount of effort it will take to sufficiently
understand it. (Or fail to make the effort, not understanding the risks.)

This is why developers without significant experience should not be making
decisions about the tech stack.

------
Fishrock123
I would like to mention that Netflix could have consulted the express
maintainers (us) but didn't.

Source: myself -
[https://github.com/strongloop/express/pull/2237#issuecomment...](https://github.com/strongloop/express/pull/2237#issuecomment-59681175)

------
augustl
A surprising amount of path recognizers are O(n). Paths/routes are a great fit
for radix trees, since there's typically repetitions, like /projects,
/projects/1, and /projects/1/todos. The performance is O(log n).

I built one for Java: [https://github.com/augustl/path-travel-
agent](https://github.com/augustl/path-travel-agent)

~~~
kyllo
How does your radix trie implementation handle variables in the URL paths, in
a nutshell?

~~~
augustl
This is one of the reasons for why I call it a "bastardized radix tree" in the
README :)

The routes are stored as "nodes". There's a root node. It has a hash map of
child nodes, by name. It also has a list of "parameterized" nodes. When a node
gets a path segment, it will first look in its hash map. If nothing is there,
it'll call the parameterized nodes in sequence. Typically there's just one
parameterized node.

For the following paths:

    
    
      /projects
      /projects/new
      /projects/special
      /projects/:project-id
    

The root node will have a single item in its hash map, "projects". No items in
the parameterized node.

The node for "projects" will have to items in its hash map, "new" and
"special". It will have a single item in it's parameterized node, for
:project-id.

I updated the README just now with a slightly more detailed explanation :)

~~~
kyllo
Very cool, thanks for taking the time to explain! I recently learned how tries
work so it was cool to see a real-world implementation like yours.

------
degobah
tl;dr:

* Netflix had a bug in their code.

* But Express.js should throw an error when multiple route handlers are given identical paths.

* Also, Express.js should use a different data structure to store route handlers. EDIT: HN commentors disagree.

* node.js CPU Flame Graphs ([http://www.brendangregg.com/blog/2014-09-17/node-flame-graph...](http://www.brendangregg.com/blog/2014-09-17/node-flame-graphs-on-linux.html)) are awesome!

------
bcoates
It's not just the extra lookups -- static in express is deceptively dog-slow.
For every request it processes, it stats every filename that might satisfy the
URL. This results in an enormous amount of useless syscall/IO overhead. This
bit me pretty hard on a high-throughput webservice endpoint with an unnoticed
extra static middleware. I wound up catching it with the excellent NodeTime
service.

Now that I look at it, there's a TOCTOU bug on the fstat/open callback, too:
[https://github.com/tj/send/blob/master/index.js#L570-L605](https://github.com/tj/send/blob/master/index.js#L570-L605)

This should be doing open-then-fstat, not stat-then-open.

------
jaytaylor
I am upset that the title has been changed from "Node.js in Flames". Which is
not only the real title of the article, but also a reasonable description of
what they've been facing with Node.

#moderationfail

~~~
dang
I can see why you would feel that way, but the title, while clever and (I'll
take your word for it) fitting, was arguably misleading and unarguably baity.
The HN guidelines call for changing such titles, so the moderators were just
doing their job. There likely would have been more complaints about the title
if we hadn't changed it.

~~~
jaytaylor
Understood. And thank you for following up, I really do appreciate it.

------
ajsharma
This is the first I've heard of restify, but it seems like a useful framework
for the main focus of most Node developers I know, which is to replace an API
rather than a web application.

~~~
ecaron
You're going to love restify. We use it at TrackIf and can't imagine our API
running on anything else. Couple it with Swagger and you won't be looking back
:-)

------
codelucas
> This turned out be caused by a periodic (10/hour) function in our code. The
> main purpose of this was to refresh our route handlers from an external
> source. This was implemented by deleting old handlers and adding new ones to
> the array. Unfortunately, it was also inadvertently adding a static route
> handler with the same path each time it ran.

I don't understand the need of refreshing route handlers. Could someone
explain they needed to do this, and also why from an external source?

~~~
mjr578
We refresh periodically as we dynamically deploy new UI code, which can be
accessed at new routes. (/home and /homeV2 for example) This allows us to not
have to restart our servers or push out new server code just to serve a new UI
at a different (or the same) route.

------
exratione
The express router array is pretty easy to abuse, it's true. For example, as
something you probably shouldn't ever do:

[https://www.exratione.com/2013/03/nodejs-abusing-
express-3-t...](https://www.exratione.com/2013/03/nodejs-abusing-express-3-to-
enable-late-addition-of-middleware/)

I guess the Netflix situation is one of those that doesn't occur in most
common usage; certainly dynamically updating the routes in live processes
versus just redeploying the process containers hadn't occurred to me as a way
to go.

------
hardwaresofton
Responses are already firing in:
[https://news.ycombinator.com/item?id=8632220](https://news.ycombinator.com/item?id=8632220)

------
pm90
I love these kinds of investigations into problems in production. I mean, you
really have to admire their determination in getting to the root of the
problem.

In some ways, these engineers are not that different from academic
researchers, in that they are devising experiments, verifying techniques, all
in the pursuit of the question: why?

------
hit8run
I would have written my apis in golang and not nodejs. Go is way faster in my
experience and it feels leaner to create something because creating a web
service can be productively doneout of box. Node apps tend to depend on
thousands of 3rd party dependencies which makes the whole thing feel fragile
to me.

------
MichaelGG
Would someone explain what I'm missing about the flame graphs? Why are they
indispensable here? In a normal profiler, you'd just expand the hot path and
see what had the most samples. Apart from making recursion very explicit, what
special aspect do flame graphs expose?

------
BradRuderman
Why are they loading in routes from an external source? Is that normal, I have
never seen that before.

~~~
mjr578
We like the option of dynamically loading new routes, that point to new
endpoints. We also have the ability to release new versions of our UI without
redeploying (or restarting) our servers.

~~~
BradRuderman
Ok you add a new route but how do you reference what code should be executed
when that route is hit?

~~~
mjr578
We have something that loads up, via requires, the action (or route) that
should be run when a URL is encountered.

~~~
BradRuderman
Second dmak, would love to understand more. I can following dynamically
loading routes but can't follow how that would be implemented end to end. Some
things I would be interested in: \- Where do the keep the code that gets
executed for new routes? Is that deployed dynamically as well? \- If you are
changing routes dynamically how do you test in non-prod, are you constantly
syncing non prod with prod? \- How do you control what you deploy dynamically
vs what you migrate through the environments?

------
bentcorner
Interesting article. I have a lot of experience dealing with ETLs in WPA on
the Windows side - it's an awesome tool that gives you similar insights. I
haven't used it for looking at javascript stacks before though, so I don't
know if it'll do that.

------
sysk
> We also saw that the process’s heap size stayed fairly constant at around
> 1.2 Gb.

> Something was adding the same Express.js provided static route handler 10
> times an hour.

Why didn't it increase the heap size? Maybe it was too small to be noticeable?

------
pcl
_Second, given a performance problem, observability is of the utmost
importance_

I couldn't agree with this more. Understanding where time is being spent and
where pools etc. are being consumed is critical in these sorts of exercises.

------
dmitrygr
So the lesson is to actually know the code you deploy to prod? Is that not
obvious?

~~~
SixSigma
Your webserver, your dns resolver, your database, your operating system, the
compilers they used, how about the bios, or the northbridge

------
drinchev
NodeJS Project has already a similar issue about recursive route matching.

[https://github.com/strongloop/express/issues/2412](https://github.com/strongloop/express/issues/2412)

------
debacle
Doesn't this seem like a bug in the express router? All of the additional
routes in the array are dead (can't be routed to).

~~~
dugmartin
No, because the route list is also used for middleware. The recursive search
is because middleware routes have a third next parameter that allows the
search to run asynchronously.

------
Pharohbot
I wonder how Netflix would perform with using Dart with the DartVM. I reckon
it would be faster than Node based on benchmarks I've seen. Chrome DartVM
support is right around the corner ;)

------
revelation
Crazy talk. In 1ms, I can perspective transform a moderately big image. NodeJS
cant iterate through a list.

We really need a 60 fps equivalent for web stuff. You have 16ms, thats it.

~~~
nostrademons
FWIW, a lot of the stuff the Chrome+Polymer team is working on explicitly has
a 60fps goal and 16ms frame budgets. I remember in my last project at Google,
I spent a lot of time working with the Chrome team to get various parts of
websearch rendering under the 16ms budget. (I couldn't manage it given the
amount of legacy code we had to work with, but various other people have
continued the work since I left, so hopefully they've had more luck.)

------
coldcode
I must admit I could enjoy just doing this type of analysis all day long. Yet
I hate non computing puzzles.

------
qodeninja
wow. I love that Netflix us using Node and even more curious that they would
use express.

------
notastartup
this is why you stick to tried and true methods folks. this is such a typical
node.js fanboy mentality. "reinventing the wheels is justified because
asynchronous". or "i want this trendy way to do things just because everyone
else is jumping on the bandwagon".

Give me flask + uwsgi + nginx anyday.

~~~
CmonDev
In the end of the day Node.js is just a Reactor Pattern implementation in a
form of bunch of scripts.

------
talkingtab
an unfortunate title. Ha ha "flames" ha ha "Node.js" but the article is really
about express. Not so "ha ha"

~~~
snlacks
They renamed it "in flame graphs."

For the audience, this was originally titled: "Node.JS in Flames"

------
general_failure
A very good reason to go with express is TJ. He was the initial author of
express and he is quite brilliant when it comes to code quality. Of course, TJ
is no more part of the community but his legacy lives :-)

------
gadders
OFFTOPIC: "Today, I want to share some recent learnings from performance
tuning this new application stack."

The word you want is "lessons".

~~~
quarterto
"learnings" is a perfectly cromulent word:
[https://books.google.com/ngrams/graph?content=learnings&year...](https://books.google.com/ngrams/graph?content=learnings&year_start=1700&year_end=2000&corpus=15&smoothing=3&share=&direct_url=t1%3B%2Clearnings%3B%2Cc0)

~~~
gadders
Nope. You see an option for a plural here? [http://www.merriam-
webster.com/dictionary/learning](http://www.merriam-
webster.com/dictionary/learning)

~~~
M2Ys4U
Dictionaries are _descriptive_ , not _prescriptive_. They can only tell you if
something _is_ a word, not whether something _isn 't_ a word.

~~~
gadders
Well certainly there is no law against using made-up words. If that's what you
want to do, have at it.

At least it will help people get their buzzword bingo cards filled up sooner.

