

Response to Netflix's “Node.js in Flames” Blog Post - joshuacc
https://gist.github.com/hueniverse/a3109f716bf25718ba0e

======
hardwaresofton
I want to point out that in the thread here on HN
([https://news.ycombinator.com/item?id=8631022](https://news.ycombinator.com/item?id=8631022))
about that Netflix article, the top comment debunks the myth that the list-of-
handlers was the issue slowing them down.

While creating a grammar from the combination of the regular expressions
(which is why you can't just have a hash) is definitely a solution (quite
possibly THE solution), that's definitely not an MVP-type of solution, and is
pretty non-trivial.

The actual problem was them constantly adding handlers by accident.

One of the reasons I love and use HN so much (I give almost no other site so
much pull on my personal opinion) is that smart (or very careful and well-read
dumb people) leave well-thought-out comments like the one I mentioned.

~~~
kevingadd
This is all true.

However, I can't take a stack seriously as an enterprise or production ready
piece of software when it does something like iterate a list _recursively_ in
performance sensitive code. It's just bizarre. JS doesn't have tail call
optimization and for loops are _just fine_. Being able to blow out the JS
stack by having 'too many handlers' is terrifying to me, especially because JS
runtimes have catastrophically poor support for handling and debugging stack
overflows.

The use of a single array for handlers that can contain multiple items also
feels like an architectural red flag to me, but I can understand how that
would happen since JS's standard library is so anemic and lacks useful data
structures. What you really want is a set (or ordered set, if handler order is
super important - it probably is here)

I saw a number of entries in those flame charts that appeared to be V8
compilation internals, which is suspicious to me. But those are probably just
debugger/symbol resolution issues, I hope, and not actual JIT recompilation
constantly happening during route dispatch? (If v8 were compiling code
constantly during dispatch, that would certainly explain multiple-ms
pauses...)

~~~
_greim_
> JS doesn't have tail call optimization and for loops are just fine.

Not personally familiar with Express's internals, but if this is async
iteration nothing's pushing onto the call stack in the first place. Unless
you're using generators, but in that case you can just use a for loop.

~~~
kevingadd
The deeply nested stack in the flame chart suggests that it is recursing, and
the netflix article said it's recursing...

------
cleverjake
Worth noting that the person defending express is Eran Hammer - the of one of
it's biggest competitors, hapi (hapijs.com)

------
ericHosick
> The criticism about allowing routes to repeat in the express array shows
> that even after doing all this work, the Netflix team still doesn’t
> understand how express works.

Wouldn't a better solution, in this context, be to aggregate values under the
same key?

------
coldcode
It does go to show you that you need to understand the limitations of a hammer
before starting to whack things with it. This is true of any framework or
library you use. Even more necessary if you are processing billions of things
with it.

~~~
freshflowers
It's not so black and white. In many cases it's faster and cheaper (and more
educational) to just start whacking and find out what the limitations are.

Of course when you're operating on the scale of Netflix then, yeah, sure.

------
jongleberry
this is why the express team actively rejects any features or support for
dynamically changing middleware and routes during runtime.

~~~
_RPM
_this_ is ambiguous. Can you explain what the context of "this" is?

~~~
fraserharris
From the original Netflix post:

"This turned out be caused by a periodic (10/hour) function in our code. The
main purpose of this was to refresh our route handlers from an external
source. This was implemented by deleting old handlers and adding new ones to
the array. Unfortunately, it was also inadvertently adding a static route
handler with the same path each time it ran."

------
cjensen
Is there a reason why they don't have a map for static strings, and use an
array for regexs?

~~~
trevordixon
Probably makes it hard to do things in the right order. You'd have a separate
array just to keep track of order that you'd be iterating every time.

And express allows for more than one handler even for a static route.

    
    
        app.get('/', function(req, res, next) { next(); });
        app.get('/', function(req, res) { res.send(); });

~~~
danudey
This could be solved by having (internally) a map of { path: [handler 1,
handler 2]}. That makes static path matching constant time, and static paths
with multiple handlers iterate over the requests appropriately.

This can be a little confusing, especially given that you'd be specifying
static and non-static paths separately, but it could be argued that that is
less confusing than some methods. For example, nginx's method allows you to
define paths in any order, but checks routes matching regular expressions
first _in config order_ , then picks a static route by _most specific match_ ;
IOW the order of your routes in nginx doesn't matter until you have regular
expressions, in which case it is surprisingly critical some of the time and
irrelevant the rest of the time.

Breaking it up into 'regular expression routes (checked first)' and 'static
routes (checked last)' could simplify things conceptually, but doubtlessly
raises more usability issues as well.

