

Node.js Has 1 Gb Memory Limit - andreyvit
http://code.google.com/p/v8/issues/detail?id=847

======
mjijackson
It would be more correct to say that V8 has a 1G memory limit. But node.js
actually gives you these nice things called Buffers which, according to the
node.js manual, are "similar to an array of integers but corresponds to a raw
memory allocation outside the V8 heap" (see
<http://nodejs.org/api/buffers.html>).

Buffers are used all over the place in node, which should mitigate the degree
to which this is actually a problem. Even if this bug report stays a wontfix,
it won't affect my decision to use node because of Buffer support.

~~~
andreyvit
Sure, Node is a nice platform, did not imply otherwise.

For a certain category of apps it helps to know the limits beforehand, though
(I've described my use case in another comment). I think this particular limit
deserves to be more widely known.

------
glenjamin
It's probably worth mentioning that the GC cycles start to take upwards of one
second at around 500mb of RAM usage with lots of small objects.

This limit doesn't stop you using NodeJS, but it's definitely something
newcomers should be made aware of before they start writing database servers
or making heavy use of a naive in-memory cache through an object literal.

Hopefully when the new GC gets rolled out we'll really be able to let loose
with the RAM usage.

~~~
Uchikoma
Funny how all VMs go through this. Remember when JVM GC pauses killed websites
- and how over the years this was fixed by better - non pausing - GC
strategies.

~~~
monopede
That's very easy to explain. A low-pause GC requires a concurrent garbage
collector. Writing a correct concurrent garbage collector is VERY tricky, so
you usually start with something much simpler. Concurrent GCs typically also
add overhead to the mutator (the user program), so it tends to be a trade-off
of high throughput vs. short maximum pause times. As an example Java's G1
collector ("Garbage First") took a team of experts about 5 years to get
correct.

OTOH, writing a standard (single-threaded) Cheney-style copying GC can be done
in a few days. Actually, the more time consuming part for me was to get all
the pointer information to the GC.

------
eric-hu
Relevant snippets:

"diego.ca...@gmail.com, Sep 19, 2010

... In my case, I've been forced to suddenly stop my work ... with node.js
because it cannot handle more than ~140K websockets concurrent connections
..."

"erik.corry, Nov 2, 2010

The limit for 64 bit V8 is around 1.9Gbytes now. Start V8 with the \--max-old-
space-size=1900 flag."

~~~
ww520
I don't understand why he has to stop the project just because ONE node.js
process cannot handle more than 140K concurrent connections. If the goal is to
handle millions of concurrent connections, spreading the connection load out
to multiple servers is a must.

I'm doing something similar with node.js now to handle millions of concurrent
connections. I use multiple node.js servers to handle it. Not only it helps in
even out the load, it has the nice failover property. One node down doesn't
mean the whole app is down. Clients of the down node migrate their connections
to the rest of the nodes.

~~~
nivertech
If you have 140M active users, handling a fleet of 1000 servers is not an easy
operation, considering you need to manage inter-cluster communications (i.e.
when sender and receiver sockets fell on different servers).

I've built Erlang/OTP websockets clustered server, which can handle 3M per
node (giving you have enough RAM). Here you handle all your users with "only"
47 servers.

There is a big operational and scaling difference between cluster of 1000
servers and cluster of 47 servers.

~~~
jat850
I'm a bit curious about this. If you have 140M concurrent users (since this
was the original complaint) and you are NOT prepared or capable of
servicing/monitoring/maintaining 1000 servers, that seems like a fatal flaw in
your server management and analytics processes.

Certainly 47 servers vs. 1000 is far nicer. But at 140M concurrent users
levels (there must only be a few handfuls of sites with these types of
concerns), not having a team prepared to oversee 1000 servers seems like
folly.

~~~
nivertech
The problem is not only operational, but also a technical one:

Nodes in the cluster need to communicate with each other and with other
systems, like databases, message queues, monitoring servers, etc.

You can aggregate data per node, so the less servers in the front-end cluster
you have, the less load on the back-end servers.

There is also financial problem: many organizations can afford 50 servers. But
not many can afford 1000 servers.

~~~
axiak
There's no reason why you can't have many node processes per server. This also
raises the fault tolerance per server.

~~~
nivertech
Facebook chat has only 38 gateway servers. I guess, that only possible b/c not
every active facebook user, uses Chat.

~~~
flexd
Perhaps they should consider adding more, it's permanently screwed up in some
way. Messages not being sent or the other person not receiving them.

------
moomin
There's a general point to be made here: this is the sort of thing that
happens when you have "closed development open source". Admittedly, there seem
to be very few people in the world capable of delivering a world-class GC.

------
andreyvit
To add my own perspective: I was running some analytics batch job on Node and
hit this limit, had to add multi-stage processing to accommodate it. Using
--max-old-space-size=1900 did help a bit.

In case you're wondering, here's a test case:
<https://gist.github.com/1148761>

I was told by one of the V8 developers that the new GC is pretty usable now,
so worth giving it a shot if you need more than 2 Gb.

------
eob
What rude and demanding bug reports.

~~~
jrockway
Welcome to Web 2.0, where people that think they're programmers try to
interact with people that _are_ programmers. The results are often ...
depressing.

~~~
irahul
A minor re-formatting.

> Welcome to Web 2.0, where people _who_ think they're programmers try to
> interact with people _who_ are programmers. The results are often ...
> depressing.

I was confused by your initial framing of the sentence, though my not being a
native speaker would have played a part.

~~~
tuukkah
Hope this helps: "[T]his distinction applies only to _which_ and _who_. The
alternative _that_ is found with both human and non-human antecedents."
[http://en.wikipedia.org/wiki/English_relative_clauses#Human_...](http://en.wikipedia.org/wiki/English_relative_clauses#Human_or_non-
human)

~~~
irahul
That's news to me. Language follows speakers, especially influential speakers,
and the list of speakers using _that_ for human antecedents includes
Shakespeare and Mark Twain, so I guess it's all right now, even if it were an
error at some point(grammar isn't static - it has been changing ever since it
came into being).

Now I am wondering about why I thought _that_ isn't to be used for human
antecedents. Was it an error earlier or it changed recently? If it was an
error earlier, Shakespeare using it doesn't fit.

~~~
ellyagg
You probably thought that because it's uncommon in good writing and you're
taught not to do it. It's probably against style guidelines for major
newspapers, for example, and it sounds elegant to educated native writers.

------
vr
I guess the point of posting it here is to gather folks with pitchforks and
torches demanding immediate attention to the issue. For a few months V8 team
has been working hard on improving the GC
([http://code.google.com/p/v8/source/browse/branches/experimen...](http://code.google.com/p/v8/source/browse/branches/experimental/gc))
which was the major limiting factor here. I don't have an ETA of when it's
going to be merged, but the new GC branch is in a pretty good shape.

------
0x12
This all boils down to using the right tool for the right task and knowing how
to organize your code so you don't run in to limits in the platform components
that you use.

In this case, the obvious solutions are to either use 'buffers' (think of them
as extended memory from the old days) or to use multiple instances on the same
machine or spread over several machines.

If you write your code in such a way that you end up handling all your users
in a single process then you will sooner or later run in to some limitation.

------
chapel
This has been long known and looks to no longer be a problem soon reading the
comments of the OP. It hasn't been a serious issue as long as you know about
it going in.

------
russellallen
They're running up against a V8 limit.

~~~
andreyvit
Yep, but the reason we care about it is exactly because it affects Node.

~~~
russellallen
The reason you care about it is because it affects Node. I'm more interested
in V8 than Node to be honest.

~~~
joezydeco
No doubt. This comment really struck me as off-tone:

 _"Sorry, Google, but your open-source project is important to much more of
the world than just >your browser< now; and while this may not be an issue for
the zones of impact >you< care about (said browser.), it’s a >huge< issue for
much of the area where V8 is important in general in the modern (post-Node)
world."_

You know Google is biting their tongue at what they really want to say to
complainers like this. You really want it working for Node? Roll up your
sleeves and get to refactoring. Hell, they even explained how to fix it.

~~~
jrockway
The moral of the story is to never be nice to anyone. Nobody complains about
the memory limits of IE's javascript engine as they pertain to server-side
applications.

------
wavephorm
If there's anything I've learned in my software career, is that problems can
be fixed. It sounds like this problem can, or already has been fixed. NodeJS
is a wonderful platform to work with, nonetheless.

------
maxogden
nice FUD headline

~~~
andreyvit
Why? It's been a real bummer for me to find that out (the hard way).

~~~
irahul
_Why? It's been a real bummer for me to find that out (the hard way)._

Knowing about the limit is hurting you, or you have really faced this
constraint in a project? As mentioned elsewhere, node has buffers which are
allocated outside of v8 heap, and the user data will largely be unaffected by
the memory limit if you are using buffers.

~~~
andreyvit
I've tried to explain it in another comment:

> To add my own perspective: I was running some analytics batch job on Node
> and hit this limit, had to add multi-stage processing to accommodate it.
> Using --max-old-space-size=1900 did help a bit.

So yeah, it did affect me, although not on a typical web project. It wouldn't
be a problem at all if I knew about this limit from the start. Thus this is a
warning for others.

(BTW another limit I was told about is that objects can't have more than a
million keys. Thankfully I did not hit that one.)

~~~
peterhunt
Well, if your data set is big enough to hit that limit then you'll probably
need to horizontally scale it out anyway...

Or rewrite in C

------
jrockway
"C Has 16 Exabyte Memory Limit"

