Hacker News new | comments | show | ask | jobs | submit login
A surprising JavaScript memory leak found at Meteor (davidglasser.net)
179 points by glasser 1429 days ago | hide | past | web | 104 comments | favorite



Honestly, I'm not really sure I'd call this a "memory leak" -- I just always assumed that all the variables present in a closure are maintained, if there's an existing reference to any function defined within. I'm not surprised at all, this is what I would expect.

I'm actually suprised/impressed that Chrome has an optimization to detect which closure variables are actually used in a function and garbage-collect the rest.

But it's certainly something important to be aware of. I think the main point is, why on earth would you be constantly re-defining a function "logIt" within a function that is repeatedly called? That's just bad programming -- whenever you define functions within functions in JavaScript, you really need to know exactly what you're doing.


> why on earth would you be constantly re-defining a function "logIt" within a function that is repeatedly called? That's just bad programming -- whenever you define functions within functions in JavaScript, you really need to know exactly what you're doing.

It's just a reduction to demonstrate the problem. I'm sure the code that actually triggered it was much more complex and had valid reasons for doing whatever it was doing.


I'm sure it was more complex, of course.

But what I mean is -- whenever you define a function within another function, that should never be a "default" way of writing JavaScript, because it creates closures, and you're just asking for "memory leaks".

Because closures use memory, you need to make sure there's a valid reason for creating the closure, and above all that if you're creating multiple closures by calling the parent function repeatedly, that there's a really really good reason for it, and that the closures are able to be garbage-collected later on.


Hmmm, idiomatic Node.js code creates closures everywhere. (And you may be right that this is just asking for memory leaks.)


Let me just clarify -- creating large closures to wrap libraries in -- that kind of thing is good. The (function(){...})(); pattern is excellent for not polluting the global variable namespace. And single library objects are great too -- they're generally only created once, so there's no memory leaks.

And closures for things like callbacks for AJAX etc. -- these are generally fine, because once the AJAX returns and the callback function finishes executing, then once the AJAX object gets garbage-collected, the callback will too, and its closure.

But if you're leaving "infinite-lived" references around to functions within closures, repeatedly, whether it's via an infinitely long setInterval, or whether you're just storing function references in some array you build up and never remove from -- that's just bad programming. Of course it's a "memory leak", but you, the programmer, intentionally asked for it. It's not the language's fault -- that's how the language is supposed to work.


> And closures for things like callbacks for AJAX etc. -- these are generally fine, because once the AJAX returns and the callback function finishes executing, then once the AJAX object gets garbage-collected, the callback will too, and its closure.

I'm not sure that this is as fine as you think it is. Those callbacks may invoke other callbacks, which may access any variable that's in its lexical scope, and as the Meteor bug shows, you may be inadvertently holding on to not the callback itself, but the environment it lives in.

I think this may be exacerbated in promise-oriented Node.js code since it is so easy to take a promise and stash it in a list or object (which we do in my current project[0], and indeed this code has been the source of memory leaks in the past and may still be leaking).

[0] https://github.com/rstudio/shiny-server/blob/master/lib/sche...


Yeah, you're right. They're not getting that closures can outlive the function that created it. They literally wrote an infinite loop here.


I, for one, am not enamored with JavaScript's cuteness. I'm all prototype all the way -- boring stuff that the closure compiler can understand.


> I just always assumed that all the variables present in a closure are maintained, if there's an existing reference to any function defined within.

In the logical sense each function has its own closure. There is no way functions should affect which[edit for clarity] variables are closed on by each other. So you shouldn't expect this.

>I'm actually suprised/impressed that Chrome has an optimization to detect which closure variables are actually used in a function and garbage-collect the rest.

In the logical sense functions close over their variables, not all variables that happen to be in scope. So that optimization is needed.


If I'm understanding what you're saying, that's just not correct. Try and run this code:

    var a = function() {
      var z = "a";
      window.b = function() { z = "b"; }
      window.c = function() { return z; }
    };
    a();
    window.c(); // returns "a"
    window.b();
    window.c(); // returns "b"
As you can clearly see, the single closure containing the variable "z" is shared between functions b() and c().

So each function does not have its own closure -- closures work "upwards", referencing everything in every scope above them.


You have two closures that share a variable. Think about nesting it deeper.

  var a = function() {
    var z = "foo";
    window.getz = function() { return z; };
    
    var b = function() {
      var y = "bar";
      window.n = function() { z = y; };
    };
    
    var c = function() {
      var y = "baz";
      window.m = function() { z = y; };
    };
    b();
    c();
  };
  a();
Now function window.n and window.m have completely different closures that happen to share the upvalue 'z', but do not share the upvalue 'y'.


I don't see how that's refuting my point -- of course they don't share "y" because they're two completely different y's in two completely different local scopes.

I had understood your comment to mean that closures don't share variables, and I was just explaining that they certainly can. Maybe I'd misinterpreted what you were trying to say...


I wasn't trying to say that at all.

I was saying that two functions side by side have distinct closures, no matter how many variables they share.

The reason I said that was to point out that variables closed over by one function shouldn't affect which variables are closed over by nearby functions.

Actually I think I see how to make the wording clearer in the original. Brb.


My $0.02 regarding terminology that may help this discussion: what you call a scope is, in Lisp terms, a (lexical) environment: something that, given a name, can give you a value or a "I don't know an object with that name" error, and that you can give a (name,value) pair (a binding) to create (languages vary a bit in the way they handle the case where the name already is known)

In many languages, environments are nested. For example, in the example:

  var a = function() {
    var y = "bar";
    var z = "foo";
    window.getz = function() { return z; };
    
    var b = function() {
      window.bb = function() { z = y; };
    };
    
    var c = function() {
      window.cc = function() { z = y; };
    };

    var d = function() {
      var q = 42;
      window.dd = function() { z = q; };
    };

    b();
    c();
    d();
  };
  a()
Calling b, c, and d creates three environments. The environment created by calling d adds a binding of q to 42 to its environment; neither the one created by calling b nor the one created by calling c add bindings of its own, but conceptually, these two have their own environment. All three build on the environment of function a; that environment contains bindings for y and z

Normally, when leaving a block, the environment created when entering it can be discarded. Here, however, the environments still can be reached after execution of the block has ended; each of those environments becomes (part of) its own closure when the function returns. The closures created by calling c and by calling d are functionally identical because they both build on the same 'parent' environment and have no bindings of their own, but conceptually, each has its own closure.

More, better info at http://c2.com/cgi/wiki?LexicalClosure and, of course, in SICP, section 3.2 "The Environment Model of Evaluation" (although I remember seeing clearer pictures for that that involve making closures. Maybe they are in the videos?)


> In the logical sense each function has its own closure. There is no way functions should affect the variables closed on by each other. So you shouldn't expect this.

What? The whole point of a closure is that it is shared.


The point of closures is to share variables, but each closure can participate in different shares.


Part of the reason people are misunderstanding you is that terminology got abused a little bit upthread.

"I just always assumed that all the variables present in a closure are maintained" should really be (something like) "I just always assumed that all the variables present in a scope that produced a closure are maintained". You're right that a closure only needs to contain the variables that are closed over, and that some of those variables are scoped such that they can appear in multiple closures.


> You're right that a closure only needs to contain the variables that are closed over.

You would think so, but this assumption fails in the presence of eval.


Sure, but that's the only exception, right? If there's no eval call in the function that requires the closure it's easily determined what variables don't need to be included, right?


In JS, functions can refer to variables defined in outer scopes, and can refer to variables defined in outer functions even after those functions have returned. JS functions are first-class objects.

They also store any variables they may refer to that are defined in their enclosing scopes, including the parameters and variables of outer functions.


It's a common enough optimization for JS engines in some form.

In particular, if the closure only reads the variable and nothing can write it, it's way cheaper in terms of performance to just create a separate lexical environment for the closure that contains the things it reads so that you don't have to keep walking up the scope chain on bareword lookups in the closure.

The fact that things then end up not referenced and can be gced is just a side-effect of the primary goal of the optimization: faster access to variables in a closure.


Works as per spec, won't fix. Closing.

The definition of "memory leak" is smeared all across the road.


TL;DR -- all closures created within a function are retained even if only one closure is ever referenced. This is really interesting, and potentially devastating for certain coding styles.

I ran the code on node and the RSS does appear to be increasing without bound. Even running node with --expose-gc and forcing a global.gc() inside logIt() causes the same unbounded memory growth.

Increasing the size of the array by a factor of 10 causes RSS usage to jump up by a factor of 10 every second, so we know that the memory usage isn't caused by creating a new long-lived closure (i.e., the logIt() function) every second.

In fact, removing the call to doSomethingWithStr() doesn't change the unbounded memory growth.

Here's a shorter snippet that demonstrates the leak more dramatically (about 10 MB/sec RSS growth):

  var run = function () {
    var str = new Array(10000000).join('*');
    var fn  = function() {
      str;
    }
    var fn = function() { }
    setInterval(fn, 100);
  };
  setInterval(run, 1000);
Tried it out on node v0.8.18


To be clear: when you say "removing the call to doSomethingWithStr() doesn't change the unbounded memory growth" you do literally mean "removing the call" and not "removing the function definition", right? That matches what I see.


Correct. I'm actually a little surprised that an identity statement like "str;" isn't optimized out.

Edit: removing the function definition eliminates the problem, so I'd say that your diagnosis of the problem is spot on.


>> TL;DR -- all closures created within a function are retained even if only one closure is ever referenced. This is really interesting, and potentially devastating for certain coding styles. <<

This is not devastating, it is how closures and scope chain work in JavaScript. See my detailed explanation below. The crux of the matter is that the closure still references the outer function's scope chain, even after the outer function has returned. And the outer functions's scope chain is referenced by the closure until the closure is destroyed or returns.


I'm a very lazy (oops) language implementer. Every time but one when I've implemented closures, I took the quick approach of just copying the entire frame to the heap.

The other time, I was able to do more data flow analysis, but it also resulted in a bunch of annoying fiddly bugs and took more maintenance.

I'm not suggesting the v8 team took the 'easy way' out, but doing the deep introspection is hard, and in an environment such as a browser, I can see trading a pathological case such as this for what must be 1,000,000,000 inappropriate aggressive gc bugs.

(See the recent discussion over RubyMotion for examples of the opposite problem: http://news.ycombinator.com/item?id=5949072)


Note that the TL;DR may well be V8-specific; other engines optimize closures differently and may not have the same action-at-a-distance interaction between closures.

But the real killer is that you can't tell which closures will keep what alive unless you assume that every closure keeps everything it closes over (even if it doesn't use it) alive. Nothing in the ES spec requires them not to...


I'm very confused by this whole thread.

Is it that, only one closure is created for variables defined in the function run, and variables needed in any inner-function are added to run's closed context? That seems like a bug fixable without impacting the language.


What is "RSS"?


After a nontrivial degree of Googling,

http://en.wikipedia.org/wiki/Resident_set_size


It was an honest question. Thanks for the downvote moron, very helpful.


OP here. Several people have pointed out that of course I should expect a leak, since each `logIt` object is leaked. That's absolutely true; my point was just that we don't want `str` to leak.

But the original bug that led to this discovery involved a data structure that shouldn't have leaked at all. I've updated the post to show it; duplicated here since GitHub Pages seems to cache posts pretty aggressively.

    var theThing = null;
    var replaceThing = function () {
      var originalThing = theThing;
      // Define a closure that references originalThing but doesn't ever actually
      // get called. But because this closure exists, originalThing will be in the
      // lexical environment for all closures defined in replaceThing, instead of
      // being optimized out of it. If you remove this function, there is no leak.
      var unused = function () {
        if (originalThing)
          console.log("hi");
      };
      theThing = {
        longStr: new Array(1000000).join('*'),
        // While originalThing is theoretically accessible by this function, it
        // obviously doesn't use it. But because originalThing is part of the
        // lexical environment, someMethod will hold a reference to originalThing,
        // and so even though we are replacing theThing with something that has no
        // effective way to reference the old value of theThing, the old value
        // will never get cleaned up!
        someMethod: function () {}
      };
      // If you add `originalThing = null` here, there is no leak.
    };
    setInterval(replaceThing, 1000);


An excellent catch. The parse tree for the closure should be able to ascertain the reachability of the variables in scope so that fences around particular sets of variables can be established. To do that would require something a bit more sophisticated than a single taint id though. You would really want a taint 'flavor' such that for any closure and in variable in scope of that closure you could defined f(closure, var) which would return turn if that variable cannot be collected. I can't think off the top of my head how you would recycle identifiers without risking temporal tainting (where a new closure in scope with the same id came at a later time and double tainted the variable leaving it essentially uncollectable or collected early. Hmm, or maybe not since their address on the heap will be different you're probably ok with that.

FWIW that is certainly the twisty bits of garbage collected languages.


> The parse tree for the closure should be able to ascertain the reachability of the variables in scope so that fences around particular sets of variables can be established.

Sadly it's not quite that simple because of the behavior of `eval` (strictly speaking, direct `eval`): it can leak bindings into the surrounding scope.


The eval case is already being handled, where the entire local scope automatically captured if there is an eval.

It seems like ChuckMcM is describing is reference counting which variables are needed in the closure. His suggestion would fix this memory issue, but probably have a big performance penalty.


Right, but according to sections 10.4.2 and 15.1.2.1.1 of the ECMAScript standard, you only get lexical bindings inside your `eval` if it's literally a call to a thing called `eval`. (Which I believe is what you meant by "direct `eval`".)

So you should be able to statically determine if this is the case, and it's not here.


Nice! You've explained why this 'memory leak' could be by design - blame eval.

In fact, Chrome GC-ing unreferenced variables could break eval if they didn't cover that case.


Only direct `eval` though. As long as you don't see any calls to symbols named `eval`, you can implement this optimization.

See http://stackoverflow.com/questions/8694300/how-eval-in-javas...


Yeah, you could test this in the first example by replacing the string 'interval' with something like "alert(str.length)", and then redefining console.log=eval.

e: nm, I missed pcwalton's point -- when you call eval indirectly it isn't accessing the same scope. So redefining console.log couldn't access local variables like str. (I don't think?)


Correct. Not only is eval not a good idea in general, but it's a very weird beast in javascript. See http://perfectionkills.com/global-eval-what-are-the-options/ for a very complete exposition of its weirdness.


ewww, yuck! Absolutely right about that. So you need to plug the tainting into the code generation perhaps? Suggests that both a reachability test (can the code ever be reached) combined with a taint test (code that can be reached and has this variable in scope). Then there is code that could exist if instantiated but doesn't yet.


First, it is not a bug in JavaScript at all.

Here is a technical explanation of why there is a memory leak and how to fix the problem.

The scope chain of Closures (in JavaScript) contains the outer function(s) activation object. The activation object of a function contains all the variables that the function has access to, and it is part of a function’s scope chain.

This means the inner function (the closure) has access (a reference) to the outer function’s scope chain, including the global object. And even after the outer function has returned, the closure still has access to the outer function’s variables.

Therefore, the activation object of the outer function cannot be destroyed (for garbage collection) after it has returned, because the closure still references its variables.

When the outer function returns, its own scope chain for execution (its execution object) is destroyed, but its activation object is still referenced by the closure, so its variables will not be destroyed until the closure is destroyed.

The execution context of a function is associated with the functions’ activation object, but while the execution object is used for its own execution and is destroyed when it returns, its activation object is referenced by closures—its inner functions. 


Now, as to the specific example in question: The reason the str variable is never destroyed is because it is referenced buy the logIt function because the logIt function's execution object references the entire scope chain of the run function, and the logIt function is never destroyed, so the str variable remains in memory.

As the original author (OP) suggested, be sure to dereference any local variable in the outer function that the closure is using, once the closure is done with it or once the outer function is done with it.

Also, simply setting the logIt function to null (when it completes execution—returns) will allow the str variable and the entire scope of chain of both the logIt and the containing run function to be destroyed and ready for garbage collection.


For a detailed explanation of closures in JavaScript, see "Understand JavaScript Closures With Ease": http://javascriptissexy.com/understand-javascript-closures-w...


You seem to be describing behavior that the article did not. Specifically, the retention of 'str' because of the 'logIt' function.

If you read more closely, you'll notice that logIt was not the cause of 'str's retention, but instead, the doSomethingWithStr function was.

When logIt is present by itself, str is not retained. Only when doSomethingWithStr is added alongside logIt is str retained.


nknighthb, I am just wondering. Did you downvote me because you think one part of my explanation was not specific enough?

First, you are correct that I specifically mentioned the logIf function when I discuss the specific example.But that does not take away from my thorough explanation of the main reason for the problem. In fact, everything I said about the logIt function applies to the doSomethingWithStr function, since they are both closures, so my explanation stands as is.

If you read my explanation again, you will see that I clearly explained that closures still have access to the outer function's scope, so both the logIt and the doSomethingWithStr functions have access to the outer function's scope chain even after the outer function or any of the other closures returns.

It is not until both all closures are destroyed or returns that the outer function's scope activation object is destroyed.


If your explanation were all there was to it, then 'str' would not be destroyed when logIt alone were present. But it is.

logIt alone -> str is garbage collected

logIt + doSomethingWithStr -> str is not garbage collected

Your explanation could explain the latter behavior, but it does not explain the former.

Edit: To put it another way, you are addressing only one of the scenarios presented in the article, and you are assuming that scenario results in a behavior the article specifically says does not occur in empirical testing. This is a key point in the article, and your "explanations" are ignoring it.


Are you saying my explanation does not explain the following? "logIt alone -> str is garbage collected"

I don't see why you don't understand why I am saying. My first post explains in detail how closures work and why the issue is not a JS bug, though it is indeed a memory leak. All I can say is that I have explained the overall inner workings of closures in JS, and my explanation covers all the scenarios outlined in the blog post.

This is the crux of the matter of the entire blog post:

"...str is only referenced in the main body of run, and in doSomethingWithStr. doSomethingWithStr itself gets cleaned up once run ends… the only thing from run that escapes is the second closure, logIt. And logIt doesn’t refer to str at all! So even though there’s no way for any code to ever refer to str again, it never gets garbage collected!"

And I explained why that problem is observed and why closures work the way they do.

PS. I will humbly agree to disagree: I think that I have explained all the scenarios outlined in the blog post and you disagree with me. This is not a problem; there is nothing wrong with disagreeing.


Try answering this question specifically: Why is str garbage collected when logIt still exists?


Good question: I take it you are referring to the code below. The reason str is garbage collected is because of the modern implementation of the JS compiler in Chrome. Such modern JS compilers look specifically for such unused variables in closures, even when the variable is in the execution context of the closure.

To clarify, it is Chrome's decision to garbage collect the str variable in that context that resulted in str being garbage collected. It is not a result of a JavaScript bug.

Moreover, different browsers have different implementations for how they garbage collect variables in closures' scope chain. Heck, browsers do their own thing when it comes to many aspects of JavaScript.

On second thought (from our previous back and forth), you are correct that I did not explicitly discuss this specific issue in my earlier post. So you win :) and I have up-voted your two comments accordingly.

Here is the code I think you are referring to: var run = function () { var str = new Array(1000000).join('*'); var logIt = function () { console.log('interval'); }; setInterval(logIt, 100); }; setInterval(run, 1000);


Yeah, that's exactly what I mean.

The confusing part of your posts has been that you seem to have been portraying the retention of 'str' as a necessary behavior. Nobody disagrees that it's a permitted implementation, only that it's the one correct or best implementation, and it's made worse by the inconsistency between the two cases, which itself invites confusion.


The C# compiler does the same thing:

http://blogs.msdn.com/b/ericlippert/archive/2007/06/06/fyi-c...

This is an old article, but I believe it still applies to the latest iteration of the language.


Does indeed still apply, and is actually formalized into the language spec itself [1] in section 7.15.5.1 Captured outer variables. So it's actually a feature (really) in the case of C#, not a bug.

[1] http://www.microsoft.com/en-us/download/details.aspx?id=7029

"the local variable x is captured by the anonymous function, and the lifetime of x is extended at least until the delegate returned from F becomes eligible for garbage collection (which doesn’t happen until the very end of the program)."


I don't see how that part of standard mandates that all anomymous functions share the same closure. The example only has a single lambda.


I don't know how to look at memory usage in other modern JS interpreters like those used in FireFox and Safari. Anyone able to check to see if they have this problem? (And if they manage to avoid the memory leaks in the first two code samples?)


In Firefox you could always do it manually with about:memory -- 1MB a second should be easy enough to notice. Not sure they've added a way to get a pretty graph yet, though. Probably Firebug has a way?

I wonder if using the "use strict" directive would let the browser optimize this more easily? Probably not, but Javascript does have a lot of crazy corner cases, and one of them might be preventing (or just making it harder to prove safe) the optimization.

e: Here's an example of one of those corner cases:

    console.log = eval
Now whether those functions reference a particular variable depends on the string they're printing! And this swap out of log() could be done at any time.

e2: Ah, as glasser and pcwalton point out, an indirect reference to eval doesn't work the same way as a direct call. TIL!


No, actually, you literally have to call eval as a function called eval if you want to get the local lexicals. See sections 10.4.2 and 15.1.2.1.1 of the ECMAScript standard, or try it out:

   > (function () { var x = 5; eval("console.log(x)"); })()
   5
   > (function () { var x = 5; var e = eval; e("console.log(x)"); })()
   ReferenceError: x is not defined


I haven't checked the first two samples but:

* Safari 5.1 exhibits no significant heap growth, as far as the timeline shows anyway

* Using top to observe private memory, Firefox does seem experience significant heap growth on the provided program


It's more of a surprising non-leak caused by clever GC that is insufficiently clever to handle pathological code. Interesting explanation, but yet another reason to be very careful of setInterval.


Not only setInterval though. This sort of leak can also show up if the closure is a DOM event handler or is otherwise exposed.


It can show up anywhere — and in fact event handlers are a major source of leaks in general — but the point is that the leak doesn't occur as frequently as you might expect not that it occurs.


That this kind of issue exists in javascript should come as a surprise to no one - there are probably a huge number of these kind of issues just waiting to be found. It's actually the key difference between a 'designed' language and an 'evolved' language.

EDIT: I find the down-votes very amusing. Has javascript now developed a cult? Read up on the history of javascript if my message above comes as a shock to you.


I think the down-votes are because you're attacking JavaScript without basis, and not constructively. This "memory leak" has nothing to do with the distinction between languages you make. And indeed, this kind of behavior in closures is by design (as I understand it), and not by evolution at all.


Fair point, but I disagree with you - fulled specced out designed languages generally specify how exactly the implementation should handle these cases. How references should be retained, when references fall out of scope, etc.

eg:

http://clang.llvm.org/docs/Block-ABI-Apple.html

http://download.oracle.com/otndocs/jcp/lambda-0_5_1-edr2-spe...

http://www.microsoft.com/en-us/download/details.aspx?id=7029

etc etc. I also don't believe I'm attacking javascript here, merely repeating what people have always known about evolved languages. Evolved languages have a lot of pros (mainly speed of new features and practical application), and they have cons too. What I'm saying is very simple: this type of bug in javascript should not be a surprise - it should be expected. There will be more.

EDIT: Laughably enough, while the ObjC and Java specifications both specifically address this issue, the C# specification actually formalizes this exact bug into the language specification. So I concede your point, C# - a designed language - simply designed the exact same bug into the language itself. See section 7.15.5.1 Captured outer variables.

Feel free to downvote this post - I now concede that designed languages are just as likely to design stupid bugs into the language as evolved languages are to create them by mistake. ;)


This is going to come across as incredibly condescending but I honestly don't mean it as such.

When you have some more experience and a greater breath of knowledge programming you will see this kind of thing happens all the time. Whether by design, by accident, by omission, by not quite being explicit, by implicitness, it happens in every spec, in every language and in every protocol.

You can find something like this in them all.


It's a bug in a complex element (GC). They happen. I wouldn't attribute it to a specific language, since even this post only mentions the latest Chrome.

There's a small reproducer example, so there will be a new version issued shortly. Not really something to get excited about in my opinion.

(Yes, I know this behaviour could be actually called ok and allowed, but you can do analysis on that code that will show the variable is not used. Even if not now, it can be fixed at some point in the future.)

((Downvoted as language bashing, rather than because I like JS - I hate it with passion for other reasons. "there are probably a huge number of these kind of issues just waiting to be found" is just a silly claim unless you're actually working on finding them or a true claim for any kind of general purpose software.))


I believe you are correct, my misconception regarding the disparities between designed and evolved languages that was nicely cleaned up for me by reading the specification for the designed C# language. Down-vote for my stupid comment appreciated, it got me to actually read the thing.


I hate JavaScript myself, but you're blaming it for what is actually an implementation problem, not a property of the language itself. It's also a problem I could see occurring in the implementation of any other language that supports closures.


This is a V8 problem, not a JavaScript problem. Any implementation of any garbage-collected language with higher-order functions could have this issue.


Actually, in this case it's not the design of JavaScript that's the problem. It's V8's implementation of closures. The two functions are sharing the same environment object, but there's nothing in JavaScript the language that dictates this implementation.


I wonder if this will temper the rise of the server-side JS frameworks?


Not likely - these kind of bugs are generally fixed when found, or remain hidden simply because that part of the language is very rarely used. You'll likely see this bug in Chrome fixed within a few months.

I was commenting more on the practical effects of having an evolved language - this is a known and likely outcome in an evolved language. Designed languages generally spend a lot of time trying to avoid these kind of issues - it makes sense that not spending that time and adding in features as they are thought up means issues like this will creep in somewhere.


The only language features at play here are closures and garbage collection. Pick your favorite "designed" language: if it has those two features, an implementation of that language could have this bug.


This problem is quite minor and obscure compared to many of the other inherent flaws (type coercion and the broken comparison operators, the lack of proper namespacing, broken scoping, semicolon insertion, hoisting, prototype-based OO, and so on) we see all throughout JavaScript.

If those problems weren't enough to convince certain developers that JavaScript is a bad idea, and unsuitable for use, then it's likely that nothing will. This is especially true for server-side usage of JavaScript, where there are so many far better alternatives.


I don't know how you can safely declare what's used vs unused in a language where properties can be dynamically accessed.

It's easy to check for dict.someProperty, but dict[propertyName] where propertyName can be the string "someProperty" is a much more complicated problem. Now you have to build a tree of everything that changes propertyName and make sure you know it can never be "someProperty".


You can't dynamically access scope objects in standard javascript[0] except by using `eval`, and `eval` already triggers special handling and deoptimizations in pretty much all browsers.

[0] outside of the global scope through `window` but all variables are clearly local here so that doesn't apply.


Doh! Good point.


Nice find. I've run into some inexplicable garbage collection issues where canvas elements in an early version of flot ended up never getting released. I could figure out no way to clear these, but I remember the flot code made heavy use of closures. I should go revisit the code and see if this might explain it. I ended up simply hanging on to the elements and reusing them, had to patch flot to achieve this.


The fix in https://github.com/meteor/meteor/commit/49e9813 seems to be just limiting the effect of the problem, not removing it.

The variable is still being held in memory just with the value null instead of whatever it was before.

Or does setting it to null trigger some special GC algorithm that detects that it isn't being used anymore?


There's nothing particularly special about null - it's really just that it's being set equal to any constant (or object that already exists for the life of the program). The object that was allocated and previously held in renderer will be GC'd (nothing special about that - it doesn't have any references anymore), and setting to null doesn't create a new allocation. You still have "renderer" as an entry in the dictionary of variables for the closure (which takes up a small amount of memory, presumably), but it's no longer preventing an object from being deallocated.


This article taught me how to find a major leak in my Dart2js code so thanks John McCutchan & Loreena Lee http://www.html5rocks.com/en/tutorials/memory/effectivemanag...


Looks like Go optimizes this fully, as pointed out by https://twitter.com/nynexrepublic/status/350717895971586049

If you run these on your own machine and peek at the RSIZE, it stays constant (well, the first grows slowly due to `logIt`).

http://play.golang.org/p/A5Pz-3kthP http://play.golang.org/p/RnXr_jB5Qh


>You could imagine a more clever implementation of lexical environments that avoids this problem. Each closure could have a dictionary containing only the variables which it actually reads and writes; the values in that dictionary would themselves be mutable cells that could be shared among the lexical environments of multiple closures.

That is basically how Lua implements closures.

https://bugzilla.mozilla.org/show_bug.cgi?id=542074


Kudos to the excellently talented Meteor team. A great product from a genuine customer (of a free product), who feels great that there is such attention to detail.


If we implement the suggested fix, and then someone evals code in that scope, they can get the surprising result that variables which look like they should be in scope actually aren't. Careful analysis of this will have many edge cases to worry about.

Implementing the simplest thing that is correct according to the spec sounds like a good idea to me.


Yes, but you can statically determine whether or not eval is called.

A world-class JavaScript environment like V8 is far past "the simplest thing that is correct according to the spec".

In fact, the JavaScript spec doesn't actually define anything about garbage collection! You can create a compliant ECMAScript-262 runtime that literally never collects any garbage ever. That's certainly the simplest correct thing. But it's a bad idea!

V8 already goes to the trouble of figuring out whether or not a given variable needs to be stored in the lexical environment or not. Specifically: variables that are not used in ANY closures and where there is no eval in sight can be stored outside of the lexical environment. This is great, and better than many similar programming language environments offer.

It just would be even better if they went one step farther.


V8 attempts to be fast. There is a trade-off between sophisticated analysis during compilation, and page performance.

I'm not as convinced as you that they haven't found the right balance.


This problem has been encountered and studied before: see Appel and Zhao's paper, Efficient and Safe-for-Space Closure Conversion.


Cool! A major reason I made this post was because I assumed that this must be something people had seen before, but I had trouble finding any references to it in the JavaScript context. I'll add a link to http://flint.cs.yale.edu/flint/publications/escc.pdf to the blog post!


The example code looks really silly. As others said, it's just bad programming.


I'm surprised this comes as a surprise to Meteor developers. References retained won't be garbage collected... regardless of whether the lexical scope is "really small" or not, lol.


The surprise is that a non-referenced value isn't collected, not that a referred-to value isn't.


That is surprising? A setInterval without a matching clearInterval is always a memory leak.


The surprise isn't that the memory associated with `logIt` is leaked.

The surprise is that `logIt` holds on to the giant `str` object in its environment, despite the fact that there is literally no way for the `str` variable to ever be used again.

V8 is smart enough to not make `logIt` hold on to `str` if there are no closures at all which refer to `str`. It's the introduction of the unrelated `doSomethingWithStr` closure that forces `str` into the lexical environment.


BTW, the original bug in the Meteor codebase wasn't a setInterval.

It was that code to replace a certain type of object (a Spark renderer) with a new instance of that object accidentally ended up with a reference to the preceding renderer in a closure assigned somewhere on it, even though that particular closure didn't actually use that reference.

So instead of replacing renderer #N with renderer #(N+1) and GCing #N, we ended up with a stream of renderers which never could be GCed... even though the reference keeping the old ones alive was literally impossible to ever use.


No it's not.

  setInteval(function() { console.log('foo'); }, 1000)
Is not a memory leak.

But I agree that it's not surprising at all. If you can access a variable from somewhere it will obviously not be garbage collected.


The point is that you can't access str from elsewhere, which is why it is surprising.


But you're still clearly retaining a reference to the lexical frame, so I'd think the "zeroth-order" assumption would be that you'd be keeping everything in that frame.

It's nice that JavaScript will do enough tree-shaking that it will eliminate the single-closure case, but that's an optimization, not something you should be counting on.

Similarly, I bet if you had a call to "eval" in there, you'd keep the entire frame, but I don't understand the exact semantics of JavaScript's eval enough to really say.

I suppose this boils down to whether your mental model is more "frame-like" or more "upvalue-like". If you think of keeping around the containing frame, it's not at all surprising that you retain the variable. If you think of the closure as a discrete set of upvalues, then it's a surprise.


> I'd think the "zeroth-order" assumption would be that you'd be keeping everything in that frame.

Why would that be a reasonable assumption? If a variable cannot ever be referenced, it should be garbage collected. In the example, "str" cannot ever be referenced. This is why we use garbage collected languages.

> but that's an optimization, not something you should be counting on.

It's not an optimization, it's correct GC behavior, and it's absolutely something you should be counting on if you're using a GC language.


There is no way to be sure it can't be ever referenced:

  function frame() {
    var str = "blah";
    return function() {
      eval("console.log(str)");
    }
  }
Edit: updated to use eval instead of setTimeout because setTimeout(string) do not share the scope.


eval already triggers a number of deoptimizations and specific behaviors because eval. It's not used in the example and thus entirely irrelevant.


I suppose I don't find it surprising, because I've used a lot of R, and R can't even optimize the simple case. There, it's for good reason, since you can extract the environment of any function, so once a closure escapes, you have a handle onto its entire containing frame.

I think I've used scheme implementations that have the same "feature", with less of an excuse.

It's a similar "failure of GC" if you create an object with two fields, like "obj={a: giant_value, b: 5}" then had your closure return obj.b. I bet the "a" slot would not be GC'd, even though there's no way to reach it.

If your mental model is that you're holding onto the lexical frame, it's a pretty much the same behavior.


> like "obj={a: giant_value, b: 5}" then had your closure return obj.b. I bet the "a" slot would not be GC'd, even though there's no way to reach it.

It absolutely should be GC'd. To demonstrate (on node again), this uses about 10MB of RSS, because obj.a cannot be GC'd.

  j = function() {
    var obj = {
      a:  new Array(1000000).join('*'),
      b:  5
    };
    return obj;
  }();
  setTimeout(function() {}, 30000);
If you change "return obj" to "return obj.b", RSS is about 9kb.

Edit: I think your understanding of GC might be coming from R's generational GC, which is a heuristic and quite different from V8's incremental GC: http://blog.chromium.org/2011/11/game-changer-for-interactiv...


V8's GC is both generational and incremental.

Generational can be actually viewed as a variation of incremental GC.


That allocates a closure. Now if you want it to sit around forever, then hey it is working as intended. Otherwise, it does "leak" some memory that will never get reclaimed by the GC.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: