
Creating a memory leak with Java - am_sandeepa
http://stackoverflow.com/q/6470651/7338196
======
TimJYoung
So, a question: is it better or worse that the usage of GCs has changed the
mindset of developers to stop worrying about allocation/freeing of resources ?

Given that developers still need to worry about freeing/releasing native
resources, I think it might have inadvertently created a false sense of
security and, in the process, resulted in developers that don't even possess
this type of thinking anymore. We ran into this with the .NET Data Provider
for one of our database engines. A good portion of .NET developers that used
it were surprised to find out that they needed to call the Dispose method on
certain database objects in order to free/release their native
handles/resources. This was followed by a period of confusion about what to
dispose of, and what not to dispose of (like this:
[http://stackoverflow.com/questions/2024021/does-the-
dispose-...](http://stackoverflow.com/questions/2024021/does-the-dispose-
method-do-anything-at-all)).

~~~
jerf
"resulted in developers that don't even possess this type of thinking
anymore."

The good old days weren't. What it resulted in on average was developers who
just bashed on their manual allocation until it "seemed to work", not glorious
hand-constructed shimmering perfect crystals.

At least now the code that "seems to work" _usually_ does, and is memory-safe
now, so yes, it's an improvement. If you still want to be careful and do it by
hand in such a language, they're still there, and there's even better options
arising.

~~~
mschaef
> At least now the code that "seems to work" usually does,

Some of that may come down to better hardware. It's possible to be incredibly
naive with today's hardware and still get results that would've taken high
level engineering 20 years ago.

Edit: This is down-modded, but it's fundamentally good news. Increases in our
capacity to do productive work with these machines come from both software and
hardware improvements. Not sure what else you'd expect.

~~~
sgift
> Edit: This is down-modded, but it's fundamentally good news. Increases in
> our capacity to do productive work with these machines come from both
> software and hardware improvements. Not sure what else you'd expect.

The downvotes are probably from the realization that this is good and sad news
at the same time. On the one side, it is good news because many software
projects wouldn't have been realized if you couldn't be that naive, on the
other side who knows what would be possible with a bit more "discipline" (for
lack of a better word)

~~~
kpil
I think you could leak a little 20 years ago too, or use unnecessary amounts
of memory - in most cases.

It's just a few applications and parts of applications that really are
resource constrained and critical enough to really matter...

~~~
mcguire
Certainly. All those Unix-style filter programs had little need to be too
careful since they didn't run for long.

On the other hand, anyone know if the leak in the X server ever got fixed?
IIRC, it was small enough that it was only a problem running the server for
months on a resource constrained machine.

------
MartinBeckCop
I'm a hiring manager and when I see a resume with a few years of Java I ask
the same question.

It's all about objects being "unintentionally reachable" which generally all
boils down to putting something in a collection and never taking it out, or
completely forgetting about object lifecycle. Just because you have a runtime
and a GC doesn't mean you can forget any kind of teardown. (If you register an
event listner and never unregister you are ignoring lifecycle.)

People who have a few years of Java on their resume should have some idea of
when an object is eligible for collection by the GC and the common programmer
errors shown by other posts in this thread that can cause leaks.

~~~
daemin
That's the classic example of a memory leak in a GC language. Where you add
objects to some list or map so that they can be looked up quicker than being
created again. Some people call it a cache. What people forget though is some
way of removing the items from the cache, and so it leaks these objects for
the lifetime of the application.

Not really as sophisticated as other ways of leaking memory, but it can also
be easily done in all other languages.

------
msluyter
Seems like there's some debate over what "memory leak" really means. On the
one hand, you can just inadvertently grow some resource -- say, by appending
to a list indefinitely -- until your sever crashes. But in this case, if the
list goes out of scope it should be garbage collected, so such memory is in
practice reclaimable.

The more interesting case is allocation of memory which is _not_ in practice
reclaimable (as described in the first answer.)

The former seems pretty trivial to understand, but if the original
interviewers meant the latter I doubt I would have been able to come up with a
good example off the top of my head.

~~~
tannhaeuser
Memory leaks on the server side are also a consequence of long-running single-
process server software. What was traditionally run as a CGI (eg. Apache
creating a fresh process for each request) or was started from inetd is now
frequently implemented in a multi-threaded or evented fashion, so that using
the O/S as garbage collector for memory and temp files doesn't work anymore.
Not only does this put additional strain on getting memory management 100%
right, it also creates memory fragmentation when done naively. While the
consensus and narrative seems to be that evented I/O is more performant
because the O/S doesn't have to schedule loads that have "nothing to do anyway
most of the time" so to speak, I've yet to see a benchmark comparing
traditional process-per-request vs. multithreaded and/or evented approaches;
it's not that user-space memory management doesn't introduce additional
overhead.

My suspicion is that in many cases memory management strategies of desktop
software have been used for server-side software without evaluation, or out of
a habit because Java introduced GC into server software. But OTOH even
OpenBSDs "new" (3-5 years old) httpd integrates traditional CGIs via the so-
called "slowcgi" bridge, and uses evented I/O and asynchronous APIs natively
so maybe I'm wrong on this one. If even OpenBSD developers themselves don't
use ASLR and other techniques of OpenBSD to ensure non-deterministic memory
allocation and go to great length to get equivalent protection within single-
process request processing containers, they sure must be on to something.

~~~
algesten
> Memory leaks on the server side are also a consequence of long-running
> single-process server software.

On a tangent to this. I noticed that one benefit my company reaped from
embracing a micro service architecture was that we get away with _a lot_ of
shit code.

Suddenly a NodeJS process can have glaring memory leaks and work fine for
weeks because whenever it crashes, it restarts in 0.5 second and no one even
bothered to investigate.

Compare that with the good-ol-JBoss monolith with a 3-4 minute startup time.

On the one hand, crap code can go unnoticed for weeks, which feels wrong and
probably will bite us somewhere in the end, on the other hand, we can focus on
function not code quality.

~~~
koolba
Plus if you build for this from the get go and assign a max life to your
processes with auto restart, it becomes a native part of of their lifecycle.

With multiple processes listening to the same socket it's easy to create a
continuously on service that under the covers respawns with no apparent loss
of service.

------
tyingq
The most common ones I've seen are unclosed sockets or other connections due
to bad logic in try/catch/finally blocks.

~~~
djsumdog
This was my first major memory leak; specifically not closing all my JDBC
result sets. Fun fact, if you get table metadata, that creates a result set.
:-P

------
mschaef
Easy to overthink this: create a memory leak by retaining un-needed
references.

I forget where I read it, but I once saw GC characterized as changing memory
management from managing the nodes of graph to managing the edges. It's more
automatic and safer, but there is still a finite heap and there are still
pitfalls.

~~~
rbanffy
Just create a Vector and keep adding things to it, indefinitely.

I once encountered that in a production environment (circa 2000). The server
kept stats while it was running, appending them to a vector on every iteration
of its main loop. On the development environment, on x86 processors, it took
days to exhaust the memory and the issue was never spotted. On the production
environment, on an 8-way ridiculously fast SPARC machine, it took a couple
hours to completely lock up the machine.

~~~
sorokod
Vectors haven't been used that much this century.

What you describe is not a leak. A leak would be when you loose all references
to your Vector and memory is not reclaimed by the GC.

~~~
daemin
I'd argue that a memory leak is when you've allocated memory that you are not
using any more but it has not been freed for some reason. If this happens
because the reference has been truly lost, or if there's something else (like
a helpful cache) that is keeping it alive, it doesn't matter. It is still
leaked and will only be cleaned up when the program exits.

This applies to resources in general, if they are only reclaimed by the OS
when the program exits, then it is a leak.

~~~
krapp
>if they are only reclaimed by the OS when the program exits, then it is a
leak.

If this is true, then memory leaks have become a paradigm. I've seen it argued
that you should never bother freeing memory yourself, and just let the OS take
care of it. It always bothered me.

~~~
daemin
I think the statement refers to when already shutting down the program. In
that case you don't need to worry about freeing OS managed resources as the OS
will do that for you when your program exits. Which kind of makes sense but it
isn't easy to do without some global flag that you can check and a non-normal
destruction path for your objects/data.

------
nathan_f77
That's funny that this would come up on HN right now.

I've just spent the last day or two struggling with a potential GC bug on
Android. It might be somewhere in JavaScriptCore, Android, or Genymotion, but
I really have no idea. The last few hours have consisted of repeatedly
clicking through my app, and trying to figure out why it crashes every 5
minutes or so. In this case it's not a memory leak, it's memory that is
getting wiped for no good reason.

I've also posted about it on StackOverflow [1].

So far I've really enjoyed working with React Native. iOS has been almost
perfect, but all my problems seem to be happening on Android. There was one
case where my components were just disappearing randomly during animations
(even without native drivers). The only workaround was to rotate the component
"keys" so that they were regularly destroyed and recreated. The native
animation drivers for Android are also incredibly unstable and buggy [2]. Ah
well, I would still recommend React Native, but only if you're already
comfortable with Cocoa, ObjC, and Java. You're going to face a ton of
roadblocks if you can't get your hands dirty with native code from time to
time.

[1] [http://stackoverflow.com/questions/43470160/in-a-react-
nativ...](http://stackoverflow.com/questions/43470160/in-a-react-native-
javascript-app-why-would-the-android-gc-behavior-change-if-i)

[2] [https://github.com/facebook/react-
native/issues/13530](https://github.com/facebook/react-native/issues/13530)

[3] [https://github.com/react-native-community/react-native-
blur](https://github.com/react-native-community/react-native-blur)

~~~
BoorishBears
Because I've worked so much with native Android bugginess (versions of Android
that crash if you try and pause an HTML5 video, crashes after X number of
MediaPlayer loops where X is a large number around 5000, etc.), it can hard
for me to use React Native because there's so many workarounds you're hoping
the people behind components are implementing unless you want to start
augmenting them yourself

~~~
nathan_f77
That's actually one of the reasons why I like React Native, because I'm hoping
that it can be like "jQuery" for Android apps (in terms of jQuery's cross-
browser compatibility). I'm assuming that the core RN code already contains a
lot of workarounds for different Android versions, so we can just use one
consistent API via JavaScript.

But you're right that it can get very tricky when you want to use a lot of
third-party libraries. For example, I thought I would be able to just drop in
'react-native-blur' [1] and everything would work out of the box. I didn't
realize that the Android component has been broken for a very long time, so I
had to dive into the code and fix it. And then I ended up basically rewriting
the whole library for iOS and Android, including the example apps, the README,
and even the GIF in the README. But the nice thing is that other people can
use it now, and they won't have to go through all of that. The other nice
thing is that I learned a ton about developing native modules for React
Native, so I don't regret it at all.

Just think that you're going to have to implement those workarounds anyway, so
it makes sense to contribute them to an open source library that everyone can
use. For your video examples, it would be awesome if you could take a look at
react-native-video [2], and see if they already include those workarounds. (I
would personally appreciate that a lot, because I'm working on a cross-
platform RN app that plays a lot of videos.)

[1] [https://github.com/react-native-community/react-native-
blur](https://github.com/react-native-community/react-native-blur)

[2] [https://github.com/react-native-community/react-native-
video](https://github.com/react-native-community/react-native-video)

~~~
BoorishBears
The fear for me is running into a situation where I unexpectedly have to come
up with the workaround. With native I generally know where the bugginess is
and plan for it, there's a small layer of indirection when you use RN. I say
small because to be fair, in a world without deadlines I'd just vet every
library I want to use upfront. But sometimes things slip, and finding out the
component you used is failing on <insert specific android version or Samsung
model> before right before release is a little scary (but again, even with
native that's a risk).

------
AldousHaxley
I get the value of garbage collectors, but honestly until Go I wasn't too fond
of them, as I'd find myself spending about as much time thinking about memory
management in Java as in C. At least with C you don't have the same lack of
determinism as you do with traditional GC languages.

I find recent language trends interesting. Moving away from VMs to native
binaries, away from Java-style GCs and embracing ARC and smart pointers. I
have a strong bias toward minimalism, so I like the trend. If it serves to
remind people that CPU time and memory, while incredibly cheap, isn't free,
then all the better.

------
mabbo
Threads. They solve so many problems, and they also let you really mess
yourself up.

Every service request would new some worker object which would in turn spin up
an FixedThreadPoolExecutorService (or something like that) to handle some
concurrent work it needed before returning.

But I failed to properly close or dispose or whatever it was with the executor
service. So the threads? They stuck around, doing nothing. And every service
call that came in, another K threads were created.

Thank god we had a good metrics system tied into the JVM stats or we might not
have even noticed, except that processes seemed to die from time to time.

~~~
mschaef
> Every service request would new some worker object which would in turn spin
> up an FixedThreadPoolExecutorService

I've recently reviewed (and failed) code that did exactly this. Maybe it's a
more common misunderstanding than I'd have thought.

------
mmimica
People deal with memory leaks in Java on daily basis, eg:
[https://medium.com/@milan.mimica/everybody-
leaks-f210631f13e...](https://medium.com/@milan.mimica/everybody-
leaks-f210631f13ef) [http://www.evanjones.ca/java-bytebuffer-
leak.html](http://www.evanjones.ca/java-bytebuffer-leak.html)
[http://www.evanjones.ca/java-native-leak-
bug.html](http://www.evanjones.ca/java-native-leak-bug.html)

------
wruza
Almost the same method for e.g. Lua:

    
    
      debug.getregistry()[{}] = big_value
      debug = nil
    

These aren't really "true" leaks, because from GC point of view the
referencing object in both cases (ClassLoader, Registry, etc.) is just a
hidden gc-root that cannot be reset from regular code. It is the same way you
cannot un-leak this C++ pseudocode:

    
    
      main()
      {
        p = malloc(1_000_000 * sizeof(int));
    
        App app;
        return app.run();
      }
    

... because active stack frame is also a gc-root. From practical point of view
though, this definitely leaks, since leaked data cannot be [re]used anywhere.

~~~
wyldfire
I don't see how this example relates. Your C++ example is a real puzzler. Even
if `p` were in use during run(), it could still be (and should still be!)
free()d before returning from main(). It's a bit tricky but there's
potentially LOTS of code that will execute after "`return app.run()`" but
before the process exits.

~~~
bpicolo
seems like something that optimizer would get rid of.

~~~
wyldfire
No, the static destructors for example cannot be optimized away because their
effects are intentional and required. In C++, close braces have potentially
limitless implicit code that gets executed.

------
haddr
I'm amazed that nobody mentioned in comments that this memory leak is not a
"typical" memory leak, but rather quite an obscure one.

It happened to me once (through some 3rd party library), and before that I had
no idea that PermGen leaks even exist...

I'm not sure if you can spot this memleak by only monitoring the heap space.
The good news is that you can observe it in VisualVM when you watch the
"classes" chart (loaded classes count, bottom left) and "PermGen" tab (top
right, but you have to click the PermGen tab).

------
mcculley
There are a lot of answers here and on Stack Overflow that are a lot more
complicated or esoteric than they need to be. The simplest example in my mind
is a vector of references (e.g., java.util.Vector, java.util.ArrayList). In
the implementation of removing a tail element or truncating the vector, one
must set the removed references to null, otherwise the objects they point to
cannot be collected and yet they are unavailable to users of the vector.

~~~
jdmichal
I'm not quite sure where you're getting that interpretation... Of course,
_any_ removed element must have its reference replaced with `null`, because
otherwise that's a live reference. This is true for _any_ object, not just
collections. Your program not being able to reach it is a function of the
interface, not the object.

I think that a separation needs to be made between _purposefully_ and
_unpurposefully_ unreachable or retained references. For instance, it's
perfectly valid for a method to return one of two internal references based on
a boolean flag. So both those objects are still reachable, even if only one
reference is "purposefully" reachable at once. That isn't a memory leak; it's
just programming logic.

~~~
mcculley
One can implement a class with an interface like java.util.List and have
remove() just decrement an internal counter when the item to be removed is at
the end of the list. This would be a correctly functioning implementation. One
must go further and set the reference to null to prevent a leak.

~~~
jdmichal
My point is that it's only a "leak" because the List interface makes it clear
that the program no longer intends to reach that object. So this would be
"unpurposefully reachable" \-- The program did not intend for the object to be
reachable, but it is. This has nothing to do with the List interface, and
everything to do with how a interface is used and what intent that usage
proves.

~~~
mcculley
I think we are arguing over the definition of "leak". When you have one
developer producing a data structure used by another developer, this kind of
leak is an easy mistake to make. I think in the sense of answering the Stack
Overflow question, that makes it a more useful answer than the examples about
ClassLoaders and non-heap resources.

------
aliakhtar
You have to work pretty hard at it, the sample code they provided is pretty
long:
[https://gist.github.com/dpryden/b2bb29ee2d146901b4ae](https://gist.github.com/dpryden/b2bb29ee2d146901b4ae)

And it extends ClassLoader which you wouldn't normally ever do.

------
ruleabidinguser
Would you really expect a programmer you're hiring to know this? Is this
useful knowledge?

~~~
Spivak
Getting the answer right isn't really important, but the candidate's answer to
this question potentially reveals a lot of information.

* Most likely they've never had to think about this before. How to they approach unfamiliar territory?

* Do they have a deep enough knowledge about Java that they could at least make a conjecture about how such a thing could be created?

* Do they understand that Java has a GC and how it works?

* Are they the rigorous definition type and declare that such a thing is impossible without a bug?

* Will they try to interpret the interviewer's meaning and talk about unintended long-lived references or ways to accidentally consume a lot of memory?

Overall I like questions like this that require some creativity to answer.

~~~
MartinBeckCop
Yes! I am a hiring manager and this is exactly why I ask this question.

It's a very tellin question if they claim Java experience and can't say
anything coherent about leaks or "unintentionally reachable" objects.

It means both that they haven't had to attack memory pressure issues, haven't
used heap dump tools, etc., and haven't had the intellectual curiosity to know
how the platform works.

~~~
stlHusker
Bull.

The question as-is ("How do you create a memory leak in java?") is poorly
constructed and will, a significant portion of time, lead down the wrong path
when interviewing. Both you and the interviewee have to make assumptions.

If you want to know if they have had to deal optimizing memory usage, simply
ask. What was the problem? How did you detect it? How did you solve it? No
ambiguity, no assumptions.

"It's a very tellin question if they claim Java experience and can't say
anything coherent about leaks or "unintentionally reachable" objects."

You didn't answer the root of the OPs question -- why is that knowledge
important to the application you are building or the work you are doing?
Sorry, you failed his interview...

------
smsm42
My experience always has been if you run any sizeable Java app for a while,
you've created it :) Now, "how you find and eliminate a memory leak in Java"
is a million-dollar question. In some cases, it may be literally.

------
lightlyused
If you have a leak that you are having trouble finding, it pays to search the
issue queue of any library you are using to see if they have caught something.

------
louithethrid
Distributed systems, holding references to objects on other distributed
systems. Loosing the connection, the refrence persists- the memory leaks.

------
rahilb
would sun.misc.Unsafe.getUnsafe.allocateMemory(Long.MaxValue) work?

Is an intentional memory leak still a _leak_?

------
gildas
What about using JNI for this?

------
OhHeyItsE
import org.apache.commons.logging.Log;

