A 12-year-old bug in JDK, still out there leaking memory

brown9-2 · on Dec 17, 2012

The odd behavior of java.net.HttpURLConnection should be a lesson to language/framework designers everywhere: don't attempt to implement caching/object reuse/connection keep-alives deep in underlying API code with no way for the end user of the library to control the behavior.

java.net.HttpURLConnection and java.net.HttpURLConnection will attempt to use Keep-Alive and keep connections open for you (which should be a great performance boost if you are making many connections to the same hosts over and over again), but they do so silently and without any obvious documentation on how you can control that behavior in your own code. This leads to lots of headaches if you as a developer actually want to be able to control the connection behavior - for instance, there are quiet and undocumented bugs with HttpsURLConnection not being able to reuse the connection unless you consume all of the response stream (even if you call connection.close() and stream.close() explicitly).

There is also no obvious way to monitor the sun.* code that does the actual keep-alive work to see how many connections are open, to verify if it is able to use keep-alive or if it has to keep opening new connections, etc (without resorting to tcpdump).

The classloader leak is the root problem but the trigger in this case is the fact that the framework code is trying to a whole lot of extra, undocumented, silent work for you. The ability to disable keep-alives in the code that caused the initial report could have been a viable workaround.

thebluesky · on Dec 17, 2012

As indicated in the article it's not still "out to get you" since it was fixed in March 2011.

From the bug report: Submit Date: 2010-12-22 Resolved Date: 2011-03-10 That's less than 3 months.

sgt · on Dec 17, 2012

I'm using 1.6 for development purposes (I will change to JDK 7 very shortly, but it hasn't been possible until now for other reasons), and I am literally plagued by the PermGen problem every single day, perhaps up to 3 times an hour. Increasing PermGen space is not a solution, as it only means that - yes, it'll take longer for PermGen space to run out, but the Application Server (e.g. Glassfish) will become incredibly slow before it breaks.

brown9-2 · on Dec 18, 2012

Are you doing lots of redeploys? Can you alter your workflow to just restart the server process instead?

sgt · on Dec 19, 2012

Yes, doing a lot of redeploys. Restarting the server process takes a bit too long. E.g. if I have 50 EJB's and 20-30 JAX-RS type web services, it does take the application about 15-20 seconds to load each time. Sometimes even longer. So incremental redeploys are really great, as it usually takes 2-3 seconds instead.

ivom2gi · on Dec 17, 2012

As we speak, then according to this http://zeroturnaround.com/labs/devprod-report-redux-java-ver... Java 7 only has up to 20% of market share. This means that most of the apps out there are still affected ...

jtymes · on Dec 17, 2012

>Luckily the new patches to Java 7 no longer have this problem. But as the different statistics show – vast majority of the applications out there have not migrated to Java 7 as we speak. So most often than not, your application at hand has got the very same problem waiting to surface.

It's a tad misleading, but it's "still out there waiting to bite you" because of Java 6 still being widely used.

thebluesky · on Dec 17, 2012

JDK 6 is EOL 2 months from now: https://blogs.oracle.com/henrik/entry/java_6_eol_h_h

It's also pretty rare for folks to use that http client. It's normally the case that an apache commons or netty http client is used.

smackfu · on Dec 17, 2012

We'll see if that EOL actually happens. It's pretty irresponsible to discontinue security updates for a version that is used by half of your customers (even if you don't want them to use it anymore.) Very similar to Microsoft and XP support.

SeanLuke · on Dec 17, 2012

Frustrating but, I am sad to say, not unexpected. I myself have submitted major bugs in core Java code (java.util.Random, java.util.ArrayList, etc.) which have gone ignored for a decade and in fact are still there. If they can't fix core code, they're hardly going to fix HttpURLConnection.

michaelt · on Dec 17, 2012

What sort of bugs did you find in ArrayList?

SeanLuke · on Dec 17, 2012

For ArrayList, it was a huge speed regression. get(), set(), and add() were not inlineable for ten years due to an error fixable with a single line of code. The problematic code is still there even now, but it's moot because sometime between Java 5 and 6 the Hotspot team special-cased it.

Over the years I've found that while the Hotspot team seemed pretty sharp, the monkeys writing the library code were incompetent in implementation and negligent in repair even for core stuff. I eventually gave up submitting error reports. Don't even get me started on java.util.Random. :-(

brown9-2 · on Dec 18, 2012

Do you have links available for your bug reports?

SeanLuke · on Dec 18, 2012

Not since Sun->Oracle totally changed the bug reporting system, sorry. It was long ago.

In the ArrayList case, IIRC (it was years ago) the issue was the now-notorious RangeCheck function. For example, get(index) does this:

    RangeCheck(index);
    return elementData[index];

get() could be inlined, but because RangeCheck threw an exception if the index was >= size, RangeCheck was not inlined. But if you did this:

    if (index >= size)
        RangeCheck(index);
    return elementData[index];

... you could bypass it and calls to get() would involve inlined code unless the exception was triggered. Same went for set and add. Recent versions of HotSpot have mooted this now.

martinced · on Dec 17, 2012

The PermGen is a Sun / Oracle JVM specificity and these classloaders leak issues have been plagguing Java since years and years.

Back in the days (circa 2006 or so) when similar issues were very common but the knowledge wasn't that much out, we'd fix the issue by changing one of the "Sun JVM / Tomcat / I-don't -remember-which" component. For example switching to a non-Sun JVM or switching form Tomcat to Caucho/Resin etc.

Or simply just starting the SUN JVM with a bigger PermGen.

It's kinda sad to see how many PermGen issues there has been throughout the years.

brown9-2 · on Dec 17, 2012

I think this kind of behavior has led to a lot of teams discovering that it's just easier to stop and start new JVM/container instances rather than hoping that the "web application reload" feature of the container will actually work without errors.

advisory5739f2 · on Dec 17, 2012

I've never had redeploys work well. Restarting the JVM/container is the way to go.

TomRK1089 · on Dec 17, 2012

Seeing as you're not supposed to use sun packages, if this one bites you it's your own damn fault. Quite the linkbait title.

brown9-2 · on Dec 17, 2012

If you look at the bug report cited in the article, use of sun.* classes is not required. Using java.net.URL or java.net.HttpURLConnection causes the behavior - because the implementation of these classes uses sun.* classes.

TomRK1089 · on Dec 17, 2012

Hm, my mistake. I did not follow the referenced link. The article states "...last in this hierarchy is sun.net.www.http.HttpClient" which led me to assume you had to explicitly go mucking around with something access-restricted.