

Java IAQ - Mitt
http://norvig.com/java-iaq.html

======
krrose27
Goes directly to "What other operations are surprisingly slow?" and attempts
to write micro benchmark with the same results.

Fails....

Most of the general stuff is accurate and dandy but I don't believe you should
listen to many (most likely any) of the speed related statements as this
article appears to have been written in 1998.

Topic should be "Java IAQ circa 1998".

------
nailer
I have an IAQ of my own:

Why is it that Java VMs seem to have such different interactive latency when
compared to other memory-managed language VMs? Is the performance of eg,
eclipse or Android or the Java browser plugin something to with the GC kicking
in too aggressively?

Over the last decade I've heard different explanations re: SWT/swing/AWT
toolkits, and server-side optimizations being default in desktop VMs, but I'm
fairly sure both issues should be resolved now. Nevertheless, my general
experience seems to be if it's Java based, it seems to handle touch/mouse
events noticeably worse than otherwise.

Maybe .net and Python and the rest are doing more of their UI work in C?

~~~
lars
It's been a while since I've done work in Java, but I'm pretty sure the GC is
not to blame for this. Java uses a concurrent mark and sweep GC. The
_concurrent_ part means it does not pause the world while it's collecting. So
as mbell says, it has to do with the UI frameworks.

But I don't think it has to do with whether or not the UI code is written in C
either. In the UI frameworks I've used for both Python and Ruby, you could
write slow interpreted code that would run on the UI thread. The key thing was
you tended not to do it, because the frameworks didn't encourage you to do so.
The problem is that Java frameworks make it easy to for example do IO on the
UI thread, whereas the other frameworks make it hard. Threading is relatively
hard to get right in any of these languages, so the choices made by the
framework authors really matter.

Another thing contributing to Java being perceived as being slow is the long
boot time of the JVM. (It used to be slow at least, don't know if it still
is.)

~~~
mbell
> It's been a while since I've done work in Java, but I'm pretty sure the GC
> is not to blame for this. Java uses a concurrent mark and sweep GC.

Part of the problem is that the default GC is not the ConcMarkSweep collector
in most installations but rather its the parallel "stop the world" GC that is
optimized for throughput, not latency. For web applications (or anything with
latency issues) step one is almost always to modify your JVM opts to enable
the CMS collector.

Unfortunately even the ConcMarkSweep collector can have issues. Its not that
hard to create a workload that it can't handle resulting in your app slowing
eating through all available heap space because the collector can't keep up
ultimately forcing a full collection which on large heaps can be devastating,
think 60 to 120 second pauses.

The G1 collector, which is no longer considered experimental in Java 7 is
'better' and you can specify a maximum pause time, but it can still run into
situations where it has to force a full GC and there are situations where the
CMS collector is still better.

------
smackfu
If you are interested in this kind of stuff, Effective Java is a must-read
book. Long, good explanations of how to do stuff right.

Also, the 2nd edition covers up through Java 1.6.

~~~
stiff
I second this recommendation, actually even if you do not program in Java nor
on the JVM. Like no other book it taught me a particular kind of foresight -
to analyse the consequences of each small decision being made during
implementing some new thing and to design things in such a way that they are
least likely to cause problems now or in the future, to me or to other
programmers working on the project - my code is much more solid and less
error-prone ever since and it was definitely a big AHA moment for me. The
author, Joshua Bloch, designed some of the standard Java APIs and I actually
can't think of a better way to learn about those things than implementing an
API used by other people, which explains why he has so many interesting things
to say that are seldom found in other places. Here is also a very nice talk by
the author:

<http://www.infoq.com/presentations/effective-api-design>

------
js2
FWIW, last changed July 13, 1998 according to the source.

------
peeters
This appears to have been written before IDEs were invented. The author
suggests that it's acceptable to extend a class just to gain unqualified
access to its static methods to save a few characters of typing.

~~~
tptacek
Did you read "Scheme In One Class" at the link he attached to the paragraph
you're sniping at?

~~~
peeters
Yes I have, and in that he reiterates that it's just to achieve unqualified
access (i.e. save a few keystrokes, which is not an issue with any moderately
good IDE):

> I came up with the idea of putting the utility methods in their own class,
> SchemeUtils, and then having the five other top-level classes extend
> SchemeUtils. That way, I can use the unqualified name, and I get the
> modularity I want.

It's not a huge deal in his project since he obviously wasn't using OOP
anyhow, but it really is an abuse of inheritance with no benefit.

~~~
Drbble
What IDE will hide the qualifying class names of method calls? I was searching
for that yesterday, but the best I could so is syntax highlighting to make
classes display in a very faint color, but Eclipse can't distinguish type
declarations from method qualifiers from class declarations, nor will it fold
away the class names I want to hide, so it's not ideal.

Coding is 10% reading and 90% reading, so an IDE that only helps with writing
code is approximately worthless.

------
typicalrunt
_Java compilers are very poor at lifting constant expressions out of loops.
The C/Java for loop is a bad abstraction, because it encourages re-computation
of the end value in the most typical case. So for(int i=0; i <str.length();
i++) is three times slower than int len = str.length(); for(int i=0; i<len;
i++)_

Does anyone know if this is still valid advice? I recall reading somewhere
that the JIT has optimized around this problem.

~~~
Someone
For this example, I would expect that any decent Java compiler optimizes this
away nowadays. That does introduce a dependency between compiler and standard
library, but the benefit is large enough, given that this is idiomatic code,
and that Java has it easy here because Java strings are immutable.

For general collections, things are more difficult, but I still expect java to
optimize things in many cases. I haven't checked, though.

A C/C++ compiler has a tougher job. It can only move strlen calls out of a
loop if it can determine that the strlen you call came from the correct
<header> file and that there is no defined behavior under which either the
pointer or its contents change inside the loop.

If it cannot do either, it must 'call' strlen each time through the loop (That
'call' can be inlined, especially if the compiler knows it is the standard
library's strlen)

~~~
mbell
The thing with Java is that the compiler (that is, javac) does very little. It
will replace constant variable accesses with the value (things that are
defined as static final.....) but that is about it. This is why its very easy
to decompile Java source, complete with method and variable names.

All the real optimization is done by the JIT which usually doesn't 'kick in'
unless a segment of code gets called many times, in client mode I think the
default in the hotspot vm is 1500, while in server mode is lower but I can't
remember the default. I generally manually set it to something around 200 for
sever apps.

As a result of this, the JIT may optimize this loop, it may not, it may
optimize in the middle of a test, it all depends on how often the code its
called. You can easily get different benchmark results by changing the
-XX:CompileThreshold JVM parameter and will see variability in performance
unless you 'warm up' the JIT forcing it to optimize.

------
typicalrunt
_But note the warning from Sun: "So when should you use static import? Very
sparingly!"_

I'm so glad there is an authoritative answer on this subject. IMHO, I hate the
use of static imports because it becomes difficult to know where a particular
method comes from... where it's an instance method from the class itself or
one that is brought in through a static import.

And then there's the possibility of clobbering the namespace by including a
local variable name and a statically imported variable.

~~~
peeters
I don't like them in general either, but there's one place I use them
extensively: JUnit tests (statically import Mockito, Hamcrest, Assert, etc).

It mostly has to do with how much the library tries to be a DSL, and how much
readability will suffer if I don't use a static import. IMO, this

    
    
      assertThat(6, isGreaterThan(5));
    

reads a heckuva lot better than

    
    
      Assert.assertThat(6, Matchers.isGreaterThan(5));
    

Of course the combination of poor Java type inference and Generics-heavy
libraries make it so that often I have to do the latter to make use of angle
braces, but that's another story.

~~~
ZitchDog
It really should be Assert.that(6, Is.greaterThan(7)), shouldn't it? Best of
both worlds!

------
KonradKlause
The page links also to a C IAQ: <http://www.seebs.net/faqs/c-iaq.html>

I've never seen that many wrong C statements. :-)

------
lucian1900
A little outdated, the one about HashMaps and equals is moot with generics.

Still quite interesting.

~~~
swah
Its not like Java is changing that much :)

Is this still true these days: "creating a new object is a fairly expensive
operation" ?

~~~
oconnor0
Creating a new object is never free, but it used to be far more expensive. To
my knowledge the original Java GCs weren't compacting & had to find the best
place in the heap to stick a new object. This required a linear search of the
available free blocks. This is obviously much slower than incrementing a
pointer at the edge of your used space. Unfortunately my search-fu has failed
me & I can't find corroboration of my claim.

My impression of Java these days is to start with the "right" thing (allocate
as needed, etc) and not to overly concern yourself with performance until it
becomes an issue. There are certainly situations where allocating lots of
objects (say billions or, maybe, millions) causes performance issues, but
creating object pools to preemptively avoid that is a lot of extra work that
may not be worth it.

