
Benchmarking hash table implementations: C++, Java, LuaJIT - hwh
http://lonewolfer.wordpress.com/2014/03/13/benchmarking-hash-table-implementations-in-different-languages/
======
hwh
Mike Pall (LuaJIT author, maintainer) makes the point that the case presented
isn't close to real applications:
[http://www.freelists.org/post/luajit/LuaJIT-
benchmark,6](http://www.freelists.org/post/luajit/LuaJIT-benchmark,6) And
first and foremost, according to Pall, "it's just measuring the overhead of
branch mispredictions and cache misses for an atypical use case of hash
tables."

~~~
arh68
If I could rerun this afternoon, what test case should I use? _What is_ a good
hash test? Mike says

 _[Typical hash table accesses use a) string keys with b) a high temporal
correlation and c) they usually have a high hit-rate or a high miss-rate, but
it 's rarely 50/50.]_

Surely there is a simple, simple routine that does all this. I want to see
this benchmark done right. If it's so simple, why hasn't anyone written it
down? The author updated:

 _If have some extra time I 'll test some variables later on I will test
different hit rates and key types. If you have a reference to a source where
statistics around hash table use cases is investigated I'll use that as a
basis._

------
arielweisberg
When microbenchmarking Java it is a good idea to use a framework to make sure
you get an accurate result despite the compiler trying to eliminate your code.
In their case it looks like they are getting real numbers since Java and C are
the same order of magnitude, but blog posts like this are where people learn
benchmark hygiene.

I recommend JMH which I believe is what was developed @ Oracle to handle this
task.

I used it to microbenchmark various Java maps for small map sizes and showed
some of the results. [https://groups.google.com/forum/#!topic/mechanical-
sympathy/...](https://groups.google.com/forum/#!topic/mechanical-
sympathy/dxbTH30_ok0)

I also found the discussion of various map implementation tradeoffs and
caveats useful. That Google group is a great resource.

One thing you have to keep in mind when benchmarking maps is locality and
cache affinity and how things will work outside the benchmark when things
don't fit in cache or have been evicted by your application.

In my case map lookups showed up in profiles relating to various lookup tables
that define logic and policy. There can be a surprising number spread out as
well as some hit hard in tight loops.

------
hamidpalo
The biggest issue with this benchmark is the unrealistic data model and the
access pattern.

From my admittedly limited knowledge a better benchmark would involve:

\- String keys because who uses int keys? It would be important to create a
new set of query keys for retrieving values that would contain a mix of newly
created strings as well as older ones to account for any hashcode memoization.
\- Non-primitive values. This would address the boxing overhead someone
mentioned and would be much closer to real life usage. \- Many more runs.
Since there are performance implications of data access patterns due to
spatial locality using a larger number of runs this would help even those out.

Another question is, why? The ultimate choice of Java vs C++ won't depend on
how fast the standard hash tables are.

------
vsbuffalo
A really nice comparison of just C/C++ hash libraries is
[http://attractivechaos.wordpress.com/2008/08/28/comparison-o...](http://attractivechaos.wordpress.com/2008/08/28/comparison-
of-hash-table-libraries/)

Good discussion too. khash.h uses macros for C templating, but is quite fast.

------
pron
The Java benchmark probably shows mostly boxing overhead, which probably isn't
the bottleneck in most real applications. It is, therefore, more interesting
to test with strings.

As another comment suggests, large maps are likely to be accessed
concurrently, and so a comparison of concurrent hash maps would also be
interesting.

~~~
blibble
he's also looking up the key twice (containsKey() and then get()), against
once in C++

~~~
rjzzleep
that's because in java you can't distinguish between no object found, and null
with the default hashmap implementation. in this "benchmark" it would probably
not matter though.

> A return value of null does not necessarily indicate that the map contains
> no mapping for the key; it's also possible that the map explicitly maps the
> key to null. The containsKey operation may be used to distinguish these two
> cases.

[http://docs.oracle.com/javase/6/docs/api/java/util/HashMap.h...](http://docs.oracle.com/javase/6/docs/api/java/util/HashMap.html#get\(java.lang.Object\))

~~~
loeg
Assuming you care about the distinction between null and no such key (for many
applications you can avoid storing null, IMO), you can actually perform the
containsKey() _after_ the get() for only keys which get() returns null. I
think that see all of the perf overhead of a single lookup if most values are
non-null.

------
rjzzleep
i feel like there's something missing here. shouldn't there also be a graph on
how much memory usage there is in the 3 implementations?

~~~
lazyjones
That and probably at least 5-6 other popular languages to make it
worthwhile...

~~~
RyanZAG
Plus you're just testing the standard library implementations. If the hash
lookup was the major focus for your program, you would be using a specialized
hash map implementation.

eg, [http://trove.starlight-systems.com/](http://trove.starlight-systems.com/)
or [http://acs.lbl.gov/software/colt/](http://acs.lbl.gov/software/colt/) in
the case of the Java version.

EDIT: Here is the kind of performance differences you get between hash
implementations to demonstrate why you wouldn't just use the standard
implementations if it's a performance critical part:
[http://b010.blogspot.com/2009/05/speed-comparison-
of-1-javas...](http://b010.blogspot.com/2009/05/speed-comparison-of-1-javas-
built-in.html)

Shows Colt int map running 3.5x faster and with 65% of the ram usage as the
standard hashmap.

~~~
JoshTriplett
> Plus you're just testing the standard library implementations. If the hash
> lookup was the major focus for your program, you would be using a
> specialized hash map implementation.

If there's a better general-purpose hash table, the standard library hash
should be using it.

(Special-purpose, sure, use a custom data structure.)

~~~
RyanZAG
In the case of Java, there is a lot of inefficiency in the standard hash table
in the form of Entry<T, K> objects for making it a bit nicer to use. There is
also a lot of extra state stored for some of the extra features offered.
Everything has a cost. If you just need a straight Int->Int or Int->Object
map, there are a lot of ways to make it far more efficient.

The standard library doesn't contain these, because in 99% of apps, the actual
performance of the hash table is a very minor part and a small bit of overhead
for added convenience is worth it. However in a program where the hashmap is
the key point, you'd obviously choose to optimize it to fit your goals.

------
lonewolfer
khash is in part 2: [http://lonewolfer.wordpress.com/2014/03/17/benchmarking-
hash...](http://lonewolfer.wordpress.com/2014/03/17/benchmarking-hash-table-
implementations-part-2/)

~~~
srean
Now you are never going to satisfy everyone with your choice of the contenders
:) If you allow C libraries it would interesting to see where Judy falls in
that spectrum, it has quite a reputation for being fast, although the code
itself is quite impenetrable.

~~~
lonewolfer
Very true, that I will not. I might test Judy though I suspect it won't be
fast enough to qualify.

------
taeric
I'm curious what the takeaway from this is. Specifically, I'm curious how well
these results translate to the use of a hash table.

If your application is even 80% giant hash table lookup, I would presume it is
doing so in a fairly highly multithreaded environment. Caching systems, maybe?

Otherwise, I would think the times where this benchmark matters are few and
far between. If you find you are looking up a ton of items in a hash table in
such a way that that shows up in a profile, you are probably best off moving
to a sorted list and just iterating through the items as you need. Right?

------
dognotdog
Weird how the higher order (imo wrong) interpolation on the graphs immediately
gives the whole article a suspicious flair.

------
thrownaway2424
Since the two non-C implementations use a JIT, it would be interesting to see
results of an FDO build of the C program.

