
A new fast hash table in response to Google’s new fast hash table - chewxy
https://probablydance.com/2018/05/28/a-new-fast-hash-table-in-response-to-googles-new-fast-hash-table/
======
antirez
Does this implementation stops on rehashing? We should stop considering
seriously hash tables that stop for rehashing because they can be used only
for a subset of problems, due to the latency. This is one of the main reasons
I'm now in love with radix trees and trying to get rid of hash tables at every
level in my code.

~~~
MaxBarraclough
Are radix trees _always_ better? I doubt it.

~~~
thecompilr
Depending on implementation and usage, radix trees can get extremely
segmented, slowing down significantly over time.

~~~
ztorkelson
Would you mind elaborating? I'm not quite sure what you mean by segmented
here.

~~~
MaxBarraclough
I'm not thecompilr, but if I understand them correctly:

In the worst case, the cache behaviour of a radix tree could be awful, if each
node lives in a different cache-line, no?

I imagine this problem can be ameliorated with a custom allocator, though.

Been a while since I've studied hash-maps in depth, but with a good hash
function, that wouldn't be a big worry, would it?

~~~
ztorkelson
Gotcha. Yes, while there are implementation techniques that can help (e.g.
arena allocation, prefix compression, adaptive node sizes), the data structure
is fundamentally a tree, and so one can expect a higher number of random
memory accesses (when compared with suitably optimized hash tables, like those
discussed in the article).

In some scenarios, radix trees can overcome that with better space efficiency
(i.e. implicit key storage), but I think it's fair to say that there are many
scenarios (if not most!) where hash tables outperform radix trees.

Of course, that all hinges on how you measure performance. As Salvatore
pointed out, if you're looking to minimize variance, hash tables might not be
suitable due to their growth characteristics. Radix trees also give you
efficient sorting and range scans out of the box, and are a viable immutable
persistent data structure as well. If none of those are meaningful to your use
cases, then hash tables could certainly be a better fit.

~~~
MaxBarraclough
Arena-allocation? As in (roughly speaking) a _Stack <Node>_ data structure
whose lifetime is bound to the lifetime of the tree, right?

How could we support deletion of nodes?

~~~
stochastic_monk
This is getting deep into some wizardry, but if you used a custom pool-like
allocator you could group contents close together in cache, and, if
necessary/appropriate, even rearrange the elements in the container to be more
cache-friendly for your purposes.

~~~
thecompilr
You could, but the premise of using a tree was to avoid unpredictable
rehashing latency, if you start compacting the tree every now and then, you
basically pay the same price.

------
switch_kickflip
I am essentially a hash table fetishist, and I'm a little bummed out that this
doesn't go into more detail about how the damn thing works. How do you do
implement a chaining hash table that uses a single flat array?

~~~
wichert
You can look at the source in his github repository:
[https://github.com/skarupke/flat_hash_map](https://github.com/skarupke/flat_hash_map)

~~~
titzer
I looked through the code and o_O there is a lot of C++ boilerplate. The core
part of it is not really documented well, and there is this mysterious jump
bits array...if I had more time, I could figure out it. But that's the point
of a writeup, explaining things with a diagram, etc.

~~~
jcelerier
> I looked through the code and o_O there is a lot of C++ boilerplate.

This boilerplate allows me to choose between std::unordered_map and
flat_hash_map in my whole code just by toggling a compile time switch in my
code. It ensures that the data structures has the same API than all other C++
containers.

~~~
titzer
I understand what it is for. YMMV but for me C++ is pretty much a constant
mental stack overflow when reading the guts like this. It seems like C++
libraries are getting more and more configurable in a way that hurts
readability and understandability and library designers don't seem to
recognize there is a fundamental tradeoff here.

~~~
kwillets
I actually have separate implementations of an algorithm with and without
templates; the template one is for benchmarking with different combinations of
options; the non-template one is for readability, and it's only about 60
lines.

------
zawerf
Sort of off topic: In theory hash tables are supposed to be O(1) (under the
RAM model). But looking at these graphs, they seem to grow linearly after they
stop fitting in the cache which is especially apparent on the unsuccessful
lookup graph. Since it's on a log-scale graph, O(log(n)) would probably fit
better.

Just thought it's an interesting divergence between what's taught in school
and in practice. I wonder if there's something that's more predictive than big
Oh for these types of analysis.

~~~
rb808
Interestingly in Java 8 they changed HashMap buckets to hold a binary tree
instead of linked list to change from O(n) to O(log N).

[https://stackoverflow.com/questions/24673567/change-to-
hashm...](https://stackoverflow.com/questions/24673567/change-to-hashmap-hash-
function-in-java-8)

~~~
utopcell
To be more accurate, binary trees are constructed only after a bucket ends up
containing a small number of elements, which should only happen rarely unless
a bad hash function is chosen.

------
tniemi
I personally prefer Judy arrays:
[http://judy.sourceforge.net/](http://judy.sourceforge.net/)

~~~
MaxBarraclough
From
[http://judy.sourceforge.net/doc/10minutes.htm](http://judy.sourceforge.net/doc/10minutes.htm)
:

> From a speed point of view Judy is chiefly a 256-ary digital tree or trie
> (per D. Knuth Volume 3 definitions). A degree of 256-ary is a somewhat
> "magic" N-ary for a variety of reasons -- but mostly because a byte (the
> least addressable memory unit) is 8 bits. Also a higher degree means reduced
> cache-line fills per access. You see the theme here -- avoid cache-line
> fills like the plague.

Sounds like a neat data-structure made with cache behaviour in mind. I'm not
getting much out of the creator's attempts to explain it, but there's a
Wikipedia article:
[https://en.wikipedia.org/wiki/Judy_array](https://en.wikipedia.org/wiki/Judy_array)

------
ebikelaw
Wheres the source of this newer Google table? Can not find it nor the
announcement of it.

~~~
utopcell
It is not open sourced yet but a presentation was given in cppcon17

~~~
ebikelaw
But ... how does it appear in the benchmarks of this article?

~~~
Arcuru
The author claims to have written an implementation of it themselves.

From the article:

> This last one is my implementation of Google’s new hash table. Google hasn’t
> open-sourced their hash table yet, so I had to implement their hash table
> myself. I’m 95% sure that I got their performance right.

------
ilija139
I wonder how it compares to Google's "Learned Index Structures"
[https://arxiv.org/abs/1712.01208](https://arxiv.org/abs/1712.01208)

~~~
stochastic_monk
I don’t think there’s much to write home about RE: Learned Index Structures
since classical structures can soundly outperform at least the trumpeted
Google “success story” [1]. It’s hype, not substance.

[1]: [https://dawn.cs.stanford.edu/2018/01/11/index-
baselines/](https://dawn.cs.stanford.edu/2018/01/11/index-baselines/)

~~~
jacksmith21006
Completely disagree. It is NOT simply about the results but looking at the
direction and a new approach.

Ultimately it also comes down to the power required to get some task done.

Also how can a paper submitted to NIPS be hype?

~~~
stochastic_monk
As an idea and an application of machine learning, it’s important and worth
exploring. Claiming it’s better is false. That’s a distinction worth making.

------
masklinn
Would be interesting to have comparisons with Rust's std hashmap and indexmap
(formerly ordermap) as well.

------
program_whiz
Props, this is an amazing contribution. So much respect for someone who can
really dig in, figure this stuff out, and make a better data-structure for
nothing more than the love of doing so. Everyone making their living as a
programmer/tech person owes it to people like this who make the basic tools.

~~~
utopcell
To be fair, the max load factor for this hash map is set to 0.5, which is
quite wasteful in terms of memory. If you allow for 50% load in dense_hash_map
(and most hash maps for that matter,) they become much faster themselves.

~~~
SolarNet
The hashmap the author wrote is .95, it's the google ones with a 0.5 load
factor.

~~~
utopcell
Is this true? From the article: ''ska::flat_hash_map has a max_load_factor of
0.5''. ska::flat_hash_map is the author's implementation (but maybe not the
most recent iteration ?)

~~~
jpetso
Right, ska::bytell_hash_map is the most recent iteration and
ska::flat_hash_map was the previous one.

------
utopcell
I like this work. We should be seeing more exploratory posts like this one.

I find it interesting that both google's and skarupke@'s hash map have exactly
the same name: flat_hash_map. Maybe boost-inspired ?

------
jbapple
This post has some overconfident and under-researched claims:

> This came about because last year I wrote the fastest hash table (I still
> make that claim)

I'm extremely dubious of claims of the "fastest $DATA_STRUCTURE" without
qualifiers about workload, hardware, and time/space trade-off. In particular,
the last graph of this post shows some workloads (400k int keys, int[16]
values, unsuccessful lookups) where the author's implementation of Swiss Table
(as Google called their table) is 3x-10x slower than "bytell_hash_map", the
author's own design.

> google_flat16_hash_map. This last one is my implementation of Google’s new
> hash table. Google hasn’t open-sourced their hash table yet, so I had to
> implement their hash table myself. I’m 95% sure that I got their performance
> right.

The talk announcing Swiss Table clearly didn't have enough detail to fully
replicate their results. This 95% claim is rather astonishing in light of
that.

~~~
coconutrandom
meta: classic HN, top post is a Negative Nancy regardless of how impressive
the new thing is. Is there a name for this phenomena yet?

~~~
vanderZwan
I know the type of post you talk about - dismissive easy critiques like
"correlation does not.." \- but GP's remark is not quite like that:

\- it states the article has _some_ overconfident claims.

\- it gives detailed arguments for why

> Is there a name for this phenomena yet?

 _" bachelors' wives and maidens' children are well taught"_?

