
Show HN: A concurrent thread-safe hash map implemented in C++ - kshk123
https://github.com/kshk123/hashMap
======
yosefo
I feel that using a linked list as the underlying structure combined with fine
grained locking is the wrong direction to go. What is the scenario in which I
would use this collection over an unordered_map and some locking scheme? A
linked list is a very (very very) poor performer and I would only use a linked
list for implementing a collection which is synchronized based on CAS
operations instead of locking. Since your implementation relies on linked
lists and fine grained locking, I have a hard time imagining a scenario where
I would gain utility. I would start by wrapping the std unordered_map and
shared_mutex, testing and measuring the performance as a baseline and only
then creating an alternative which could compete. If you want to make it
interesting, at least make the synchronization lock free and gain some
advantage from using a linked list.

~~~
kshk123
Thanks for your feedback. The linked_lists can definitely be replaced, and I
am working on finding a suitable data structure to maintain the individual
buckets. However, the "unordered_map" with one global mutex will not achieve
what I am trying to do, i.e., allow multiple threads to simultaneously write
into the same hash_map, unless two (or more) of them collide on the same
bucket at the same time.

~~~
yosefo
It's true that a single global mutex will not provide such locking
granularity, but unordered_map does provide bucket api allowing you to find a
bucket by key. This means that for a pair p, you can find the relevant bucket
at index i, lock mutex i and insert to the map. The only problem you need to
deal with is rehashing which can be solved by wrapping the hash function with
a modulo operation and playing around with the max_load_factor. (I could write
a short sample later, it would be a very thin wrapper.)

------
quinnftw
A few issues with this implementation:

Is it really nessecary to allocate all of the hash buckets up front? Why not
initialize them to null and then allocate them as you need?

Why not use some kind of managed container like std vector to store the hash
buckets? This removes the need to do explicit memory management.

Why not use std forward_list instead of rolling your own linked list? Again,
this will reduce the amount of code and save you from having to do manual
memory management

~~~
kshk123
Thank you very much for your review comments. Please find my answers below.
All the hash buckets are allocated upfront so that there is no need for a
global map. If I want to allocate the hash buckets later, there will be a need
to take a hashmap-level-lock, which will hamper the concurrency of the map.
Yes, using the std containers will make life easy. I was trying to have my own
data structure in place, but there is no particular reason for it. I might
move to the std containers in later version.

------
brudgers
Curious if there are any tests.

~~~
kshk123
Yes, the HastTest.cpp in the "src" directory has a number of test cases.

------
aerioux
this doesnt work for collisions

~~~
kshk123
Can you please elaborate, if there is a bug I would definitely like to know
about it and fix it.

