Hacker News new | past | comments | ask | show | jobs | submit login
Parallel Hashmap (github.com/greg7mdp)
3 points by cr4zy on April 23, 2022 | hide | past | favorite | 2 comments



With getpy[0], a python wrapper, you can get 200x faster map reads in parallel

    In [1]: import numpy as np
       ...: import getpy as gp
    
    In [2]: key_type = np.dtype('u8')
       ...: value_type = np.dtype('u8')
    
    In [3]: keys = np.random.randint(1, 1000, size=10**2, dtype=key_type)
       ...: values = np.random.randint(1, 1000, size=10**2, dtype=value_type)
       ...: 
       ...: gp_dict = gp.Dict(key_type, value_type, default_value=42)
       ...: gp_dict[keys] = values
       ...: 
       ...: random_keys = np.random.randint(1, 1000, size=500, dtype=key_type)
    
    In [4]: %timeit random_values = gp_dict[random_keys]
    2.19 µs ± 11.6 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
    
    In [7]: %timeit [gp_dict[k] for k in random_keys]
    491 µs ± 3.51 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
[0] https://github.com/atom-moyer/getpy


Still the state of the art, bypassing folly and the swisstables on parallel benchmarks.

And even on single threaded workloads it's about 10x faster than std::unordered_map. My smhasher has benchmark tables.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: