I saw Microsoft's Garnet getting to the limits of (CPU) concurrent hashmap performance and thought it would be interesting to see if I could beat it with a GPU. Specifically, the GET throughput at 256M keys.
Of course, a few lines of hacky code doesn't get one very much, but it's enough to connect a to remote hashmap and throw integers around.
Redis got 1.5-1.7 Mop/s.
Maybe this would be particularly useful as an embedded KV store in another larger application where perhaps the limitations of the VRAM size are less important. My guess is the main beneficiaries of that would be ML training, but as they're already using the VRAM it might not work out there.