Show HN: Beating Microsoft's Garnet KV Store in less than 300 lines of code

danpalmer · 2024-03-23T18:22:45 1711218165

Nice demo! I wonder whether GPU acceleration is actually beneficial for real-world examples, i.e. once you add network overhead, real world data sizes, the more complex data structures that Redis/Garnet offer, etc.

Maybe this would be particularly useful as an embedded KV store in another larger application where perhaps the limitations of the VRAM size are less important. My guess is the main beneficiaries of that would be ML training, but as they're already using the VRAM it might not work out there.

winwang · 2024-03-24T02:05:20 1711245920

While network overhead always exists, the goal would be take advantage of the seemingly 10x performance headroom on a GPU as compared to a CPU. Not to mention, GPUs are getting more and more HBM capacity.

Of course, you're right to wonder about how GPUs behave with more complex structures -- I'm not sure. Research papers seem to get pretty good results for stuff like skip lists and b+trees? The general idea, though, is that GPU compute + bandwidth optimized memory is better/more efficient for high-throughput compute, if you can stomach a couple tens of microseconds. Coincidentally, network latencies force you to stomach that anyway.