Hacker News new | comments | show | ask | jobs | submit login

You're thinking about transactional databases, and you're right. Transactional databases will probably not benefit hugely from a GPU. That's not saying it's impossible, but probably not worth the effort.

However, there are so many types of databases around. Lambda architectures are all the rage now - you keep one database for your transactionals, and another for analytics. Analytics are huge, in the multi-billions of dollars every year and they've become one of the most important parts of steering a business and deciding on new strategy. Larger businesses don't just 'go for it' anymore, they analyze, and inspect, and dig deep into their historical data to find out if something is worth doing.

GPUs tend to lend themselves well to analytics, contrary to transactions. Specifically, columnar databases. When the columns are all of the same data type, and the data locality is high, GPUs perform /very/ well.

Regarding your sorting point you may not really want to sort everything, you got that bit right. But what if you want to perform a `JOIN` on a bunch of data?

It makes more sense to sort it first, because the JOIN would be much faster - matching keys would be much easier.

Now, if you were performing really fast SORT on a GPU, you're saving precious processing time.




Doesn't the overhead of moving things back and forth between GPU memory and main memory wipe out most potential gains, though?

If you're running analytical workloads on big data sets, you're typically I/O bound to start with. It seems like managing moving little pieces of it back and forth to the GPU to compute is going to be a big PITA, add lots of little latencies, and gain you absolutely nothing. What am I missing there?


1. Not everything needs to be pushed up to the GPU. Some things are better left in RAM.

2. What if you only push indexes or similar up to the GPU, like an AB-tree index? You're keeping all of the 'heavy' stuff down, and only uploading a representation of it, to be later replaced with the actual data.

3. Think compression/decompression done on the GPU directly.


At Blazing we also build GPU db and have always loved what this project (Alenka) is doing. First of all when you are talking about I/O bound which I/O are you talking about? Do you mean from disk? From RAM? There are many ways of getting around some of these I/O bottleknecks like sending compressed data or processing while transferring. You're assumption that these workloads are typically I/O bound is correct but then agian GPU databases aren't always going after the most "typical" workloads. If you are doing large amounts of transformations, or complicated joins then you also can benifite hugely from the use of a gpu. Ever try to join several tables together across multiple columns? If you do then you should probably use a hash join and if you are using a hash join you better believe you are going to want to do be doing computationally intensive things like sorting and hash generation. Have you tried any gpu databases to see if this concern is valid? GPU dbs can take advantage of things like very expensive cascading compression that many normal databases can't.


Different approaches and solutions for transactional processing and analytics, decision-making informed by data analysis, all that's been around for decades along with its own silly jargon - OLAP, OLTP, data mining...


> Transactional databases will probably not benefit hugely from a GPU.

What about parallel queries over a Restriction-Union normalized data model?

I think the benefits would be similar in nature to columnar stores as you note:

> GPUs tend to lend themselves well to analytics, contrary to transactions. Specifically, columnar databases. When the columns are all of the same data type, and the data locality is high, GPUs perform /very/ well.


Maybe, but that's a very specific situation. You don't typically create a full RDBMS for a specific data model




Applications are open for YC Winter 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: