
Postgres on the GPU - Rickasaurus
http://wiki.postgresql.org/wiki/PGStrom
======
cs702
If I understood correctly, this module allows you to export data to, and then
access (read-only), foreign tables architected specifically for faster copying
of data to/from GPU, allowing you to speed up queries that benefit from GPU
computation by 10-20x. Nice.

On the surface, this seems very similar to
<https://news.ycombinator.com/item?id=5592886> except it's nicely integrated
with Postgres. Or am I missing something?

tmostak?

~~~
masklinn
> this module allows you to export data to, and then access (read-only),
> foreign tables architected specifically for faster copying of data to/from
> GPU

That seems to be pretty much it, although the read-only part is a limitation
of Postgres rather than a feature of the module.

Furthermore, it seems to be dispatching queries intelligently so you can
perform all queries against the FDW table with (I hope) minimal overhead, if
the qualifier can't be compiled to a kernel the fdw will run it as a normal
on-CPU qualifier. That's a thoughtful touch.

------
rektide
PgOpenCL takes a different approach- instead of a Foreign Data Wrapper (FDW)
that exposes tables, it's a language for writing postgres functions in, where
the language is just opencl. <http://www.slideshare.net/3dmashup/pgopencl>

Alas, Tim's been talking about this for two years now, and as far as we know
he's the only one whose ever seen the code.

~~~
Rickasaurus
Very cool. This should be upvoted more.

------
hippich
NVidia's CUDA only (for now?)

Anyone can explain why opensource projects embrace CUDA over OpenCL? As I
understand OpenCL is more generic API which could be potentially used with
CPUs and GPUs.

~~~
guard-of-terra
Another question, is CUDA actually fast enough?

Because in bitcoin mining, it's always ATI/AMD and OpenCL because ATI cards
are like ten times faster than Nvidia cards. This is because of architecture
differencies.

Does it not affect this postgres table-scanning task? I wonder if they did any
benchmarks.

~~~
duskwuff
AMD's advantage in Bitcoin mining was purely due to an architectural quirk:
their shader cores supported bitwise rotation, but Nvidia's didn't. Bitwise
rotation is a rare instruction outside of certain crypto algorithms (like
SHA256!), so this really means very little for general-purpose performance.

[http://www.extremetech.com/computing/153467-amd-destroys-
nvi...](http://www.extremetech.com/computing/153467-amd-destroys-nvidia-
bitcoin-mining)

~~~
mrb
_"AMD's advantage in Bitcoin mining was purely due to an architectural quirk"_

False. I authored a Bitcoin miner utilizing this quirk (bit_align). I was also
the first to leverage another instruction exclusive to AMD (bfi_int):
<https://bitcointalk.org/?topic=2949> bit_align "only" gave AMD a 1.7x
advantage over Nvidia. The biggest perf gains (2x-3x!) came from the fact AMD
has more execution units:
[https://en.bitcoin.it/wiki/Why_a_GPU_mines_faster_than_a_CPU...](https://en.bitcoin.it/wiki/Why_a_GPU_mines_faster_than_a_CPU#Why_are_AMD_GPUs_faster_than_Nvidia_GPUs.3F)
(I also authored this section of the wiki).

------
rarrrrrr
Brilliant!

Perhaps it's just my scars showing, but I'm concerned about database system
stability with active GPU hardware added to the box.

I probably wouldn't add the GPU hardware to a master but rather do the queries
that would benefit from it on a streaming replica, where an occasional kernel
panic won't be so severe.

Regardless, looking forward to trying it.

~~~
jonknee
It's read only...

~~~
oakwhiz
I think that he is referring to the tendency of GPU drivers to crash. Even if
the DB is read-only from the GPUs' standpoint, if the DB goes down because of
faulty drivers it's still a problem.

~~~
NuZZ
On the flip side, if this is sufficiently adopted, it could present motivation
to driver developers, thus improved drivers. Perhaps this and the linux gaming
movement could mean some symbiosis for driver development.

Ignoring Windows as I guess I don't really take Windows servers too seriously.

------
metrix
Postgres on the GPU just screams HSA <http://hsafoundation.com> and more
importantly AMD and their upcoming Kaveri processor. which will significantly
reduce the latency of performing GPU calculations since the CPU and GPU will
share the same cache.

------
DigitalTurk
Really, really cool!

I've toyed with the idea of doing pattern matching (and graph rewriting) on
the GPU before but this looks like it's much more advanced than I thought was
feasible.

I'm surprised they went with CUDA instead of OpenCL though. CUDA is
proprietary NVidea technology and does not work for non-NVidea devices.

------
perlgeek
All the examples seem to use numbers (integers and floats). It would be
interesting to see if it can work efficiently with variable-width strings,
which is the main workload that I encounter. But even if not, I see the value
working with lots of data.

~~~
hippich
GPU is good at calculating stuff (nVidia in particular at calculating floats.)
I do not see reasons to do text stuff with GPU

~~~
qb45
GPUs had been used for some kind of fuzzy string matching [1] and worked well
thanks to their huge memory bandwidth.

[1]
[http://en.wikipedia.org/wiki/Smith%E2%80%93Waterman_algorith...](http://en.wikipedia.org/wiki/Smith%E2%80%93Waterman_algorithm)

------
jaakl
The query sample is interesting: it seems to be find objects (locations)
nearby given point. But there is CPU-only solution to index the data properly
and have 1000x faster response times: PostGIS. So if before you threw more
CPU/RAM if you did not know how to make things faster in smart way, then now
you throw more GPU. Anyway, for sure there can be cases where GPU -based
solution could be better alternative than using traditional existing solutions
like specialized indexes.

------
spitfire
I notice the GPU load time is about 53ms. This is a discrete graphics card so
I do wonder how an integrated APU will affect this. I can imagine the overhead
there being virtually nil.

The long term trend is for the GPU to merge with the CPU, so I think we'll see
more of this in the future.

------
agnsaft
This is cool, however, Postgres could probably achieve higher performance
without the GPU as well if they added concurrency on the CPU for certain type
of operations (e.g. aggregation, sorting, etc). That would be a killer
feature.

------
kyhoolee
interesting, I'm also currently working with GPU

------
DiabloD3
I don't understand why people keep writing new code in legacy hardware that
most people don't own.

Its written in CUDA, not OpenCL. Stop that.

~~~
ketralnis
Yeah, why do people keep solving the problems they have rather than the
problems you wish they had? Bastards.

