

Vitesse Data: Postgres + LLVM - yangyang
http://vitessedata.com/

======
vtuulos
We have been running a similar setup (Postgres -> Foreign Data Wrappers ->
LLVM) at AdRoll for over a year. We keep 100TBs+ of raw data in memory,
compressed.

We managed to build our solution mostly in Python(!) using Numba for JIT and a
number of compression tricks. More about it here:

[http://tuulos.github.io/pydata-2014/#/](http://tuulos.github.io/pydata-2014/#/)

[https://www.youtube.com/watch?v=xnfnv6WT1Ng](https://www.youtube.com/watch?v=xnfnv6WT1Ng)

~~~
BenoitP
I saw an instance of your talk at another venue. At the time I was learning
about bitmap indexes (because we needed OLAP capabilities on a another
database engine that is not postgres). Your talk reminded me furiously of
them, but in the end I was not able to further investigate the difference.

How does your technology differ from bitmap indexes? Have you solved the
performance problem of updating random rows, for example?

~~~
cbsmith
Combining OLAP without OLTP (large aggregate queries + lots of real time
updates) is the holy grail that Cassandra, for example, has addressed rather
nicely.

 _Compressed_ bitmap indexes are awesome. Like most indexes, the updating
random rows problem is best addressed using a log structured merge tree and
amortizing your index updates. Just have an in-memory buffer of recently
updated rows.

If you are doing mostly sums & counts type work and can deal with some level
of inaccuracy, you can consider HyperLogLog...

------
hendzen
And what is old, is new again. For reference, the System R database used this
technique of compiling SQL queries to machine language in the 70's:

"The approach of compiling SQL statements into machine code was one of the
most successful parts of the System R project. We were able to generate a
machine-language subroutine to execute any SQL statement of arbitrary
complexity..."

See: [http://www.cs.berkeley.edu/~brewer/cs262/SystemR-
comments.pd...](http://www.cs.berkeley.edu/~brewer/cs262/SystemR-comments.pdf)

------
florianfunke
If you're interested in query compilation, check out the HyPer DBMS prototype
at [1]. It efficiently compiles SQL (SQL92++) to super-fast LLVM code (faster
than Vitesse). It's not open source, but you can download a binary [4] or try
it out online using the "WebInterface" link and read about the query compiler
here [2]. It also has lots of other goodies, like SIMD-powered CSV parsing
(about 4x faster than Postgres) [3]. Rumor has it that the prototype will be
commercialized soon ;)

EDIT: It also has NUMA-aware multi-threaded query execution.

[1] [http://www.hyper-db.com](http://www.hyper-db.com)

[2]
[http://www.vldb.org/pvldb/vol4/p539-neumann.pdf](http://www.vldb.org/pvldb/vol4/p539-neumann.pdf)

[3]
[http://www.vldb.org/pvldb/vol6/p1702-muehlbauer.pdf](http://www.vldb.org/pvldb/vol6/p1702-muehlbauer.pdf)

[4] [http://databasearchitects.blogspot.de/2014/05/trying-out-
hyp...](http://databasearchitects.blogspot.de/2014/05/trying-out-hyper.html)

~~~
sbrakerm
Looks like the Vitesse product was inspired by this research. Even though the
HyPer looks a lot more mature (though it's still a research project). Looking
forward to see the product.

------
willvarfar
Interesting and upvoted even if just a sales pitch.

But a keeping-the-sauce-sectret aproach won't win over techies! We want to
know how it works, why it works, what it can do and what it can't do!

Looking forward to the white papers and talks :)

PS name is very close the Google's Vitess?

~~~
justinsb
To be fair, they've made it very clear how it works:

1) They have optimized the (existing?) CSV import code to use SSE
instructions, for faster CSV import

2) For a sufficiently complicated query, they will compile it to native code
using LLVM. They presumably precompile most of Postgres (or at least the
execution parts) to LLVM IR, and then convert enough of the execution plan
into code so that the LLVM optimizer can optimize it (inlining, branch
prediction, dead code elimination etc). If they are able to persuade LLVM to
optimize the per-row decoding, I think that could be a huge win.

It's a great accomplishment; I do wish they had contributed it to Postgres,
but I can't blame them for not doing so (they need to eat!).

~~~
ris
"I do wish they had contributed it to Postgres, but I can't blame them for not
doing so (they need to eat!)."

Stop this. The two are not mutually exclusive. Postgres developers are not
starving.

------
mfenniak
There are a few posts on the pgsql-hackers mailing list that have some more
detailed information, for anyone interested in reading more about the approach
taken: [http://www.postgresql.org/message-
id/5CFE0CA1-E5CC-4CD1-9D0B...](http://www.postgresql.org/message-
id/5CFE0CA1-E5CC-4CD1-9D0B-8D72143D81C2@vitessedata.com)

I think this is a fascinating approach to improving query speed. It certainly
won't be applicable to every use-case, but it seems like there's a lot of
value in it.

------
Thaxll
From my experience I/O is more important than CPU for DB, I've never seen the
24 cores of our servers having problems, however the raid 10...

~~~
justinsb
It really depends on your workload. If your working-set fits into memory and
is read-mostly, your I/O is basically irrelevant. Even if you're not in-
memory, if you are doing mostly sequential reads (like the OLAP queries they
are benchmarking), each drive can do ~200MB/second, so it is not too difficult
to become CPU bound if you are doing any significant processing on each row
(particularly if your row size is small!)

For OLTP queries, particularly with working-set > RAM or lots of writes, your
disk I/O is probably the bottleneck (probably your IOPS, actually, which is
why SSD can be so valuable). Pretty sure they're not targeting that use-case
though!

~~~
swasheck
in my experience (which isn't the sum-total of all human experience), our OLAP
queries have a bottleneck at the fiber interconnect between the storage
controller and the OS. cpu still isn't the problem for our OLAP systems.

your point that great strides have been made in storage such that we could be
back at cpu as a bottleneck is well taken though.

------
twic
There was a vaguely similar thing a while ago called PGStrom, which compiled
qualifiers into CUDA code which ran on the GPU:

[https://wiki.postgresql.org/wiki/PGStrom](https://wiki.postgresql.org/wiki/PGStrom)

Someone got similar performance just using multithreaded native code on the
CPU:

[http://blog.notapaper.de/article5.html](http://blog.notapaper.de/article5.html)

------
uberneo
Shard-Query does the same work of utilising all the available CPUs but its for
MySQL - [https://github.com/greenlion/swanhart-
tools/tree/master/shar...](https://github.com/greenlion/swanhart-
tools/tree/master/shard-query)

------
yawniek
isn't usually the transfer of data between memory and cpu the bottle neck and
not the cpu itself?

e.g.
[http://www.pytables.org/docs/CISE-12-2-ScientificPro.pdf](http://www.pytables.org/docs/CISE-12-2-ScientificPro.pdf)

~~~
spacemanmatt
The most common bottleneck in my work experience has been disk I/O when the
data set would not fit into memory. And every professional data set I've
worked with exceeds 1TB. Perhaps there are other bottlenecks, but disk seeks
and (for one place) their iSCSI over gbit ethernet nonsense ruled the
performance challenges.

------
peatmoss
I wonder how well this approach would extend to Postgres add ons like PostGIS
and PGRouting?

~~~
cordite
As someone that has used PostGIS on every city in the US to map to every
county in the US over the last two centuries for genealogical mapping
purposes, it would probably make it nicer. (Some of the results are already in
the library of congress)

------
bfrog
I would love to see how its done one day, though I get the need to make back
the investment in time!

------
jeffdavis
Menu button doesn't seem to work in Firefox on android.

