
How to Architect a Query Compiler, Revisited [pdf] - lainon
https://www.cs.purdue.edu/homes/rompf/papers/tahboub-sigmod18.pdf
======
rwhaling
It's always nice to see the papers that come out of Tiark Rompf's group. For
background, here are some other great ones:

Lightweight Modular Staging: A Pragmatic Approach to Runtime Code Generation
and Compiled DSLs
[https://infoscience.epfl.ch/record/150347/files/gpce63-rompf...](https://infoscience.epfl.ch/record/150347/files/gpce63-rompf.pdf)

Collapsing Towers of Interpreters:
[https://www.cs.purdue.edu/homes/rompf/papers/amin-
popl18.pdf](https://www.cs.purdue.edu/homes/rompf/papers/amin-popl18.pdf)

LMS-Verify: Abstraction Without Regret for Verified Systems Programming:
[http://lampwww.epfl.ch/~amin/pub/lms-
verify.pdf](http://lampwww.epfl.ch/~amin/pub/lms-verify.pdf)

There's a ton more. Nada Amin's talks on the last two papers are excellent
also:

[https://www.youtube.com/watch?v=QuJ-
cEvH_oI](https://www.youtube.com/watch?v=QuJ-cEvH_oI)
[https://www.youtube.com/watch?v=Ywy_eSzCLi8](https://www.youtube.com/watch?v=Ywy_eSzCLi8)

Oh, and I forgot my favorite: Functional Pearl: A SQL to C Compiler in 500
Lines of Code [https://www.cs.purdue.edu/homes/rompf/papers/rompf-
icfp15.pd...](https://www.cs.purdue.edu/homes/rompf/papers/rompf-icfp15.pdf)

Some really mind-blowing stuff.

~~~
zokier
Worth pointing out that "Lightweight Modular Staging: A Pragmatic Approach to
Runtime Code Generation and Compiled DSLs" is also co-authored by Martin
Odersky of Scala fame. Definitely looks like good stuff.

~~~
tiarkrompf
No surprise here, as Martin was my PhD advisor :)

------
tiarkrompf
Hi all,

Author here, great to see our paper on the front page today. Happy to answer
any questions!

Since Spark was already mentioned in the comments, you might be interesting in
the parallel/follow-on work we did applying similar techniques to accelerate
Spark:

"Flare: Optimizing Apache Spark for Scale-Up Architectures and Medium-Size
Data" (to appear at OSDI, preprint here:
[https://www.cs.purdue.edu/homes/rompf/papers/essertel-
osdi18...](https://www.cs.purdue.edu/homes/rompf/papers/essertel-osdi18.pdf))

If you're running Spark in production and are interested in trying it out,
please leave your contact data here:
[https://flaredata.github.io](https://flaredata.github.io). We're a small team
and have a long waitlist, but should have some open slots in our private beta
program this Fall.

Cheers, \- Tiark

------
chrisaycock
I'm glad to see the paper specifically contrast iterator approaches ( _pull_
model) versus a "data-centric" approach ( _push_ model). This was indeed the
issue I've seen with Apache Spark's RDD.

Imagine a common pipeline:

    
    
      - input data
      - process data
      - output data
    

Using iterators, the output logic loops over an abstract next() interface
provided by the process logic, which in turn loops over another abstract
next() interface provided by the input logic. Given a long enough chain of
dependent processes, the CPU's cache will be obliterated.

If we were to write code in a regular programming language by hand, we would
never use such an inverted model. Instead, we would directly input the data,
process it, and then output it. That's exactly what the paper proposes; the
authors supply a callback function at the major stages of query evaluation.
They use this model in the context of writing a compiler to speed query
execution.

It's worth noting that the Spark maintainers also realized that iterators have
bad performance and that newer versions of Spark follow the compiler approach:

[https://databricks.com/blog/2016/05/23/apache-spark-as-a-
com...](https://databricks.com/blog/2016/05/23/apache-spark-as-a-compiler-
joining-a-billion-rows-per-second-on-a-laptop.html)

~~~
atombender
As someone working on exactly such a query engine, this is very interesting,
thanks.

It's worth noting that Rust uses iterators extensively, yet is able to get
around much of the issues with the pull model by being able to optimize and
inline iteration loops.

I believe its starting point for getting good performance here is by making
iterators — i.e., the std::iter::Iterator trait — a first-class trait. For
example, "for v in iter" knows about how to loop over iterators, and "for v in
values" sees if "values" implements the IntoIter trait to turn it into an
iterator. Rust also avoids vtable lookup when the value is not a polymorphic
boxed trait. Together, this means it can aggressively inline and unroll
iteration loops. I'm sure they use other tricks, as well.

