
Snel: SQL Native Execution for LLVM - wslh
https://arxiv.org/abs/2002.09449
======
erezsh
Hmm, interesting concept, and an article is nice, but.. why not release the
code?

~~~
binarycrusader
_Snel is still under development, it won’t be publicly available yet, but we
hope to improve code base, add support to multi-tenant over a distributed
network and more cool features, so it will be open-sourced in the future._

[https://medium.com/grandata-engineering/introducing-snel-
a-c...](https://medium.com/grandata-engineering/introducing-snel-a-columnar-
just-in-time-query-compiler-for-fast-on-line-analytics-9cf561f82526)

~~~
simonw
That's from October 2017 - would be great to see an update.

------
vbsteven
I don’t know if it is intended but “snel” is the Dutch translation of “fast”

~~~
_Microft
It is.

 _We called this new engine “SNEL” as an acronym for “SQL Native Execution for
LLVM”. Well, that’s the excuse, actually we chose Snel because that means
“fast” in Dutch and the name seemed right to us._ from
[https://arxiv.org/pdf/2002.09449.pdf#page=6](https://arxiv.org/pdf/2002.09449.pdf#page=6)

------
cheez
Ooh this seems really cool, wonder if you can take SQLite bytecode and convert
it to LLVM bytecode and execute natively. Edit: oh that's what they do haha

------
hans_castorp
Sounds pretty much what Postgres is doing since v11

~~~
bloomer
Actually this is quite a bit different. This is more a columnar-store version
of SQLite, basically an embedded OLAP database, which is pretty cool. I’m not
aware of other column-stores in this niche, most are distributed systems meant
for big data and so are much more complicated to setup and manage.

~~~
phunge
Have a look at duckdb, it's is another interesting tool that's columnar and
embedded.

It would be amazing if the world of open-source column stores matured a little
bit relative to where we are today...

------
csours
I understand SQL, and I kind of understand LLVM, but I don't understand why
SQL on LLVM?

~~~
cryptonector
Because an RDBMS need not be I/O-bound. It might be compute bound (e.g., if
the dataset fits in memory), so optimizing the compute-side can help. RDBMes
generally compile queries into a "plan" that is then interpreted to execute it
(SQLite3 compiles queries into bytecode, while PostgreSQL compiles them into
AST-like tree structures). JITting certain portions of a query plan can help
it go faster.

~~~
anarazel
In fact, I'd say it's extremely common for database workloads to be compute
bound on modern hardware.

For one, storage bandwidth has increased massively (my laptop's NVMe drives
can do 2 x 3.2GB/s reads, you can get quite a few into even small servers).
While memory latency and CPU throughput have not increased to the same degree.

To benefit from the increases in CPU throughput, one needs to take advantage
of superscalar execution. Which isn't significantly possible with interpreted
execution.

> SQLite3 compiles queries into bytecode, while PostgreSQL compiles them into
> AST-like tree structures

FWIW, expressions are compiled into bytecode in PostgreSQL as well. While
there'd plenty benefit of doing that for query trees as well, there are not
quite as much raw execution speed reason for it as there is for expressions
(as the individual "steps" are much coarser, so the tree walk overhead is
proportionally smaller).

------
julienfr112
Is it open source ?

