Hacker News new | comments | show | ask | jobs | submit login
Writing a simple SQL interpreter in Julia (nhdaly.github.io)
122 points by nathandaly 61 days ago | hide | past | web | favorite | 24 comments



Great blog post! Nathan also gave an excellent talk at JuliaCon this year about building and distributing Apple App Store apps in pure Julia:

https://www.youtube.com/watch?v=kSp6d3qSb3I


Haha thanks Stefan! :)


Is there a tl;dw synopsis? I thought static compilation was a ways off - does this involve bundling all of Julia w/llvm and all?


Like Stefan said, static compilation has been possible for years, just not very user friendly. We've been working on making compilation easier here: https://github.com/JuliaLang/PackageCompiler.jl

And we have a package for building complete, standalone, desktop applications out of the compiled code, here: https://github.com/NHDaly/ApplicationBuilder.jl

> does this involve bundling all of Julia w/llvm and all?

Yeah, more or less. It involves bundling the julia runtime shared library (no different than a C++ program needs access to the C++ runtime library, except that usually comes pre-installed), and all its dependencies, which yes, includes llvm and all.

A far-off goal is to be able to shave off the pieces of the runtime you don't need, such as LLVM if everything is precompiled, FFTW if your program isn't doing heavy linear algebra, etc.


Static compilation has already been possible for several years, it's just not very user-friendly.


That was a great read! I always love seeing people writing DSLs in Julia, even if it’s just for fun.

Was there anything that you found to be surprisingly challenging, or was it mostly straightforward?


Thanks!

I think most of the challenge was self-inflicted.. I just sort of programmed as I went, with no planning at all, and so i wound up reimplementing the same concept a few different ways, which made smooshing it all together into the mega SELECT function difficult.

I also was surprised by some of the aspects of writing the macros. It wasn't really clear to me when pieces of the expression get passed in as separate arguments, vs when they come in as a nested structure.

That is, does `@SELECT foo @FROM bar @WHERE a>2` parse as a call to `SELECT(:foo, :@FROM, :bar, :@WHERE, :(a>2))`, or as `SELECT(:foo, :(@FROM(:bar, :@WHERE(:(a>2))))`? I think it was the second one, but I still don't think I really have a good grasp on the pattern.


Neat, but..

> Cause it turns out, hey, a join is really simple! All it's doing is basically computing the Cartesian Product of your two tables, and then filtering out rows based on how you specified the query. No big deal.

Was this a simplification or that's how you think JOINs work? In either case, it's an extremely misleading statement.


Thanks, no, i think this is a good criticism.

I think what I meant was something more like "You can think of it as basically just the Cartesian Product of your two tables, and then filtering out rows based on how you specified the query." I'll make that change now.

I recognize that you would want to do the filtering first, of course, and only materialize the final table at the very end after everything else finishes.

Other than that, though, am I missing anything else? It's quite possible due to, as I said, how new I am to relational databases.


Fun project! I think Julia could be a great language for implementing things like using Bayesian statistical methods for estimating efficient query plans. :-)

Do you use the new iterators to implement the joins? If so, it seems reasonable that you could implement the join/filter as logical separate components but have the filtering happen in-line with the joins during iteration. Then you'd get the best of both worlds. It'd be interesting to see how well the Julia jit can inline that type of code!

Also, if you're just getting into relational databases, definitely take some time to read up on the relational calculus, and relational algebra.


Other than some implementation tricks (optimizing broadcast joins, etc.), how is he wrong? That’s basically the definition of a JOIN.


The "implementation tricks" are essential optimizations.


For most usable systems, sure. For a functionally correct but slow implementation, nope.


Technically JOIN without a where could mean CROSS JOIN?

https://en.m.wikipedia.org/wiki/Join_(SQL)#Cross_join


oh metalevel eDSL, I was expecting standalone sql parsing :)


Ah yeah, I should've been more explicit in the title.

I think parsing wouldn't be hard either (since that's essentially what the macros are doing), I just didn't do it.

In fact, I think probably all the macro parsing code could be reused for a parser!


warning you're one step away from prolog


Tomorrow: Prolog-like logic programming DSL in Julia.


What I really want to see is an APL implementation as a Julia DSL.



XD lol


Very cool work!

Do you think it's likely someone will eventually write a full database system in Julia?

It'd be kind of crazy do do it I'm such a high level language, but I wonder... It seems like Julia is pretty fast and capable.


It's already been made! https://github.com/JuliaComputing/JuliaDB.jl

(though it appears to be in need of an update to 1.0 still which is in progress: https://github.com/JuliaComputing/IndexedTables.jl/pull/182)


Also, http://relational.ai/ — high performance relational database with integrated AI written in Julia, which gives them JIT compilation “for free”: just generate Julia code for a query and run it. This lets them run circles around other databases in terms of query speed without even breaking a sweat (in terms of implementation effort; the CPUs are probably quite hot).




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: