Hacker News new | past | comments | ask | show | jobs | submit login
ZetaSQL – A SQL Analyzer Framework from Google (github.com)
211 points by obahareth 26 days ago | hide | past | web | favorite | 27 comments

It is somewhat similar to https://calcite.apache.org/ which I like quite a lot. The calcite documentation is top notch as opposed to a bunch of MDs in a directory providing no coherent flow of information.

There will hopefully be some integration with Calcite in the future. The ZetaSQL team has been in discussion about this on our mailing list :)

    We will also be releasing more documentation over time,
    particular related to developing engines with this framework.
Looks like they intend to resolve that.

The intention is always there

Direct link to the documentation one-pager: https://github.com/google/zetasql/blob/master/docs/one-pager...

> 22365 lines (18045 sloc) 599 KB

That is a truly enormous "one-pager". I thought my scroll bar was bugged!

This is an in-joke at Google.

To elaborate: The spirit of a one-pager isn't to cram everything onto "one page", but rather to have a "single document" that contains all the context/content you need instead of having hyperlinks to large swaths of other documents (doing a depth-first search in documentation at Google can be a huge time-sink).

Source: Another Googler that wastes time on memegen

company whose verb has come to mean "to search" has employees who dislike search

Or maybe they just value an easy-to-use CTRL+F search

I might be a bit slow but what is this exactly? Is it something I could add to a data layer and get SQL semantics from?

SQL parser. Input: text in a standardized SQL dialect, output: a data structure describing what the user wants done.

That is really cool. I’ve been looking for this

No answer to this question. Also interested.

I have to presume this is in production with the BigQuery web UI. It has brilliant validation and completion, arguably better than Looker. I’d let that framework into my hypothetical SQL UI any old time it likes.

This is neat, although the docs don't say which (if any) SQL standard it follows.

Not super sure if it follows any of them, but it might be ANSI 2011, since Cloud Spanner SQL is described as "ANSI 2011 with extensions", and I would expect that to be a strict subset of ZetaSQL.

For some more context, ZetaSQL is the thing we've standardized our internal systems on, aka F1, Dremel/BigQuery, Spanner (and there's a few others). The F1 Query paper makes a light reference to this in S7.2 http://www.vldb.org/pvldb/vol11/p1835-samwel.pdf.

For me, the most valuable feature of it has been that protos are an explicitly defined data type, since we use protos for pretty much everything. Docs: https://github.com/google/zetasql/blob/master/docs/one-pager...

EDIT: BigQuery SQL is described as "compliant with the SQL 2011 standard and has extensions", and I would also expect that to be a strict subset of ZetaSQL.

Google has it's own internal SQL dialect. It's the same that is also used in Cloud Spanner. I think it's mostly standard complaint, but less feature complete than the standard sql dialects.

I’m pretty sure it wouldn’t follow any; I can’t imagine any significant benefit (if you’re not chasing strict compatibility, general SQL should take you more than far enough on the beginner-friendly front) and significant benefit from doing a custom SQL

A benefit of following an established standard is that one can use this not just to write a new engine but also to build internal tooling (static analysis, linters, IDE tools etc.) for an existing one (e.g. MySQL).

What significant benefit

Things like late row lookups in mysql that are not an issue in postgres https://explainextended.com/2011/02/11/late-row-lookups-inno...

aren't both executions (mysqls' and postgresqls') acceptable from an SQL point of view?

I don't think the SQL standard mandates how the DBMS executes the query. In fact, I seem to remember that was on of the purposes of having an high level language.

Seeing as this introduces a language, a couple examples on the landing page wouldn't be a bad idea.

Does it support Druid as engine? Is it possible to connect it with Metatron Discovery?


Wasn't this internally called "GoogleSQL" - sorry was 2 years back, all I remember was switching one syntax to another, and certain things easier in previous syntax, but not well defined, and vice versa...

Still using Dremel SQL occasionally and I tend to agree that certain things are easier in Dremel SQL. But it's a canonical example of continuously applying convenient patches making things worse. While something is easier to write, expressing something else becomes impossible. Examples are complicated Protos with multiple layers of repeated/map/struct stuffs...

With Google SQL, it's really verbose but a conceptually coherent design just gives you less surprises.

I'd say I like the latter more.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact