Hacker News new | comments | show | ask | jobs | submit login
Usql – Universal command-line interface for SQL databases (github.com)
160 points by pjf 3 months ago | hide | past | web | favorite | 51 comments

An unfortunate name, given that MS has had a language called U-SQL released for their data lake platform for a number of years now. The title of this post made me think they were bringing the functionalities to their SQL Database at a quick glance.


So this looks like a nice client which can be used with different databases, but do I still have to write database specific SQL?

The first example I can think of is the way you set the size of the result set in for the different databases (TOP vs. LIMIT vs. ROWNUM)[1]. I mean having one client to rule them all is good, but in my experience the harder problem is to learn all the different dialects depending on what database you use.

[1]: https://www.w3schools.com/sqL/sql_top.asp

I have a story to tell about this. I worked on a project for a company once, that mapped a single custom SQL dialect to nearly every other dialect. We wrote a translator / mapper that converted the AST of that single dialect of SQL to every platform specific as close as we could. That included as many function mappings, data type mappings, SQL query expressions, as we could. Then we had custom drivers that could be used which parsed that SQL.

One of the most difficult projects I've done, but it worked reasonably well, for most basic functions. We struggled with custom platform specific features of course, but even figured out workarounds for them. Of course, if a database didn't support a base level of the SQL standard (say SQL-92), for example, Cassandra - it was impossible for the translator to take a complex expression and handle it on its own - there was no target syntax to translate to.

Anyways, point being - been there tried this, and it is possible. An open source implementation of what I did would have been awesome, alas, it was for a proprietary project.

And ultimately, if you want one specific syntax, unless you find creative solutions around missing features between platforms (possible but much different problem that goes beyond syntax) - it is never quite as powerful for each platform.

I still think every day about starting some kind of open source SQL gateway client project to normalize it and sit in front of databases. It's a big project though, and like I said, I question the value sometimes.

Normalizing the syntax for databases that support SQL >=SQL99 (or even 2003 standard) would not be too bad though.

This is something we have to blame the vendors for.

There is an international SQL standard from ISO, it's just not commonly followed.

However, sometimes the standard isn't very useful in itself. For this example, FETCH FIRST x ROWS ONLY is the syntax mandated by the standard. Although some databases accept this in the meanwhile, LIMIT might have been a better choice as it is supported by more databases.


Edit: ps.: Please don't use w3schools.com as a SQL reference. It's utterly outdated, prefers vendor syntax over standard and is sometimes just straight wrong.

I'll disagree with you that we have vendors to blame. Vendors are trying to give users new features and support existing applications. The way new features get added to ANSI SQL is primarily by those features being adopted by multiple vendors and then one vendor's implementation (usually Oracle's) is selected as the standard.

You also have to keep in mind that ANSI SQL is a pure relational language. It's artificial in many senses. There are essentially no details of implementation in ANSI SQL. These implementation details include everything from a modular database engine like MySQL, multiple index algorithms or clustering options like PostgreSQL, deep features like external procedural languages extensions, and so on. None of these type of features are considered by the standards body. It's entirely left up to the vendors to create them. Well, a lot of features like these end up becoming standardized precisely because they become popular ways to solve problems. Statements like BACKUP DATABASE and RESTORE DATABASE are not a consideration of standard SQL because that is considered an implementation detail. Point in time recovery and log file handling are all different.

Other features like string aggregation, JSON and XML support, etc. all began as vendor extensions before they were standardized. Yes, sometimes the standard will create a new extension that no vendor supports or only minimally supports like the new MATCH_RECOGNIZE() expression. But for the most part it's the standard that lags behind and takes cues from the vendors.

Even then, the results of standard SQL sometimes are not adopted by vendors because nobody needs the features the standards body invents. Indeed, the standards body has sometimes standardized what would later be identified as bad practices. Elements like NATURAL JOIN or KEY JOIN (I think that's in the standard) are considered bad design since they result in ambiguous column joining when you start adding multiple tables or modifying the schema.

This is basically the exact same issue as XHTML vs HTML5 and browser vendors. It's not quite as bad since there's no W3C vs WHATWG political garbage (I can't imagine how bad it'd be if ANSI and ISO didn't cooperate) but it's the same kind of thing. Vendors are in competition and want to provide features that developers need. They're not going to wait for the ISO joint technical committee to decide what to do.

There's a few areas I think ANSI SQL falls down compared to some of the vendor syntax. eg I really hate the way table joins are done in ANSI SQL. I get the logic behind the syntax but the PL/SQL syntax for table joins gives me far less mental gymnastics. In fact it is probably the only thing about PL/SQL that I actually like.

Most DBAs I know consider any comma join syntax difficult to maintain and difficult to debug because it's often difficult to tell the difference between join conditions and filter conditions. It's also very easy to mistakenly create a CROSS JOIN with comma join syntax, too.

The real pain comes when you try comparing the old vendor specific OUTER JOIN syntax for Oracle and SQL Server. I guarantee you'll love JOIN ... ON ... over comma joins.


   SELECT * 
   FROM T1, T2
   WHERE T1.PK1 = T2.FK1(+)
SQL Server:

   SELECT * 
   FROM T1, T2
   WHERE T1.PK1 *= T2.FK1
Note that the outer indicator goes on the opposite side. Oh, and, of course, if you flip the order of the fields around, you've got to remember to flip the (+) operator. Which of these three are identical:

   SELECT * 
   FROM T1, T2
   WHERE T1.PK1(+) = T2.FK1

   SELECT * 
   FROM T1, T2
   WHERE T2.FK1(+) = T1.PK1

   SELECT * 
   FROM T1, T2
   WHERE T2.FK1 = T1.PK1(+)
Now imagine you're joining 5 tables and want to reuse the JOIN syntax.

Yeah. Fuck that. I'll take this any day:

   SELECT * 
   FROM T1
       ON T1.PK1 = T2.FK1

I have genuinly written a lot of complex SQL for both MySQL and Oracle and honestly I do prefer Oracles syntax for joins. But I did spend several years writing PL/SQL before learning ANSI SQL so I guess it might just be a question of what you're used to?

I would just be aware that the prevailing opinion is that the ANSI syntax is considered better because it's considered clearer, more maintainable, and more functional. I was perfectly fine with comma joins until I had to interact with both SQL Server and Oracle. Then I found all kinds of hidden benefits. Like if I want to query the same set of tables, I can just copy the whole FROM clause and reuse it whole-hog. The logical separation is just nice, too. My tables know how they relate to each other. They know what type of join is being done. I don't have to tell the fields how they relate to each other. Even if I get my join conditions wrong, the relationships are correct.

Even Oracle's own documentation[0] tells you to avoid the (+) syntax:

> Oracle recommends that you use the FROM clause OUTER JOIN syntax rather than the Oracle join operator. Outer join queries that use the Oracle join operator (+) are subject to [several restrictions], which do not apply to the FROM clause OUTER JOIN syntax[.]

The doc itself lists all the issues, which is about 10 of them. None of these are opinions, either. They're actually technical limitations with (+), though they may not be ones that you encounter. The only advantage I know of for the (+) syntax is some uncommon issues with materialized view optimizations.

[0]: https://docs.oracle.com/cd/B19306_01/server.102/b14200/queri...

Maybe on what you got started on. I started with T-SQL, then PL/SQL and now back to T-SQL, so you can guess which I prefer.

Me too. Joins FTW.

Depends which databases you use. This is super useful for Postgres/Mysql/SQLite, which aren't completely compatible, but have a large amount of crossover (they all use LIMIT for example).

Was confused at first, since it uses the same name as Microsofts Azure Data Lake language:


Hint: its not related!

I'm really waiting for "Todo -> General -> 13. proper PAGER support". Because the sqlite3 cli never had support if I remember correctly, so piping to less was always neccessary. A pager is key to getting comfortable on the cli imho.

Yes, that's a good feature to have. IIRC, IPython does not have it either, so, for example, if you do a dir(object_name) for some object, like a module, class, function, etc., if it has too many attributes, they scroll off the screen, more so since IPython prints the attributes one per line. The CPython shell is better for this, since it prints multiple attributes per line. For IPython, I have to do this:

    for at in dir(object_name):
        print at,
and that will still scroll off if it is too long (although in that case it would scroll off in Python too). So a built-in pager would be useful for many such apps, since it will allow you to stay in the app; if you pipe the app's output to less, you cannot interact with the app via the keyboard, until you quit the less invocation.

%page dir(object), %pinfo or %help object are your friends.

Great, will check those out, thanks.

It's missing completion for now, unless I'm mistaken?

Definitely a project to follow, though.

For me it's also a deal breaker, but I will keep an eye on it.

Nice. IBM have for some time now provided an Oracle-style SQL*Plus command line utility for DB/2 with almost complete compat. I wonder if that could be a common syntax for such DB-specific utilities, or if the usql folks have insight to share towards another proposal.

I already use pgcli, written in Python, in conjunction with pspg to tabulate results. I can't imagine what Usql would offer above what is already available through pgcli. Could anyone comment?

The creater of dbcli answered this question in the other usql thread [0]

> usql is a great tool if you're familiar with Postgres' psql client and wish you could use it for other databases like MySQL, Cassandra etc.

> dbcli tools are designed to preserve the usage semantics of the existing tools but improve on them by providing auto-completion. For instance you can use `\d` in pgcli and `SHOW TABLES` in mycli. This was a conscious decision to make pgcli and mycli drop in replacements for of MySQL and psql. I was also working under the assumption that people rarely use multiple databases, you're either a postgres shop or a MySQL shop. If you have a mix of both, there is a good chance that not a single person is interacting with both of them on a daily basis. You have different teams using different databases. But my reasoning there could be flawed.

> There is nothing stopping someone from adding an adapter to the usql tool to make it behave like MySQL (because they like the mysql client better) based on a command line argument, for instance.

[0] https://news.ycombinator.com/item?id=17303625

Multiple database vendor support, not only postgres.

Looks good. I miss beeing able to pipe sql to the client via stdin. At least I didnt see it in the README.

I didn't run it, but in the code it appears to take commands from stdin, but if they're coming from a TTY (interactive), they are processed through readline. The "-f -" trick didn't seem to be supported by the code.

Yes, it reads from STDIN. If it doesn't, it's a bug, and please file an issue on GitHub.

Thanks! Didnt try it, just couldnt find it in the examples.

Looks nice. It would be incredible if it supported autocompletion of table names and queries.

SQL Server's own tools are so good one doesn't really need this.

If you're developing against MS SQL server from linux or mac, you absolutely need something like this.

Unless you are using MySQL or PostgreSQL. ;)

SQL Developer for Oracle is also quite good.

Has anyone noticed that it is always mentioned that a tool is written in Go, even if this implementation detail has no relevance for the user?

Nobody would write "Universal C++ CLI ...".

People are fanatic about using tools written in a language they like. If people think that a program written in C will be faster than a program written in Python then it is a feature to them that may or may not be valid.

However modern languages have a lot of trade offs in deployment strategies that are just easier to say by stating the language.

Golang is normally fully statically compiled so getting this tool on a host requires nothing other than coping the completed binary. It often requires nothing at all from the system it is running on. This is an implied feature of the language.

C/C++ code can be that simple, but static binaries are not always possible. And then you require the libaries on the remote machine.

There is a good chance Python will be on a remote machine, but deployment could be complicated. Psql support requires the the libaries to be on that machine for example.

And if the tool was written in node, a ton dependencies and work would be required for a normal user to use it.

I agree that the language should not be a feature, but it is. Containers/snap/flatpak sorta help with this, but they are even more work for the user currently.

It does matter to the user.

A Java based tool requires installation of a JVM and likely has a non-trivial startup time. A Python / Ruby tool requires a runtime and probably some kind of mucking with virtualenv or rvm to work properly. A Node.js tool would require (haha!) a Node runtime and crossing your fingers that the NPM dependencies for what you're going to use to connect to your database haven't been compromised since the last build.

A Go based tool would be statically compiled, start up quick, and not have any external dependencies.

+1 for mentioning startup time. The default Java hdfs client takes 5+ seconds to just startup. The one written in Go is super fast and supports shell completion.

[1] https://github.com/colinmarc/hdfs

not if you use GraalVM :D

Or if you bundle your JVM... you can do it with the regular Java stack.

It's a cycle. We're on the tail end of it now for Go; there's been a couple of occasions where someone posts their project on HN and I'm surprised to notice it's Go without having been advertised as such. Rust is a bit behind Go, but I expect that to happen to me soon now. (ripgrep was close, but IIRC I was introduced via a blog post about how it works full of Rust source. Getting there, though.) Crystal, Nim, and a few others seem to be trying to initiate that cycle now.

Without disagreeing with your broader point, I'm happy to say that ripgrep was introduced using a title that never mentioned "Rust" (that was very much intentional on my part), and there isn't any Rust source code in the intro blog post.[1] To be fair, I did just re-skim it to make sure. There are plenty of references to Rust, but Rust had a very minor role in the blog post itself. It, of course, has a very major role in the viability of continued maintenance and feature development of ripgrep though. :-)

There will hopefully be a ripgrep-related blog post in the near future that will feature Rust quite a bit, but that will be about a library, not a tool.

[1] - https://blog.burntsushi.net/ripgrep/

Fair enough. Skimming over your post, I don't feel too bad as remembering it as having significant code, since it is pretty detailed with a lot of discussion (it's a great post!), but your correction is valid.

I do encourage people in the Rust community to have the confidence to just be in Rust and let people find it out, like you did. There's good reason in the early days to wear the language on your sleeve, but there's also good reason as time wears on to project quiet confidence. Rust is, in my opinion, quite solidly there, at least relative to HN.

It's a benefit to me over it being a Ruby/Node/Python CLI. Means it's likely to be easy to install, and performant.

I've always seen the mention of Go as shorthand that the software in question will be relatively easy to install, being a self-contained binary and all.

nodejs and rust comes to mind as language getting included the same way.

For me "in go" is useful information since I infer "easy to deploy" from it

I think it’s appropriate for a HN thread title. A lot of us are here because we are into coding and may want to see OSS projects in the languages we use. For me at least, the rest of the world can be all about the use cases and end products, but HN should be allowed to get geekier and worry about the language it was coded in.

If you look on the GH read me the Golang attributions are not quite as prominent.

I've never seen a modern C++ CLI compiled without dependencies.

Gets points quicker from the enthusiasts. Same effect as when I found a bug in the std lib a few years back, mentioned round these parts and got voted down. People.

We've updated the title from “Usql: Universal Golang CLI for SQL Databases” to the original from the linked page.

Same with Rust, various JS frameworks...

Applications are open for YC Winter 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact