I really like SQL, and think its one of the most important languages a dev needs...

snthpy · on Dec 30, 2022

> Eg SELECT !username from users could expand to all columns without username.

This should be landing in PRQL soon. See ongoing discussion here: [an exclude clause, that would select all columns except whats specified #172](https://github.com/PRQL/prql/issues/172)

The syntax we seem to be converging to is like you suggested:

    select ![foo,bar]

would select all columns except "foo" and "bar".

The aim is to support this at least for SQL dialects that support it like DuckDB, BigQuery, ... For other dialects we might just error out initially.

There is a goal of supporting working connected to a live database in which case the source table schema could be read and the appropriate SELECT clause generated. This would not be robust to schema changes such as future column additions though. At the moment the SQL is generated in an offline setting though so this is not possible.

phplovesong · on Dec 30, 2022

Thats nice to know!

If you are a PRQL dev i have a suggestion for you/your team.

Make PRQL language agnostic. In a real life scenario many people are writing SQL not in the app layer, but more ad-hoc for stats, reporting, analysis, examining the data etc.

What i mean by this is:

If i write an app that has embedded SQL i dont really care to much for the possible verbosity. This code will be used more than once, so im OK with taking the extra effort. Also all languages has an query-builder / ORM so the benefit of something like PRQL is possibly not big enough to merit it as an additional dependency.

My suggestion:

Make PRQL a cli tool that can be used by allowing users to connect to a database in a similar fashion as something like usql (https://github.com/xo/usql), I would find the most power in a tool like PRQL is this setting: Ad hoc SQL queries.

This would open the editor integration. With a tool like this i could open vim, and write PQRL in a window, and pipe it via something like Slime to a open PQRL session.

This would be pure money!

aerzen · on Dec 30, 2022

So something like a language server, but connected to a database and able to execute queries? That's an interesting idea.

While working on the compiler, I try to separate different components so things like this are possible. Currently, we are focusing on the language itself and also put some though into intermediate representation of the query (RQ - relational query). We are conservative about expanding the scope, because building database connectors and execution engines would take too much focus off the language and the compiler.

But I'd lie if I said that I don't want to dig into it.

phplovesong · on Dec 30, 2022

Well an LSP server would be a welcome addition.

But before that i would put some effort into the actual "repl/session" -like tool. Pipe a PRQL query to a PRQL session, and then the PRQL translates that to SQL, and returns the underlying response from the database.

If done right, this would open the possibility to build all sorts of cool editor plugins, that could directly interface with the PRQL session.

IMHO this is where a tool like PRQL would shine, as it would make writing SQL more enjoyable and way faster, but at the same time being both language and sql-dialect agnostic.

nonesuchluck · on Dec 30, 2022

Seems a natural fit for a notebook UI. If a PRQL cell doesn't start with "from," just continue adding filters to the pipeline above. Would let you progressively build pipelines by adding filters and derivations, while previewing the data each step along the way. Split a cell to debug a pipeline at any point.

snthpy · on Dec 30, 2022

Thanks for the suggestion. I don't think I knew about usql. I completely agree with you and have been working on a cli tool called `prql-query` or `pq` at the command line:

https://github.com/prql/prql-query/

Unfortunately I haven't had much time to spend on it of late but hope to push some updates soon.

zestyping · on Dec 30, 2022

In BigQuery you can specify columns to include as well as exclude. For example:

    select s.*, t.* except (s.foo, s.bar, t.blah)
    from s, t
    where ...

I've found this extremely handy for avoiding duplicate column names when joining tables. For example, trying to join two tables that both have an "id" column can be really tedious without this syntax.

How would your syntax achieve that?

Tostino · on Dec 30, 2022

The column list for a table is available in the information schema, just build the query with the expanded column list, minus the ones passed in to ignore on the DBs that don't have native support. Using the Information schema makes it DB agnostic for "free"

jj_ · on Dec 30, 2022

In BigQuery you can write:

  SELECT * EXCEPT (username) FROM users

https://cloud.google.com/bigquery/docs/reference/standard-sq...

mmsimanga · on Dec 30, 2022

DuckDb has the feature to exclude columns in the SELECT statement [0]

[0]https://duckdb.org/2022/05/04/friendlier-sql.html

snthpy · on Dec 30, 2022

> As an example why would WHERE clauses need to have a AND instead of multiple WHERE clauses.

PRQL does this:

    from employees
    filter country_code == "USA"   # Each line transforms the previous result.
    filter age > 50

compiles to:

    SELECT
      *
    FROM
      employees
    WHERE
      country_code = 'USA'
      AND age > 50
    
    -- Generated by PRQL compiler version 0.3.1 (https://prql-lang.org)

Sammyadems1 · on Jan 2, 2023

What about an OR in the where? or Parens with multibuls ANDs and ORs

albertopv · on Dec 30, 2022

SQL was meant to be like readable english, so two or more WHERE didn't made sense. Also select all columns but one may not be future proof is column are added or removed, you really don't know what columns that query is going to read, useful for oneshot query, but not much more.

wodenokoto · on Dec 30, 2022

This is a tension between application-sql and exploration-sql.

When building a data application, say a reporting tool, you want the code to be explicit. It is much easier to read if all columns are written out.

If you are ad-hoc exploring data, this is a pain.

phplovesong · on Dec 30, 2022

Meh, thats not a valid argument. What happens if i `select id, name from users` and later someone drops the name column. Similarly if i `select * from users` and use name the same problem/bug arises. Dropping a column without the application code refactor is obviously a developer issue, not a SQL one.

ilitirit · on Dec 30, 2022

You're right that it's a "dev/dba problem", but the reason for not using `select !<column>` is exactly the same as not using `select *`. If you specify exactly what you want, there's less of a chance of your ORM-based app breaking because it received an extra or too few columns. I'm sure most people who work with SQL DBs have been bitten this before.

Also, while it could also be useful, over the years I've started moving away from things like using `!` to negate things. It makes it slightly harder to read and it ruins the symmetry of my code. And lord knows SQL doesn't need any more of that. It's probably the only language I use often where I have to place my commas at the start of the line; where I need to add a first filter as `1 = 1` etc just to make it easier to comment out things quickly.

What I personally want most is the ability to move the FROM clause to the top (this is a no-brainer) and better aliasing rules. The alias I used in my SELECT should be accessible everywhere in the statement.

albertopv · on Dec 30, 2022

Select * is something no one should be doing in real code.

Select [list of columns] should the way to query and doing so you know exactly what columns are read. That's one of the reasons I dislike ORM like hibernate.

I advocate the least surprises principle, select * or select !column are doors to surprises. What if someone add a geometry column with tons of data (real example happened to me)? You are going to read tons of data that probably you don't need.

phplovesong · on Dec 30, 2022

The query itself does not matter. If i delete/rename a column the app WILL break no matter how much i would follow best practices.

I dont see `select !name from users` any worse than `select name from users`. In both scenarios the app will break if the name column is renamed/deleted.

beefield · on Dec 30, 2022

To me the biggest thing against select * is readability. Basically you are obfuscating what you are doing and getting from the database. In trivial queries it does not matter, but when you read more complex procedures, it starts to be a bit annoying.

rsaarsoo · on Dec 31, 2022

I think there are pros and cons to both approaches:

- Explicitly spelling out the column names has the advantage that you can search your code base for that name and see where it's used.

- When declaring a view which filters the rows in a table, using SELECT * would simplify maintenance, as you wouldn't need to update the view when new columns get added to the table.

diogofranco · on Dec 30, 2022

But delete/rename columns are not the only operations, nor the most common. Adding columns is the most common, and there only one of those options breaks

thesz · on Dec 30, 2022

>SQL was meant to be like readable english

DROP TABLE IF EXISTS and CREATE TABLE IF NOT EXISTS are counterexamples to that statement. Notice that suffix operator IS NULL is present in expression part, but these statements do not use that kind of grammar.

albertopv · on Dec 30, 2022

I was talking about original idea, current Sql evolved and in many ways is a monster, but multiple WHERE instead of AND operator make little sense to me

whateveracct · on Dec 30, 2022

> As an example why would WHERE clauses need to have a AND instead of multiple WHERE clauses.

Haskell's esqueleto SQL eDSL actually enables just this

BUT acting like this is a big deal is Wadler's Law for sure.

phplovesong · on Dec 30, 2022

Sure, multiple ORMs and query builders allow this, but when writing pure SQL thats dynamic its annoying to have to check for some condition before knowing "Should i use WHERE or AND next". This is traditionally solved with an hard coded WHERE 1 = 1 in the beginning of the query. Still its a wart, even tho a small one.

Point being, there are many other small things like this that (can) make SQL a burden, eg the select all but <n+1> columns.

michaelcampbell · on Jan 1, 2023

> why would WHERE clauses need to have a AND instead of multiple WHERE clauses

Semi related. Python's for comprehensions allow multiple if's.

    some_list = [ x for x in things 
                  if x % 2 == 0
                  if my_function(x) > 0
                  if some_other_global_predicate()
                  ...
                ]

And this is shown in the python docs.

I've had PR's rejected for using this since they wanted "and". Drove me up a wall as this IMO reads as easily if not more easily than the "and" version, and (again, IMO) is more for humans where the "&&" is more for the computer.

aabbcc1241 · on Jan 2, 2023

I made a DSL that compiles to SQL.

It allows multiple where statements, and you can write where, having, group by, limit and orders in any other.

You can also write select statement in nested syntax similar to graphql, and have it auto generate join table statements.

You can use it as a npm package or try it on this playground https://better-sql.surge.sh

tome · on Dec 30, 2022

> When working with big tables, its super annoying to select all but one column

What's the use case for that? In fact, I can hardly understand the benefit of projecting at all, except at the very end. Why would you want to do it? Is there a performance benefit?

chaps · on Dec 30, 2022

I would love this addition in SQL. My work is largely exploratory with SQL. 99% of my queries will never get saved anywhere.

At the end of the day it's just syntactic sugar, but sometimes when writing reports you sometimes don't want to include a single column like a sequential key, because it's not relevant information to the end user or exploration.

Another reason is that some text columns are long enough that just absolutely destroy readability in things like psql. Often I just want to get a feel for what's in the data and removing those annoying text fields to make psql readable is a pain. You have to run a `select * from table limit 0`, copy that into a text file, then `s/ +\| +/, /g` the columns list, then remove the annoying field... then throw it back into your query. Takes about 45 seconds to do all that and that adds up!

tome · on Jan 3, 2023

Right, I understand why you'd want to project at the very end of the query, i.e. suppress some columns from being shown. Other than that, I don't see the point, unless there's a performance impact. Keeping columns around doesn't actually change the result of the query in any way.

chaps · on Jan 8, 2023

You've definitely missed the point because you're concerned about the "very end of the query". I'm describing both a reporting and more importantly as far as I'm concerned an explorative process that's disrupted by the silly need to list out all columns. Performance is irrelevant.

tome · on Jan 8, 2023

Yes, I know I've missed the point. That's why I asked my question in the first place! My question was:

> What's the use case for that? In fact, I can hardly understand the benefit of projecting at all, except at the very end. Why would you want to do it? Is there a performance benefit?

and I meant it literally. It's not an oblique, rhetorical way of saying "you don't need to do this". It was a genuine question!

You've answered that it's convenient when doing exploratory work. Thank you. That helps me understand better.