Hacker News new | past | comments | ask | show | jobs | submit login
How about trailing commas in SQL? (eisentraut.org)
227 points by ingve 41 days ago | hide | past | favorite | 268 comments



I feel the problem. When coding (not only in SQL) you often have to add something to the end of a list, and it is annoying that the end of the list is always special. You can't just copy some line and move it there. Also when moving things around you always have to take extra care at the end.

So, my solution for this was always

  SELECT a
       , b
       , c 
  FROM ...
instead of:

  SELECT a,
         b,
         c,  -- people want to allow trailing comma here
  FROM ...
Leading commas also are also very regular and visible. A missing trailing comma is a lot harder to see.

Before people start to mess with the standard, I'd suggest to maybe just change your coding style - and make the start of the list special - instead of the end.

I'd suggest going for a lot less trailing commas - instead of allowing more.


> it is annoying that the end of the list is always special. You can't just copy some line and move it there. Also when moving things around you always have to take extra care at the end.

You have simply moved the "special" entry to the beginning rather than the end.

Side remark: I've noticed that when it comes to code formatting and grammar it's almost like there are broadly two camps. There are some instances of code formatting that involve something "clever". Your example of leading commas above for example. Another example is code ligatures. It's as if there's a dividing line of taste here where one either feels delight at the clever twist, or the total opposite, rarely anything in between. I happen to dislike these kinds of things (and particularly loathe code ligatures) but it is often hard to justify exactly why beyond taste.

Code ligature thing has something to do with just seeing the characters that are actually there rather than a smokescreen, which IMO impedes editability because I can't place the cursor half-way through a ligature and so on. But it's more than that -- you could fix those functional issues and I'll still dislike them.


> I can't place the cursor half-way through a ligature and so on.

That sounds like a bug in the editor. Ligatures are incredibly common font features even outside of programming, even sequences like fi are often ligatures.


How could this possibly be done for ligatures which don't maintain their horizontal position though? `!=` will usually be shown as an equals sign with a line through it right down the middle.


For code editing, usually ligatures are spaced relative to their expanded variant.

E.g., ≠ is as wide as two characters.


Adding something to the end is the most common thing to do. And changing the start is extremely rare - it's anyway special because you usually put it in the line with the SELECT.


> you usually put it in the line with the SELECT.

No, you usually don't. That would "bury" the first field so you don't immediately see it if you quickly glance at the code. I'll admit I was a bit surprised when I saw a fully formatted SQL query but it does look much better:

   SELECT
      a,
      b,
      c,
      d
   FROM
      Customers
   ORDER BY
      b,
      c DESC

Edit: I've just seen other comments here suggesting you return an extra "pad" value at the start so you can use leading commas without "losing" the first value visually. I hardly know where to start with that. It transmits more data on the wire and, to my eyes, looks horrific. That level of hackery is proof, as if it were needed, of how badly needed trailing commas are.


Any new style will look bad, simply because it is new to us. But you quickly get used to it.

After that, it is about minimizing errors. Leading commas minimize errors, and is a style that is portable across databases.


> Any new style will look bad, simply because it is new to us.

I agree that new styles and syntaxes can take some getting used to. When I went from Pascal to C, braces seemed awful but I now think punctuation is much clearer than words for delimiting blocks.

But the discussion here is about whether to introduce a new clearer style. Taken on this context, your comment amounts to defending any existing syntax, no matter how horrific, because you'll get used to it eventually.


> But the discussion here is about whether to introduce a new clearer style. Taken on this context, your comment amounts to defending any existing syntax, no matter how horrific, because you'll get used to it eventually.

No, I'm not defending any existing syntax.

As I said, it is about minimizing errors. You can get used to anything. But some styles and syntaxes are more error-prone than others. Leading commas in SQL is less error-prone, and therefore is preferred. No matter what your initial aesthetic impression. Allowing extra trailing commas is not portable, and therefore will increase errors if you come to rely on it then move between databases. So if the feature is provided, I will still be slow to adopt it into my style.


I always do this. There's usually one field that you can put at the start that never changes. But the field at the end will keep changing as you add more fields to the SELECT list.

   SELECT
      a
      ,b
      ,c
      ,d
   FROM
      Customers
   ORDER BY
      b
      ,c DESC


it is a trade-off

reordering, insertions and deletions much easier within the IDE as you’re manipulating lines as a whole. focus maintained while you are in the zone or tired is the biggest gain

at the cost of slightly uglier code and a miniscule… truly miniscule wire overhead.

other decisions will have much larger wire impact, say choosing a column wider than necessary


Hard disagree.


I'm curious why you think so? I agree totally with the parent comment; when I iterate on a SQL query the most common place to make changes is near the end of the SELECT block, adding and removing and refining column expressions. I do the exact same "comma first" trick to make my life easier.


I only write SQL for complex grouping operations, otherwise the Django ORM is superior. And in those situations, the ordering is really not very relevant, and thus changes can occur anywhere.


> Code ligature thing has something to do with just seeing the characters that are actually there rather than a smokescreen, which IMO impedes editability because I can't place the cursor half-way through a ligature and so on

Why wouldn't that be possible? (The cursor thing)

It's still two characters as far as your editor is concerned.


I didn't say it's not possible, but certainly when I've tried ligatures in the past my editor treated it like a single character until I pressed backspace after the character (or delete before) or whatever. Anyhow there's all sorts of weird functional artefacts that just feel dodgy to me and they basically all arise from this world where multiple runes are being treated as a single one. Here are two more examples:

1. If I `/=` in vim (find equals signs), should the "!=" ligature be highlighted? If it is, then it is misrepresenting the state: if I then press delete, the highlighted ligature won't actually be removed, only part of it will be.

2. In javascript the `===` operator is less visually distinct from `==` since if there are no comparisons nearby I have to judge from the length somehow and it takes longer, adding some additional comprehension fatigue.

Anyway like I said, you can probably try to fix all of these with weird special case handling but it's just fighting the underlying assumption my editor makes (and my brain has learned to understand): a source file is a list of runes. That's how I like to think about it.

Anyone is free to think otherwise. They're just in that other camp and that's OK. But I don't like them personally.


Among the editors which handle well ligatures is Geany.

The cursor moves always over a character, regardless whether it is part of a ligature or not. You can select and delete or replace only a part of a ligature, exactly in the same way as when using a font without ligatures.

The only thing that ligatures change is the visual appearance of the text, which is in my opinion an extremely useful workaround for the inertia of the legacy programming languages, which continue to use the restricted ASCII character set, which has never included all the characters that would have been needed in a programming language.

Now with ligatures, we are finally able to write an ALGOL 60 program that looks like it was intended to look, 65 years ago, instead of being forced to use awkward combinations of ASCII characters, like later programmers and designers of programming languages have resigned themselves to do, because of the character set limitations that the American manufacturers of computing equipment have been unwilling to remove (causing problems not only to programmers, but also to the users of non-English languages).

ASCII has never been intended to include the characters required for writing mathematical expressions. ASCII has been intended only to include the set of characters that were in use in the commercial correspondence written in English, which had been previously available in the better typewriting machines. Because of this, ASCII has always been bad as a character set used for a programming language.


A good compromise is an option from the Kitty terminal[1]: if your cursor is on top of the ligature, show the individual characters.

[1] `disable_ligatures cursor`


I'm not sure if this is an example of code ligatures, but I used to work with a guy who had configured his editor so that "=>" was replaced with some kind of special arrow character, and it used to drive me nuts. When I read code, I want to know what's actually there, and not have to ask "what does that arrow actually mean?"


But you have demonstrated that you know what the arrow means. Do you have your editor configured to display special symbols for all whitespace characters? Including newlines? Are you sure that you aren't just annoyed by looking at stuff you're not used to?


when chrome started hiding the https prefix from urls it still showed the same information as https was the only hidden prefix, yet a lot of people were similarly upset.

Code ligatures make text harder to read because they are not text, expecially when many of those ligatures are identical to actually existing unicode characters making the gliph more ambiguous than they laready were.

IMHO symbol should reflex their use: the dozens of different arrows used in math are meant to be handwritten or at least to be seen as a complex gliph.

=> and -> in almost every programming language is a = or a - followed by a >. We could very easily make a language where ⇒ is an operator (https://isabelle.in.tum.de/ does for example IIRC) but most languages use simple ascii characters for grammar for good reasons.

IMHO code ligatures are worse than a cursive coding font.


> Do you have your editor configured to display special symbols for all whitespace characters? Including newlines?

I actually have my editor configured to display special symbols for all control characters except U+000A NEW LINE (that includes U+0009 HORIZONTAL TABULATION as well, this is actually a control character, not a whitespace) and all whitespace characters except U+0020 SPACE.


> You have simply moved the "special" entry to the beginning rather than the end.

Yes, but

1. Terms are more commonly added to the end than the beginning.

2. The beginning is often already special, e.g. starting with `SELECT`.

3. The commas are visually aligned/consistent with the indentation.

4. Because of #3, it's far easier to spot a missing comma.

I worked for years in a SQL-heavy role, and this style was the preference there, for these reasons.


That is the haskell workaround, and it also sucks, because it still requires a special non-uniform first value.

I do not want to write either of your snippets, I want to write

  SELECT
      a,
      b,
      c,
  FROM
Because now selected values are uniform and I can move them around or add new ones with minimal changes no matter their position in the sequence.

It’s also completely wonky in many contexts e.g. CREATE TABLE. Trailing commas always works.

> I'd suggest to maybe just change your coding style - and make the start of the list special - instead of the end.

Supporting trailing commas means neither is special.


Spot on.

Also, to generalize a bit: if a human is expected to read it, the way humans write should be able to be parsed by it. That's subjective, to a point, but it's an easy rule-of-thumb to remember. Trailing commas are so common that people have built workarounds for them. Therefore, they can be understood as "the way humans write". If you're writing a language that you still want to be readable by humans, you really should account for that. And, no shade for it not already being done. I'm just reiterating that there should be NO pushback to allowing trailing commas. It's a completely "common-sense" proposal.


> It's a completely "common-sense" proposal.

The SQL standards committee is having none of it. I can tell you that just from this one sentence.

And, more seriously, there isn't really such a thing as a common-sense proposal with SQL. The grammar is so warped after all these years that there isn't a path to consistency and the broken attempt at English syntax has rendered it nearly incomprehensible for both human and machine parsing. Any change to anything could have bizarre flow on effects.

I'd love to see trailing commas added to SELECT though. Given the mess it isn't possible to make the situation worse and the end of the list being special can be infuriating.


I don't want to write commas at all. I want to write

    SELECT
        a
        b
        c
    FROM d
Or even

    SELECT a b c FROM d
Because now selected values are uniform and there's no superfluous punctuation to worry about.


This would work (w/ a context free grammar) if aliases required the 'AS' keyword.

Ambigious:

   SELECT a aliasA b c aliasC FROM d aliasD e
Unambigious:

   SELECT a as aliasA b c as aliasC FROM d as aliasD e
Alternately, a schema-aware parser could determine if 'aliasA' was an alias or a column reference.

FWIW, personally, I'd rather go the full-Python, using newlines as delimiters:

   SELECT 
      a aliasA 
      b 
      c aliasC 
   FROM 
      d aliasD
      e
(With tabs for nesting.)


> FWIW, personally, I'd rather go the full-Python, using newlines as delimiters:

I love Python - mostly, but significant whitespace is its worst curse, and I'd love new languages to move away from the idea. I enjoy Rust because I can write out garbage faster than I can think of it, and the auto-formatter will make it look neat.

Also, have you experienced the unspeakable horror that is templating YAML files? Of course in SQL prepared statements is the safe&sane way to go, but there are cases where you still have to generate SQL as text, such as when building an ORM.


Aside: I avoid ORM hell by using SQL to generate source code.


I'm slowly shifting in that direction as well, but the libraries built around this style are far less mature / mainstream. Django is the "boring" solution and gets hella work done.


Actually it's still ambiguous, because you forgot about sub-expressions. How would you parse this:

   SELECT a as aliasA b select q from foo as aliasQ c as aliasC FROM d as aliasD e
Even if you can figure it out (and add a 3rd level of sub-expression before deciding if you can), it's completely unreadable. You need either commas or parenthesis.


We can simply employ ACI (automatic comma insertion). Every newline after Select implies a comma, unless there’s a clearly unfinished column definition. And if you want to list columns on a single line, you have to type commas explicitly.

  Select Foo Foo, Bar +
    1 Baz
    Baz Bar
    Quux Frob
  from
    t Table


You can get either commas or required `as` and parenthesis around expressions.

IMO, required `as` and parenthesis are better. But it's not a clear thing where everybody will agree.


Ideally there'd be a syntax that looked like a function `SELECT(a, b, c, d)` with a totally distinct variant for specifying types `TSELECT(a, int, b, null, c, text, d, timezonetz)`.

The big problem here is the seemed-good-in-the-70s syntax that died with SQL. In the best of all possible worlds they could have just used Lisp as the foundation then let people macro/transpile their own syntaxes. A subtle flaw in SQL is being just hard enough to correctly emit that most people don't build new languages on top of it.


Oh, just add optional commas at the end of your language's parameter lists...

I have really no idea what's "ideal". Haskell's "white space means application" is way nicer than most function syntaxes, end of line statement separation work very well if you don't need any long statement, the SQL complex syntax is very convenient... and yet if you mix any of those things, you'll get a truckload of disastrous issues.

But one thing that is clear is that allowing for trailing commas in SQL is a gain.


There is PRQL and a few others. There is also JOOQ which does transpilation to and between various SQL dialects (sadly open core with a small core). Also, some databases like ClickHouse are implementing alternative languages like PRQL and Kusto natively.


I'd still want to have leading commas.

If the items in the list are long IF(...), maybe uses several lines and maybe contain SELECTs it's hard to see a missing trailing comma. At the start they're all lined up well, and it's very hard to get them wrong.


> I'd still want to have leading commas.

Real happy for you. Trailing commas support don’t prevent you from doing that.


I often do this with boolean WHERE filters when I'm doing interactive exploration on some data:

    SELECT ...
    WHERE
      foo = 1
      AND bar = 2
I want to comment out a line (i.e. "--foo = 1"), but would break the syntax.

The solution is to start with "WHERE true":

    SELECT ...
    WHERE true
      --AND foo = 1
      AND bar = 2
Now you can comment/comment anything.

(Putting the AND at end of each line has the same problem, of course, and requires putting a "true" at the bottom instead.)


I do the same, though my muscle memory is `1=1` instead of `true`.

Of course then you get editors/linters/coworkers that always point out that the 'true' is unnecessary. This also doesn't work with ORs (just swap to false), but in practice it seems it is always ANDs that is being used.


Huh, I've always used "WHERE 1 = 1 AND ...". Using `true` looks more clean.


Not all sql variants support "true". e.g. SQL Server doesn't.


In XTDB you can now use (trailing-friendly) commas in this scenario too, instead of ANDs: https://github.com/xtdb/xtdb/pull/3985


God, everyone's going to hate me for this. (I will have earned it, I think.)

    SELECT ...
     WHERE foo = 1
       AND bar = 2
Each keyword gets a new line, the middle gutter between keyword and expressions stays in the same place, and things get really, really fugly if I need a subselect or whatever. Any given line can be commented out. (And no, none of that leading comma bullshit, somehow that looks nasty to me.) Downvote this into oblivion to protect the junior developers from being infected by whatever degeneracy has ahold of me.


You may have missed what I was getting at.

In order to quickly comment out the "foo = 1" here, you cannot simply comment out the whole line because it would become syntactically invalid:

     SELECT ...
     --WHERE foo = 1
       AND bar = 2
I have a single keyboard shortcut to comment/uncomment a line because I like to work briskly with as little unnecessary typing.

It has nothing to do with indentation or being "fugly". I'm talking about interactive exploration of prototyping when the final SQL isn't set in stone.


I slowly come to creating vim scripts for all that. I already use `alt-,` for triggering the final comma on a line. Maybe with modern LLMs I just need to prompt a vimscript that detects where the line is (SQL condition, array item, json, etc) and use `alt-,` to do the right thing. Or something like "fixing" the whole block with `g,ap`. Because all this is irritating and no one does anything with it for decades.


Since we're already here, we could think about trailing AND, actually. Look:

  SELECT a, sum(b),
   WHERE foo = 1
     AND GROUP BY a
Sounds pretty SQL to me.


This is cursed, but also entirely consistent with the trailing comma proposal.

It'd sure look funny if all keywords that connected clauses together were trailable, though.

    SELECT a, sum(b),
    FROM stuff
    WHERE foo = 1
       OR GROUP BY a
Or just as horrifying:

    SELECT a, b, c,
    FROM foo
    UNION ALL
    SELECT a, b, c,
    FROM bar
    UNION ALL


Those aren't horrifying to me, only problem is that I want the keywords to right-align with each other.


IMHO this is one of the ugliest formattings. Whenever I see that i try to revert and avoid at all costs. I know it's a personal flavor, yet. I might be too opinionated..


SQL is also case insensitive for most clauses!

`SeLeCt ... fRoM ... wHeRe ...` IS VALID! (And you should use a linter/formatter to avoid these categories of style war)


I agree, I'd say I usually don't care about aesthetics, but that somehow looks so wrong, I am bothered by it.


I too prefer leading commas, especially useful when you're prototyping a query. I also picked up the ORM trick of starting your WHERE clause with 1=1 so that every meaningful filter can start with AND ... I'm not as consistent with this one, but it's handy too.

I catch (friendly) flak for my zealot SQL formatting (capitalization, indenting) and know it doesn't impact the execution, but there's something about working in logic / set theory that matches with strict presentation; maybe helps you think with a more rigid mindset?


ClickHouse has support for trailing commas for several years.

I recommend looking at ClickHouse (https://github.com/ClickHouse/ClickHouse/) as an example of a modern SQL database that emphasizes developer experience, performance, and quality-of-life improvements.

I'm the author of ClickHouse, and I'm happy to see that its innovation has been inspired and adopted in other database management systems.


I do this as well.

on top i often do a pad entry so that the elements are all on their own line

  SELECT 1 as pad
    , a
    , b
then i can reorder lines trivially without stopping to re-comma the endpoints or maintain which special entry is on the line of the SELECT token

what would be helpful is both LEADING and trailing commas

so I am suggesting:

  SELECT
    , a
    , b
would be permissible too. The parsing step is no longer the difficult portion.

Developer ease leads to less mistakes is my conjecture.


Why not something like

  SELECT
      a,
      b,
      c,
  1 as pad FROM
Then?


cool, that would work too. my preference is leading separators so the separators are all in a visual column. being in a visual column allows the eye to discount the separators easily.

typically names are different lengths and the commas are hard/harder to spot

Your suggestion:

  SELECT
     first_column,
     second_column_wider_a_lot,
     (third + fourth_combined_expression),
  1 as pad FROM
vs my current preference:

  SELECT 1 as pad
     , first_column
     , second_column_wider_a_lot
     , (third + fourth_combined_expression)
  FROM


Another trick is, if you're programmatically building a SQL statement - adding "WHERE 1=1" makes things easier ... like so:

  SELECT *
  FROM table
  WHERE 1=1
That way, if you want to filter down the result, everything programmatically appended just needs an "AND ..." at the start, like:

  SELECT *
  FROM table
  WHERE 1=1
  AND age > 21
  AND xyz = 'abc' ...
Because without "WHERE 1=1", you'd had to handle the first condition different than all subsequent conditions.


I prefer “1=0 OR 1=1”, because when you delete all conditions you can keep 1=0 out of selection and it decays into a no-op rather than destroying a table:

  DELETE FROM table
  WHERE 1=0[ OR 1=1
    AND age > 21
    AND xyz = 'abc']
  ;
Brackets designate selection bounds before text deletion. The above just safely does nothing after you hit DEL.

Without that you’d have to delete whole [WHERE…], which leaves a very dangerous statement in the code.


Similarly, you could select NULL as your leading column, and prepend commas by that means.

That method does impact the result set, and using it for CTAS or bulk insert would require more care in column selection.


You can also just do "where true", easier to type.


Not in e.g. MSSQL


You've moved the problem from the last to the first element though. Surely people would prefer to be able to do this

  SELECT
      a,
      b,
      c,
  FROM ...

?


The first element is modified less often than the last element. Often it's just an "id" column or something. Comma-first is a net win.


as mentioned elsewhere, i personally introduce a pad element to get fire and forget consistency

  SELECT 1 as pad
    , a
    , b


That's something where "what it makes the computer do" overrides "how nice it looks in text form" to me.


Recently I've been formatting like this but with tabs so the first column is aligned with subsequent columns:

    SELECT
        a
    ,   b
    ,   c
    FROM ...


Mode Analytics published some data years ago showing that SQL programmers who preferred leading commas had a lower rate of errors than programmers who used trailing commas. [0]

0 - https://mode.com/blog/should-sql-queries-use-trailing-or-lea...


I do this but I'm skeptical of the causation. I think it might be a symptom of people who are generally more careful with syntax because the formatting means more to them, so they spend time reading the query and moving bits around, which is how they find little typo bugs.


> Before people start to mess with the standard

No, it's still a really good reason to mess with the standard.

While the standard doesn't evolve into something slightly modern, yes, that workaround is better than the way people usually write SQL. But its existence isn't a reason not to fix the fundamental problem.


SELECT a

       , b

       , c
 
  FROM ...
is the same as:

SELECT a, b, c FROM ...


I updated my comment... On the first try the code-formatter here played some tricks with me.

The line before the Code has to be empty to get correct formatting.


I've done this for arrays in JSON files, so that git merge will merge two changes that append to the list without a conflict.

I think the right answer is to fix the merge algorithm to handle some common cases where an inserted line logically includes the delimiter at the end of the previous line.


The problem with that is that it's invalid json. Some things might tolerate it though.


the simplest solution is to have commas be "separating whitespace" so that "," === ",,,,,,,,,,,," === " ,,,,, , , ,, , ," so your example can become

  SELECT
       , a
       , b
       , c 
  FROM ...
and

  SELECT a,
         b,
         c,
  FROM ...

or

  SELECT
         a,
         b,
         c,
  FROM ...

the only limitation is that sometimes more parenthesis are needed and that SELECT col_name new_col_name from table_name needs to be rewritten as SELECT col_name as new_col_name from table_name. good tradeoffs IMO


Alternately:

  SELECT a, b, c, NULL FROM ...

  SELECT a, b, c, true FROM ...

  SELECT a, b, c, 'ignore me' FROM ...
FWIW, my SQL grammar ignores trailing commas. IIRC, H2 Database does too.


This is how SQL Management Studio writes the queries it writes for you (like Select 1000 records) so when you comment out a line or lines, it doesn't cause any issues due to a misplaced comma.


This is way less readable to me, because now my brain has to parse "a comma, actual thing" instead of a list of things.


I have never been a fan of leading commas. How often are we hastily moving column expressions around?

I also have a somewhat controversial style preference in regard to capitalization. Because SQL is case-insensitive, I will always type everything in lowercase and let syntax highlighting do its thing. I hate mixed capitalization. Not only does it feel like keywords are yelling at me, but also the moment someone else gets involved there will be inconsistent casing. Do you capitalize just the keywords or do you include functions? How about operators (e.g. “in” or “like”). More often than not I see individuals are inconsistent with their own queries. So I say to hell with it all and just keep it lowercased


You are a monster. Being against trailing commas is like being against happiness or cute children or Cheetohs.

You do not need to see the missing comma. That is what compilers are for. Also, literally the only reason anyone has a missing comma is because they reordered the terms and forgot to put the comma onto the one that was forced to not have one because of this monstrous failure.

Some things are nothing but good. You are on the wrong side of history.


One thing I like about the nix language is it uses semicolons to separate elements of the map: while I use trailing commas, they always look dangling to me whereas semicolons look fine without another expression following.


That's as bad as using regular expressions: now you have TWO problems.

Why do you seem to think you've cleverly solved the problem, when you've just moved the problem somewhere else just as bad, by blithely messing with the standard formatting conventions universally used by most human written languages and programming languages in the world?

Programming languages borrow commas from human written languages, and no human written languages have leading commas. And moving the problem to the beginning on the list because you sometimes add things to the end ignores the fact that that also causes problems with READING the code as well as writing it, and you READ code much more often than you WRITE it.

That's not a solution at all, and there's nothing clever about it. You've just pushed the problem to the beginning of the list, and now your code is also butt-ugly with totally non-standard formatting, which sane people don't recognize and editors and IDEs and linters don't support.

I agree with cnity that it's too clever by half, and I'm in the camp (along with Guido van Rossum and his point that "Language Design Is Not Just Solving Puzzles" [1] [2]) that's not impressed by showboating displays of pointlessly creative cleverness and Rube-Goldbergeesque hacks that makes things even worse.

Of course it's not as bad as tezza's clever by a third suggestion to add a confusing throw-away verbose noisy arbitrarily named pad element at the beginning, that actually forces the SQL database to do more work and send more useless data. Now you have three or more problems. The last thing we need is MORE CODE and network traffic contributing to complexity and climate change. I would fire and forget any developer who tried to pull that stunt.

[1] https://www.artima.com/weblogs/viewpost.jsp?thread=147358

All Things Pythonic: Language Design Is Not Just Solving Puzzles. By Guido van van Rossum, February 10, 2006.

Summary: An incident on python-dev today made me appreciate (again) that there's more to language design than puzzle-solving. A ramble on the nature of Pythonicity, culminating in a comparison of language design to user interface design.

[...] The unspoken, right brain constraint here is that the complexity introduced by a solution to a design problem must be somehow proportional to the problem's importance. In my mind, the inability of lambda to contain a print statement or a while-loop etc. is only a minor flaw; after all instead of a lambda you can just use a named function nested in the current scope.

[...] And there's the rub: there's no way to make a Rube Goldberg language feature appear simple. Features of a programming language, whether syntactic or semantic, are all part of the language's user interface. And a user interface can handle only so much complexity or it becomes unusable.

[2] http://lambda-the-ultimate.org/node/1298

The discussion is about multi-statement lambdas, but I don't want to discuss this specific issue. What's more interesting is the discussion of language as a user interface (an interface to what, you might ask), the underlying assumption that languages have character (e.g., Pythonicity), and the integrated view of semantics and syntax of language constructs when thinking about language usability.


What your mind rejects, mine finds freeing. What's idiomatic and natural depends on personal experience and evolves, as does the culture around you. There was a time in my life when I may have rejected leading commas as well, but at some point I came around to them and have never looked back. It works for me. I legitimately find it easier on my mind, and it has caused me far fewer annoyances than a comma-less last column has. I have colleagues that use it as well out of their own preference. I would suggest that insisting on a universal orthodoxy of stylistic preferences is much more oppressing to the spirit than occasionally needing to adapt the mind to the reading of something formatted in an unfamiliar style.


Leading commas are the way!


People seem to think nothing of putting all other syntax on a new line (+, -, ||, &&, AND, OR are all fine) but as soon as you put a comma on a new line everyone loses their minds.


> I would fire and forget any developer who tried to pull that stunt.

A tad bit harsh there?

it is a trade off.

clarity and ease during design time versus slightly uglier but still consistent code as a work around. miniscule energy overhead


Firing and forgetting was your suggestion!

>tezza 4 hours ago | root | parent | next [–]

>as mentioned elsewhere, i personally introduce a pad element to get fire and forget consistency

Adding an extra pad entry is a cure much worse than the disease, and I'd expect it should be objectively obvious to anyone that you're introducing more complexity and noise than you're removing, so it's not just a matter of "style" when you're pointlessly increasing the amount of work, memory, and network traffic per row. But sadly some people are just blind to or careless about that kind of complexity and waste.

You might at least have the courtesy of writing a comment explaining "Ignore this extra unused pad argument because I'm just adding it to make the following commas line up." But that would make it even more painfully obvious that your solution was much worse than the problem you're trying to solve. You seem to have forgotten that other people have to read your code. Maybe just don't leave dumpster fires burning in your code that you want to forget in the first place.

As Guido so wisely puts it: "the complexity introduced by a solution to a design problem must be somehow proportional to the problem's importance".


>That's not a solution at all, and there's nothing clever about it.

It is a solution, and I'm not motivated by trying to be "clever". It just makes writing and reading the query easier for me.

>You've just pushed the problem to the beginning of the list

The beginning of the list is modified less often than the end. The two cases aren't symmetric.

>and now your code is also butt-ugly with totally non-standard formatting

"Ugly" is subjective. Personally I like how the commas line up vertically, so I can tell at a glance that I didn't miss one out. SQL doesn't have a standard formatting in any case. It's whitespace insensitive and I've seen people write it in all kinds of weird ways.

>which sane people don't recognize and editors and IDEs and linters don't support.

A difference in code-formatting taste is not "insanity". And it does not interfere with tooling at all.

>showboating displays of pointlessly creative cleverness

Where are you getting all of this from? You seem to be imagining a "type of guy" in your head, so that you can be mad at him.

I am reminded of Sayre's law: In any dispute the intensity of feeling is inversely proportional to the value of the issues at stake.


Programming languages borrow commas from human written languages, and no human written languages have leading commas

How is this any different from leading periods, which seems to have become the standard across several of the most popular programming languages?

    myobject
        .somefunc()
        .otherfunc();
It's not subjectively pleasant in my opinion, but I think it's hard to argue that it doesn't improve maintainability and (apparently) readability for the masses.


I don't always read code(*), but when I do...

   o 
  ムワ
The padding and separators are the least issues I wish I only had there.

We don't need philosophy or morals (or jobs where that's important), we just need the way to edit these damn lines without anything screaming "syntax error!" or parasitic "++--" diffs every time you make a change. “When art critics get together they talk about Form and Structure and Meaning. When artists get together they talk about where you can buy cheap turpentine.”

-

(*) That classic claim didn't turn out true for me after so many years, that I suspect developers simply avoid admitting that it's not true for them either.


Please god. I don't believe in you but maybe someone else does and will think me a kindred spirit worthy of mercy. Let me have trailing commas. There is no reason not to. It's backward compatible, easy to implement and would make the world so so so much better.


they are _not_ backwards compatible. That's a big part of the problem. A trailing comma is a syntax error for an SQL engine without support for it


I mean, yes, technically, but is anyone's code actually breaking because of this?

Who is writing SQL queries expecting them to fail because of this reason?

By your definition, "backwards compatible" doesn't mean anything. Literally everything will be a breaking change if you define "backwards compatible" like this.


I think the term you are looking for is "forwards compatible"! Old SQL queries will still run fine on engines that support the new syntax (they're forwards compatible.) New SQL queries with trailing commas will NOT run fine on engines that don't support trailing commas; this is not a backwards-compatible change. And that's fine.


It's forwards compatible from the perspective of the query - queries in the new style will continue to work on newer db versions.

It's backwards compatible from the perspective of the database - the db will continue to support queries from older versions.

I believe you're speaking from different perspectives.


> New SQL queries with trailing commas will NOT run fine on engines that don't support trailing commas; this is not a backwards-compatible change.

But they never ran fine on engines that didn't support trailing commas in the first place :/

What you're calling "forwards compatible" is what I call "backwards compatible". Frankly, I suspect most people expect "backwards compatible" stuff to work like this.

Is this distinction useful in any way?


I think the confusion may be whether you're talking about the queries or the engine. I think this change to the engine/parser would be backwards compatible because old queries will still work on the new engine. A change to the queries in a codebase to include trailing commas would not be backwards compatible because it won't work on older parsers. It seems clear to me that the change discussed here is the engine, hence it should properly be characterized as "backwards compatible".


No. It is not. Whomever brought it up was not helping.


I think the question is whether you are disagreeing for pedantic reasons or are actually trying to contribute.

If you are trying to contribute, you will use the phrase backward compatible as we always do: my old code will still run after the change.

If you are trying to be pedantic, you will choose a new way of looking at it: My new code will not work if you go backward in time.

I don't care. I will swear to never write a dangling comma for any engine that doesn't support it. Also, I will swear to never use an engine that doesn't support it if once that does exists... Just as I did with Javascript.



Don't let the perfect be the enemy of the good. I'm sure that 99% of the people who are requesting trailing comma support care about select lists. And CREATE TABLE for the rest. Yes, the SQL standard has a huge surface area of custom syntax (rather then most programming languages which have a smaller number of composable atoms), which makes adding "consistent" syntax changes a challenge, but it doesn't mean you need to boil the ocean. Just get trailing comma select lists into a version I'm likely to use in the next decade and I will love you forever!


Most of that additional syntax surface area would also benefit from trailing commas. For example, `SELECT * EXCLUDE (a, b, ...)`, or even `FROM a, b, ...`.


OP is massively overthinking it. Add them to CREATE TABLE and SELECT queries would remove 99.9% of annoyances.


I get where they're coming from though. Having trailing commas work in some common cases but most other cases where there are commas would be weird and definitely result in confusion. More confusion that knowing that trailing commas aren't a thing in SQL, currently.


OP is one of 7 PostgreSQL core team members. That's kind of his job to massively overthink this :)


I feel like this ship has sailed. SQL has been around for more than 50 years and everyone who needs to generate it has already put that extra `if` statement in to suppress trailing commas.

What annoys me far more often is the lack of support for trailing commas in JSON.


> lack of support for trailing commas in JSON

This, 100%! And the lack of comments.


JSON5 allows comments, it's been around since 2012. That said, JSON is not meant for humans / manual editing, and deciding to use it for configuration files was a mistake.


Thinking that a new, least sucking data format won’t be used for configuration was a bigger mistake. Like, yeah, I will exchange all my data in JSON now, but store configs in a good old XMLNS XSLT DTMF?


Store configs in the program if you can. Store in INI, TOML, or something similarly simple for humans if you must. Never use anything that requires matching syntax as the default (closing tags and matched brackets being the two main ones).

JSON is already strictly worse than XML though as it doesn’t support comments and multi-line strings in a sane way.


Thanks, but I tend to ignore this. Every time you, as a direct end-user, provide feedback on format or software, someone appears with a whole philosophy around why you shouldn't be like that and what you should do instead. And sometimes there's no "why" part even.

But I really don't need philosophy. I know what I want, and I want json with $subj, //comments and optional key quoting. Feels like some people just love making inconvenient things standard and/or teaching others how to live. (I mean the idea "json is not for X" here, not your comment)


even if it wasn't meant for humans and manual editing, it works reasonably well for these usecases


I'd say it even works better than many common configuration filetypes such as YAML or INI.


How am I supposed to make changes to this configuration file?


Funny I’m the exact opposite: I essentially never write JSON by hand, or add json content to repositories, so could not care less about the lack of trailing commas, I do semi routinely write or review SQL.


I agree. And JSON basically isn't meant to be hand-written, just easy to glance over, if you need to or do basic tests with. It's a serialization format. It's not a config format or anything like that. The idea of wanting to use it for that (when the config needs to be hand-written) should be a red flag for anyone. Why would you want to hand-write something where a key is denoted with double quotes?

And just to be clear: YAML is also not a config format and wasn't meant to be. YAML is for metadata style stuff that is supposed to be close to humans, heck the whole yes/no and "not so strict" typing parts.

If you want a config format you got many options: There is toml/ini and friends on one side and UCL/HCL/... on the other. Or if you want to go really simple, do something like Postgres, Tor, etc. do and just use space separated strings.


I've been bitten by trailing commas in my python+sql code repeatedly, it's an easy mistake to make when python itself has trailing commas (which I make deliberate use of).


I feel the same way about SQL too, it's set and difficult to shift. But I also look at PRQL longingly - https://prql-lang.org/


Have you seen the Google BQ pipe syntax? https://cloud.google.com/bigquery/docs/reference/standard-sq...

Feels like it does 75% of what PRQL does while still staying somewhat backwards compatible. Already works in BQ if you opt in.


Douglas Crockford made a commitment that JSON would never, ever get better. He promised us that it would always suck. He is as good as his word.

It has the small benefit that you never, ever have to worry about running into an old version of a JSON parser that bounces your fancy trailing commas.

Worth it? Not to me but Douglas Crockford does not care what I think. Or you either, apparently.


What extra "if"?

  ', '.join(fields)
;)


The closing tags should be optional too.

[{},{},,,,{},,

Should be fine. Now you can push things to the eof.


You probably want jsonlines. Being able to open a file in append mode without needing to parse the whole thing is great.

https://jsonlines.org/


I get why one wants strict parsing rules but i don't see the point of closing tags. It seems nice enough to have [say] logs in json. Why would one need to add closing tags before consuming the json elsewhere? Parsing the entire thing is a huge waste of time if it is large.

I can use csv or html ofc.


To this still empty thread: how about we just stop arguing and add these damn commas everywhere? How about having human- and devenv-oriented languages finally, after how many decades?


In my opinion: because the Robustness principle [1] is a non-insignificant cause of bugs and SQL injection is a major player in the vulnerability game.

The first case that jumps to me: if you write "a,b,c," you can't know whether you forgot a parameter, you passed an empty string where there shouldn't be one, or you intentionally left a trailing comma. And SQL already is human-oriented - compare your typical SQL to a data operation in K [2]. It just so happens that some things are hard and off-by-one errors are famously near the top of the list [3].

Whenever a database complains that your SQL query contains a trailing comma it's a sign that you may not have paid enough attention to your arguments (or even worse, that you're building queries by hand instead of using prepared statements). From where I stand, not allowing trailing commas is a feature, not a bug, and I would therefore object against them.

[1] "be conservative in what you do, be liberal in what you accept from others" - https://en.wikipedia.org/wiki/Robustness_principle#Criticism

[2] https://news.ycombinator.com/item?id=42999650

[3] http://www.randomprogramming.com/2014/09/quote-of-the-week-h...


This doesn't seem like it's much of an argument. Injecting `d, g` on the `b` place in your example will work fine. Trailing comma or no.


If you write "a,b,c" you can't know whether you forgot ",d" either, because if you wanted to type ",d" but got distracted and forgot it, the habit of leaving no trailing comma will lead to just "a,b,c" without ",d". And you can't know whether you forgot or mistyped a whole query either. What is this argument even, seriously? It's so weak that it is on par with suddenly having a stroke while programming. Which guardrails should we invent for that case?

SQL is fundamentally unsafe "forgot the item"-wise. It doesn't allow dynamic column selection, there's no type safety, nothing. If you want guarantees, maybe make a proper language with guarantees rather than justifying random side-effects of a comma being a syntax error sometimes.


How about we just do nothing and people write syntactically-correct SQL instead of demanding everyone and everything else change? I'm sorry, is this really the most important impediment to software development right now? Some guy's beef with the SQL parser?


You really want us to focus on more important impediments (whatever that means)? Just be glad we only want commas and add them already. If that trivial why even argue.


I anxiously await your pull request.



Let me start with FROM while we are at it.


I always start with "SELECT COUNT() FROM ..." for this reason, and then go back and replace the COUNT() when I get a handle on the tables/filters/groups involved..


I love it. I can see no downsides (as long as the updated implementations are well tested).

I reject any argument about how mature SQL is at this point. Just bump the version number (SQL 25) if necessary and make it so.


Doesn't feel like a breaking change to me.


IIRC BigQuery started allowing trailing commas in queries around 2019.

It all came as a very big surprise, out of thin air.


That's correct, and I absolutely love it. If I move to a new role where BigQuery is not the data warehouse of choice, I think this will be the thing I'll miss most.


Snowflake also allows trailing commas :)


It's nice when bigquery and python together for data analysis that little things like that are congruent


I have to add to my huge amusement at the particulars of this thread and deep appreciation for your nerdiness. As you can see from my other post here, I consider this question to be no question at all. Obviously all SQL engines should change. Equally obviously, they never will.

That all of you can take something so futile so seriously, spend time writing about your workarounds (which, of course, we all have in spades), is simply charming.

Thanks for you being you.


There is no correct answer for this; it’s a highly personal preference.

A lot of people prefer purity of grammar; these are the kind of folks who demand coding style guidelines and will be very unhappy if you violate their preference. They’re not wrong, but they have a very strong preference.

A lot of people don’t care so much about particular style decisions, but they want readable code that is easy to understand, maintain and modify. I think a lot of the trailing-comma-preferring people fall into this category. But again, it’s a preference and isn’t right or wrong.

There are probably also people who don’t care and compete in obfuscated code contests or try to minimize line counts by putting as many statements on each line as possible or otherwise writing genius-but-unreadable code by taking advantage of language and syntax idiosyncrasies. This is a preference too (although I consider it antisocial).

My preference is, if it’s unambiguous, allow trailing commas wherever there are lists, because it makes cut and paste operations much easier.

But at the end of the day that’s just my preference and not superior to anyone else.


Not worth the trouble. This is such a cosmetic change to appease a specific type of developer but the effort to implement that across all the DB engines of note would be monumental.


It's not just cosmetic, though, as now in order to add a new entry at the end you need to change two lines of code, which means your git blame is no longer accurate.


Literally unusable.


Or one database can have it as an option and that "specific type of developer" can just use that?


I don't think this is something that warrants such a change, but if I had a choice I would prefer to be able to completely avoid using comma, maybe just by using an extra pair of parenthesis. I this would break in more than one way, but one can dream.

   SELECT (a b c) from sometable where id in (10 20 30)
This would be quite useful when doing exploratory work and you want also to copy/paste values from somewhere else.

But to be fair the main issue in this case is handling of the WHERE clause, because (un)commenting parts is never straightforward

    select foo from bar 
    where
      x < y   -- cannot just comment this
      and (
        z = 10 -- neither I can comment just this
        or baz is not null
      )
Yes, one could put AND and OR on their own like, but similarly one could put a comma on a line of its own...


You want to replace comma with space or newline, and use parentheses to disambiguate subexpressions.

I'm not convinced this is better than what you are replacing.


I'd be happy with an query editor that silently removes the comma. I write way more ad hoc SQL than what I deploy, and this would be a great tool for faster exploration of data (ditto for "FROM table SELECT columnA, ....")


OMG please please please! Regarding the author's specific questions, I would suggest not letting perfect be the enemy of good.


That's not that easy to enforce in a system like Postgres. If you ship something you have to stay with it for many years. Thus, all possible angles have to be taken into consideration.


Clojure got it right, commas are whitespace. Put them where ever you want, or don't. Problems 100% solved, and parsing is easier. Everything else seems super dumb once you see it.


Without talking about SQL it would be nice for git diff. I only see benefits of supporting them, hopefully this can be integrated into the SQL standards


And git blame, and it would reduce merge conflicts.

Please add this.


> this might be the most requested feature in SQL

Don’t forget to take into account the people not requesting this feature.

A lot of them don’t think this is worth the changeover.


> A lot of them don’t think this is worth the changeover.

I think nobody wants to make trailing commas mandatory to use


Of course not. I don't think that has anything to do with it, though.


What changeover is there to be had if using trailing commas is not mandatory?


I'm a big fan of adding `WHERE 1=1` to the first line of a where clause. Then all remaining lines are prefixed with `AND` or `OR` which is nice for readability and indentation, and makes it easy to comment out a line while iterating.


> We support most cases, except the ones that are too complicated to implement or cause grammar conflicts.

It would be really helpful if the author had provided at least a couple of these.

I can't think of any obvious examples that would be complicated/conflicting, so it's not even clear if this is real complexity or not. I mean, it might be, but let's at least demonstrate that concretely?

SQL grammar is pretty limited. Surely it can't take more than half an hour or so to check if any grammatical ambiguities could be introduced -- or at least a quick first pass? This whole post is postulating about theoreticals when it could just answer some of them.


Yeah, this seems like the best option. I don't get the author's confusion

> Option 2 is weird, how do you determine the cutoff?

The cutoff seems obvious. If it is currently always a syntax error allow a comma, if it can be valid syntax preserve the existing behaviour.

I also can't think of any other cases right now, but if needed there can be a few places that don't allow trailing commas. As long as SELECT and CREATE TABLE support it 99% of the benefit will be there.


I guess trailing commas would make constructing the clauses programatically a little easier, if they are being appended one-by-one.

On the downside, does this make it easier to write code that is vulnerable to command injection, because it is easier to append a bunch of statements passed as input?

For the WHERE clause, if i happen to be programatically appending conditions, i'll start with `WHERE true` so i don't need to consider if i'm appending the first condition (no AND/OR/NOT etc.) or a later condition (required AND/OR/NOT etc.). A decent query planner should ignore the fixed value.


Programmatically it is easy to not add trailing comma (just use something like `join(',', array)`) but it would save time during manual editing of a long query - I often add a new line then run query to discover that I forgot to add , on a previous line. When each line including the last one ends with , I can just add a new line (also ending with ,) and save time.


Prql has trailing commas https://prql-lang.org/ if you're ok with running that transformation.


I was thinking about it again recently, but can't tell if it's worth. Have you used it in real life projects? How was it?


It's nice for more complex queries. Not a big difference for simple ones. I don't have a more nuanced take here - it works fine.


Unfortunately it’s written in Rust, so it can only be used with Rust projects.


If Rust can only be used with Rust, then it's strange that the project lists a number of other languages in the docs: https://prql-lang.org/book/project/bindings/index.html


Alright, it supports four languages, but that’s still not much.


The limitation is clearly not Rust. Any language that can bind to C libraries can bind to the functions PRQL exposes... the authors just haven't chosen to implement convenient SDKs for many languages. They also list 8 languages, not 4, just that they've had time to polish the libraries for 4 languages, apparently.

PRQL appears to be a rather small project... not some major corporate effort.


Yes, that's precisely it. PRQL is a completely volunteer driven project by folks who had enough of the thousands of paper cuts from SQL and felt that we deserved something better after 50 years. Throw away the SQL syntax and keep what people usually like about SQL - declarative, relational operators - plus add functions and composition.

The main limitation is developer time. There is so much that could be done with PRQL! Without corporate sponsor, parent company, or more contributors, velocity is unfortunately limited. If you'd like to see that change, please reach out!


I wouldn’t mind writing the C bindings myself, but the docs say that not even C is officially supported.


Hi, PRQL contributor here.

C bindings would be great! Do you want to open an issue in the repo?


Sorry, by my comment I meant that if PRQL exposed a C API (as the parent commenter claims it does), I’d write bindings for the languages I use that can consume C libraries. Unfortunately I’m not proficient enough with Rust to create the C API myself.


Right, that's not my forte as well. Seems like quite a key enabler though so let me see where we are on that.

There was also a helpful comment on a HN thread a few weeks back about how to make the API better to develop against. I've been meaning to get back to that but been constrained myself.


I suspect there is a communications breakdown happening here. I'll try to clarify what I was saying, since I think I did a poor job.

In Rust, when you define a `#[no_mangle] pub unsafe extern "C"` function, and then compile as a shared object / dll, that function will be exposed in an ABI-compatible way the same as any C function would be. It's just a matter of defining the proper header file so that you can use it from a C program, or from any other programming language that can bind to C.

Writing a header file manually is boring and error-prone, so people will often autogenerate the header file for the exposed functions using a tool like cbindgen: https://github.com/mozilla/cbindgen

We can see that PRQL is using cbindgen here to automatically create a C header: https://github.com/PRQL/prql/tree/main/prqlc/bindings/prqlc-...

The public API that PRQL wants to expose is defined in Rust here: https://github.com/PRQL/prql/blob/main/prqlc/bindings/prqlc-...

The generated C header file is here: https://github.com/PRQL/prql/blob/main/prqlc/bindings/prqlc-...

And that C header file -- combined with the compiled library -- should be all that is needed.

I suspect that the PRQL maintainer is saying that they want to offer a more idiomatic binding for C. The raw API that is exposed may not be the most user-friendly API, especially since they don't seem to have much familiarity with what is considered "idiomatic" in C, so they haven't been ready to commit to that API being considered "stable" yet. Based on my own poking around in their existing bindings... that C binding appears to be the API that they are using internally in those other language bindings already. (I'm also not sure how else they would be creating most of those bindings, if they weren't using that C binding... apart from some special cases, like how there is a convenient alternative for exposing bindings to Python from Rust, for example.)

We can see in the dotnet bindings, for example: https://github.com/PRQL/prql/blob/main/prqlc/bindings/dotnet...

C# does not allow directly using a C header file, so it requires manually re-defining the same set of extern function signatures, but it appears to be the same.

I'm not an expert on PRQL by any means, and it's been a few years since I really used Rust, but I'm just piecing together what I can see here.

This article I found could also be helpful: https://www.greyblake.com/blog/exposing-rust-library-to-c/

Rust code normally does not adhere to a C-compatible ABI, but the purpose of these "extern" functions is to do exactly that when you're trying to expose code that can be called by standard conventions... since the industry has largely settled on C-style functions and structs, for better or worse, with all of the limitations that imposes.


I know this is a bit offtopic but a lot of people have worse experience with using (otherwise excellent) PInvoke in .NET than strictly necessary.

> C# does not allow directly using a C header file

https://github.com/dotnet/ClangSharp?tab=readme-ov-file#gene...

there are bespoke libraries which build on top of it like CsWin32 where you specify the methods/modules to import and get nice and, often, memory-safe types and members in C#.

I think it should be possible to enhance this even further like having '// pinvkgen: #include <some_dependency.h>' at the top of any particular C# file and getting what you need in a C-like way. There are some problems with this inline approach but it could work.

The main point is there are quite a few community packages which simplify authoring bindings (there are more, like https://github.com/Cysharp/csbindgen/ for Rust or https://github.com/royalapplications/beyondnet for Swift). It really shouldn't be a necessity to write bindings by hand for large dependencies as it's both error prone and a solved problem.


What if they simplified it so commas were ignored and thus optional. Then they are purely for readability, allowing for whichever approach a project preferred as their standard (no commas, traditional SQL, with leading commas, with trailing conmas).

I could have missed something but I don't see anything that would be impossible to parse without commas so long as there's at least one character of whitespace separation.


The keyword AS would then have to be obligatory, to avoid ambiguity.


I suspect that it's perfectly valid to name a column "from" or any other SQL keyword so allowing a trailing comma would make the grammar ambiguous.

Personally I agree with the sentiment, I find it annoying to have to juggle the commas on the last column name but I think there is likely a valid reason to do with making lexing easier to reason about.


Most SQL implementations do seem to allow you to name a column or a table with a keyword, but to refer to it you may need to put it in quotes or backticks.

I'm not sure if this is a solved problem at the level of the ANSI SQL spec or if every vendor does their own thing, but there's definitely plenty of precedent that ambiguous grammar is allowed and can be resolved.


I know that MySQL uses backticks, Postgres/Sqlite uses double quotes, and MS SQL Server uses square brackets, when using keywords for a column or table.


The SQL standard uses double quotes.


The `KW item[,item]+ KW` grammar just sucks in general. Natural-ish grammars are really stupid and shortsighted.


Snowflake and DuckDB are two flavors of SQL that allow things like trailing commas. My personal favorite feature is `GROUP BY ALL`.

    select 
      c1, 
      c2, 
      c3, 
      ...,
      c50,
      sum(c51),
    from 
      table
    group by all


I really like the trailing comma, where it's supported, for that it can drive the language formatter - e.g. with trailing comma typically each item on it's own line, otherwise not. I think Dart is doing this, and other languages.


One of my favorite features of BigQuery SQL is that it accepts the trailing comma.


And by the way allow from…select Instead of select…from

Its the same as having the imports first


I wonder why this is not a thing yet. Makes a lot of sense to me as a heavy SQL user.


When do we get an object shell (powershell, there I said it) instead of old school strings? Been using bash for over 20y, but MS had to make it popular.. I wouldn't mind at least a proof of concept for SQL


What would it offer that you can't already get from DBATools, LINQ, and ORMs more generally?


I probably hand-write and generate more SQL than most. I don't find this to be anything to be concerned about--there's so much change coming soon.

If I had to choose I'd opt for better editors.


Reminds me of my favorite refrain that people love complaining about all of the parentheses in lisp; but completely ignore that it largely eschews all other punctuation.


Should probably write a compiler that compiles "enhanced SQL" down to compatible SQL. Make it accept trailing commas, queries starting with FROM etc.


This trick works for SELECT and is way less ugly than putting a leading comma. Just SELECT a constant:

‘SELECT a, b, c, 0 FROM t1 ‘


DuckDB has this feature:

duckdb> select 1,2,;

returns 1 and 2

https://shell.duckdb.org


As mentioned in TFA.


indeed


I just use an array and implode it into a string. Then can both add and remove things.


At some point we should just commit the AST to git and render it however it is preferred in IDE.


last time I was on a project that involved a lot of data analysis/writing SQL, I was gonna write a plugin for "sloppy SQL" which would let me use trailing commas and some other qol features, but that project ended before I got around to it.


No doubt the most errors in my SQL syntax when writing migrations is the extra comma.


I would unconsciously delete the trailing comma for a couple weeks/months.


BigQuery supports trailing commas :)


Why even have commas at all? This is a non-issue for the likes of Tcl and Lisp and your average shell-scripting language and what have you; whitespace is already a good enough delimiter.


Genuinely, I'd love to see this approach be taken: allow for a set of characters to be used as list delimiters. I personally like the set to be comma, semicolon, and newline, but of course this set would need to be varied depending on other syntax (e.g. in SQL, we wouldn't want semicolon to be used for this).

Having newline be a valid list separator is particularly nice because it solves the "trailing comma" and "comma-first" style workarounds in a visually elegant way. The newline already provides a visual separator; we can already tell that we're at the end of most lists by way of having another keyword appear next without needing to rely on a lack of commas, for example:

    select
        id
        name
        email
    from users


Because there's existing implementations that would interpret that as aliasing column names.

``` SELECT Field AS Renamed, OtherField AS AlsoRenamed ```

and

``` SELECT Field Renamed, OtherField AlsoRenamed ```

are semantically equivalent.


Making AS mandatory for naming columns would be an improvement in and of itself.


Yes, but it's also a breaking change to a very large legacy codebase. I can't see it ever happening, sadly.


Because in SQL whitespace is already used for all manner of special syntax.

That said, generating SQL from s-expressions can be a very pleasant experience.


Maybe some ideas are just not good ideas.


I feel like if you actually start writing the amount of queries where this would be useful you stop caring about tiny details like that.


After writing code for 20+ years these sorts of concerns are just tiresome. The effort expended debating and enforcing optimizations like this completely wipes out any and all benefits they bring, which is incredibly minimal to begin with.

Just read the code, write your change, move on. If you find yourself missing small details like this you need to just slow down a tiny bit. Its not hard. By all means format your code cleanly, just don't spend cycles writing blog posts about it.


Not sure why you're being downvoted; this is the only mature, reasonable answer. This is a frivolous complaint, that would upend the SQL grammar, because someone blogged about being mildly inconvenienced. I _wish_ I had tiny, nitpicky problems like this, but my problems are actually substantial.


Why do we need commas?


I've never cared for commas. I like cooking my family and my dog.


Trailing commas are stupid no matter what language they are in.


I have the complete opposite view, it is stupid to not have them.

Whenever I edit JSON and SQL there is a lot of fiddling with commas when rearranging lines or adding new lines to the end of the list. In other languages I use (Go and TypeScript) there is no such fiddling.


Because…


I nominate this post for worst idea of the century.


Trailing commas are an abomination


This seems so minor it's hardly worth discussing. I in the 'who cares' camp.


People who write code.


People who endlessly complain about syntax aren't usually writing any meaningful code, I've found.


Have written a lot of code over ~35 years. Not a fan of trailing commas in any language I've used.


That seems odd. The number of times you have had to correct for commas is 100% over 0%

Now extrapolate to the whole industry. There's a cost, regardless of how someone might want to value it.


I find them aesthetically displeasing and almost never get annoyed by having to twiddle a comma here or there. If I were to stack rank things that hinder my efficiency, they'd be down towards the bottom.

I mean sure, if people spend most of their day just mindlessly copy pasting crap then perhaps I guess. But compared to the amount of time I find I spend thinking about the actual problem at hand, the single keystroke to edit a comma is a rounding error.


Lmao yeah, web pages take 10+ seconds to load megabytes of JavaScript that do nothing but waste time and energy, but the SQL commas are the real culprits!

What a joke.


a joke worth making a burner over?


The difference between you and a single celled archea is literally only many such small "who cares" changes.


Yes but this one isn't it.


Count how many people think it is. Accept that it is for many people.


I want leading _and_ trailing commas.

Frequently I will be working on a query and have something like

    SELECT
       a
      , b
      , c
    from foo
and then I want to comment out a column as I am working and exploring the data. But I cant comment out column a without also removing the comma on the next line.


Wouldn’t just trailing solve your issue?

    SELECT
    a,
    b,
    c,
    From Foo


in 20 years of writing SQL, preceding commas is so much better IMO. Its so easy to miss a comma at the end of a long expression. preceding commas means you can never forget them.

Then if I can have an extra leading comma, I can reorder, comment out or remove, or add a column at any point in the list without having to think about the rest of the projection. Also diffs are cleaner, it only highlights the rows that have changed, not the row above it/below it that had its comma added or removed. This happens a ton when im iterating on a query.


It's only easy to miss a comma at the end of a long expression because you need to calculate whether it should be there in the first place. If commas were always required unconditionally, it wouldn't be a problem.


its still a problem if you forget to write it, and with a trailing comma its often that you forget it because theyre hard to see


Moved the problem to column C


But if this change is adopted, then commenting out column C would still compile the query. There isn’t a need for leading?


I think the right answer is that people who care about these things should use a client that rewrites queries via an LLM and clearly show the changes it makes (not sure if this exists yet but I'm sure it will). This also would handle a few other problems I've run into such as having multiple WHERE clauses or doing a WHERE prior to a JOIN and it could also allow using plain English to describe what you want to query.

I've found LLMs do a really good job of writing/cleaning up SQL.


Ugh. Why? To make copy/paste programming easier? To make query generation easier? When you're writing code to generate queries, it's worth doing it right. Just about every programming language has an easy way to take an array of strings and add a separator between each element. Like PHP's implode: implode(', ', ['foo', 'bar', 'baz']) == 'foo, bar, baz'.

Every time I see a trailing comma, I think "is this a bug? did they forget an element?".

Edit: as a reply to the comments below: that's a lot of hassle to save you hitting backspace once. There's premature optimization, and then there's single keystroke optimization.


To make writing and maintaining sql easier. To make sql diffs better. To make sql more consistent with other languages.

Every time this trips me up, it’s hand written SQL, because literally every other language I routinely use supports trailing commas.

The additional complexity during codegen is barely existent. If you’re using an orm or code generator this will be an issue once at most (if and when you write your own).


> To make writing and maintaining sql easier.

From my experience there are a lot of programmers that simply don't write SQL at all. They use an ORM or some query building library in their preferred programming language. So they're the ones that don't understand the problem when handwriting SQL, because they simply don't do it. Even the parent comment here is talking about PHP array syntax instead of SQL.

I write all of my queries directly in SQL, and I run into the trailing comma issue somewhat frequently. It's a minor annoyance, but I've just been dealing with it forever and accepted it.


Everyone knows the hardest part about writing SQL is having to get Claude to tell you where the commas should go.


You can already do this without changing the SQL language spec. If you really care about the things you say, follow the other comments and make the first element special which is changed much less frequently.


So you can’t do it, you can do a different thing which (as I already answered in details elsewhere) still sucks but differently.

Great contribution, thanks for nothing.


Curious to see your Postgres PR with this change in it. Is it done yet?


From my own experience:

1. It makes git diff less noisy when I need to add new element at the end since you don't need to add an extra comma at the end of the now-second-to-last element.

2. It makes reordering the last element a lot more pleasant during development since you don't need to add and remove comma when you are moving the line.

And the same applies (or I wish applies) to SQL. Time and time again I wish trailing comma is a thing in SQL when I'm constructing and testing even slightly more complex queries.


During prototyping, I'm frequently adding and removing entries from lists, or change their order. It's really annoying when you have to update the commas afterwards. Much nicer dev experience when ever entry ends with a comma, and you can juggle them around at will.


That’s why I use leading commas.

Unless I change the first column, no problem.


Yea, and if you do trailing commas, "unless I change the last column, no problem". This is exactly the problem.

Supporting leading or trailing redundant comma are both good options. Not supporting either isn't.


I rarely change the first column in a query but often the end especially while debugging. So a leading comma isn't the same problem to me.


When you start working with sql you despise leading commas.

When you stop working with any other code than SQL you look back on your younger self and think ‘I was soooo young’ when typing those lovely leading commas.


That sounds made up on the spot. I wrote leading commas in SQL for decades, no such thought crossed my mind. I still use and suggest them to people. They look especially good when aligning selected columns.


I hated them! I couldn’t stand the look.

Learned to love the functionality of these though and the looks grew on me.


I mainly work with SQL and like them.


It's so you can do this in code:

  my_things = [
    "thing_1",
    "thing_2",
    "thing_3",
  ]
And not have to juggle commas when you add a new end thing.


That’s why I use leading commas.


The first line cannot have a leading comma resulting in inconsistency then too.


That's for where the language parser allows you to have have leading commas i.e line continuations without a trailing comma.

And some, amusingly, don't.


You’re clearly missing the entire point. The mismatched number of commas in either orientation requires you to alter unrelated fields when making modifications.


I rarely change the beginning of the query more likely the end.


It's not about typing; code is never really about keystrokes. It's about easily moving lines around (up/down, copy paste, etc), diff noise (only one line changed instead of two), etc. See also https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe... it's always been allowed in JS, even though if I recall correctly Internet Explorer didn't allow it.

Anyway, SQL is a different beast entirely, this is specifically about hand-written SQL which needs to be optimised for readability and comprehension, both in the query itself and the diffs changing them. Spreading columns and arguments across multiple lines is common if not mandatory for writing maintainable SQL.


> It's not about typing > It's about easily moving lines around

Those are the same thing, just different keystrokes.

> diff noise

This is a non-issue, or rather a solved issue. Any half competent diff tool does word diffs instead of just line diffs. Changing the syntax of many (every?) programming language to promote "better diffs" seems over the top. And what's the result? "a,b" -> "a,b,c" vs "a,b," -> "a,b,c," both have equally "noisy" diffs.

> Spreading columns and arguments across multiple lines is common if not mandatory for writing maintainable SQL.

I agree. But I disagree that a trailing comma makes it more readable or maintainable in any way. I can't look at a trailing comma without wondering what's missing.


That's what I was thinking. Sometimes it's called join, like Enum.join/2 in Elixir.

Either you're cuddling the strings with your hands and it'll take time and care anyway, or you're doing programmatic massage, in which case most DB-driver capable languages has a bunch of functions that solve issues like these.


Might not be so bad if the syntax error for it were better than

  Syntax error at or near ')'




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: