I feel the problem. When coding (not only in SQL) you often have to add something to the end of a list, and it is annoying that the end of the list is always special. You can't just copy some line and move it there. Also when moving things around you always have to take extra care at the end.
So, my solution for this was always
SELECT a
, b
, c
FROM ...
instead of:
SELECT a,
b,
c, -- people want to allow trailing comma here
FROM ...
Leading commas also are also very regular and visible. A missing trailing comma is a lot harder to see.
Before people start to mess with the standard, I'd suggest to maybe just change your coding style - and make the start of the list special - instead of the end.
I'd suggest going for a lot less trailing commas - instead of allowing more.
> it is annoying that the end of the list is always special. You can't just copy some line and move it there. Also when moving things around you always have to take extra care at the end.
You have simply moved the "special" entry to the beginning rather than the end.
Side remark: I've noticed that when it comes to code formatting and grammar it's almost like there are broadly two camps. There are some instances of code formatting that involve something "clever". Your example of leading commas above for example. Another example is code ligatures. It's as if there's a dividing line of taste here where one either feels delight at the clever twist, or the total opposite, rarely anything in between. I happen to dislike these kinds of things (and particularly loathe code ligatures) but it is often hard to justify exactly why beyond taste.
Code ligature thing has something to do with just seeing the characters that are actually there rather than a smokescreen, which IMO impedes editability because I can't place the cursor half-way through a ligature and so on. But it's more than that -- you could fix those functional issues and I'll still dislike them.
> I can't place the cursor half-way through a ligature and so on.
That sounds like a bug in the editor. Ligatures are incredibly common font features even outside of programming, even sequences like fi are often ligatures.
How could this possibly be done for ligatures which don't maintain their horizontal position though? `!=` will usually be shown as an equals sign with a line through it right down the middle.
Adding something to the end is the most common thing to do.
And changing the start is extremely rare - it's anyway special because you usually put it in the line with the SELECT.
No, you usually don't. That would "bury" the first field so you don't immediately see it if you quickly glance at the code. I'll admit I was a bit surprised when I saw a fully formatted SQL query but it does look much better:
SELECT
a,
b,
c,
d
FROM
Customers
ORDER BY
b,
c DESC
Edit: I've just seen other comments here suggesting you return an extra "pad" value at the start so you can use leading commas without "losing" the first value visually. I hardly know where to start with that. It transmits more data on the wire and, to my eyes, looks horrific. That level of hackery is proof, as if it were needed, of how badly needed trailing commas are.
> Any new style will look bad, simply because it is new to us.
I agree that new styles and syntaxes can take some getting used to. When I went from Pascal to C, braces seemed awful but I now think punctuation is much clearer than words for delimiting blocks.
But the discussion here is about whether to introduce a new clearer style. Taken on this context, your comment amounts to defending any existing syntax, no matter how horrific, because you'll get used to it eventually.
> But the discussion here is about whether to introduce a new clearer style. Taken on this context, your comment amounts to defending any existing syntax, no matter how horrific, because you'll get used to it eventually.
No, I'm not defending any existing syntax.
As I said, it is about minimizing errors. You can get used to anything. But some styles and syntaxes are more error-prone than others. Leading commas in SQL is less error-prone, and therefore is preferred. No matter what your initial aesthetic impression. Allowing extra trailing commas is not portable, and therefore will increase errors if you come to rely on it then move between databases. So if the feature is provided, I will still be slow to adopt it into my style.
I always do this. There's usually one field that you can put at the start that never changes. But the field at the end will keep changing as you add more fields to the SELECT list.
SELECT
a
,b
,c
,d
FROM
Customers
ORDER BY
b
,c DESC
reordering, insertions and deletions much easier within the IDE as you’re manipulating lines as a whole. focus maintained while you are in the zone or tired is the biggest gain
at the cost of slightly uglier code and a miniscule… truly miniscule wire overhead.
other decisions will have much larger wire impact, say choosing a column wider than necessary
I'm curious why you think so? I agree totally with the parent comment; when I iterate on a SQL query the most common place to make changes is near the end of the SELECT block, adding and removing and refining column expressions. I do the exact same "comma first" trick to make my life easier.
I only write SQL for complex grouping operations, otherwise the Django ORM is superior. And in those situations, the ordering is really not very relevant, and thus changes can occur anywhere.
> Code ligature thing has something to do with just seeing the characters that are actually there rather than a smokescreen, which IMO impedes editability because I can't place the cursor half-way through a ligature and so on
Why wouldn't that be possible? (The cursor thing)
It's still two characters as far as your editor is concerned.
I didn't say it's not possible, but certainly when I've tried ligatures in the past my editor treated it like a single character until I pressed backspace after the character (or delete before) or whatever. Anyhow there's all sorts of weird functional artefacts that just feel dodgy to me and they basically all arise from this world where multiple runes are being treated as a single one. Here are two more examples:
1. If I `/=` in vim (find equals signs), should the "!=" ligature be highlighted? If it is, then it is misrepresenting the state: if I then press delete, the highlighted ligature won't actually be removed, only part of it will be.
2. In javascript the `===` operator is less visually distinct from `==` since if there are no comparisons nearby I have to judge from the length somehow and it takes longer, adding some additional comprehension fatigue.
Anyway like I said, you can probably try to fix all of these with weird special case handling but it's just fighting the underlying assumption my editor makes (and my brain has learned to understand): a source file is a list of runes. That's how I like to think about it.
Anyone is free to think otherwise. They're just in that other camp and that's OK. But I don't like them personally.
Among the editors which handle well ligatures is Geany.
The cursor moves always over a character, regardless whether it is part of a ligature or not. You can select and delete or replace only a part of a ligature, exactly in the same way as when using a font without ligatures.
The only thing that ligatures change is the visual appearance of the text, which is in my opinion an extremely useful workaround for the inertia of the legacy programming languages, which continue to use the restricted ASCII character set, which has never included all the characters that would have been needed in a programming language.
Now with ligatures, we are finally able to write an ALGOL 60 program that looks like it was intended to look, 65 years ago, instead of being forced to use awkward combinations of ASCII characters, like later programmers and designers of programming languages have resigned themselves to do, because of the character set limitations that the American manufacturers of computing equipment have been unwilling to remove (causing problems not only to programmers, but also to the users of non-English languages).
ASCII has never been intended to include the characters required for writing mathematical expressions. ASCII has been intended only to include the set of characters that were in use in the commercial correspondence written in English, which had been previously available in the better typewriting machines. Because of this, ASCII has always been bad as a character set used for a programming language.
I'm not sure if this is an example of code ligatures, but I used to work with a guy who had configured his editor so that "=>" was replaced with some kind of special arrow character, and it used to drive me nuts. When I read code, I want to know what's actually there, and not have to ask "what does that arrow actually mean?"
But you have demonstrated that you know what the arrow means. Do you have your editor configured to display special symbols for all whitespace characters? Including newlines? Are you sure that you aren't just annoyed by looking at stuff you're not used to?
when chrome started hiding the https prefix from urls it still showed the same information as https was the only hidden prefix, yet a lot of people were similarly upset.
Code ligatures make text harder to read because they are not text, expecially when many of those ligatures are identical to actually existing unicode characters making the gliph more ambiguous than they laready were.
IMHO symbol should reflex their use: the dozens of different arrows used in math are meant to be handwritten or at least to be seen as a complex gliph.
=> and -> in almost every programming language is a = or a - followed by a >. We could very easily make a language where ⇒ is an operator (https://isabelle.in.tum.de/ does for example IIRC) but most languages use simple ascii characters for grammar for good reasons.
IMHO code ligatures are worse than a cursive coding font.
> Do you have your editor configured to display special symbols for all whitespace characters? Including newlines?
I actually have my editor configured to display special symbols for all control characters except U+000A NEW LINE (that includes U+0009 HORIZONTAL TABULATION as well, this is actually a control character, not a whitespace) and all whitespace characters except U+0020 SPACE.
Also, to generalize a bit: if a human is expected to read it, the way humans write should be able to be parsed by it. That's subjective, to a point, but it's an easy rule-of-thumb to remember. Trailing commas are so common that people have built workarounds for them. Therefore, they can be understood as "the way humans write". If you're writing a language that you still want to be readable by humans, you really should account for that. And, no shade for it not already being done. I'm just reiterating that there should be NO pushback to allowing trailing commas. It's a completely "common-sense" proposal.
The SQL standards committee is having none of it. I can tell you that just from this one sentence.
And, more seriously, there isn't really such a thing as a common-sense proposal with SQL. The grammar is so warped after all these years that there isn't a path to consistency and the broken attempt at English syntax has rendered it nearly incomprehensible for both human and machine parsing. Any change to anything could have bizarre flow on effects.
I'd love to see trailing commas added to SELECT though. Given the mess it isn't possible to make the situation worse and the end of the list being special can be infuriating.
> FWIW, personally, I'd rather go the full-Python, using newlines as delimiters:
I love Python - mostly, but significant whitespace is its worst curse, and I'd love new languages to move away from the idea. I enjoy Rust because I can write out garbage faster than I can think of it, and the auto-formatter will make it look neat.
Also, have you experienced the unspeakable horror that is templating YAML files? Of course in SQL prepared statements is the safe&sane way to go, but there are cases where you still have to generate SQL as text, such as when building an ORM.
I'm slowly shifting in that direction as well, but the libraries built around this style are far less mature / mainstream. Django is the "boring" solution and gets hella work done.
Actually it's still ambiguous, because you forgot about sub-expressions. How would you parse this:
SELECT a as aliasA b select q from foo as aliasQ c as aliasC FROM d as aliasD e
Even if you can figure it out (and add a 3rd level of sub-expression before deciding if you can), it's completely unreadable. You need either commas or parenthesis.
We can simply employ ACI (automatic comma insertion). Every newline after Select implies a comma, unless there’s a clearly unfinished column definition. And if you want to list columns on a single line, you have to type commas explicitly.
Select Foo Foo, Bar +
1 Baz
Baz Bar
Quux Frob
from
t Table
Ideally there'd be a syntax that looked like a function `SELECT(a, b, c, d)` with a totally distinct variant for specifying types `TSELECT(a, int, b, null, c, text, d, timezonetz)`.
The big problem here is the seemed-good-in-the-70s syntax that died with SQL. In the best of all possible worlds they could have just used Lisp as the foundation then let people macro/transpile their own syntaxes. A subtle flaw in SQL is being just hard enough to correctly emit that most people don't build new languages on top of it.
Oh, just add optional commas at the end of your language's parameter lists...
I have really no idea what's "ideal". Haskell's "white space means application" is way nicer than most function syntaxes, end of line statement separation work very well if you don't need any long statement, the SQL complex syntax is very convenient... and yet if you mix any of those things, you'll get a truckload of disastrous issues.
But one thing that is clear is that allowing for trailing commas in SQL is a gain.
There is PRQL and a few others.
There is also JOOQ which does transpilation to and between various SQL dialects (sadly open core with a small core).
Also, some databases like ClickHouse are implementing alternative languages like PRQL and Kusto natively.
If the items in the list are long IF(...), maybe uses several lines and maybe contain SELECTs it's hard to see a missing trailing comma.
At the start they're all lined up well, and it's very hard to get them wrong.
I do the same, though my muscle memory is `1=1` instead of `true`.
Of course then you get editors/linters/coworkers that always point out that the 'true' is unnecessary. This also doesn't work with ORs (just swap to false), but in practice it seems it is always ANDs that is being used.
God, everyone's going to hate me for this. (I will have earned it, I think.)
SELECT ...
WHERE foo = 1
AND bar = 2
Each keyword gets a new line, the middle gutter between keyword and expressions stays in the same place, and things get really, really fugly if I need a subselect or whatever. Any given line can be commented out. (And no, none of that leading comma bullshit, somehow that looks nasty to me.) Downvote this into oblivion to protect the junior developers from being infected by whatever degeneracy has ahold of me.
In order to quickly comment out the "foo = 1" here, you cannot simply comment out the whole line because it would become syntactically invalid:
SELECT ...
--WHERE foo = 1
AND bar = 2
I have a single keyboard shortcut to comment/uncomment a line because I like to work briskly with as little unnecessary typing.
It has nothing to do with indentation or being "fugly". I'm talking about interactive exploration of prototyping when the final SQL isn't set in stone.
I slowly come to creating vim scripts for all that. I already use `alt-,` for triggering the final comma on a line. Maybe with modern LLMs I just need to prompt a vimscript that detects where the line is (SQL condition, array item, json, etc) and use `alt-,` to do the right thing. Or something like "fixing" the whole block with `g,ap`. Because all this is irritating and no one does anything with it for decades.
IMHO this is one of the ugliest formattings. Whenever I see that i try to revert and avoid at all costs. I know it's a personal flavor, yet. I might be too opinionated..
I too prefer leading commas, especially useful when you're prototyping a query.
I also picked up the ORM trick of starting your WHERE clause with 1=1 so that every meaningful filter can start with AND ...
I'm not as consistent with this one, but it's handy too.
I catch (friendly) flak for my zealot SQL formatting (capitalization, indenting) and know it doesn't impact the execution, but there's something about working in logic / set theory that matches with strict presentation; maybe helps you think with a more rigid mindset?
ClickHouse has support for trailing commas for several years.
I recommend looking at ClickHouse (https://github.com/ClickHouse/ClickHouse/) as an example of a modern SQL database that emphasizes developer experience, performance, and quality-of-life improvements.
I'm the author of ClickHouse, and I'm happy to see that its innovation has been inspired and adopted in other database management systems.
cool, that would work too. my preference is leading separators so the separators are all in a visual column. being in a visual column allows the eye to discount the separators easily.
typically names are different lengths and the commas are hard/harder to spot
Your suggestion:
SELECT
first_column,
second_column_wider_a_lot,
(third + fourth_combined_expression),
1 as pad FROM
vs my current preference:
SELECT 1 as pad
, first_column
, second_column_wider_a_lot
, (third + fourth_combined_expression)
FROM
I prefer “1=0 OR 1=1”, because when you delete all conditions you can keep 1=0 out of selection and it decays into a no-op rather than destroying a table:
DELETE FROM table
WHERE 1=0[ OR 1=1
AND age > 21
AND xyz = 'abc']
;
Brackets designate selection bounds before text deletion. The above just safely does nothing after you hit DEL.
Without that you’d have to delete whole [WHERE…], which leaves a very dangerous statement in the code.
Mode Analytics published some data years ago showing that SQL programmers who preferred leading commas had a lower rate of errors than programmers who used trailing commas. [0]
I do this but I'm skeptical of the causation. I think it might be a symptom of people who are generally more careful with syntax because the formatting means more to them, so they spend time reading the query and moving bits around, which is how they find little typo bugs.
No, it's still a really good reason to mess with the standard.
While the standard doesn't evolve into something slightly modern, yes, that workaround is better than the way people usually write SQL. But its existence isn't a reason not to fix the fundamental problem.
I've done this for arrays in JSON files, so that git merge will merge two changes that append to the list without a conflict.
I think the right answer is to fix the merge algorithm to handle some common cases where an inserted line logically includes the delimiter at the end of the previous line.
the simplest solution is to have commas be "separating whitespace" so that "," === ",,,,,,,,,,,," === " ,,,,, , , ,, , ,"
so your example can become
SELECT
, a
, b
, c
FROM ...
and
SELECT a,
b,
c,
FROM ...
or
SELECT
a,
b,
c,
FROM ...
the only limitation is that sometimes more parenthesis are needed and that SELECT col_name new_col_name from table_name needs to be rewritten as SELECT col_name as new_col_name from table_name. good tradeoffs IMO
This is how SQL Management Studio writes the queries it writes for you (like Select 1000 records) so when you comment out a line or lines, it doesn't cause any issues due to a misplaced comma.
I have never been a fan of leading commas. How often are we hastily moving column expressions around?
I also have a somewhat controversial style preference in regard to capitalization. Because SQL is case-insensitive, I will always type everything in lowercase and let syntax highlighting do its thing. I hate mixed capitalization. Not only does it feel like keywords are yelling at me, but also the moment someone else gets involved there will be inconsistent casing. Do you capitalize just the keywords or do you include functions? How about operators (e.g. “in” or “like”). More often than not I see individuals are inconsistent with their own queries. So I say to hell with it all and just keep it lowercased
You are a monster. Being against trailing commas is like being against happiness or cute children or Cheetohs.
You do not need to see the missing comma. That is what compilers are for. Also, literally the only reason anyone has a missing comma is because they reordered the terms and forgot to put the comma onto the one that was forced to not have one because of this monstrous failure.
Some things are nothing but good. You are on the wrong side of history.
One thing I like about the nix language is it uses semicolons to separate elements of the map: while I use trailing commas, they always look dangling to me whereas semicolons look fine without another expression following.
That's as bad as using regular expressions: now you have TWO problems.
Why do you seem to think you've cleverly solved the problem, when you've just moved the problem somewhere else just as bad, by blithely messing with the standard formatting conventions universally used by most human written languages and programming languages in the world?
Programming languages borrow commas from human written languages, and no human written languages have leading commas. And moving the problem to the beginning on the list because you sometimes add things to the end ignores the fact that that also causes problems with READING the code as well as writing it, and you READ code much more often than you WRITE it.
That's not a solution at all, and there's nothing clever about it. You've just pushed the problem to the beginning of the list, and now your code is also butt-ugly with totally non-standard formatting, which sane people don't recognize and editors and IDEs and linters don't support.
I agree with cnity that it's too clever by half, and I'm in the camp (along with Guido van Rossum and his point that "Language Design Is Not Just Solving Puzzles" [1] [2]) that's not impressed by showboating displays of pointlessly creative cleverness and Rube-Goldbergeesque hacks that makes things even worse.
Of course it's not as bad as tezza's clever by a third suggestion to add a confusing throw-away verbose noisy arbitrarily named pad element at the beginning, that actually forces the SQL database to do more work and send more useless data. Now you have three or more problems. The last thing we need is MORE CODE and network traffic contributing to complexity and climate change. I would fire and forget any developer who tried to pull that stunt.
All Things Pythonic: Language Design Is Not Just Solving Puzzles. By Guido van van Rossum, February 10, 2006.
Summary: An incident on python-dev today made me appreciate (again) that there's more to language design than puzzle-solving. A ramble on the nature of Pythonicity, culminating in a comparison of language design to user interface design.
[...] The unspoken, right brain constraint here is that the complexity introduced by a solution to a design problem must be somehow proportional to the problem's importance. In my mind, the inability of lambda to contain a print statement or a while-loop etc. is only a minor flaw; after all instead of a lambda you can just use a named function nested in the current scope.
[...] And there's the rub: there's no way to make a Rube Goldberg language feature appear simple. Features of a programming language, whether syntactic or semantic, are all part of the language's user interface. And a user interface can handle only so much complexity or it becomes unusable.
The discussion is about multi-statement lambdas, but I don't want to discuss this specific issue. What's more interesting is the discussion of language as a user interface (an interface to what, you might ask), the underlying assumption that languages have character (e.g., Pythonicity), and the integrated view of semantics and syntax of language constructs when thinking about language usability.
What your mind rejects, mine finds freeing. What's idiomatic and natural depends on personal experience and evolves, as does the culture around you. There was a time in my life when I may have rejected leading commas as well, but at some point I came around to them and have never looked back. It works for me. I legitimately find it easier on my mind, and it has caused me far fewer annoyances than a comma-less last column has. I have colleagues that use it as well out of their own preference. I would suggest that insisting on a universal orthodoxy of stylistic preferences is much more oppressing to the spirit than occasionally needing to adapt the mind to the reading of something formatted in an unfamiliar style.
People seem to think nothing of putting all other syntax on a new line (+, -, ||, &&, AND, OR are all fine) but as soon as you put a comma on a new line everyone loses their minds.
>as mentioned elsewhere, i personally introduce a pad element to get fire and forget consistency
Adding an extra pad entry is a cure much worse than the disease, and I'd expect it should be objectively obvious to anyone that you're introducing more complexity and noise than you're removing, so it's not just a matter of "style" when you're pointlessly increasing the amount of work, memory, and network traffic per row. But sadly some people are just blind to or careless about that kind of complexity and waste.
You might at least have the courtesy of writing a comment explaining "Ignore this extra unused pad argument because I'm just adding it to make the following commas line up." But that would make it even more painfully obvious that your solution was much worse than the problem you're trying to solve. You seem to have forgotten that other people have to read your code. Maybe just don't leave dumpster fires burning in your code that you want to forget in the first place.
As Guido so wisely puts it: "the complexity introduced by a solution to a design problem must be somehow proportional to the problem's importance".
>That's not a solution at all, and there's nothing clever about it.
It is a solution, and I'm not motivated by trying to be "clever". It just makes writing and reading the query easier for me.
>You've just pushed the problem to the beginning of the list
The beginning of the list is modified less often than the end. The two cases aren't symmetric.
>and now your code is also butt-ugly with totally non-standard formatting
"Ugly" is subjective. Personally I like how the commas line up vertically, so I can tell at a glance that I didn't miss one out. SQL doesn't have a standard formatting in any case. It's whitespace insensitive and I've seen people write it in all kinds of weird ways.
>which sane people don't recognize and editors and IDEs and linters don't support.
A difference in code-formatting taste is not "insanity". And it does not interfere with tooling at all.
>showboating displays of pointlessly creative cleverness
Where are you getting all of this from? You seem to be imagining a "type of guy" in your head, so that you can be mad at him.
I am reminded of Sayre's law: In any dispute the intensity of feeling is inversely proportional to the value of the issues at stake.
Programming languages borrow commas from human written languages, and no human written languages have leading commas
How is this any different from leading periods, which seems to have become the standard across several of the most popular programming languages?
myobject
.somefunc()
.otherfunc();
It's not subjectively pleasant in my opinion, but I think it's hard to argue that it doesn't improve maintainability and (apparently) readability for the masses.
The padding and separators are the least issues I wish I only had there.
We don't need philosophy or morals (or jobs where that's important), we just need the way to edit these damn lines without anything screaming "syntax error!" or parasitic "++--" diffs every time you make a change. “When art critics get together they talk about Form and Structure and Meaning. When artists get together they talk about where you can buy cheap turpentine.”
-
(*) That classic claim didn't turn out true for me after so many years, that I suspect developers simply avoid admitting that it's not true for them either.
So, my solution for this was always
instead of: Leading commas also are also very regular and visible. A missing trailing comma is a lot harder to see.Before people start to mess with the standard, I'd suggest to maybe just change your coding style - and make the start of the list special - instead of the end.
I'd suggest going for a lot less trailing commas - instead of allowing more.