Are there others out there like me? I assume I'm in the minority, but I don't understand why. People stopped uppercasing HTML tags, why can't we do the same with SQL?
In most other languages you can read the statements fairly linearly, one statement leads to the next, and to the next, etc. At least within functions/methods and then you usually have indentation to help with understanding boundaries. Each "logical line" may be spread out over one or even several on-screen lines, but it's fairly easy to see the whole logical line and understand it at a glance.
Not always so with SQL.
If I get a complex SQL query to analyze, the first thing I do is not look for the SELECT clause (I'll usually have some sense of that before I start), but rather the FROM clause including it's JOINs and such. When keywords are uppercase, scanning the page for where in the statement to start my reading is much easier. Next I want the WHERE clause and its predicates... and finally I want to see what's being selected. So I'm not reading things nearly as linearly to understand a single logical statement like I would to understand say, a well written function. Also, single queries can be very long, so being able to jump around them where there are not going to necessarily be other cues can be important.
I could be wrong, but the density of user defined names (schemas, tables, columns, etc) also is higher than in other languages, so having the keywords in uppercase makes those more spread-out statement boundaries and fewer numbers of keywords in uppercase again helps me scan the statements for those major parts as I need them.
If you're writing single table queries, no sub-queries, maybe a dozen of columns or so, and only a relatively small number of WHERE clause predicates, then it might not matter much. For better or worse, that's not the world I usually find myself in :-)
But developers with good tooling to deal with more static views of the code aren't the only readers of SQL, and perhaps not the most common.
With my DBA hat on, trying to figure out what's going on reading through logs and such, I don't always have that kind of tooling available. In this context the casing can help as well.... and this may form some of my bias; my first professional experiences on the application side of the world were in very much DBA roles dealing with other people's applications running on my servers. Insofar as access tools preserve developer formatting, having good "color-blind" formats made reading complex queries easier.
I wonder if programming languages should have a way of specifying that a string constant contains embedded code or a template, so text editors can display it accordingly.
I deal with a lot of SQL embedded in code, and prefer uppercase keywords as it seems to help me visually parse the overall file.
To throw out a tossed-off example:
P.full_name, A.full_address, AT.type as address_type_name
People as P
Addresses as A on P.address_id=A.id
P.deleted is null
Plus, it's more comfortable to type as well.
Just use syntax highlighting, which can make it unambiguous what's a keyword vs func vs name.
But seriously upper case helps my eyes focus on the sql commands so that I have some sort of hand hold on the wall of text.
You can currently already do this by using the CockroachDB binary (https://www.cockroachlabs.com/docs/releases/v2.1.0-beta.2018...) but understandably people want a more minimal solution.
1. It lowercases table and field names, rendering the formatted statement useless for a project with not exclusively lowercased names. Could you please fix this?
2. I'd appreciate accepting common placeholder "?" as a valid part of the SQL statement so that we can format queries with placeholders. Would that be possible?
sqlfmt is not super useful for non-postgres grammars.
Still how about not lowercasing identifiers? This is actually a blocker, I'm sure you see it:)
Any chance of allowing keywords that are specific to a SQL implementation? (e.g. PostgreSQL)
If I enter "CREATE MATERIALIZED VIEW ... WITH DATA;" in for example, it would tell me "materialized" and "with data" is wrong.
But sqlfmt doesn't (currently) have a goal to format all SQL, just cockroach SQL.
I have to say, I do most of my development work in the database, and I'm not thrilled with any of the automatic formatters thus far. The ones I've seen can do a good job in many cases, but if you start getting very advanced at all, the weaknesses of an automatic formatting approach start to become more evident. It may just be that anything other than relatively simple SQL, PL/SQL are not well suited for automatic formatting.
I'm still working through a good set of formatting rules, and I always find I end up making exceptions for readability.
SELECT 1, 2
(addition) More than anything that's wrong with all of these formatters for SQL is that there real consensus for what good formatting is. There are historical norms for sure, and that's likely a starting point, but reconciliation between the communities around each database vendor (and sometimes each company developing with that database) make that a difficult challenge. So this new tool will become the CockroachDB formatter, pgFormatter is as far as I know the most popular PostgreSQL formatter, other databases your mileage will vary. Each not really an SQL formatter, but an SQL formatter for (stick affinity here). For CockroachDB, I'm not sure how much payoff there is for moving away from the more popular tool for Postgres if they're stressing their similarity to PostgreSQL SQL.
Sure, once you've gone and created a brand new tool you've gone past the point of a PR unless you were starting with a fork. My original question was, why jump to a new tool out of the gate when the differences seem fairly minor. Yes, I'm sure that pgFormatter probably didn't start with the prettier paper... but so what? If you aren't coming up with something really different, I'm not sure that matters.
Actually, thinking about it. What I need more than an SQL formatter is something closer to a linter quite frankly. In complex queries it's easy to miss something... I know, I know... I'm probably talking about something more like an IDE since many of the errors are likely to be logical rather than syntactic, but still... that would be a better problem to solve than a formatter right now.
Is it me or is "opinionated" software popping up all over the place? It looks to me like the author considers it a selling point, too.
Can someone explain to me why I would want an "opinionated" software over any regular old software? Especially over a configurable software?
I'm not a native speaker, but I never encountered "opinionated" being used with positive connotation. It has so far always been in the "well, uh, he's very opinionated!" sense of apologizing for that old grumpy person who gets on everybody's nerves.
Perhaps it resonates more positively now to people who have had to work with "flexible" software such as Spring in the old days.
There were 100 ways to skin a cat with old Spring, and each way was a tortuous devil's brew of XML. Googling for any solution was a nightmare, exacerbated by hundreds of wannabe experts polluting the interwebs with their own strange takes.
Spring Boot was introduced in response as an opinionated approach. You can still skin the cat in 100 different ways, but at least you start with sensible defaults that someone far more knowledgeable than yourself had deemed to be good.
In programming, there's the idea that "The code shouldn't merely work, it should clearly work". If your configuration system becomes so complex that the same principle applies there, that's a bad sign.
One thing I learned somewhat recently as well is that within google3, these automatic formatters come in super handy when doing large-scale changes, e.g. when a library is deprecated and the maintainers are moving all the clients to the new one.
Having a sensible set of defaults can be helpful as well.
That said, sometimes "opinionated" can also just mean "I haven't had the have time to make things configurable yet".
The best of both worlds is "opinionated defaults but configurable". I use an excellent free SQL formatter called ApexSQL Refactor for SQL Server (Windows), and it came with decent defaults, but is also extremely tweakable.
It's become popular because people became sick of tools with bad defaults and a million options that could maybe make it better, but nobody uses.
Specific (or opinionated) design, interfaces, or formatting feels like an early declaration that the people in charge want to deliberately restrict an aspect of the software, in order to keep things consistent and reduce cognitive load during use or development. Constraints are freedom, and all of that.
To me, "opinionated" is shorthand for "not a lot of options", so it is less extensible, but will probably also work better and right out of the box.
good examples are micro frameworks that wrap very large complex APIs and simply don't pass through whole swaths of complexity.
I work on a project that's grown in complexity and team size. Developers have come and gone. Sometimes a bit of code that's grown over a few years needs its style to evolve. This kind of formatter clearly realizes this.
What I tell developers that complain about this or that style is that it's more important to have a consistent code style throughout the lifetime of a project, then to adopt whatever bracing style someone happens to like at the time.
I'd really like all of my code to be automatically formatted with a tool like this, as part of the build process. This would help with situations where code evolves, as it would reformat it when needed.
I like https://www.sqlinform.com/ - keep a copy of notepad around just so I can use it. Wish there was a VSCode extension. The free version was good enough for my needs, but it's so handy I upgraded to the pro version just to show appreciation. The best part is the ability to really customize the layout and save the settings. It's always step #1 when someone sends me a massive blob of badly formatted SQL.
Web site is kind of horrible.
Prettier actually has a beta Postgres plugin, but progress on it seems to have stalled. Hooking it up with sqlfmt would be neat - then prettier could format SQL inside other languages too, like code fences in Markdown or template literals in JS.
I use plv8 very often, and this would be perfect.
It does lead to longer queries and poor use of screen real estate, but it is very readable.
Funny because I was reading other people's comments and thinking "wow, that person is VERY opinionated on query formatting" (but now here I am).
There are opinions, and there are crimes against humanity (although that is a matter of opinion).
It would be great if the parser supported question marks as placeholders. The pg_stat_statements table in Postgres saves denormalized queries, replacing the literal values with '?' placeholders (e.g., `SELECT * FROM table_name WHERE id = ?`).
- jetbrains plugin
- vim plugin
For example, if A is
CREATE TABLE person (
id serial PRIMARY KEY,
CREATE TABLE person (
id serial PRIMARY KEY,
ALTER TABLE person ADD COLUMN age int
The style is similar to https://www.sqlstyle.guide/, which is excellent.
(this uses the actual Postgres parser wrapped into a library)
both of which seem to give nicer output and handle more diverse sql without erroring.
also not really sure why it considers itself "opinionated"
It's 2018, we aren't using teletypes with bad shift keys and we all have syntax coloring IDEs, we don't need to scream out all our KEYWORDS like we were still writing COBOL or FORTRAN in the 1960s.
It's become my biggest pet peeve with most SQL formatters. I think sentence cased SQL is far more readable.
Select * from sometable where SomeColumn > 3
Personally I like it because it clearly separates keywords from objects (tables, columns, databases, etc).
This is very much like the tabs vs. spaces debate.
Certainly there are still far too many places people are writing/reading/interacting with SQL without the benefit of syntax highlighting, but we have the technology to solve that (and in many cases need to solve that; a lot of the places that SQL is embedded and the outer language has no idea about are opportunities for SQL injection attacks).
> This is very much like the tabs vs. spaces debate.
It's sadly worse than that. It's something like 60 years of momentum versus user/developer experience and readability concerns.
Despite my under-appreciated jokes that most SQL databases probably don't need to be shouted out to do their jobs (Oracle maybe being an exception. ;), it actually is a serious readability issue.
We have decades of research showing that people read/scan by word shape and that lower-case and mixed-case is far faster and easier to scan/read than all caps. 'from' has a more interesting and relatively more unique shape than 'FROM', making it easier to spot. (Especially in a monospace language where the outermost shape of all four letter all caps words is essentially the same width and height rectangle.) (If you try a syntax highlighting scheme that does something more interesting than bolding keywords, italicizing them with a good italic monospace font, the lower-case word 'from' can be almost beautiful and very easy to find. A nice benefit to syntax highlighters is that sort of variation is a personal option between you and your editors of choice.)
Typesetters have known for a long time that all caps is slower to read, and should be used sparingly because it exhausts readers.
Sentence-casing SQL statements benefits from the readability of all the English we read everywhere else.
I make jokes about shouting in our database queries, because it's sadly easier than changing more than a half-century of momentum in the status quo, but I do think that all-caps keywords in SQL statements is a huge papercut in dev productivity/ergonomics that we have the technology to fix but apparently not the interest/willpower.
Select * From sometable Where SomeColumn > 3
(HEY '(DONT FORGET LISP!))
OMG will someone please tell college professors about Clean Code? Perhaps it takes years of writing deeplier nested code before you even know what Uncle Bob is talking about.