SQL style guide

Stasis5001 · on Oct 9, 2016

Weird, I really dislike a lot of the suggestions. In particular, an example:

  SELECT first_name AS fn
    FROM staff AS s1
    JOIN students AS s2
      ON s2.mentor_id = s1.staff_num;

We already agreed that staff is a good name for a table, so why are we renaming it to s1? There are all sorts of subtle bugs that arise when s2.mentor_id = s1.staff_num is wrong, and the variable names provide no help here that we're doing things right.

How about:

  SELECT first_name
    FROM staff
    JOIN students
      ON student.staff_id = staff.id;

If you're reading some 50+ line SQL query with a bug, things like this are easy to read and be confident in. Honestly, I think AS should basically never be used for tables.

Likewise, the guide gives the example of naming join tables 'services' instead of 'car_mechanics'. But if I see a table 'car', a table 'mechanic', and a table 'car_mechanic', I instantly know how to join against this in any direction. The hard part of understanding a schema is how it fits together, not what it represents, and when you have hundreds of tables and need to remember that join table's name, it's really, really nice if it's trivial to derive.

Lastly, the guide's recommended indentation for FROM vs JOIN statements seems off to me. Consider:

  FROM riders AS r
         INNER JOIN bikes AS b
         ON r.bike_vin_num = b.vin_num
            AND b.engines > 2

Well, the thing is, the riders table and bikes table have the exact same priority in the query, but this spacing really emphasizes the riders table and it's not clear why.

dpcx · on Oct 9, 2016

I only slightly disagree with a couple of your points.

1) They aliased it to show how it should look. This is a contrived example, and you normally wouldn't alias these.

2) Aliasing tables is essential when you have join tables that end up being over 30 characters long, or you need to self join for something. Short of using CTEs (which aren't available everywhere), I don't know of a way to self join without using aliases.

rtkwe · on Oct 10, 2016

> Aliasing tables is essential when you have join tables that end up being over 30 characters long, or you need to self join for something. Short of using CTEs (which aren't available everywhere), I don't know of a way to self join without using aliases.

This is 80-90% of my queries because we do a lot of cross schema joining in my work. Also there's a lot of tables replicated from the mainframe so the table names are T_<4 Letters that mean something to someone>_<Some more seemingly random letters>_<Something vaguely like a word if you're lucky> so their names aren't useful at all. Add in that a lot of queries are reporting queries so we're pulling in 20-30+ columns from these long named tables from different schemas and not aliasing them would add a a small novel worth of text to the query.

quizotic · on Oct 10, 2016

> Honestly, I think AS should basically never be used for tables.

I believe AS is required for self-joins. A common expository query is to find all employees who earn more than their managers, which can be expressed via "select name from staff as employee join staff as manager on (employee.manager_id = manager.id) where employee.salary > manager.salary".

Also, SQL allows a sub-select to be listed in the FROM clause. But in such cases, the sub-select must be given a name (tables have a name, but sub-selects don't).

As an aside, the keyword "AS" is allowed by not required.

Stasis5001 · on Oct 10, 2016

Sure, I forgot the self-join case. In my experience these are a minority of joins, but definitely. In postgres at least, a lot of self-joins can be solved with a DISTINCT ON instead.

Sub-selection with a FROM clause, yeah, those definitely need names. But those don't have perfectly good names that everybody already knows, whereas tables do. If you're maintaining somebody's code and they declare

  foo_bar = 10

I think it'd be pretty strange to ever consider writing

  fb = foo_bar

just because you find foo_bar too long to write or something.

thomasthomas · on Oct 13, 2016

why is "tb_" bad for tables? for some reason this has become standard practice for me. at least on SQL Server it can help distinguish tables from views in complex queries or stored procedures. No idea where i learned this habit.

ZenoArrow · on Oct 9, 2016

Aliasing tables is very useful. Aside from giving shorter names and helping you to be explicit when selecting fields, they also end up being useful for working with IntelliSense (or the equivalent in the tool you're using). Plus, you can make them memorable enough (e.g. in the first example you shared, could use 'S' for staff and 'ST' for students) for the query you're working on and shortening the query code makes it more readable. For all the reasons above, aliases are one of the first things I'll add when writing a new SQL query.

As for the indentation after FROM, I prefer to put an indent here too. Whilst you're right that INNER JOIN would have the same priority, LEFT JOIN wouldn't, and I wouldn't want to use different indenting styles for different types of joins.

That said, I don't agree with everything in this style guide, especially...

* Do... "Try to use only standard SQL functions instead of vendor specific functions for reasons of portability."

* Avoid... "Descriptive prefixes or Hungarian notation such as sp_ or tbl."

With the standard functions, it strikes me that's optimising the wrong thing. How many times are you likely to switch database engines? Such actions are very rare, especially for teams with good grounding in RDBMS'. What's more important to me is readability of the code, and there's plenty of useful non-standard SQL for improving readability. If we want both we should be pushing for updated standards to be applied across the board.

As for the second point, it's useful to know whether you're looking at a table/view/function/stored procedure/trigger from the name alone. Whilst the tbl prefix seems superfluous, other database objects should have a descriptive prefix in the name IMO.

marvy · on Oct 10, 2016

> S for staff and ST for students

This isn't memorable: staff and students both start with st!

ZenoArrow · on Oct 10, 2016

I'm aware of that. However, it only has to be memorable for the period of time you're working with a query. If you open up an old view or stored procedure to modify it, you would read the table aliases first. If someone wasn't able to retain the meaning of 'S' and 'ST' for the length of time needed to modify a query then perhaps working with code isn't for them.

Lastly, if someone did have poor memory they could lengthen the aliases to STA and STU.

marvy · on Oct 11, 2016

> sta and stu

Better :) But I prefer to just give my tables short names to begin with. How about staff and pupils? That's so short that there is little temptation to abbreviate.

jaziek · on Oct 10, 2016

> How many times are you likely to switch database engines?

Not often, but you may wish to support multiple engines at once, and it makes your code a lot easier to manage if you don't have to have a separate set of queries for each engine.

dragonwriter · on Oct 10, 2016

I really agree that the recommendation on aliasing is just plain wrong. I prefer using singular names relating to the role-in-the-query, e.g.:

  SELECT manager.last_name AS manager_name, 
         employee.last_name AS employee_name
    FROM staff as employee
    JOIN staff as manager
      ON employee.manager_id = manager.staff_id

This is probably most important (as in the above example) with self-joins, but I think its a good practice more generally, and makes queries more self-documenting as to intent.

alien3d · on Oct 10, 2016

SELECT staff_first_name FROM staff JOIN students USING(staffId)

I always put table_name_field. The most reason is for my auto generation code.

walshemj · on Oct 9, 2016

its not renaming its an alias if you have a complex sql statement having aliases just makes it easy to get you head around the code.

coldtea · on Oct 10, 2016

>its not renaming its an alias

That's what the parent meant with renaming too -- not that there's an actual table RENAME on the database.

An alias is a rename in the sense that it gives a different name for the same referenced table during the query.

NickNameNick · on Oct 10, 2016

Only if they're meaningful.

SQL isn't BASIC, and we aren't limited to x (1,2,4, etc) character variable names. Why not just use the existing meaningful name? They aren't that hard to type, and you might even be able to use autocompletion, depending on environment.

I've spent far too much time re-writing queries to remove obfuscating aliases to accept a blanket statement that they make it easier to get your head around the code.

justinlardinois · on Oct 10, 2016

In my line of work I often end up working with queries where you'll have a handful of JOINs and need to SELECT a dozen or more columns. In those cases writing out staff.foo or students.bar gets old really fast.

I agree that s1 and s2 are bad though, as I said in another comment.

coldtea · on Oct 10, 2016

>In my line of work I often end up working with queries where you'll have a handful of JOINs and need to SELECT a dozen or more columns. In those cases writing out staff.foo or students.bar gets old really fast.

It's better to have it "get old" though, than to introduce subtle bugs because of similar, short, aliases.

walshemj · on Oct 10, 2016

Longer Names have there own issues and remember not every one can recall long strings. dyslexics and other neurotypical people often have difficulty with this

mattmanser · on Oct 9, 2016

Try to use only standard SQL?

Stopped reading right there.

There is no way you can port SQL Server to MySQL to Oracle to PLSQL without rewriting virtually every single query. They're all too different.

So why bother? Write idiomatic code that other programmers used to that engine will understand.

Not only that but different engines like different things. That blazingly fast nested subquery in MS SQL will become a massive performance problem in MySQL.

There is no such thing as portable SQL, don't pretend there is.

Not only that, often the standardised way is worse than the vendor specific way (e.g. INFORMATION_SCHEMAS vs built in sp_ from MS).

wvenable · on Oct 9, 2016

I completely agree with this. One should never plan to change database engines -- it just happens so rarely and it's such a big deal that planning for it is a waste of time.

It would be like coding in Java in such a way to make porting to C# easier in the future (or vice-versa).

berntb · on Oct 10, 2016

So it isn't true anymore that quickly growing companies often need to go from one of the open source DBs to Oracle, after a decade or so?

(OK OK, that is a good problem to have.)

Right now I miss Oracle. (Never thought I'd say that. :-) )

wvenable · on Oct 10, 2016

It's highly unlikely you could ever use generic enough SQL to go from open source DB or Oracle. Best case you wouldn't be getting the best performance out of your open source database and then you wouldn't be getting the best performance out of Oracle.

It's better to build for the platform you have than build for a mythical common platform that contains the minimum features of syntax of all of other DBMS (which probably doesn't exist). If you change DB platforms, you're just going to have to rebuild.

berntb · on Oct 10, 2016

OK, your point is not that DB changes won't happen -- but that there is no use to prepare and think about it beforehand. Sounds reasonable to me.

Can_Not · on Oct 16, 2016

Kind of like premature optimization, "premature compatibility" can be a performance/labour-hours killer.

mobiuscog · on Oct 10, 2016

Supporting SQL Server, MySQL, Oracle, Postgres, DB2 and some others with standard SQL on a large Enterprise codebase here.

Works just fine.

Could we override anything that needs better performance with custom queries ? Sure, and we occasionally do, but for 95% of use-cases, it's irrelevant.

Of course, we'd love to just focus on one database, but don't get confused in this day of cloud-everything that 'suitable' performance for very large companies isn't possible.

mattmanser · on Oct 10, 2016

It's a different matter when you're approaching from a "support everything" perspective.

And to do that, I assume you're handcrafting all your queries, have to teach every developer how to do that in a supported cross-engine way and are severely limited in the set of functionality that you can actually use.

A simple example is that both My SQL and SQL Server's tools love adding ` an [ to their auto-generated scripts, something you simply could not tolerate.

But if I want to add a particular constraint to SQL, I can do it using the GUI in management studio, click the script option, and viola, a portable script without any effort on my part to remember the arguments, syntax, etc. of a fairly obscure command that you might write a couple of times a year.

Treffynnon · on Oct 10, 2016

This seems to be a simple misunderstanding.

Try - in this context means where it doesn't make sense (for performance, readability, etc reasons) then don't follow the guide. Follow it where possible and be mindful of when you deviate that you're adding to tech debt.

mmphosis · on Oct 9, 2016

Or avoid expensive, uncooperative, proprietary SQL vendors. I'll keep pretending and stick with sqlite and PLSQL.

6nf · on Oct 9, 2016

I don't know if you've ever worked with MS SQL Server in production but for us it's been the most amazing piece of technology in our stack for the past 15 years.

SQL Server is in the top 3 best things to be published by Microsoft ever. (The other 2 are probably Excel and Flight Simulator)

systems · on Oct 9, 2016

and SQL Server BI stack, is very solid and cost effective SSRS is really good SSIS is very acceptable (I mainly use to call SQL Tasks) and SSAS is really second to none

Pxtl · on Oct 10, 2016

SSRS still doesn't have support for msbuild, SSDT is a goddamned dumpster fire, and SSIS jobs have that hilariously over designed graphical programming language and are do frustratingly brittle that they are functionally single-use. And SSMS has the worst use of tabbed files I've ever seen.

There's a link to like about mssql, but a lot to hate too.

alistairbayley · on Oct 10, 2016

What's so amazing about it, when compared to the other SQL DBMS's? (e.g. postgresql, oracle, sybase, db2, etc)

Especially as it's just a re-badged Sybase...

6nf · on Oct 10, 2016

You still think of MSSQL as 're-badged Sybase'?

You know that was literally DECADES ago right?

Pxtl · on Oct 10, 2016

I never get the fussing with aligning. I get the importance of clean indenting, but so much SQL I see looks like the writer spent a long time tweaking the alignment with spaces to make sure the data types lined up perfectly or the ON clauses or what have you.... And then had do do it all over again the moment they changed something.

Just indent when scope changes, and don't align things. Alignment is a time-sink.

terminalcommand · on Oct 10, 2016

I always think of the DB, as the last castle of a program. When everything else fails, the DB should function. Programs are maintained way more than they're written.

No matter how hard I try to keep my code-base clean, it gets cluttered.

That's why I strive to create a powerful base structure and make them bulletproof.

Whatever may come (depression, personal problems, fatigue, stress, deadlines, etc.) this skeleton keeps the app alive.

In my experience code that is written now, will stay and that complete rewrite I've been dreaming about will never happen :D.

So IMHO it's better to design the core funcionality exhaustively and try to create a failproof code-base.

My intellectual capacity fluctuates and I'm never as productive as I want to be. I need a safety rope for my dumber version, who has forgotten about the code and has no idea where to start :)

Indenting, and aligning helps a lot, when I try to load the code back into my memory.

Numberwang · on Oct 10, 2016

"I need a safety rope for my dumber version"

I like this. 14:00 Numberwang appreciates the effort.

greggyb · on Oct 10, 2016

Or use a code formatter?

dotancohen · on Oct 9, 2016

Though mostly good advice, I definitely disagree with this:

  DONT:  
  Plurals—use the more natural collective term where possible instead. For
  example staff instead of employees or people instead of individuals.

I like naming my tables as plurals so that foreign keys to the table rows can have a name that relates to the table name. For instance, having column `Orders.employee` as an FK to an `Employees` record makes much more sense than having column `Orders.employee` as an FK to a `Staff` record.

exceptione · on Oct 9, 2016

The debate about singular/plural naming for relations comes up frequently. In my opinion, it stems from the fact that English has only 2 forms for all the possible grammatical cases. It has only a different form for the singular and plural cases, and thats what people concentrate on. I propose to take a step back.

There are quite some languages that have different forms for the nominative, accussative, dative and genetive case. Lets take Latin and the concept of an unpaid developer (servus).

Servus is nominative singular. Lets make it plural: servi.

  Select from servi...

oh wait, we have a from so we need an ablative plural here. So we get:

  select from servis...

What about updating? Then we need a plural accusative form:

  update servos...

But are we updating the whole collection or just a single record? Then we should probably talk about servum. Et cetera ad infinitum.

This might be a little overstretched.. Why don't we just use the normal form (the style guide talks about natural) of a noun? Good question, that is the singular nominative form. So, why make a special case for plural and ignore al other grammatical concerns?

throwaway745234 · on Oct 9, 2016

Ian Mackinnon, is that you? (http://stackoverflow.com/a/3894235/1163893)

exceptione · on Oct 9, 2016

Nope, i am not. But good catch, since his comment on SO made me originally see the light. :)

dragonwriter · on Oct 10, 2016

Honestly, if Latin-speakers had invented high-level programming languages (and, especially, languages intended for the kind of use and audience that motivated the design of SQL), they'd probably just include noun declensions as part of the declaration, and then decline the noun properly for the way they were used, rather than settling on single form.

Yes, that's not something we'd do with Latin declensions in a language designed by English speakers, but that isn't really an argument against naming collections with plural nouns in such a language, just because doing what might be natural in a language designed by Latin speakers to resemble Latin doesn't make sense to us.

thijsvandien · on Oct 10, 2016

It makes me curious if you apply the same rule to variable names in (other) programming, when they contain multiple items of a kind (e.g. arrays, lists, sets and so on).

  FrequentCustomerArray

rather than

  FrequentCustomers

? If not, why is that different?

fanf2 · on Oct 11, 2016

It isn't different. There's a conflict between using the collection as a whole

    map(foo, FrequentCustomers)

and using a member

    foo(FrequentCustomer[0])

Perl has plural / singular inflection :-)

    map foo($_), @FrequentCustomer;

    foo($FrequentCustomer[0]);

vbezhenar · on Oct 9, 2016

I always trying to use singular form. I didn't give it much thought, I just don't like repetition, and all those "-s" are exactly repetition (not even saying about non-trivial things like "man-men", etc). The only place where it bit me every time is table for users. "user" apparently is a reserved word and I can't use it as a table name (without quoting). Otherwise I liked my approach.

talaketu · on Oct 9, 2016

I prefer singular; using a plural or collective casts the table as a container of instances, rather than as a relation between attributes.

Object oriented design principles do not effectively translate to relational design

johnnyb9 · on Oct 9, 2016

Not to mention with an ORM tool you'll have an Employee object, and probably call your collection employees.

Which is better?

foreach (var employee in staff) {

}

or

foreach (var employee in employees) {

}

fny · on Oct 9, 2016

Honestly, they both look fine, but they do impart different meanings. "Staff" implies a set of employees dedicated to something like a company or project whereas "employees" could refer to multiple employees in any context.

If you're walking on to a new project, these subtleties make all the difference in understanding what's going on without having to rely on documentation.

vizeroth · on Oct 9, 2016

In my case, we have anywhere from 3 to 20+ types of employees, and "staff" is the word commonly used to refer to one of those classifications. Certain information requires different levels of protection for some of those classifications. While people can transition between classifications, it's not a very common occurrence, and it only makes handling their information more difficult (and the old classification may apply to the information for a specific period of time after the change, or indefinitely for some types of data).

I basically end up with a "people" table which holds a very small amount of common data and is used in a lot of relationships.

Honestly, the distinctions require a great deal of documentation to fully understand. However, making the data less normalized and a little harder to comprehend also protects new people from making honest mistakes which could lead to legal issues.

stymaar · on Oct 9, 2016

> and probably call your collection employees.

Really ? IMHO this is a bad practice : it makes the code les readable and breaks the auto-completion ...

johnnyb9 · on Oct 9, 2016

Doesn't break autocomplete in Visual Studio 2015...

Kiro · on Oct 9, 2016

What would you call it?

cema · on Oct 10, 2016

Better is, imho,

  foreach (var e in employees) {
    // ...
  }

because the iteration variable is essentially a placeholder.

no_protocol · on Oct 9, 2016

I have often seen commas placed at the start of each line when columns are being listed. I'm a little surprised this guide doesn't follow that convention. It would fit nicely with the established pattern of lining everything up:

  SELECT first_name
       , last_name
       , email_address
    FROM users

Can anyone speculate why it's not done this way in the guide? I think the reasoning behind it was to make later modification easier because adding or removing the last field required modifying two lines.

Treffynnon · on Oct 10, 2016

This is a pretty commonly asked question. The answer is two fold:

1. putting the comma before doesn't actually make commenting out columns easier - it just moves the problem to the other end of the list. Just try commenting out `first_name` in your example and you'll see what I mean.

2. because normal convention would be for the comma to come immediately after an item in written English.

r00fus · on Oct 10, 2016

Re; #1, I also use comma-first indenting and I actually put each field in it's own line specifically for ease in commenting (first_name would have its own line)

Treffynnon · on Oct 10, 2016

This doesn't solve the problem at hand either unfortunately. You'll still have a hanging comma in front of `last_name`:

  SELECT 
      -- first_name
       , last_name
       , email_address
    FROM users

When parsed it will look like this query error to the system:

  SELECT
       , last_name
       , email_address
    FROM users

delgaudm · on Oct 10, 2016

I'm surprised too. I'm unaware of any disadvantages to doing it this way. I have always preferred this format myself. It makes things easier to comment out, it makes it so the clause never accidentally ends in a trailing comma before the next clause, and it looks nice.

ryeguy · on Oct 10, 2016

I just think it looks visually ugly. I've heard others think that too, so it's not just me. I don't feel like the problem it solves is really that big of a deal. I can always just clone the second line up since order normally doesn't matter.

ZenoArrow · on Oct 9, 2016

Yeah, basically it's a good practice to start field name rows with commas as it makes it quicker to comment them out as you're tweaking the SQL code (though there's no harm in putting multiple fields on a single row, especially if they come from related tables).

restalis · on Oct 10, 2016

Besides the first reason pointed out by Treffynnon, it's worth remembering that the code we write is primarily for our fellow humans and that a piece of code will most likely be read more times than written so we better strive to optimize it for legibility.

lazzlazzlazz · on Oct 12, 2016

I'm a little surprised that anybody follows the "commas at the beginning of the line" convention - but to each their own.

barrkel · on Oct 9, 2016

If you're using Rails, your life will be much simpler if you do use an 'id' column, use plurals for table names, and name your join tables 'bars_foos', i.e. using the plural on both sides and in alphabetical order.

Don't really agree on indenting joins that are at the same level as FROM. There's not much semantic difference between the table in the FROM clause and one in a JOIN clause. Subqueries should be definitely be indented when nested.

Don't really agree with the leading whitespace alignment either. Alignment across lines is generally troublesome when combined with team development and version control: it leads to merge conflicts. If you start using 'GROUP BY' that forces you to introduce whitespace in your 'SELECT' line to bring everything into sync, then chances of a conflict increase quite a bit.

Don't 100% agree on using IN either, in particular if you're using a tuple membership check. That is, `JOIN x ON (x.a, x.b) IN ((1, 2), (2, 3), (4, 5))` certainly looks neater than the equivalent mix of ORs wrapped around ANDs, but it usually performs quite a bit worse because not many people write their SQL that way and the DB is not normally tuned for it.

A much bigger concern from my perspective (reviewing PRs from other developers) is writing SQL that will perform well for the database and data sizes. There are massive perf pitfalls around NOT EXISTS vs NOT IN, and it's easy to get horrific performance out of MySQL especially if using a subquery in the WHERE clause.

Structuring things as a join gets much more predictable performance, and structuring a set of joins as nested subqueries instead of joins at the same level (or using extra parentheses in Postgres) gives the SQL author more control over what will be done. Don't trust the optimizer to know best. It often doesn't and if it chooses wrong, it can hose a release. Measure, as always.

Stratoscope · on Oct 9, 2016

> Don't really agree with the leading whitespace alignment either. Alignment across lines is generally troublesome when combined with team development and version control: it leads to merge conflicts.

I'm definitely with you there. I've written about the disadvantages of this kind of column alignment several times on HN, so won't repeat them here. (Well, if anyone asks, I'll dig up a specific comment or two.)

I have a suggestion that can help anyone cure the temptation to column align all the things: Try coding in a proportional font for a while.

Contrary to popular wisdom, proportional fonts work fine for most kinds of programming. The only thing you can't do in them is column alignment, and in most cases this limitation is a good thing.

Instead of column alignment, you'll be forced to use indentation, and you may discover that indentation alone gives you enough ways to make your code structure clear without arbitrary alignment.

In my case, it happened the other way around. I got tired of the problems that column alignment caused, so I stopped using it and switched to indentation only. Some time later I was curious and tried a proportional font, and the code was still just as readable as before. So now I use proportional fonts all the time, and I'm no longer tempted to use column alignment.

wvenable · on Oct 10, 2016

My SQL style guide is quite a bit different from this:

* Use consistent and descriptive identifiers and names.

* Make judicious use of white space and indentation to make code easier to read.

* Try and use the same column/table naming convention as your programming language. If you language uses camelCase, use that. If it all possible, names should be the same everywhere to prevent confusion.

* Table names should always be in the singular: "Person" instead of "People", "Invoice" instead if "Invoices". This removes all pluralizing ambiguity and makes sense when mapping to class names, etc.

* Concatenate two table names together to create the name of a relationship table (especially in the case of many-to-many) unless a more appropriate name makes sense.

* Use TableNameId as the primary key column name.

---

* Do not use descriptive prefixes or Hungarian notation such as sp_ or tbl.

* Do not plurals -- don't even use collective names

* Avoid quoted identifiers

* Avoid underscores -- except for use a poor-mans namespacing

* Avoid simply using id as the primary identifier for the table.

* Avoid aliasing as much as possible unless joining the same table multiple times. In that case, use a descriptive name. Never use short one or two character aliases.

* Avoid putting keywords in all caps -- use an editor with syntax highlighting instead.

cema · on Oct 9, 2016

An interesting read! I do follow certain rules when writing SQL, and I agree that having them and following them is a good idea. Plenty of what is in the article looks like good advice.

However, these rules do not always appear to be consistent with how the code in (most) other programming languages is written. Consider indentation, for example. The usual approach is to line up those elements of the code which correspond to the same logical level of the flow, whether it is C++, Javascript, or Lisp. So rather than (quoting from the article)

  SELECT file_hash
    FROM file_system
   WHERE file_name = '.vimrc'

I would rather see either

  SELECT file_hash
  FROM file_system
  WHERE file_name = '.vimrc'

or

  SELECT file_hash
      FROM file_system
      WHERE file_name = '.vimrc'

(The difference here is the same as between indented and non-indented braces in C.) I think this is especially useful when we have multiple joins or inner queries.

Many suggested rules are indeed consistent with what I have seen to be accepted as best practices. Such as naming a table in singular (more precisely, giving it the name of what a single row corresponds to), or avoiding the infamous Hungarian notation (pretty much an accepted best practice in most languages that I have seen).

Overall, I think, a good first step towards building best practices.

dahdum · on Oct 9, 2016

It's good advice, but the formatting shown is not conducive to quick editing imo.

Here's your huckleberry:

  SELECT
  	fs.id,
  	fs.file_hash 
  FROM
  	file_system fs,
  	other_table ot 
  WHERE
  	file_name = '.vimrc' AND
  	fs.id = ot.file_system_id

cema · on Oct 10, 2016

Yes, this is better. BTW, I saw people writing things like

  fs.id
  ,fs.file_hash

and

  file_name = '.vimrc'
  AND fs.id = ot.file_system_id

which has subtle advantages (such as easier editing).

Also, I cannot bring myself to writing SQL keywords in all caps. It feels ancient. Just a matter of taste.

Dowwie · on Oct 9, 2016

"Where possible avoid simply using id as the primary identifier for the table."

Has anyone had trouble by using surrogate primary keys? I've found the opposite of what the author said could be more true: composite keys should be avoided instead.

andreareina · on Oct 9, 2016

I think the author was talking about (not) using `id` as the column name, e.g. `user.user_id` instead of `user.id`

webbore · on Oct 9, 2016

ON second reading I think you are correct. I agree on this point then, as it helps when joining through a denormalized table that references multiple PK ids, ex: I think: user_id = role_id is more readable than the likely alternative: u.id = r.id

NickNameNick · on Oct 10, 2016

it also lets you use the join foo using(foo_id) syntax, which I quite like

echelon · on Oct 9, 2016

Generated unique tokens are better than auto increment IDs. They can be created in a distributed multi-master setup, and they don't expose how many of a given record type you have to 3rd parties when exposing primary keys or feeds via APIs.

dpcx · on Oct 9, 2016

But also because they won't join incorrectly. That was the biggest win in explaining it to other devs.

calpaterson · on Oct 9, 2016

I've had lots of problems with surrogate primary keys where people don't put enough constraints on the natural key and then duplicates (sometimes slightly different) get in.

Unless you're using innodb, primary key is not special, it's just another index. If you decide to use surrogate keys you also need to enforce the natural key too (if possible).

There are also some nice performance advantages you can get from natural keys with covering composite indices in the case where you only want the key. But in general a table can have multiple keys in all but the highest levels of normalisation.

Dowwie · on Oct 10, 2016

That's a good point about manually defining constraints on a table. A composite key forces a developer to think about uniqueness whereas the surrogate key makes that thought process optional.

jawns · on Oct 9, 2016

I disagree with this style point, and I think the author implicitly acknowledges that `id` should be the standard identifier, because further down he points out that the `_id` suffix should be used for columns that make reference to that identifier.

jbverschoor · on Oct 9, 2016

No he doesn't. In an example he actually uses the following:

ON s2.mentor_id = s1.staff_num

staff_num is the primary key

To me this isan anti pattern because: - a primary should (can) never change; - composite primary keys are only useful (save space) in very large datasets

It also conflicts with his pattern: uniform suffix _id

forkandwait · on Oct 9, 2016

I am working with a 20 table db (not including lookups and session management tables) based on surrogate keys and it sucks ass for discoverability. It also subverts data checking based on SQL REFERENCES. I think natural / composite keys should used to design the db and establish relationships, surrogate keys added only if necessary, and then auto-generated, just like one waits to denormalize until there is a real need.

I suspect the love of surrogate keys is just more MySQL bullshit pretending that classic SQL design is wrong merely because it is not implemented in that particular broken database.

Dowwie · on Oct 10, 2016

I'm not using mysql. Would you mind expanding what you mean by the struggles you've experienced during discoverability?

dahdum · on Oct 9, 2016

I disagree with this too, using table.id is easier to type and consistent when creating joins. It also encourages (forces) to reference the fields with tables prefix.

ryeguy · on Oct 10, 2016

One advantage of using user_id on both the pk of users as well as the fk side is you can do:

    INNER JOIN comments USING (user_id)

instead of:

    INNER JOIN comments ON comments.user_id = user.user_id

philliphaydon · on Oct 10, 2016

Which is fine if you are aware of the differences between the two. Since they act differently.

ryeguy · on Oct 10, 2016

I don't think so, what differences are there? Could this be a database specific thing? In PG it seems equivalent.

philliphaydon · on Oct 11, 2016

In MySQL you can end up with different results.

comboy · on Oct 9, 2016

Maybe what author means is that from the db design standpoint, composite keys are used much less often than they could be. Very often unique table row can be defined by some other column or a pair, on which you will probably put an index anyway.

But that's just playing devil's advocate. In general I think ids are fine because ORMs play nice with them and FKs are simple.

Treffynnon · on Oct 10, 2016

Basically this is the answer - besides that bit about acquiescing to the demands of an ORM.

bigger_cheese · on Oct 10, 2016

Naming conventions are pretty controversial I think.

The databases I deal with are mostly for storing time series data taken from measurement instruments (i.e Load Cells, Thermocouples, Flow transmitters, Pressure gauges etc). We seem to use prefixes to distinguish columns i.e.

MS_ (Mass in Tonnes, MS_KG_ etc. Is used If not stored as Tonnes) VL_ (volume, in cubic meters) RT_FL_ (is Flow rate in normal cubic meters per hour) TP_ (is temperature in degrees Celsius) PR_ (is gauge pressure in KPa)

I've seem some databases which didn't use a prefix convention and in my experience it was a real nightmare trying to keep track of units etc.

There is an ISO standard for naming conventions ISO:11179 - My work's databases are (apparently) configured based on this but I'm sure the DBA's extended it in a bunch of ways.

Pxtl · on Oct 10, 2016

I tend to prefer English suffixes and camelcase rather than Hungarian prefixes and underscores. Like, for a steel coil, InnerDiameterMM, MassTonnes, GaugeMM, etc. I prefer to leave abbreviations for cases where it's obvious even to non-experts, like mm for millimeters.

bigger_cheese · on Oct 10, 2016

My only issue with that is you'd wind up with some very long column names.

Using prefixes you can encode a lot of information in a relatively short ID. Which may or may not be useful. As you mentioned though the downside is it requires some domain knowledge to decode.

A lot of our columns are in a form like this:

TP_TSC_HEX1_XXXXX

(where XXXX is some ID usually from scada system)

I can see from the name this is a temperature corresponding to thermocouple id: XXXX in TSC Heat exchanger 1.

Edit: I should also mention some programming languages (not sql) have a language feature for specifying named ranges (I think it is called lazy evaluation) which tends to favor prefixed names over suffixed - I.e x1,x2,...,xn to iterate over the prefixed range. We use SAS heavily at my work which supports using this syntax, which probably influenced the design of our DB tables.

mmgutz · on Oct 9, 2016

Still dont get why sql uses all caps for keywords. It's one of the few languages that has it as a best practice. I can read C/JavaScript/go just fine without all caps.

I do it anyway to conform.

VLM · on Oct 9, 2016

You can ignore the boilerplate if its in CAPS. Quite a few DB engines enforce as a syntax error a SELECT that isn't in the precise order of FROM WHERE GROUP HAVING ORDER LIMIT (plus some others depending on your DB). You and I understand what we mean if we write

LIMIT 10

SELECT *

FROM wtf

but many DB engines will be unhappy because LIMIT has to be after FROM, etc.

Once you're used to the strict order rules, you can skip reading anything in CAPS as long as the bug you're fixing isn't that they're myspeled or order out of.

barrkel · on Oct 9, 2016

Personally I don't care either way, but I like it to be consistent for any given sequence SQL in a file (SQL is usually embedded in some other language's quoting mechanism).

If everything is lowercase, you can tune in to spotting the keywords. If it's mixed, it's just confusing.

There's an argument to be made that because SQL is so often quoted in a different language, highlighting isn't often applied.

wvenable · on Oct 9, 2016

I used to use all caps for keywords before the invention of syntax highlighting. Now my keywords are highlighted in another color just as they are in every other language so I no longer all-caps them.

collyw · on Oct 9, 2016

There are relatively few keywords in SQL, and I think it makes them distinct from user named parts of the query.

userbinator · on Oct 10, 2016

I have a feeling it may be related to increasing its COBOL-ness. Both are languages that were designed to be "human-readable".

jackmaney · on Oct 9, 2016

Agreed. Every time I write SQL with keywords in all caps, I feel like I'm screaming at the query parser.

VLM · on Oct 9, 2016

Encoding. Needs a standard for encoding. "We are all UTF-8 here" As a parody of a bad design: this table is UTF-8 except for that column which is UTF-16 LE and that column of names is CJK and the street addresses column which is 7bit ASCII.

If you embed encoding into a design where the data changes like putting png files into a column named icon_gif, you are deserving of physical punishment. A better design is column binary_blob_28 should really have a sister column named binary_blob_28_mimetype. And don't get cute and use silly names for mimetype, if IANA's never heard of it you probably shouldn't be using it as a name.

Oh and while you're all unicoding for a good time, standardize that everything you write is run thru a normalizer as appropriate, or at least "you tried".

Another fun area not covered (or I missed it) is nothing is more hilarious than a timestamp without associated timezone. So is 1846 a good Central time to eat dinner, or a good UTC time to eat lunch in the midwest USA, or ... some DB support a surprising amount of TZ conversion although TZ always turns into a surprising amount of pain. "We're storing every time as UTC" isn't the worst idea ever.

Where possible ALWAYS use "id" as the name of the table primary identifier and its always a bigint... why burn infinite limited brain cell cycles trying to figure out the prikey of an arbitrary table "hmm well table employee uses MD5(social_security_or_equiv) which is a 16 byte char as its primary key but table source_code uses git hashes now is it the full 40 bytes or is 7 byte prefix good enough (usually is...). If you have to look up every foreign key or even worse, guess, or worse yet, guess and get it wrong, thats the last time your prikey won't be a bigint named id. Oh and as a corollary all your foreign keys will always be named "something_id" type bigint where "something" is the exact table name, not just close.

For the largest subset of equipment you'll interconnect with, figure out the best hardware supported hash and use it for non-security (de-duplication, for example) purposes. That may very well be md5, or maybe the lower 4 to 6 bytes of md5, or whatever.

Speaking of security the guide didn't go into it, so unless you're conferred with someone who knows the problem domain plus a little about security, assume your DB will downloaded next week and published on wikileaks ... don't do something stupid with sensitive data. If theres no reason to store sensitive data, then don't store it at all to begin with!

Animats · on Oct 9, 2016

Further suggestions:

- If a field should never be NULL, put NOT NULL in the field definition. If a field should be unique, put UNIQUE in the field definition. This improves both correctness and index performance.

- A useful format for CREATE TABLE is the one you get back from SHOW CREATE TABLE.

jawns · on Oct 9, 2016

The suggestion to right-align the root keywords (e.g. SELECT, FROM, WHERE) and left-align everything else does make the SQL look good.

But I can't think of a single editor that makes this type of alignment formatting easy to do. Can you?

rspeele · on Oct 9, 2016

emacs C-u M-x align-regexp can do this, if you give it a regexp matching the keywords you wish to right-align and enter -1 when prompted for "group to modifiy". It would still be a hassle though.

dghf · on Oct 9, 2016

Can anyone explain what this means?

> 1. The key should be unique to some degree.

If it's talking about primary keys (which it seems to be), it should be unique, full stop.

If it's not, where does uniqueness come in? Is it actually talking about indices, which, if I understand correctly, should have a wide spread of values (without any of them necessarily being unique) if they're to be useful?

carterehsmith · on Oct 10, 2016

It is just the terminology.

Technically, a key is any column (or a set of columns) that you use to find records. It does not need to be unique. But it tends to be more useful if it is more selective, I guess that's why they say 'to some degree'.

So now, if a key is also unique, it is called a 'candidate key'.

In general, you can have more than one candidate key in a table. For example, a natural key (such as order#) and a surrogate key (e.g. 'id').

Among candidate keys, you pick one, and call it 'primary key'.

dghf · on Oct 10, 2016

> Technically, a key is any column (or a set of columns) that you use to find records.

Isn't that an index rather than a key?

I thought 'key' meant:

- a superkey (any set of columns the values of which must be unique in any given row; in a properly relational model, any relation will have at least one, the trivial superkey, the set of all attributes in the relation)

- a candidate key (a minimal superkey, i.e. one without redundant columns -- if you remove any column from the key, it will cease to be unique)

- a primary key (the candidate key considered to be and designated as the most important)

- a foreign key.

carterehsmith · on Oct 10, 2016

I agree, the definition I provided (not mine) sounds more like an 'index' than a 'key'.

I am not familiar with that usage, I am used to 'key' being used as a logical concept, a constraint, while 'index' is a physical concept, some structure that is there 'just' to make shit go faster.

Anyways, some sources don't even define the word 'key', but use it to mean 'unique key'.

And some other sources (like the one by OP) use it to mean something else (totally not unique), as they recommend that it be 'unique to some degree'

And then some others use the definition I provided above. Is that the one meant by OP? I don't know.

I any case, I guess OP meant something like "indexes should be highly selective'. Which BTW is a performance tuning advice, I am not sure that that belongs in 'SQL Style Guide'.

Treffynnon · on Oct 10, 2016

Thinking multi-column indexes here.

dghf · on Oct 10, 2016

Yes, but it's in the 'Constraints and keys' section. It even says 'Constraints, and their subset, keys'.

As far as I'm aware, an index is not a constraint, except for a unique index, which implies a unique constraint.

Having said that, I can't think of an explanation more likely than yours.

Treffynnon · on Oct 11, 2016

Well that's good seeing as though I wrote the guide in the first place.

An illustration in bad pseudocode:

    PRIMARY KEY (account_number, transaction_date)

dghf · on Oct 11, 2016

> Well that's good seeing as though I wrote the guide in the first place.

Apologies, I didn't realise. That was crass of me.

But could you explain what you mean by unique in some degree? Surely in the example you give the key is just unique, with no further qualification?

Treffynnon · on Oct 14, 2016

OK, so I think this might be a language/understanding difference here - perhaps what I have written isn't clear enough - not sure. The statement to some degree, in my opinion, doesn't imply that the key isn't actually unique. It still means unique.

What I meant is that if you don't have a unique column to make, say, a primary key with then you need to make it unique by adding (an)other column(s) to it in the key definition. Each additional column being a degree of difference/separation in my mind as I was writing it.

I hope that explanation makes sense and doesn't confuse the issue further.

I'd like to get your feedback on what you think it could say to make this clearer - I want to convey:

a. it must be unique.

b. you can have multi-column keys (some users - especially those tied to an ORM - don't actually know that you can do this).

cyberferret · on Oct 9, 2016

Glad to see I conform to the guide in most places, but one thing I have been doing for nearly all 35 years of my programming career is to prefix my SQL views with `vw` so that I can tell immediately in my queries when I am referencing a view that may be made up of other queries.

beart · on Oct 14, 2016

With this style, you lose the ability to transparently switch out a view for a table.

geordee · on Oct 9, 2016

Now we need a formatter/linter too

antoineaugusti · on Oct 9, 2016

I'd love this tool. If you know a great one, please share it

ZenoArrow · on Oct 9, 2016

If you're using MS SQL Server, this tool is popular...

http://www.red-gate.com/products/sql-development/sql-prompt/

... and this tool appears to have similar refactoring capabilities...

http://www.ubitsoft.com/products/sqlenlight/index.php

jalfresi · on Oct 9, 2016

Very much this! I'm getting very used to tools as an implementation of policy! Feel very spoiled!

Numberwang · on Oct 9, 2016

poorsql.com

insulanian · on Oct 10, 2016

The part about indentation is a bad practice to follow. Indenting the code that way is not maintainable. I witnessed that multiple times in my career and adopted three simple rules:

    1. Use only one space character between any two words.
    2. Use only one tab (4 spaces) per indentation level.
    3. Use no more than two blank lines to vertically separate parts of the code.

These three rules are simple, easy to remember and I'm applying them to any language I work with.

johnnyb9 · on Oct 9, 2016

This is great, agree with many of the suggestions.

The worst SQL for my eyes is the stuff that has multiple columns, etc. on the same line. Makes diffing stuff 2x as hard and you can't parse as much information. Though maybe I'm just used to that at this point.

buro9 · on Oct 10, 2016

Let's just agree that we will all disagree.

Then let's agree just to accept that there is no ideal, but that there is benefit in using the same formatting, and that perhaps we should create a sqlfmt utility like gofmt and all just use that.

greggyb · on Oct 10, 2016

Ah, but what RDBMS should that sqlfmt utility target? There are large differences.

Here's one for TSQL. I find it very useful when spelunking through customers' almost invariably awful SQL.

http://poorsql.com/

lisper · on Oct 9, 2016

Arrgghhh... yet another set of rules with no rationale or justification.

Treffynnon · on Oct 10, 2016

Have you read Celko's book? If not I suggest you do. As the guide states at the very top that is where you'll find the detail you're after.

mrgalaxy · on Oct 9, 2016

I have only done a little bit of work with SQL. Could someone please go into a bit more detail on a couple of these points for me?

> Plurals—use the more natural collective term where possible instead. For example staff instead of employees or people instead of individuals.

> Where possible avoid simply using id as the primary identifier for the table.

VLM · on Oct 9, 2016

The first one is just poorly stated but is correct in spirit. Use the most inclusive and flexible name allowed by the business logic without being overly generic. Some would say thats the natural term, I guess. The first example of using "employee" as a term invokes murphy's law that the company will hire their first contractor or intern the first week the software ships, resulting in much confusion, wait you've got a column for employees wheres the column for contractors? The second example guarantees that someone in production will confuse the individual serving ice cream cup production table with your HR list of employees if you use the overly generic word "individual". Or maybe those individuals are sales prospects, not employees.

The second one is simply wrong because it burns brain cells when you come back to debug or extend something a year later and nobody memorizes the prikey of table production_quality_results, is it the serial number of the mfgrd object or the timestamp of the QAQC inspection or the serial number of the inspection activity or ... and when you look at column names in table 131 is drivers_license_id a foreign key to a row in the drivers_license table or just a raw store of data, like this is just where you store it in the system? This is especially hilarious if your FK and data source are similar bigint type, like a bigint to connect a FK to a prikey or is your component serial number literally a bigint itself, assuming the prikey is the data itself without dereferencing it will be a hilarious bug, "it seems serial number 10 doesn't exist in our assembly line" "Whoops thats actually row 10 of the production table, serial number whatevs".

Treffynnon · on Oct 10, 2016

I am totally lost when it comes to your comments on the second item. It seems you're trying to say that surrogate keys are easier to find or perhaps guess.

The indexes set out against a table can generally be accessed with a query making it very easy to work out what existing index suits your use case best.

iblaine · on Oct 10, 2016

A guide is better than no guide. I like the part about adding suffix strings to variables. Two things I would changes

* Add commas before the variable * Do not use single letters for aliases. Some versions of hive will throw exceptions. Plus seeing tables a, b, c, d joined together is annoying to read.

jbverschoor · on Oct 9, 2016

I really like his indentation. I'm gonna use it. Most of it has been my style for a long time.

webbore · on Oct 9, 2016

If you're writing a lot of ad-hoc queries, I prefer to have keywords on their own line and also put commas at the start of the following line vs end of the previous. It lets you comment stuff out easier as you experiment. In a production query the only hard rule should be, "Please be consistent!"

Treffynnon · on Oct 10, 2016

Putting the comma before doesn't actually make commenting out columns easier - it just moves the problem to the other end of the list. Just try commenting out the first column in your SQL and you'll see what I mean.

based2 · on Oct 10, 2016

https://www.reddit.com/r/SQL/comments/3efmn0/sql_style_guide...

mjw_byrne · on Oct 9, 2016

Trouble with this kind of thing is that if you make it general enough to apply to most situations you can't really say anything (too many exceptions) and if you get opinionated then your guide doesn't apply to a wide class of use cases.

Stuff:

1. Advice like "Use consistent and descriptive identifiers and names" - is general programming advice, not SQL style advice, and is so bland as to go without saying (would anyone seriously recommend being inconsistent or non-descriptive?)

2. Preferring standard SQL to vendor-specific SQL just isn't practical for nontrivial use cases.

3. "Use /.../-style comments where possible" - why? This makes commenting out large blocks of code really fiddly. Using double-dash comments, even for multiple-line comments, leaves /.../ usable for knocking out big blocks of code.

4. The advice on table naming - if a table contains one row per xyz, then calling the table xyzs and calling its id xyz_id is sound practice, arguably even if this does violence to the natural language in which xyz is a word. This reduces cognitive load on devs and makes the naming more predictable, especially bearing in mind that not everyone speaks the same first language.

5. No idea why "cars_mechanics" is worse than "services". "services" could mean practically anything.

6. There's a lot more to be said about table aliasing. Personally I prefer a, b, c... and then in derived tables, different groups of letters such as x, y... or p, q... . Stuff which requires an alias which will never be used (e.g. a derived table which is simply selected from) is aliased by _.

7. The guide says Hungarian prefixes are to be avoided, but a _num suffix denotes a number? Doesn't seem consistent.

8. All the advice on portability is unnecessarily opinionated. Portability is a concern that varies from completely unimportant to completely necessary, depending on what you're trying to achieve.

9. Indentation and layout - there isn't a "one size fits all" rule - it just comes down to judgement and taste. The given examples also look weird:

  SELECT r.last_name
    FROM riders AS r
         INNER JOIN bikes AS b
         ON r.bike_vin_num = b.vin_num
            AND b.engines > 2

(a) Tables "riders" and "bikes" are siblings in this join so why are they indented to different levels? (b) The AND isn't symmetric with respect to its operands.

IMHO this is better:

  SELECT
    r.last_name
  FROM
    riders AS r
  INNER JOIN
    bikes AS b
  ON
    r.bike_vin_num = b.vin_num
    AND
    b.engines > 2

Arguably, "b.engines > 2" should be in a WHERE clause instead of the ON clause, because it doesn't refer to table r. The query result is the same and the query planner probably produces an identical plan (depends on engine, of course), so this is genuinely just a style/semantics point.

10. "Avoid UNION" - why??

11. Avoid vendor-specific data types - this just isn't practical real-world advice. Sometimes you commit to a certain DB because you need to use its specific feature set. That's OK.

12. "Prefer NUMERIC and DECIMAL to REAL and FLOAT because rounding errors are annoying". This is completely determined by your specific requirements! Essentially this advice is saying "always prefer arbitrary-precision arithmetic". 32-bit and 64-bit floats are much faster and more compact than arbitrary-precision and are accurate enough for a very wide class of use cases. IMO this advice should be turned on its head: "use floating-point unless you know you need higher precision and you understand the performance/storage implications" - which would be the default approach in most programming scenarios.

13. "Avoid table partitioning" - this is a design/performance concern, not a style concern, and it depends entirely on use case.

14. "Avoid EAV, use a different product" - not real-world advice. Most devs don't have the luxury of just introducing different products when they run into an awkward scenario, and even if they did, introducing a whole new DB engine carries major overhead. Sometimes you just suck it up and have a small EAV wart in your design.

Treffynnon · on Oct 10, 2016

I think a lot of this comes down to what appears to be a simple misunderstanding.

Avoid - in this context means that where it does not make sense (for performance, readability, etc reasons) then don't follow the guide. Follow it where possible and be mindful of when you deviate that you're adding to tech debt.

justinlardinois · on Oct 9, 2016

> If there is already a correlation with the same name then append a number.

And then applies it for this query:

  SELECT first_name AS fn
    FROM staff AS s1
    JOIN students AS s2
      ON s2.mentor_id = s1.staff_num;

I think that's terrible. s1 and s2 sound to me like they're instances of the same table that you had to JOIN with itself for some reason.

developer2 · on Oct 10, 2016

Agreed, there is a lot I disagree with from the article, but this is the only one that my mind lashed out at as simply being WRONG. The only time I have seen numbers used on aliases in 10+ years is on self-joins. Even on self-joins I prefer to see something in the alias that identifies why that particular join is part of the query, but the reality is that such a descriptive name would often be so long that you're more likely to see a numeric suffix.

HalfwayToDice · on Oct 9, 2016

Regarding lower-case/hungarian notation etc, the membership table in my database is:

  tblMember
  ---------
  MemberID
  MemberUsername
  MemberDateJoined

etc

Is this very non-standard formatting for SQL?

matt4077 · on Oct 10, 2016

Yeah, "tbl" is redundant and so is the prefix "Member". In queries, you can use Member.id / member.name etc. for the same effect. I'd also drop the "date" because the it's repeating the type of data (probably inaccurately b/c it's most often a datetime). Sometimes I use pre- or postfix for disambiguation as in "isRetired" vs. "RetiredOn".

The primary column is always id, something like "memberID" would be some id created by an external system.

wvenable · on Oct 10, 2016

Why prefix your tables with tbl? Do you prefix your functions with "fun" and your variables with "var" and your classes with "cls"? It's completely redundant.

carterehsmith · on Oct 10, 2016

Some people do it to avoid clashes with reserved words.

Example: in some DBMSs, you cannot name your table 'user' because it is a reserved word.

"user" (double quotes) works, but there may be gotchas as sometimes double quotes suddenly make identifiers case-sensitive... but only if the filesystem is case-sensitive... things like that.

So you can either deal with the gotchas or say fuck it, put a prefix "tbl" and call it a day.

wvenable · on Oct 10, 2016

I'd much rather just name that table AppUser than prefix every single other table with "tbl". That's just taking one minor annoyance and multiplying it a thousand times.

ZenoArrow · on Oct 9, 2016

There's nothing wrong with CamelCase if it's readable for you. Readability is the only reason some people prefer using underscores between words.

Numberwang · on Oct 9, 2016

Anyone use these fellas [] ?

ZenoArrow · on Oct 9, 2016

Yes, though generally only if a field or table name happens to be a SQL keyword.

philliphaydon · on Oct 10, 2016

In SQL server? In PostgreSQL you would wrap with double quotes. Which kinda sucks when your language uses double quotes for strings and you have to escape.

ZenoArrow · on Oct 11, 2016

Yes, in SQL Server.