Misleading title. While #4 and #3 look like good ways to avoid relying on a horrible MySQL-only misfeature (silently picking arbitrary values from rows you've grouped over), almost all the other behavior (maybe other than LIMIT in #5) is correct according to the ANSI standard and you'll find other databases do the same.
That's funny. I read the first one, then all I did was look at the photos. I understand that I'm just me but this article was more set up as a number of photos with some unrelated SQL tips around it. I think the poster over did it. I think the Social Media 101 is more referring to how Joel used to add a bit of flair to his articles.
As another comment (http://news.ycombinator.com/item?id=1880981) pointed out, this is part of the ANSI standard for SQL, and not exactly due to MySQL's handling of NULLs strangely (there, also fixed the apostrophes for you).
Most of the things listed there are not MySQL specific. Here's one that I'm pretty sure is:
SELECT * FROM example WHERE value = 'á';
where "á" uses UTF-8 encoding.
By default this will return rows that match both "á" AND "a", which may or may not be what you want.
If you want to make MySQL distinguish between an "a" with an accent and one without, then you need to specify a collation either in the table creation/alter statement, or in your query directly:
SELECT * example WHERE value = 'á' COLLATE utf8_bin;
All major engines have collations and collation-specific behavior (which can be confusing). However, this one I saved for the next post (with another set of completely unrelated pictures).
How can this ever be expected to work, unless, of course your column is literally '1, 2, 3'? Seems strange to blame MySQL for it. ("MySQL is buggy because it cannot infer meaning from my arbitrary string literals!")
None actually "blames" MySQL for that. The article just describes what comes to the beginning developer's mind first, and tells the correct way to do that.
A lot of these come down to NULL makes things not work the way you might expect. Maybe C. J. Date is rightbwhen he says there should be no NULL in the relational model.
Or possibly the NULL should behave more like a modern NULL and have a different truth table for it. I don't think it is coincidence that no modern NULL value in a programming language acts like the SQL NULL.
People will usually reply to my criticisms about SQL NULL by pointing out that it is logically consistent, which is true, but it's not the only logically consistent NULL we can define. I know the theory behind the SQL NULL quite well, well enough to observe just how rarely it matches SQL practice.
(I personally find the SQL null semantics incredibly clear and understandable, and find the weird hacks many languages take to have null mean something really confusing.)
It's possible to make a relational schema without nulls. Most people avoid it though because it adds extra tables and makes queries more complex, in some situations. In others it can make them simpler.
Is that what he meant? Because without having read any of his work, and relying only on my practice with relational DBs, I would say it would be nice if columns were not NULLable, thereby removing ambiguity from non-inner joins.
Yes, he meant no NULL anywhere. The relational model as originally formulated by E.F. Codd did not have NULL. It was only much later that some tried to extend it by adding NULL (or even two different kinds of NULL).
In Date's book "Database in Depth" he gives some wonderful examples of how NULL in SQL breaks things. One I recall involved a couple simple tables and a simple question about the data. There were something like half a dozen or so reasonable ways one might write a query to answer the question--and there were many different answers depending on the exact query.
"Database in Depth" has been updated and refocused more on SQL, and is now called "SQL and Relational Theory". He sums up in the latter book the problem with NULL thusly:
To sum up: If you have any nulls in your database, then
you’re getting wrong answers to certain of your queries.
What’s more, you have no way of knowing, in general,
just which queries you’re getting wrong answers to and
which not; all results become suspect. You can never
trust the answers you get from a database with nulls.
In my opinion, this state of affairs is a complete
showstopper.
That depends on how and when you're going to use it. It's fine with small data sets, or if you're going to run the query and cache its output periodically. With MySQL you can also speed it up quite a bit by selecting only an indexed column rather than `SELECT *`.
But yeah, if it's in a query that you're going to run on every single hit to the index page of a busy site, then you're going to have problems.
All "wrong" queries are wrong in PostgreSQL as well (and #3 won't even parse). The workarounds for #4, #3 and #2 are different for PostgreSQL, the others will work.