

SQL Databases Are An Overapplied Solution - fogus
http://adam.blog.heroku.com/past/2009/7/8/sql_databases_are_an_overapplied_solution_and_what_to_use_instead/

======
mdasen
Relational database alternatives aren't as applicable as they make them sound.
They've provided some examples. The first one is a social networking profile
with a list of interests. "Awesome! I can just store things in a JSON-like
list and not have another table or anything!" Oh, what if I want to find all
the users who have 'ruby' as an interest? Oh, I can only lookup by key? And
yes, there are ways around that. You could create a table of interests and
each interest would be a key in that table and it would have a list of people
with that interest. So, you have 'Joe' => 'interests': 'ruby', 'blokus';
'Amanda' => 'interests': 'ruby', 'rugby'. And then you have an interests table
with 'ruby': ['Joe', 'Amanda']. And when you want to search by interest, you
select the document from the interests table and get the list of people and
then request the documents for those people. But is that more efficient? No.
It both requires more code and takes longer to execute. A RDBMS is able to
optimize a join on that data in a way that you can't.

Similarly, with the e-commerce orders: what if you want to retrieve by the
product ordered? It's not unreasonable to assume a situation where certain
products are fulfilled by warehouse A and others by warehouse B. Well, you're
in the same boat again doing a less efficient thing.

Plus, what is hard to scale when it comes to a database? The article is
passing off SELECT statements as hard to scale. Now, I'm not saying that
they're a piece of cake. You can get into a lot of trouble. However, reads are
relatively easy to scale since you can just add more boxes. Writes are hard to
scale because, unless you shard and do other hard things, you only get the
power of one box for writes (since the writes have to be done on every box
while a read only has to occur on one box).

So, even if you're using a document based store, you eventually have to shard.
Now, when you never relate data, sharding can be a lot easier since it can be
done based on a hash function. Systems like memcached do this automatically.
So when you say get(1, 42, 64, 128) it will be able to hash those ids and send
each request to the proper server for that item. But that means that you lose
out on a lot of ease. And most of these alternative stores don't do that for
you (and it's why memcached is such a useful tool alongside a RDBMS).

And SQL databases do scale a lot more than Heroku (in these two articles)
seems to let on. Wikipedia, Facebook, Craigslist, and Flickr are all backed by
MySQL. Now, not MySQL alone. Memcached is a big part of it for all/most of
them (I don't remember what each site uses exactly). There's a reason why many
of the largest sites use a SQL database and it's not because they're unaware
of other storage engines.

It seems like Heroku might be getting a lot of complaints that the service
isn't magically scaling. Computers aren't magic and document based databases
aren't any more magic. CouchDB uses B-Tree indexes just like relational
databases. The difference isn't so much that these data stores offer better
performance for some lookups. It's more that they only offer the lookups that
can have good performance.

I feel like I should offer some free advice to Heroku: your SQL databases
would scale better if your dedicated SQL boxes were Amazon's high-RAM boxes
rather than the high-CPU ones they opted for. RAM means more for database
performance than CPU. Oh, also, offer some consulting for clients on their
database woes. A lot of the time, people are doing lookups that should be
using an index, but they haven't created it and so the database has to do a
full table scan rather than an index lookup. And there's a big difference
there. A full table scan of 1 million records will take 50,000 times longer
than an index scan. Yeah, indexes are good. One of the reasons that CouchDB
"scales so well" is that you can _only_ do queries on things you've made
indexes for.

I'm not saying that non RDBMS don't have their place. They do. However, we
keep seeing articles posted about RDBMS not scaling and it doesn't seem like
they quite know the purpose of scale. Scale doesn't have to be infinite.
Nothing will do that. The questions is: will it scale enough to handle the
traffic? And SQL databases will unless you're really, really big. Do you
expect your site to become a top 500 site on the internet? Heck, WordPress.com
at #20 is doing fine implementing a not very efficient SQL backed blog. Now,
I'm sure they implement caching and such, but it's still SQL backed.

Basically: learn a lot about indexes. If you become one of the top sites on
the internet, hire someone who can help you. In the meantime make your product
and don't worry too much about the FUD.

~~~
patio11
_Oh, also, offer some consulting for clients on their database woes._

We do this at the day job. It runs about $X00 an hour, with a minimum of Y
hours, if our customers need it. Heroku, on the other hand, has a lot of
customers who are wondering how far their $50 a month is going to get them.
For these customers, many of whom are Rails types who don't quite grok
indexes, it might be a better solution to say "Um, look, rather than us
teaching you a core engineering skill that you're manifestly unwilling to pay
for, how about we suggest a technology stack that makes this skill
unnecessary".

Previously, one of the Rails hosts (Dreamhost or Heroku, can't remember)
released stats saying something like 97% of customers create no indexes. I
totally understand how that can happen, too -- you expect ActiveRecord to be
magic, and with what it does it is very powerful magic, but it is not magic
that totally obviates your need to think about database design. (Edited to
add: My business runs on Rails, I consider myself to have low to intermediate
SQL ability, and if you contact my day job to get consulting on your database
woes you won't get handed off to me anytime soon.)

~~~
cscotta
If there are Rails developers who work with applications of any size at all
and are _not_ familiar with indexes, the problem isn't scaling - it's lack of
knowledge of one's application's stack. The mentality that one's app should
magically scale without any idea of what's under the hood is toxic.

Anyway, it doesn't get much simpler than: add_index :users, :account_id

~~~
patio11
Right, but it gets _much more complicated_.

Near trivial example: what index or indices do you need to support the
business requirement "I want to know how long users stay active after they
sign up, and I want you to be able to slice that data by signup date and by
whether they're paying customers or not."

So programmer Bob goes off and does this.

"Oh, Bob, the screen only lets me slice the data by signup date and by
customer type, but I want to slice by both at once."

So Bob makes a one line tweak to his controller (to use both conditions,
instead of one or the other)... and BOOM, down goes the poor server.

~~~
mattculbreth
Well this is a requirement in the Business Intelligence domain, so you should
create a reporting database (probably a star schema) and put an analytics
package on top of it. You'll get easy sub second queries.

------
pj
These heroku guys are selling something. They're on a bandwagon, constantly
bashing SQL databases. I'd say SQL Databases are great for 90% of information
based systems.

The problem you are going to run into, I know a company they wanted a
particular database, so they got a designer turned programmer who built a
system for them in Perl using flat files. His architecture was to put one data
field per line in a file and then put the one-to-many records, comments about
the master record in the file starting at line x. Well, now they have fields
for the master item all the way to line x minus 2 and they know they want like
5 more fields.

If the developer who built the system had started on an SQL database, then the
solution to the problem of needing more fields would be really simple.

Now this problem is just one issue. Add to this the problem of needing to
query the master records on particular fields. The programmer has to open up
every single file and go to the particular line where the field exists and see
what the value is. Of course as the file based database grows, it's really
starting to impact performance.

The big problem here is that the original programmer didn't know SQL or how to
manage a SQL database so he downloaded some Perl code (note, I absolutely LOVE
perl, so don't be hating) that managed a dataset in perl and just went to
town.

This is not the way to build a robust information based system.

SQL databases are a tool and files are a tool and CouchDB is a tool and a lot
of people are going to read this biased stuff about noSQL and think, "hey,
let's use couchdb, et al" and they only have 5 users on the system and, you
know, maybe a million records and Access, MySQL, SQL Server Express, or any
number of other free sql based systems would handle a problem like that just
fine.

Instead, they're going to go write a ton of code to fit a key/value store to
the problem and it's going to be a nightmare to support as the system grows
and they need to implement security and manage more fields.

But the pro-noSQL people don't talk about _all_ those issues that are going to
arise later and focus _solely_ on the scalability issue when in the vast
majority of cases scalability is really the last thing you need to think
about.

~~~
sho
While I agree 100% with your point that people need to think carefully about
their actual needs and not just follow the hype of the moment, comparing
"NoSQL" databases to someone writing flat files in Perl is a bit harsh! Using
SQL these days, with modern ORMs, is if anything quite a lot easier than
manipulating flat files. And that's the problem! SQL is the new "flat file",
the hammer every programmer knows, so you know what every problem starts to
look like.

~~~
pj
Yes, that was a bit harsh, but the point I'm trying to make is that the noSQL
proponents aren't really talking about PRO-couchDB, they just keep hammering
on how "bad" SQL is, when it's really frickin' awesome.

~~~
sho
It's unfortunate the conversation seems to have taken that turn, yes. I don't
like the term "NoSQL", which is why I put it in quotes. SQL is a fantastic
tool, so is Couch et al .. I don't understand why people need to "declare" for
one side or the other. Ah well, human nature I suppose.

~~~
pj
yep, that declaration is crazy. They're all good tools. Use the right tool for
the job. I think part of the issue really, unfortunately, is that SQL is
tough. It's not taught in colleges very much. I learned it on the job. I never
learned anything about SQL in college and it's not something you can just jump
into, but a key/value pair is easy. Anyone who has taken a data structures
class is going to understand it. It's basic programming.

You gotta kinda wrap your mind around SQL though and if you don't have a
support group helping you do that, you're probably going to struggle with it.
So I can understand the desire to find a simpler alternative to solve
information based systems, something between file based Perl and SQL.

~~~
timwiseman
I suppose much of it depends on your background, but in both colleges I
attended (one for undergrad, other for master's) there were classes offered on
Database theory that did cover at least the basics of actual SQL use. I know
that's a minuscule sample size, but I can say I would be surprised by any CS
department that didn't at least touch on it in electives at a minimum.

I also remember when I first started with sql that I could read and understand
(most of) it quite easily. Writing it was of course another story, especially
if optimization mattered, but it wasn't hard to read and not too hard to learn
at least the basics. At this point the market is filled with books like "SQL
for Dummies" and "The Manga Guide to Databases" that provide very gentle
introductions for complete beginners as well as numerous classes at practical
trade schools.

Mastering SQL and RDBMS is definitely a work for years and creates dedicated
specialists just like any deep field, but picking up the basics is relatively
easy compared to many other tasks in this field.

------
dasil003
I really can't stand the hype around these document DBs anymore. Sure SQL
database are overapplied. I realize many people are incompetent with them, and
that they often don't give appropriate thoughts to alternatives. The title is
fair, but then it veers off very quickly into hyperbole and misrepresentation.

To state that the applicability of relational databases is narrow is
egregious. Any one of the list of reasons could be a reason to use a
relational database, it doesn't require a significant portion at the author
seems to imply. There are other reasons too... like how error prone it is to
write raw map-reduce code to generate a query that would be trivial, fast and
guaranteed correct in any relational DB. And I'm thoroughly sick of the
boogeyman of scalability as well. You can go up to the scale of 99.99% of web
applications with relatively little hassle.

The article acts as if data integrity is usually not important, but for the
vast majority of applications that are out there in the wild and older than a
6-month startup the data is far more valuable than the code. For a huge subset
of all data SQL databases represent it the most accurately with adequate
performance.

I don't care if the pendulum has swung too far towards SQL--this type of
article is intellectually dishonest and won't help anybody make better
decisions, it's just more hype.

------
ams6110
I'm not convinced by this. SQL databases have proven over the years (decades)
to be a very good choice for storing data. The author ticks off a litany of
other products (document databases, cloud storage, memory object caches,
message queues...) to consider instead.

Now you have a bunch more stuff in the mix. Stuff to install, stuff to
configure, secure, periodically patch or update, backup, etc.

SQL databases don't scale? Maybe not in the extreme, but for the majority
(vast majority) of web applications they will do just fine if they are used
intelligently.

~~~
brown9-2
I don't think that the author's point has to do with performance, but that a
SQL database isn't always the right tool for the job and shouldn't be the
default answer all of the time.

~~~
patio11
Right. I love SQL, don't get me wrong, but there are a couple things my site
does which SQL is just a suboptimal solution for:

1) Action caching. I stick it in memcached. This gets me free expiry-by-time,
free blow-away-cache-when-deploying-new-version, and saves me having to create
and maintain a table specific to caching in the database. (I really, really
hate changing my schema, since I invariably manage to screw something up when
migrating. Solution: don't, absent a clear need to.)

2\. Caching expensive queries. Again, memcached. Much of my analytics data is
publicly accessible, changes infrequently (how often am I going to add a new
sale to March 2007, honestly?), and lets users calculate very expensive
summary statistics. This screams "cache me!" rather than letting somebody burn
my server to the ground just by opening up /stats and clicking like mad.

3\. Short-term scratchpad. I have a couple of places where I need to remember
little bits of information for a few minutes.

For example, a lot of customers have issues with entering registration keys,
so I try to make this automatic for as many as possible. One way to do this is
to have the application generate a random ID for itself, and submit this
random ID to my web page when requesting the sales page. The sales page then
passes this to my offsite payment processor, who will in turn pass it back
with successful transactions. Meanwhile, the application polls the server
periodically to see whether their ID has been associated with a purchase
recently. Sure, I could stick that in the Customer table, but it feels
unclean: logically, it isn't a business requirement of customers, it is just a
little hack I use to make their lives easier, and it has no meaning outside of
an interval which lasts, on average, about 30 seconds.

~~~
silentbicycle
You can use memcached and an RDBMS _togther_ , though. While RDBMSs will use
memory to cache query results (with varying degrees of effectiveness), it'd be
unreasonable for them to try to know about opportunities for caching e.g.
generated HTML. That's really not their concern. (You may wish to look into
views, though.)

The database is for searching, indexing, and ensuring that transactions are
atomic. The cacheing is a different layer.

------
OperaLover
Does the OP believe we really need to have "database wars" to match the tired
old "OS wars" and "browser wars"?

Careful analyses are useful (and I've seen some well-reasoned debates in HN)
but what strikes me is that clear-cut cases for using one tool vs another are
rare, for web-dev back ends at least. Witness: the large-scale sites that use
various _imperfect_ db solutions. Yes, they sometimes involve some "hacks" to
the canned db solutions, but who's found the perfect swiss-army-knife that
needs no modification for any project.

What _does_ strike me as significant is the mastery level of the db developer
(whatever the db on which they have the greatest mastery/comfort) ...because
projects sometimes "go south" or end up getting late-stage additions of
unforeseen complexity.

Conclusion: I'm a db agnostic. Or maybe "poly-theist." There are several good
competitors. None is perfect. Several are powerful, especially in the hands of
an insightful and experienced db hacker. ( IMHO )

~~~
HoneyAndSilicon
On the upside: even if the "wars" are a tad artificial... it _does_ provide db
noobs (like me) a chance to see the issues (at least when processed in the
analytical way they are here at HN).

------
mattmcknight
His example raises a question... 'items': [
{'product_code':'AX5718','qty':'1','price_per_unit':'5.00'},
{'product_code':'BB9388','qty':'3','price_per_unit':'2.00'}, ],

This is starting to look like an N queries type problem if I want to print out
an receipt that includes the name of the product. I'll have to grab all of the
product codes out of the order and iterate over the list to separately query
for each of the n products to get its name. A join would be faster. And what
happens when a product changes its code? Where's my surrogate primary key?

I also don't see where he gets the infrequently-written part of this: "If we
restrict our use of SQL database to only what they are good at - small,
infrequently-written, long-lived records with complex data relations - and
start storing all our other data in more appropriate places, we won’t need to
push our SQL databases to such high levels of performance or scale." A lot of
frequently updated data is better off in a relational, or at least normalized,
store. If the price of a stock is changing all of the time, do I want to keep
updating it in everyone's portfolio document? If you denormalize, you turn a
single update into a multiple document update, which turns into locking a
bunch of documents or records which might be waiting for another multiple
document update.

Now, if all of your writes are inserts, it's a little easier to live in a
denormalized world.

------
mtts
The guy quotes "Data integrity is 100% paramount, trumping all other concerns,
such as performance and scalability." as somewhat of a disadvantage of SQL
databases.

Then he goes on to mention that e-commerce orders are examples of things where
100% data integrity isn't needed.

I find that shocking. An ACID-compliant DB engine is essential backup for when
customers manage to break your application in ways you never even imagined.

~~~
rleisti
CouchDB is ACID compliant

------
wglb
I am tempted to say "real men don't use databases", but that is a little
dramatic. Databases in the sense of RDBMS/SQL databases. It seems that HN
doesn't, Google doesn't, Yahoo doesn't, ITA does (from what I understand). (Or
at least they are using a lot of Oracle--maybe that is stuff that is much more
than just the RDBMS.) Is your problem as big or as hard as the problem that
these outfits solve? Isn't it an interesting matter for contemplation why
these smart folk aren't?

While the article has an agenda, it seems useful to understand why a database
might be needed and what the useful alternatives might be.

~~~
nettdata
A lot of databases "suck" because they have to be jacks of all trades, in that
they have to be able to do a ton of very general things.

That's why a crap-load of web sites and apps can all use the same database
software.

You will ALWAYS have much better performance if you write your own "database"
that specifically does what it is you need doing, as in the case of Google,
etc., that you mention.

And don't forget, Google didn't just write their own database, they wrote
their own file system, etc., and even designed their own hardware (batteries
on board to eliminate large UPS's, etc).

But they can afford to spend the money to hire the really smart people to
build and maintain that stuff for them, and they have the scaling requirements
that justify it.

Most other companies either can't justify it, or don't want to... they'd
rather rely on the vendor to maintain the DB code, to security
testing/updates, etc.

Really, I think a lot of the complaints about SQL (especially in the case of
these recent articles) come from the fact that it's hard to design and
implement a system that works the way they want it to, and there's no tool
that just plugs into what they have that does what they need.

I see their issues as the exception rather than the rule, and not really
applicable to most situations.

$0.02

------
Tichy
Interesting read, but I am doubtful about some aspects. For example "SQL Blobs
are the wrong choice for binary data" - wouldn't that depend entirely on how
the SQL Blobs are implemented? Sure, maybe apache can serve faster from the
file system than from an SQL Blob (if you read the SQL Blob in the middle
layer and stream it). But that is just one possible implementation. In theory,
the web server could be part of the database and stream the SQL Blobs
directly. SQL Blobs could be implemented as files in the file system. And so
on...

------
dmfdmf
What does HN run on?

~~~
silentbicycle
The dataset is stored in memory on a server written in Arc, in turn running on
top of MzScheme. (The source comes bundled with Arc.) It occasionally writes
changes to disk, and lazily reloads them when restarted.

The data set is pretty small, though - the id for this comment is just under
700,000, for instance. While it's nice to see that one server running an
interpreter can hold up to so much traffic (and this is much more traffic than
many people will _ever_ get!), it's not really representative of the cases
where worrying about scaling actually becomes necessary.

~~~
Tichy
While it is cool, I wonder if that really was less work than simply storing
things in a db. Presumably with a good ORM mapper, caching the data in memory
transparently should only be a small configuration setting.

It just seems to me there is so much to worry about when working with the FS
directly. Like what happens when the app crashes in the middle of a write?
Maybe it is not an issue anymore with modern, journaling file systems, but
there are probably some issues left?

~~~
gnaritas
> While it is cool, I wonder if that really was less work than simply storing
> things in a db.

Yes, it's less work. The introduction of a databases nearly always more than
doubles the complexity of a program and the amount of code written. There's a
reason developers like in memory prototypes and persistent hash tables, they
require less code than any other solution and don't force you to tear apart
your model and try and stuff it into tables and put it back together again on
every page view.

~~~
silentbicycle
Not if you factor in the amount of code necessary for ACID transactions, etc.
If you make an informed decision that you don't need that level of
consistency, sure, but it's probably reasonable to be overly cautious with
data by default.

(If you're going to the database and back for every page view, you could
probably be using caching more effectively, e.g. memcached.)

~~~
gnaritas
ACID transactions have nothing to do with RDBMS, non relational db's can be
and are ACID compliant as well. Look at any object database, you'll generally
find MVCC and ACID compliance.

The issue is whether your data fits well into tables and rows, or not, and
whether it's worth cramming it into tables and rows and continually
reconstructing it to get the benefits relational db's generally provide over
the alternatives... which are enforced data integrity, language neutrality,
and a generic fixed structure capable of allowing flexible querying. To get
those, you trade speed and ease of development.

~~~
silentbicycle
I think we fundamentally agree, but we've been a bit loose with terminology
along the way, and have been arguing with misreadings of each others' points.

Have a nice weekend.

------
lsb
SQLite version 1 was implemented on top of a gdbm backend. They ran SQL
queries on top of a key-value store. I wonder if Tokyo Tyrant, which has the
goal of being a modern implementation of dbm, would make a good backend.

~~~
strlen
It would be interesting to see a SQL parser (a front-end, basically) on top of
a distributed, HA key-value store. It could even be a pluggable MySQL engine
(there already is a Memcached engine).

Of course transactions and certain types of constraints would become
impossible (as key-value stores do not provide for atomicity), but that
_could_ be an acceptable trade off.

~~~
dasil003
If you're going to use SQL why would you use such a limited backend with a
parity mismatch and a much greater set of limitations in place of many
heavily-optimized-over-decades backends?

~~~
strlen
To gain the scalability/performance of a k/v store while having the capability
to do complex queries (e.g. joins, range queries)

~~~
dasil003
I believe you would be much better off with a DSL that mapped to a set of fast
operations on a k/v store. SQL databases are heavily heavily optimized, and
far more performant than k/v stores for most non-trivial queries. k/v stores
don't somehow give magical performance. What they do is limit you to a small
set of primitives that helps with scalability and maps more naturally for
applications that are document-oriented and don't slice and dice data as much.

------
DanielBMarkham
"SQL databases don't scale, but that’s ok. "

I wish I could have read that sentence first, instead of last.

Relational databases are just persisted forms of cross-linked lists. As if
they're going anywhere anytime soon.

------
Devilboy
I don't agree with these two points:

| The database does not need round-the-clock availability (middle-of-the-night
maintenance windows are no problem).

Lots of SQL databases are part of 24/7 critical infrastructure.

| Queries do not need to be particularly fast, they just need to return
correct results eventually.

Queries can be very fast if you predefine the correct indexes to suit the
particular query. Only ad-hoc queries that you didn't plan for will be slow
(sometimes). As it would be on any type of database.

~~~
gnaritas
> Only ad-hoc queries that you didn't plan for will be slow

Not true. If that were so denormalization wouldn't be a standard technique to
speed up queries that join many tables and have to be really fast. The fact is
SQL databases force you to reconstruct your document from many tables in a
query and that is costly, even with indexes.

It's simply much faster to not break the data up in the first place to fit it
into a tables and rows data structure if speed is your primary concern.
Storing a complex document, as is, will result in much faster retrieval.

RDBM's are a tool, a damn fine tool, but like any tool, they're aren't the
answer to every solution and far too many people pull out the RDBMS as the
answer to every problem.

~~~
mattmcknight
Except you just showed that normalization has performance tradeoffs, not that
relational databases aren't a good solution. They offer you the flexibility to
normalize or denormalize as the conditions demand.

Of course they aren't the answer to every problem, and I think unstructured or
semi-structured data (such as HTML pages or text data with entity tags) is a
good case for storing things in a more document oriented system.

~~~
gnaritas
Normalization was just an example, the point was that joins are expensive.
Even denormalized, you can't do much without joins for any complex data
structure.

Relational databases are always are good solution; however, they are not
always the best solution for the problem at hand. The real problem is that
once someone has their head wired to an RDBMS, it's often impossible to get
them to even consider another solution. It becomes their golden hammer.

The articles main point, that "SQL Databases Are An Overapplied Solution", is
absolutely correct.

~~~
wvenable
"the point was that joins are expensive."

That is a blatant myth. Joins are not slow.

~~~
gnaritas
They are when compared to no joins. Rebuilding a data structure is more
expensive than not taking it apart in the first place.

~~~
silentbicycle
Yes, but normalization is done to lean on the RDBMS's constraints to ensure
the data is internally consistent, not for performance-related reasons.
Similarly, keeping backups may waste storage space, but when you need them,
you _really_ need them, and it's too late to change your mind.

~~~
gnaritas
Exactly my point, all those nice things RDBM's give you come at the cost of
performance trade-offs. Relational databases are never the fastest option,
they are the safe conservative option.

The bill of goods that RDBM's sell is that data integrity trumps all else. The
reality is that is doesn't. Data can be cleaned and integrity maintained by
background processes if the performance you get is more important to you than
absolute up to the second data integrity. There are lots of cases where
available data is far more important than up to the second accurate data.

Eventual consistency is more than good enough for many many things common web
applications do. In many cases it's perfectly fine that the web app is looking
at 4 or 5 hour old data, it just doesn't matter, what matters is that it can
look at it in milliseconds without having to rebuild it on every query.

~~~
silentbicycle
While I ultimately agree with you, I think that the safe conservative option
should be the default. By the time peoples' systems are large enough for the
performance difference to actually matter (if ever), they will have had plenty
of time to get sufficiently informed about the trade-offs involved. The people
who are upset because they get crappy performance from MySQL _when they don't
even know about indexing tables_ are not qualified to decide whether risking
data corruption is worth a performance boost.

In other words, it's premature optimization.

~~~
gnaritas
And I think the simplest solution requiring the least amount of code and
effort should be the default. It's not just about performance, it's about
implementation effort. The safe conservative option is quite often a lot of
extra work.

Most people aren't building banks, for most apps data corruption is not the
big concern RDBMS fans make it out to be.

------
alphazero
"SQL Databases"? SQL is a language.

"Queries [in a relational database] do not need to be particularly fast, they
just need to return correct results eventually."

I stopped reading at that point. The article is symptomatic of everything that
is wrong with this profession.

~~~
silentbicycle
When people call them "SQL Databases", it's often a sign that they don't
understand the relational model. Of course, it's much harder to make a RDBMS
perform well if they don't, just as code written without an understanding of
algorithms is likely to be unnecessarily slow.

~~~
alphazero
Agreed. (Prepare to be downvoted for making a rational statement contrary to
the prevailing fads.)

~~~
silentbicycle
FYI: This isn't Reddit, and any meta-commentary about voting here is, at best,
ignored as noise that detracts from conversation.

ObRelational: For people who want to learn about the relational model, two of
the better presentations I've read are:

1\. _The Definitive Guide to SQLite_ by Mike Owens -- short, non-academic,
theoretical enough to be useful. Focuses on SQLite, which is nice, because
that way you don't take MySQL's flaws for granted.

2\. _An Introduction to Database Systems_ by C. J. Date -- academically
thorough; it's a textbook, so the previous edition is pennies on the dollar.
Mostly covers the relational model itself, rather than SQL, which is somewhat
like comparing the pure untyped lambda calculus to Scheme. Once you understand
the big ideas, though, everything else is just syntax and performance tuning
(which are often implementation-specific anyway).

