
Why Programmers Don't Like Relational Databases - mqt
http://typicalprogrammer.com/databases/programmers-vs-rdbms/
======
edw519
Now this is just stupid.

I've built a career moving data out of flat files into RDBMS in order to get
systems working properly (and quickly). I couldn't believe the hoops people
jumped through just because of their dislike (or ignorance) of relational
databases.

This data base model is perfect for many applications, and can easily be mixed
with other data storage technologies (if you know what you're doing.)

Don't like the capabilities of your RDMBS? Then add what you need to you app.
Indexing, transactions, stored procedures, audit, security, whatever. The code
doesn't care where it's at. It still runs.

Not pretty enough for you? Then become a mathematician or fashion designer.
Otherwise, shut up and code.

~~~
gregjor
edw519: I'm not sure you got the point of my article. I am a programmer who
happens to like relational databases. I have enough experience with RDBMSs
(and the same half-baked alternatives you've run into) to at least have an
opinion. My article was not criticizing RDBMSs or SQL, but rather giving my
opinions on why so many programmers don't like or don't understand relational
databases.

My anecdotal experience is that a lot of programmers bristle when they get
into a technical area that actually has a mathematical basis or a demonstrably
right way to solve a problem. Programmers are used to problems that have lots
of possible solutions, none of them provably right or wrong. I've seen that
attitude in the realms of databases, computer graphics, compilers, and
algorithms.

A few years ago I worked with a programmer who actually set out to prove to me
that bubble sort was more efficient than whatever Oracle was doing to sort
rows. He was retrieving a lot of rows, unsorted, then sorting them with his
own code. Even pulling out Knuth and trying to explain big-O notation to him
had no effect -- that was just so much academic bullshit and opinion that
didn't apply, according to him. Of course Oracle was able to sort the rows
orders of magnitude faster than his PL/SQL bubble sort, and the <i>coup do
grace</i> was delivered by a DBA who took the inefficient cursors out of the
solution.

That kind of thing is played out every day between programmers, and when they
run into something like relational theory their first instinct is all too
often to dismiss it as old, inapplicable, ugly (see the other comments in this
forum), rigid, etc. Those are all code for "I don't understand it and I can't
be bothered to learn it." When a DBA who is charged with maintaining data
integrity interferes with whatever bad idea the programmer has he or she is
subjected to <i>ad hominem</i> attacks. Some DBAs no doubt deserve contempt
but in my experience a lot of programmers can and should be learning from DBAs
rather than sneering at them.

In the comments in this forum you'll see the "Relational databases can't work
with tree-structured data" canard hauled out, as if this exact problem hadn't
been addressed numerous times in the database literature (Joe Celko, Fabian
Pascal, Chris Date, etc.). That's ignorance, and the older I get the more I
suspect it's more willful than I'd like to think.

~~~
edw519
I'm not sure why you're not sure that I got the point of your article. I agree
with pretty much all you have to say here.

What I think is "stupid" is not your article, but the fact that some
programmers don't like relational databases (for whatever reason). I'll try to
pick my words and make my point more carefully in the future.

I understand much of the theory behind these technologies, but, as a
practitioner, I'm not in a position to base my actions solely upon theory. I
HAVE to deliver. I wish I had a nickel for every benchmark I've conducted over
the years. You may have noticed that I tend to avoid theoretical debate here;
I'm more interested in what works.

And relational databases work very well, very often. Whether you like them or
not.

------
geebee
I hope I don't get negged as a starry eyed Rails enthusiast, but here goes...

Up until recently, I generally heard programmers complain constantly about the
limitations of SQL and RDBMS's, and I heard a lot about object databases. The
idea was that programmers would define objects, and the database would provide
persistence for the objects. It was (and still is) a nice idea.

Rails took a completely different approach, embracing the RDBMS as the model
itself. Rather than creating objects and pushing the design out to the RDBMS,
you define the relational structure and allow Rails to extract an object
model. If you are doing basic CRUD operations, you may not have to think about
SQL again, even in a reasonably complicated app, and the syntax is very clean.

I just love it. And Ruby is like a beautiful flo... oh yeah, didn't want to
get negged, better shut up now.

~~~
ardit33
duh. Hibernate has been doing this for ages, before even rails existed. And it
is a mess. You are trying to shoehorn a solution (ORM) so the database will
fit the way you would like that data back.

How about actually having the database start evolving just like programing
languages have?

~~~
jimbokun
"How about actually having the database start evolving just like programing
languages have?"

What's stopping you from solving this problem? Sounds like a great
entrepreneurial opportunity. Maybe you could submit a YC application.

~~~
ardit33
Haha. Maybe next year. It has taken decades to reach to the RDMS that we have
today. They have improved efficienncy by time, but the inovation has stoped.

Any new system, (OOD) will take years before reaches the maturity level to be
used in commercial solutions. Kinda hard for a cash strapped startup.

Also, there have been attempts to build OODs but they have failed, mainly b/c
they way ahead of their time.

What I would like to see a distributed Object Oriented Database system, that
can scale at will.

~~~
jimbokun
I think Ellison and the Oracle guys consulted for years in order to eat while
working on their relational technology until they thought they were close
enough to go for it and drop the consulting and bet everything on the database
as a product.

Maybe you could take a similar approach?

------
breck
Isn't it because programmers are bad with all types of relationships?

~~~
edw519
LOL. That's not what my probation officer said.

------
mojuba
An intriguing topic, nice try, but the author failed to explain the reasons
why programmers hate databases. It's definitely not because we don't know the
relational theory (I beg your pardon, but screw relational theory, it's as
ugly as DBs themselves).

One problem I can think of is that I'm forced to mix two languages in one
story (a program, that is). It's like writing an essay in English over-
peppered with a lot of Latin phrases and sayings. In web development it gets
even worse: as if my favorite programming language X plus SQL wasn't enough, I
mix up HTML, CSS and JavaScript - that's at least 5 languages in total!

Another problem is that from a programmer's perspective databases are, put
simply, arrays possibly with pointers. The only reason we can't declare these
structures as arrays/lists with pointers is that data can grow infinitely big
and my favorite language X can't handle that. In other words, using databases
is a classical example of premature optimization that comes at a price of
overcomplicated code. Yes, databases are plain arrays and even the best
relational advocate can't prove the opposite (or I'd be glad to hear arguments
actually).

Upd: persistence and a possibility to share data between processes is another
reason we use databases, but I'm sure there are more elegant ways of doing
this.

~~~
nostrademons
I don't understand the resistance to using multiple languages in one project.
Every time you use an API, you're effectively using a new language. The syntax
may be familiar, but the vocabulary is completely different. That's the
_point_ \- through libraries and DSLs, we extend the capabilities of the base
language so it's more suitable to the problem domain.

Take a look at the following SQLAlchemy code:

    
    
      select([users.c.first_name + ' ' + users.c.last_name,
              users.c.email, games.c.title], 
           from_obj=[users.join(games, 
                     users.c.user_id == games.c.creator_id],
           and_(games.c.creation_date == datetime.datetime.now() - datetime.timedelta(3),
                or_(users.c.email.like('%@aol.com'),
                    users.c.email.like('%@hotmail.com'))))
    

Can you really say that's easier to read than this?

    
    
      SELECT users.first_name + ' ' + users.last_name,
             users.email, games.title
      FROM users
      INNER JOIN games ON (users.user_id = games.game_id)
      WHERE games.creation_date >= DATE_SUB(NOW(), INTERVAL 3 DAY)
      AND (users.email LIKE '%@aol.com' OR users.email LIKE '%@hotmail.com')
    

SQLAlchemy is a really cool library - it basically lets you write all your SQL
queries in 100% Pure Python. But I doubt I'd use it on my own projects,
because I've learned to be suspicious of libraries whose only purpose is to
make some other tool look like a language I already know. Invariably there are
reasons why the original language looked the way it did, and the
reimplementation as a library becomes incredibly clunky as things get more
complicated. I find I'm better off learning the original language instead.

~~~
olavk
My theory is that there is two kinds of people. Those that like everything to
be in the same language and environment, and those that like to combine lots
of different languages where each language is optimized for a specific task.

~~~
davidw
It's _difficult_ to be good at many languages. Knowing many is not too hard,
but being really quick with them is. IMO, at least. And the more you add, the
messier things get. Google, who seems to know a thing or two about
programming, only allows four: Java, Javascript, Python and C++.

~~~
nostrademons
In production code (i.e. deployed to the publicly-facing website). I've heard
that for internal-facing apps, 20% projects, and non-Google-branded stuff, you
can use any language you like. Orkut was initially written in .NET, for
example.

------
sanj
I've thought about this a bit, having had to write various ORM layers over the
years.

My conclusion is that OOP is conceptually built around single "objects" where
RDBMSs are built around sets of "objects". The difference in nominal units is
some of the basis of the impedence mismatch.

Think about what it takes to do something interesting with a collection of
objects in c++/java/whatever, such as pick out the subset that have a cost>n:
it is a loop over the entire set. Ruby makes it prettier and cleaner, but
you're still just looping. Unless you create a _custom_ collection class that
internally stores a hash organized by costs. But the creation of the new class
is extra overhead because the nominal unit is the single object.

The counterexample is just as instructive. To retrieve the newest 10 entries
in a blog uses the LIMIT syntax in SQL, which is painfully nonstandard:
<http://troels.arvin.dk/db/rdbms/#select-limit>

Things that are simple in one space are hard in the other.

In my own work I've resolved some of this by attempting to avoid abusing the
"standard" ORM tools that are out there that allow designing queries in your
language of choice (Rails for Ruby, Hibernate for Java). Instead, I attempt to
figure out exactly what I need out of query: how to structure the SQL to give
me back the exact set of data I care about. No subsequent looping in Ruby/Java
allowed.

That forces a line in the system between the DB and the application code where
each is able to be use its nominal units in as native a way as possible.

~~~
neilc
You don't need to step outside the SQL standard to retrieve the newest 10
entries in a blog -- SQL window functions (RANK(), ROW_NUMBER()) can do this,
and they're part of the spec. You could also use a cursor with ORDER BY.

~~~
sanj
I don't believe that row_number() is standard, and I thought it couldn't be
relied on for being monotonically increasing with inserts. It will also fail
if you need to use a date column that may not be ordered by time.

And, for me at least, cursors are definitely in the range of "advanced-ish"
SQL.

My point isn't that it can't be done -- empirically, many blogs do it -- but
that it points to part of the impedance mismatch.

~~~
neilc
> I don't believe that row_number() is standard

Sure it is: see Section 6.10 of Part 2 of SQL:2003.

> I thought it couldn't be relied on for being monotonically increasing with
> inserts. It will also fail if you need to use a date column that may not be
> ordered by time.

I think you are confused: ROW_NUMBER() is a _window function_ , and as such is
applied to an arbitrary query expression:

    
    
     SELECT x, y, row_number() over (order by z) from t
    

And I would beg to differ that cursors fall into "advanced-ish" SQL :)

I don't see how it is an "impedance mismatch": transforming sets of data is
the fundamental goal of SQL, and this seems to fall squarely within that
domain.

------
henning
I like databases because unless you have an unusually nice job, SQL is the
closest you'll get to getting paid to use a declarative programming language.

