
Why Normalization Failed to Become the Ultimate Guide for Database Designers? - blasdel
http://lambda-the-ultimate.org/node/3762
======
henrikschroder
Reaching the perfect database schema is only possible if you have static
requirements. Out here in the real world, things change, and the more
normalized your database model is, the harder it is to change that to reflect
your new business needs.

Another opposing force is real-world performance. In most cases we're fine
with a non-perfect model, we're fine with duplicated data, and we're fine with
inconsistent data, because the problems they cause are much easier to deal
with than performance bottlenecks. Having a slow frontpage because getting
your inventory list requires a five-way join is gonna put you out of business
really, really fast.

~~~
olavk
In my experience it is easier to change a normalized scheme then a non-
normalized. The redundancy and integrity issues of non-normalized schemas are
really difficult to deal with when the schema has to evolve due to changing
business needs. I have certainly never observed that un-normalized data should
make it _easier_ to evolve a schema - can you provide an example of when this
would be the case?

Regarding performance optimization, I think it is valuable to distinguish
between the logical model and the physical model. Any kind of caching is a
kind of redundancy, but that is not a problem if the database engine
guarantees the integrity of the logical model. Relational database engines
provides ways to optimize the physical storage and querying of data without
compromising the logical model. For example materialized views can provide the
same kind of optimization as a denormalized table in the 5-way join case - but
without the integrity issues.

------
wingo
Whenever Z-Bo writes I feel like there is a thought there, screaming to get
out, but muffled by a blanket of words. I wish he would be more concise.

~~~
lucifer
I got the sense that he didn't grasp the point of Kent's paper (c.f. his
remark regarding "N+1 schema"). I personally agree with Kent's view regarding
the necessity for yet another (but final) abstraction _layer_ to decouple
semantic and logical abstractions. At some point in the future DSLs and
Schemas will meet in a happy union to achieve this.

~~~
mschy
To quote one of the first engineers I ever worked with: "all problems can be
solved with an additional layer of abstraction. _"

_ except performance problems, of course.

~~~
BigZaphod
"except performance problems, of course."

Well, in some ways, even those can be solved with abstraction. The key is
where that abstraction layer is placed. :) The assumption is you add them "on
top" of what came before; but if you go in the other direction you may be able
to improve performance by implementing, say, a smarter CPU with higher level
abstractions built into the silicon. Things like that.

~~~
olavk
High level languages is also an example. C is really an abstraction layer upon
machine code, but in most cases compilers produce faster code than what you
would do if you wrote assembler. So C is an abstraction layer that helps you
improve performance.

------
DanielBMarkham
I didn't see a lot of insight here, but I upvoted it anyway for at least
dealing with some of the trickier issues of programming.

If you think of programming in mathematical terms, it's almost like you're the
polar opposite of the pragmatic practitioner. This whole idea of a perfect
form is inane. Relational set theory gives us conceptual ways of talking about
constructs. It's a tool, not a way of life. Somebody build a bad relational
model? Well -- what do you mean by "bad"? Did the resulting code achieve the
goals set at the beginning?

And when I got to the point in the abstract where it said no books had shown
how to use it well, I was ready to bail out. I'm not sure what books the
author has been reading. The ones I've seen have practical real-world examples
of normalization all through them.

------
jganetsk
I wonder if it's possible to have some sort of controlled de-normalization.
Like, the ability to create a read-only column whose values mimic that of a
column in some other table. A "view column" if you will.

I would definitely opt for that as a "real-world" practice.

------
scotty79
Joins don't scale.

------
bhiggins
There need not be any relationship between the schema, which is purely
logical, and how it is implemented. In practice, there is, but why this is
still the case is baffling.

------
Kilimanjaro
Let me guess twitter's bigtable:

    
    
        - id, nick, mail, pwd, time, mytwithere
        - id, nick, mail, pwd, time, anothertwithere
        - id, nick, mail, pwd, time, mythirdtwithere
        - id, nick, mail, pwd, time, mylasttwithere
    

hmm, no, that won't work

normalization still wins

