

3 Rules of Database App Aging - Harkins
http://push.cx/2009/rules-of-database-app-aging

======
chime
Let me add a few more from my 16+ years of DB coding:

1\. Many field types will change over time. Booleans will become multiple
choice. Character fields will become memos/text. Date fields will start to
have year, month, or day parts optional.

2\. All tables with tax-logic will be compounded by code hacks. So just a
table with county tax rates and order amount ranges will not be sufficient to
calculate the applicable tax because you will need to keep adding special
conditions of tax to apply when pencils are sold on 3rd week of every month
when school is in session.

3\. Management will never understand how it could be that complex to "just
change the database" to enable payroll calculation for hourly employees who
work overtime just because the database already handles the fixed payroll for
salaried employees. In other words, databases with existing data, logic,
triggers, and reports will always be difficult to change.

4\. The biggest rule is that business logic never ever translates to
structured database tables without lots of conditions in code and strict
limitations in scope of application. I'm not talking about pretty to-do lists,
address books, or blog engines. I'm talking custom ERP apps, MRP modules,
production forecast algos and my favorite, scheduling for production and
employees.

I guess the take-home message in all of this is that don't get into this field
if you don't like challenges of this sort. In addition to coding up DB apps, I
also make a lot of non-database apps that keep me sane and balance things out.
However, in terms of sheer complexity and project planning, good DB apps are
pretty high on my list of "exciting" things.

~~~
biohacker42
Interesting, if these DBs ever growing in complexity are just a representation
of an organization's inner workings...

I wonder if you could use an organization's DB to predict when they'll be too
big to function.

~~~
DLWormwood
> Interesting, if these DBs ever growing in complexity are just a
> representation of an organization's inner workings...

Isn't there a named law based on this?

EDIT: Yup. Found it.

<http://en.wikipedia.org/wiki/Conway%27s_Law>

------
tricky
The idea of putting any logic in the database never did make much sense to me
for these exact reasons. The beauty of software is it is soft (i.e. easily
changed) and us hackers have learned how to manage change with tools and
whatnot. Databases are robust and solid because they have to be. A lot of the
time they're the app's foundation. You go chipping away at the foundation and
you end up with a trailer home.

I spend 100% of my time writing "database apps." I'm not saying I'm right, but
I use the database to store data, and my app to maintain consistency and
provide logic. You can't see the future, so why pretend? I mean, who would
have thought a single person can be considered either male, female, or both
depending on the context and date... That happened to one of my apps... No one
would have had the forethought to handle that case during the initial design.

~~~
tt
> The idea of putting any logic in the database never did make much sense to
> me for these exact reasons.

I'm a big proponent of enforcing business rules right into the data model.
Constraints such as not null, reference constraints, or even conditional
unique constraints (yes, there's such a thing!) should be enabled wherever
possible.

Just remember, your app will not be the final app that makes use of the data.
Your app comes and goes, along with many other apps written by many others.
But the data lives "forever."

~~~
voidpointer
In my opinion the notion of implementing business logic outside of the
database and acknowledging the longevity of the data are not necessarily in
conflict. You need to have a well defined model and you can choose whether the
database is a full implementation of that model or just a representation of
the state of that model.

I don't find databases well suited for expressing complex behavior -
programming languages are far better at that.

------
Blaksheep
I've been at my company for just shy of 4 years and I had no official training
or schooling in database design or programming for that matter. When I started
we were roughly 3 years out of a migration from MS Access to SQL Server - all
done without rewrites or restructuring, but rather a simple recreation of the
existing schema.

4 years later, the 4 "programmers" that were there are now long gone and we've
cycled another 5 or 6 since I've been there and have managed to keep another
10 on staff. Not one a DBA.

So you start with a horrible database design in MS Access. Couple that with
the migration to SQL. Add in 8 years of "maintenance" by nearly 2 dozen
programmers who were forced to do web design, web development, application
programming, hardware maintenance and everything else under sun because
management doesn't know the difference. Finally, you mix that up with a sales
driven CEO & COO and no real CIO, which means that deadlines and client
satisfaction are king and properly designed architectures take a back seat to
"just add a column so we can get it done".

What does leave you? A 12 Million dollar INC 500 company, backed by a dozen
programmers with an average age in the mid 20's and average salaries just
BELOW the average Level 1 programmers salary in our area. It also leaves you
with a 9 hour day with your Management Studio open writing alters, updates,
deletes and selects just to handle the relentless tickets that come in from
the rest of the support staff. And it also leaves you with an 86 GB database
consisting of 5000 tables, 5700 views, 3500 stored procedures with little to
no data warehousing and enough transactional processes to choke an 8 Core, 24
GB Dell Server on a daily basis.....

Did I mention we don't have load balancing or proper offsite backups? We've
been teetering on the brink of absolute chaos and collapse for so long that
routine 30,000 row updates I won't even blink at. And there's no end in sight.

This article is great, but it doesn't even scratch the surface. ;)

------
andersnawroth
I really find it refreshing that more people start questioning "relational
model" databases. The discussion here inspired me to write a little on what
the problem actually is with relationships in RDBMS, read it here:

[http://blog.nawroth.com/2009/01/aging-databases-and-
relation...](http://blog.nawroth.com/2009/01/aging-databases-and-
relationships.html)

~~~
peterneubauer1
Good post Anders!

------
voidpointer
These are great examples towards the fact that the relational model isn't
always the best way to go. Especially with regards to an evolving model. In
object oriented models, polymorphism can be employed to extend an existing
model with rather minimal impact. Introducing, for instance, a subtype for an
address in a relational model isn't easily done. I'd rather model in OO land
and use the DB as dumb object store. I haven't implemented a serious project
based on an real OO database. Any experiences you guys would want to share
with regards to model flexibility with an OODB?

~~~
litewulf
Single table inheritance?

~~~
seiji
people from the wild mysql badlands may not understand your point. here you
go: [http://www.postgresql.org/docs/current/static/ddl-
inherit.ht...](http://www.postgresql.org/docs/current/static/ddl-inherit.html)

~~~
litewulf
I actually use MySQL ;) Its pretty web two point oh-y.

Honestly, when I say something like "single table inheritance" I wish people
would use say a search engine and type it in and see what comes out. Its a
pretty unique phrase that gets you what you want... don't enable people who
expect everything to be explained to them! (Not uninformed-ist mind you.)

------
tom_rath
No way. If you discover your database model is incorrect, fix it. Data kludges
and hacks are not professional solutions.

The individual who wrote this is dead wrong and either headed for a lifetime
of maintenance pain or a well-deserved reputation of being "that guy who
caused all the problems" after he's moved on to cause pain elsewhere.

Nonsense like this is why corporate DBAs are such hard-asses when it comes to
changing the schema.

~~~
Harkins
Hi, I'm the individual who wrote this. So far I have never earned a reputation
for causing problems, but I have earned one for fixing them. (Yes, I am in
contact with former coworkers.)

I think we may be in violent agreement: I also think kludges and hacks are
unprofessional. In the post I'm writing about my experience with the way that
models are exposed as imperfect over time and they're fixed to deal with it.
As much as I gripe about exceptions and outliers, capturing them is a big part
of design.

~~~
tom_rath
...Except the 'fixes' you've given aren't fixes: They're patchwork on-the-spot
kludges which let you ignore the problem until it crops up again. A few years
of that will give you spaghetti code that is a nightmare to maintain.

How about that user who has two locations? Sure, you overlooked that in the
initial design, but a permanent fix needs to be made to ensure the system will
accommodate the next one which arrives.

A profile for someone located outside the US? Pretty likely you'll see that
happen again if the system grows. Why not account for it correctly rather than
rely upon duplicate data or another such hack?

Your urge to 'fudge and hack' is natural but the professional solution is to
resolve the root cause by correcting the data model, not fiddling with the
data to 'just make it work'.

After the initial glory of creating Software v1.0, much of the remaining work
involves fixing the system to better reflect reality. The reason mature
software takes 'exceptions and outliers' in stride is that it's been coded to
do so after a great deal of trial and error in on-going analysis and
discovery.

Ignoring those exceptions and working around them rather than fixing them
permanently leads to madness.

~~~
Harkins
The correct fix for the guy with two locations was to remove the location_id
column and create a join table from People to Locations. The correct fix for
the guy outside was to alter the Location model to accomodate that, which
included making US postal codes optional.

 _Your urge to 'fudge and hack' is natural but the professional solution is to
resolve the root cause by correcting the data model, not fiddling with the
data to 'just make it work'._

We agree again, that's why I wrote that they're "like freezing a bottle of
water, it always grows and breaks worse in the end."

~~~
tom_rath
That sounds reasonable.

The impression I gained from your blog post was that of willy-nilly
denormalization and data duplication to accommodate edge-cases (which often
become anything but) as they're discovered to make problems quickly go away.
After working on a few development, refactoring and data conversion projects,
I was astounded to discover how often those sorts of kludges crop up and how
costly they become.

Perhaps you could write a sequel pointing out the pitfalls of what appear to
be easy database fixes to design errors but turn out in the long-run to be
anything but.

~~~
Harkins
Sorry I left you with that impression, I certainly didn't mean to. I only
mentioned those as great ways to shoot yourself in the foot when you don't
have the patience/skill to do the Right Thing.

Sounds like you're better prepared to write that followup than me, but I'll
ponder it. Maybe "Database Anti-patterns". Actually, just typing the name
makes me think it likely someone's done this sort of thing, but I don't have
time to check.

------
ruby_roo
I am curious about the implications of Rails migrations here. It seems that
migrations give you a pretty good level of softness in your data models when
you need it. Of course, I think a lot of Rails apps do end up with much of
their logic in the M part of MVC. And understandably, some folks are not very
comfortable with that.

Can anyone speak from experience on this? Talking about mature Rails apps now.
Are Rails migrations a help or hindrance in terms of 'app aging'?

~~~
nikz
In a number of the apps we've worked on, we've found that migrations are no
longer useful in order to build a database from scratch (say, you're getting a
new employee started and are setting up his development rig).

Mainly because the actual Model code changes so much in between someone
running the "rails" command and the "mature app" stage - the validation logic
and callbacks being the main culprit.

Eventually, it's easier to bootstrap the schema from schema.rb (rake
db:schema:load IIRC) than it is to maintain migrations from a zero state.

Then you've just got to make sure you keep up with the deltas :)

------
Hexstream
And an equation to sum it up:

DatabaseSchema + RealWorld = Entropy

------
edw519
_The modern database paradigm is defined by relations, so of course that’s
what falls apart as soon as you get an app into production._

Only poorly designed data base schemas "fall apart".

If you find yourself making all fields optional or needing only many-to-many
relationships, that's a sure sign that you already did something wrong when
you designed the data base.

Surely you must have known about the guy from Guam or the one living in
Virginia but commuting to D.C. when you analyzed your app, right? And even if
you didn't, you did design a data base structure flexible enough to accomodate
the natural evoluation of the app, right?

Please don't confuse the limitations of 3rd and 4th normal form with the
shortcomings of data bases built with them.

~~~
Harkins
For what it's worth, the "what falls apart" line was intended to be tongue-in-
cheek.

 _Surely you must have known about the guy from Guam or the one living in
Virginia but commuting to D.C. when you analyzed your app, right?_

Nope. For most of the apps I've worked on recently, we're pulling incomplete
datasets that are augmented from other, slightly-incompatible sources and
manual entry and must be designed and built to meet the deadline for a
newspaper story. So I do the best I can to capture the data and then change
the schema as these uncaptured/new requirements make themselves known.

~~~
edw519
_must be designed and built to meet the deadline_

 _I do the best I can_

Then we agree.

The analysis for the app was incomplete (for whatever reason). I'm not looking
to place blame, I'm just saying that a well designed data base by definition
must handle the wear and tear of the app.

Some of the worst crap I ever had to maintain was originally considered a
success because it met its original deadline. Sigh.

~~~
inerte
> I'm just saying that a well designed data base by definition must handle the
> wear and tear of the app

Impossible. This is pure theory and has no ground on reality. Name me one
database that was designed to handle Sarbanes-Oxley needs in the last decade.

Business requirements change, and so does the app and the database.

Tell me how you would deal with this situation in your current database
designs: A few years ago, a born male person here in Brazil got his sex
legally changed to female (also had a sex-change operation). So, s/he's a
female for most of legal concerns, but s/he's got the genome presented on
males, which means:

1) He needs to be a "female" on your employees records; 2) She needs to be a
"male" on your company health insurance records;

Sorry if I sounded abnoxius, I can't argue softly in English :p

~~~
Harkins
Something a little like this happens on our elections app: politicians have a
name and party, but sometimes run in an election under a different name (eg.
"Robert" runs as "Bob") and party (eg. Joe Lieberman lost his primary but won
in the general as an independent). And sometimes they're doing this in
multiple elections at once.

