

Marriage and Object Databases (response to the similarly-named article from earlier today) - jrockway
http://blog.jrock.us/articles/Marriage%20and%20Object%20Databases.pod

======
giardini
From the article:

"Polymorphism is where OO excels and where the relational model fails..."

But nothing can be done with OO structure that cannot be just as easily
modeled in a relational form. One only need think in Prolog to find the
appropriate model. And that's the beauty of relational databases: all
constraints are in the form of relations (i.e., not some as relational
constraints and others as inheritance constraints).

Various bugs (there are probably more) in the implementation: The article's
database allows two entries of the form: (marriage_id1, Tom, Mary) and
(marriage_id2, Mary, Tom). to coexist in the database,

The article's database also allows such transitivities as (marriage_id1, Tom,
Mary), (marriage_id2, Mary, Bill), (marriage_id3, Bill, Tom). Relational or
OOP, a constraint to eliminate these cases requires a search of the database.
It is insufficient to require only that one cannot marry oneself.

The article's database also fails to handle the not uncommon case where two
persons marry, divorce and then remarry (often consecutive times).

About performance: There's never been a question about performance: object
databases can perform faster than relational databases, usually by an order of
magnitude. Most OO database use primarily pointers while relational databases
use primarily indexes. Computers are fast enough today that in most instances
we don't notice the difference. But where speed is absolutely necessary OOP
databases (and similarly structured databases such as CODASYL, ISAM, etc.) are
used.

Some reasons relational databases are preferred: \- ease of maintenance, \-
the relational model allows a standardized query language, \- constraints are
handled within the database engine proper.

~~~
jrockway
Thanks for your comments.

 _The article's database allows two entries of the form: (marriage_id1, Tom,
Mary) and (marriage_id2, Mary, Tom). to coexist in the database,_

This mostly stems from using a set based on eq. If it used a custom equality
function, this wouldn't problem. (The code implementing this involves a long
trip into the weeds, and I wrote the article for people that have never heard
of object databases... and for people that aren't necessarily comfortable with
CL.)

 _The article's database also fails to handle the not uncommon case where two
persons marry, divorce and then remarry (often consecutive times)._

I would just add a fresh marriage object to the set. The reason we "change-
class" on the marriage when divorcing is because a divorce cannot exist
without a marriage. A remarriage is no different from a fresh marriage... in
my mind anyway.

If you want to query on remarriages, though, then you should probably make a
type for this. A quick "defclass" and you are set.

 _Some reasons relational databases are preferred: - ease of maintenance_

Not sure there is a big difference here. There is some care and feeding that
BDB requires, but nothing extreme. (You can also use a SQL-based database as
the backend for Elephant.)

 _the relational model allows a standardized query language_

There is no reason you couldn't declare classes so that information necessary
for querying is present, and then add a query langauge on top of that.
AllegroCache does exactly this.

 _constraints are handled within the database engine proper_

I'm not sure why you see a difference between (network + relational database
engine) and (rpc frontend + object database). I mean sure, you can touch the
files on disk and ignore the RPC interface, but I can rm -rf /var/lib/mysql
too. I don't see a difference in safety, other than the many years of
production testing that the SQL databases have.

This is an implementation detail, though, and not a fundamental difference.

------
neilc
_I don't know what would be going through the mind of someone that thinks that
Enums should be modeled by using separate tables (i.e. one table for males,
one for females; modeling the enum Gender = Male | Female), so I am not even
going to talk about those. I am going to jump right to a model that won't make
people cry._

I didn't see anywhere in the original article that models "gender = male |
female" as _two_ additional tables. There is a schema variant that uses a
single additional table "genders", with foreign keys pointing into that table
from the "humans" table. That seems like quite a plausible choice for modeling
an enum in a normalized fashion, and it provides a clean way to handle when
the set of enum choices changes (that is, new choices can be added trivially,
and data integrity is automatically checked when options are removed). It's
not the most efficient or concise representation, of course, but those are not
the only criteria when choosing a data model.

~~~
whatusername
he was talking about the start of the OP: There was a "Male" table and a
"Female".

It was options 1 through to 6 I believe.

------
Hexstream
I really don't get why he declares that a &rest variable is of type list... Of
course it is.

~~~
jrockway
Indeed, but delete-duplicates wants to know that it's a list and not a vector.

~~~
Hexstream
But the compiler already knows that &rest variables are lists...

~~~
jrockway
SBCL doesn't. Try it and see.

------
bhiggins
you can model type inheritance relationally. it's completely orthogonal.

