
A Tiny Intro to Database Systems - sandcrain
http://blog.dancrisan.com/a-tiny-intro-to-database-systems
======
crdb
As a non-CS grad coming fresh to databases, I found both the entity-
relationship, and the object-oriented models confusing. Then I read Date [1]
and Codd's [2] books and papers on the relational model, the one from the
1970s that is basically set and type theory applied to data, and found that to
be a lot clearer and a more powerful abstraction to deal with your data model.

For example, your Relational Model introduction has a discussion of various
data types. But arguably, whether your integer is implemented as BIGINT or
TINYINT is an _implementation_ decision which should be separate from the
_model_ discussion (dixit Date). In other words, that attribute has a type of
integer and how that integer is stored is a separate issue, and your RDBMS
ought to abstract it away (as, I think, Postgres is pretty good with, and
MySQL quite annoying). The beauty of the latest RDBMS developments,
particularly in Postgres world, is that the implementation has gotten so good
that you don't need to really worry about it like you used to just a decade
ago, at least in 95% of use cases.

Again as a non-"full time developer" it amazes me the number of "experienced"
developers who are not aware of the relational model and who do not know what
a foreign key is or why referential integrity might be important.

I think one can teach SQL (and the relational model) to a non-developer in
about 2 hours, because it is so declarative and intuitive. One day I'll go
write that tutorial, as many clients need it sorely...

[1] e.g. [http://www.amazon.com/SQL-Relational-Theory-Write-
Accurate/d...](http://www.amazon.com/SQL-Relational-Theory-Write-
Accurate/dp/1449316409)

[2] e.g. [http://www.amazon.com/The-Relational-Model-Database-
Manageme...](http://www.amazon.com/The-Relational-Model-Database-
Management/dp/0201141922) or the original paper:
[http://www.seas.upenn.edu/~zives/03f/cis550/codd.pdf](http://www.seas.upenn.edu/~zives/03f/cis550/codd.pdf)

edit to add: on the E/RM vs the RM: [http://www.dbdebunk.com/2013/09/entity-
relatonship-model-not...](http://www.dbdebunk.com/2013/09/entity-relatonship-
model-not-data-model.html)

~~~
edejong
To really teach the relational model would take quite some more time. I would
discuss database normalization (3NF/4NF/BCNF), query optimization, indexes,
foreign key constraints and bridge set-theory with the relational model.
Optional parts would be triggers and other kind of constraints.

To understand the query planner is tantamount to making good schema's and
requires insight in the underlying data-structures (B-tree) and join methods.

Then, for the student to get used to this way of thinking, I'd have them
implement a simple project, e.g. a hotel-booking system.

------
zAy0LfpBZLC8mAC
This is highly confusing. For one, it's pretty out of date (tape as "tertiary
storage"? 512 byte page size? the whole topic of concurrency control without
one mention of MVCC?), but also, the actual explanation seems to mix up quite
a few things.

For example: a "Write-Read Conflict" ist just that, a conflict, which a
database in some appropriate isolation mode would handle by appropriate
locking in order to avoid "reading uncommitted data" (if the transactions were
to happen concurrently--per the definition given, operations don't need to
happen in concurrent transactions for them to be in conflict). Or, an actual
DBMS with MVCC with SSI, like a modern Postgres, would simply execute the read
on its snapshot and thus force the reading transaction to precede the writing
transaction in any equivalent serialized order (even though the read happened
after the write in realtime), and only abort the transaction if that could
lead to contradictions (cycles) in the dependency graph.

------
blueatlas
Well done. But I have to note, the chapter "Schema Refinement - Functional
Dependencies" is an example of what drives many students out of CS. Even so,
this is one of the better introductions to functional dependencies that I've
read.

~~~
justin66
It's a topic that you'll experience halfway through a graduate-level textbook
on databases like Elmasri's Fundamentals of Database Systems. No gentle way to
do it, and a student has probably been driven away or not well before they
read about it.

~~~
jhalstead
We covered functional dependencies (in the same level of detail that's
provided in the OP's link) in the first half of my undergraduate databases
course at UIUC last year. It was painful to say the least.

~~~
justin66
I know what you mean (I think this concept put me to sleep in class) but on
the other hand, it doesn't really have to be. I think with these sorts of
concepts teaching technique is terribly important.

blueatlas cited the bit on functional dependencies as "an example of what
drives many students out of CS" but I'd say that more than anything it's an
example of a tricky topic that ought to be presented by a highly engaging,
smart instructor instead of a boring one who may or may not understand the
material very well. (in retrospect, my feelings and his aren't mutually
exclusive)

------
snissn
This is great, I really enjoyed reading your explanation of B+ trees! Could
you add in a small section regarding how to delete a node?

------
aesthetics1
There are a _lot_ of typos! Run through with a spell check.

------
GizaDog
A needed read. Thanks!

------
sqyttles
Jut curious: why python 2.7 over python 3.x?

------
threeseed
Unless I a missing something this is just for relational databases.

The term database can encompass pretty much everything from CSV files to in
memory distributed grids.

