
Models for hierarchical data (2010) - ValentineC
http://www.slideshare.net/billkarwin/models-for-hierarchical-data
======
maxdemarzi
If you are dealing with lots of hierarchical data, then I recommend you take a
look at Neo4j (or other graph databases) since they are much better suited for
these kind of queries as well as updates at any level and multi parent
hierarchies.

See [https://maxdemarzi.com/2016/10/27/catalogs-and-
hierarchies/](https://maxdemarzi.com/2016/10/27/catalogs-and-hierarchies/) for
a quick example. Read until the end 'cause the kicker is being able to go UP a
hierarchy in no time at all.

See [http://www.slideshare.net/StampedeCon/managing-genetic-
ances...](http://www.slideshare.net/StampedeCon/managing-genetic-ancestry-at-
scale-with-neo4j-and-kafka-stampedecon-2015) for a real world use case with
Neo4j vs Oracle RAC comparisons.

~~~
jinjin2
Real object databases like Realm also makes it trivially easy to build these
kinds of hierarchical models, and I personally find it a bit easier to think
about as objects with references to other objects as opposed to the more
generic graph model of graph databases like Neo4j

------
andrewgrossi
I found this resource very helpful when building a solution for handling
multiple sub-tasks in a todo application (started with adjacency and ended
with closure based on recommendations). I'm assuming this issue is limiting
some of the existing software to prevent 1:n functionality. I also discovered
postgres has built functionality to support tree structure so might want to
investigate if you are exploring options
[https://www.postgresql.org/docs/9.2/static/ltree.html](https://www.postgresql.org/docs/9.2/static/ltree.html).

------
acjohnson55
I'll save you some time: you probably just want to use a closure table.

I did a bunch of research on this while working on a comments system. You pay
a price in table space if your hierarchies tend to get very deep. But for most
applications I can think of, it's the way to go, which you'll see if you page
through to Slide 69.

~~~
TeMPOraL
Well, I haven't heard of closure tables before. I only used nested sets. From
this presentation I learned not just the name, but how they look and how to
make a basic version (I'm sure there are some practical optimizations - O(n^2)
of the basic version feels like it could hurt).

~~~
acjohnson55
Here's another great resource on the options:
[http://stackoverflow.com/questions/4048151/what-are-the-
opti...](http://stackoverflow.com/questions/4048151/what-are-the-options-for-
storing-hierarchical-data-in-a-relational-database)

Yeah, O(n^2) could hurt. I guess you have to have some idea of what average
and max depths are to figure out how many rows you're talking about. But the
other advantages are pretty compelling.

------
snaily
The closure table is a neat solution, even though the adjacency list is by far
the most natural to me. As the author notes, breadcrumbs and subtrees can be
generated in the database layer by recursive CTEs, and support has become more
common since 2010. SQLite has added WITH RECURSIVE, for instance.

~~~
combatentropy
Yes, the first way, the adjacency list, has the simplest table layout and
still lets you set foreign keys. I strongly recommend PostgreSQL's WITH
RECURSIVE queries to go along with it
([https://www.postgresql.org/docs/current/static/queries-
with....](https://www.postgresql.org/docs/current/static/queries-with.html)).
Or now that I know SQLite has it, I recommend either one over MySQL. Writing
such queries is mindbending for a while. But it's worth it to be able to take
care of it all in the database. Before, I would write a function in PHP that
did a database query, then called itself to do another database query, and so
on until it was done.

------
netghost
Closure trees work quite well.

I've had some good success with materialized path solutions in Postgres since
its support for arrays makes them indexable and easy to query.

I'd suggest picking whichever is easiest to maintain, and works well with the
shape and structure of your data. If you use Rails and Postgres I have a few
articles including this one that might help illustrate how it can all play
together with an ORM:
[http://www.monkeyandcrow.com/blog/hierarchies_with_rails/](http://www.monkeyandcrow.com/blog/hierarchies_with_rails/)

------
sitkack
There is an excellent book on this subject, "Joe Celko's Trees and Hierarchies
in SQL for Smarties".

------
siddboots
In practice, I've found that the adjacency list works well enough in most
cases. One good data warehousing habit is to generate the closure table as a
materialised view of the original adjacency list table, so you effectively
have both interfaces available for the same data.

------
smoyer
Modified pre-ordered lists have been used since the '60s to describe the
hierarchy of the telephone system. It's expensive to insert or change a node
but very efficient when read. Since it was used for diagnostics and root-
cause-analysis, reading was far more common than writing.

