Hacker News new | past | comments | ask | show | jobs | submit login

I think in the OP's case, they are staring at a dizzying number of tables for Drupal and wondering if the schema is "good."

It's easy to create a clean schema for a simple problem.

Drupal (and many other mature CMSes) are the result of a decade or more accumulation of features. Much like perl, "it's worse ... because people wanted it worse"[1].

1: https://groups.google.com/forum/m/#!msg/comp.lang.perl.misc/...

Ah, I didn't realize people were taking the linked schema as an example of something with a lot of tables. To me, the linked schema is by no means a bad design—nor a design with a particularly large number of tables!

The single problem this schema has, is nothing to do with the schema itself, but rather the way it's visualized (and possibly the way it's manipulated at the RDBMS level.) That problem is that these tables are not namespaced.

Clearly, there are multiple components/microservices here. There is one set of tables for a AAA system with generic object ACLs; another set for a generic object annotations system; another set for a half-baked JIRA-like "database in the database" to handle custom CMS object-types; another set for an object store; another for a custom reverse-indexing search-engine implementation; etc.

Each of these components has a sensible number of tables, and is in a practically-useful normalization form. You just can't see where each component starts and ends in this diagram, so it looks like one massive web of tables.

If you draw in the component boundaries, then this schema is quite elegant, both in concept and in practice. When working with the code of any given component, you'll only be dealing with a mental model consisting of the tables related to that component, and maybe with the foreign-key references from the "generic" components that can make assertions about anything (like the tags/annotations, or the ACLs)—but you can usually forget about those and just CASCADE updates/deletes down to them.

There's no feature-itis here. "Drupal" is just the name of a system containing seven or eight distinct services. Sort of like "Kubernetes" is.

The only "problem", if you want to call it that, is that because Drupal is one codebase and one process, it doesn't strictly need to take advantage of the isolation features RDBMSes provide to allow distinct applications/microservices to not get in one-another's way, e.g. SCHEMA objects. Unlike something built as a bunch of standalone processes, there's no force pushing Drupal to componentize (and therefore, naturally namespace) these tables. They'd have to make an explicit choice to do so.

Applications are open for YC Summer 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact