Pure curiosity: Do you know of any other OSS projects that went down this route?
Wordpress uses the EAV db pattern too (have a look at the wp_options and wp_meta tables).
Of course, Magento and Wordpress use it in fundamentally different ways - so this project I've got to build a combined search across a site with both Magento (EAV for products) and Wordpress (non-EAV for posts/pages) in use, is in that "it'll sit on the backburner until it becomes critical or somebody else solves it for me first" state…
The other three most common ways I've seen to store differently structured data in a consistent way are:
* Edge table
* Binary table
* Universal table
There are as many tables as there are types of data, + 1 table for the definition of all things stored.
The binary approach groups values of the same name into a table of that name. So if you have a property called "username" you'd have a table called "username", and this approach usually still includes the edge approach such that the values are stored in tables with ID:Value structure by type of value. Effectively partitioning the value table by name of the property being stored.
There are as many tables as there are types of data, + the count of unique property names.
The universal table approach uses a single table to store all data, effectively containing all of the edges in a graph. It's a conceptual full outer join of all binary tables. Imagine 100's of columns, and for each row the vast majority of those are NULLs.
The triplestore can be thought of as a condensed version (not type-safe) of the universal table. More space efficient, but usually at the cost of the datatype safety offered by the database.
You can combine these methods to store any type of semi-structured data in a traditional RDBMS and these approaches are used a lot if you know where to look for evidence of them.
SharePoint uses a version of the universal table. Oracle (for the ORA_XML storage type) uses a combination of the above under the hood to map XML into traditional database tables.
They all have pros and cons relating to query speed (for certain types of query), storage efficiency, indexing, types, etc. And as always you need to know why you're choosing something.
SharePoint also partitions their universal table dynamically. But they start with saying that each "SiteCollection" has it's own database/tablespace. As SiteCollections can contain a single site, this is basically equivalent to Reddit creating new tables per subreddit.
Elsewhere (up or down on this page) I saw that the SQLAlchemy author advises not to do this kind of thing. I simply say don't do if you don't know strongly why you should do it. Don't cargo-cult... go and read up on the problem space and only implement one of these solutions when you know the effects of doing so.
This is one of those times when analysis paralysis can be your friend and prevent you from building a mess you didn't need. If anyone starts with one of these designs and you haven't got a decade of experience to know why... you are probably doing it wrong.