Hacker News new | past | comments | ask | show | jobs | submit login

I think this is a reason NoSQL became so widespread. If we stored every piece of data relationally, databases would have an insane number of tables, resulting in huge numbers of joins, and coupled with the difficulty in running migrations, would become unmaintainable.

Every data model is shoehorned into the relational model via normalization which involves a ton of judgement calls about how the data will be used at the time the schema is defined. And then this becomes extremely difficult to change.

It’s a big trade-off between schema flexibility and performance.

I’ve always felt we would be better off querying at a higher level (say the logical db schema) and then the physical db schema is managed for us behind the scenes. But then there is also a further level up at arguably the purest representation of data as triplets, which is why Datalog is called out so much.




> I’ve always felt we would be better off querying at a higher level (say the logical db schema) and then the physical db schema is managed for us behind the scenes

But that's *exactly* what an RDBMS does: it handles the low-level record data-structures on-disk and presents to you a relatively cushy set of tables that you can design how you like them.

...otherwise please clarify what you mean by "logical db schema" vs. "physical db schema"?

Your talk of there being a "physical" design harkens back to those myriad "4GL" database platforms but those are all completely obsolete (despite what Progress's sales people will tell you). Y'see, database application developers haven't needed to concern themselves with physical records on-disk since dBase stopped being cool around 1990 (and the poor sods still using 4GL systems through the 1990s and even into the 2000s failed to see the industry shifts towards open databases (hence the "Open" in ODBC) so I don't feel too sympathetic for their cause.

-----------

> I think this is a reason NoSQL became so widespread.

I attribute it to too many CS/SE grads who didn't take database-theory electives and so never got to properly grok SQL (never mind how SQL is horribly, horribly mis-designed!) but they got hired by cool startups but for 90% of their use-cases the notion of a database behaving like memcachd is very, very enticing.

You may have noticed that the hype around "traditional" NoSQL engines (like Redis, MongoDB, CouchDB, etc) started receding probably about 3-4 years ago (and PostgreSQL is cool again!) after all those companies that built their entire business around a NoSQL database suddenly find themselves being mature businesses that now need to hire normal business-people who need to get that business data into Excel, or an OLAP warehouse, or otherwise get real-time views of data in a tabular form - those are all things that will make upper-management decide to hire people to transition the company off NoSQL and back to RDBMS.

This isn't anything new: a couple of decades ago there was (and still is) a popular in-process key-value (i.e. NoSQL) database called Berkeley DB[1] that had plenty of use-cases and applications as a more flexible alternative to some stock dBase library - or as the years went by: Microsoft JET (aka Access MDB). Though in this case, the funny thing is that dBase is also "NoSQL" because application programmers have to query data by manually iterating over fixed-size records at known offsets in a one-file-per-table system - it's really primitive.

[1]Now also owned by Oracle, natch: https://en.wikipedia.org/wiki/Berkeley_DB


> ...otherwise please clarify what you mean by "logical db schema" vs. "physical db schema"?

Logical schema is an ER diagram, physical schema is how that is translated to a DB-specific DDL, not to be confused with the physical data storage layer.

Think about all the different ways you can model a list in SQL. E.g. jsonb, array, hstore, 1-M FKs, M-M join table, and all the different levels of normalization.

> I attribute it to too many CS/SE grads who didn't take database-theory electives

Completely agree! But it also makes you realize how the relational model is a poor fit for a lot of data models too - which again many are unaware of. It's not perfect, but its a nice sweet spot for now.

I think we are at the other end of another pendulum swing of databases though. But it's hard to innovate when the prerequisite for success of any new database remains as SQL compatibility.


I’ve got an idea: “BQL+BDL” (“Better Query Language” + “Better Data Definition Language”) which would basically be a superset of SQL, but better. Like TypeScript is to JavaScript. And implement it as a schema-aware translation layer between the user and RDBMS-specific SQL dialects.

(Yes, this reeks of https://xkcd.com/927/ - but the aim isn’t to abstract-away different RDBMS’s dialect’s features, but instead to massively improve QoL)

Ping me if you’re interested! My email address is “[my-first-name]@[my-first-name].me” (and my first name is “Dai”)


EdgeDB and EdgeQL are doing something in this direction. SQL with GraphQL for joining solves some problems. Built on top of PSQL.

I think PSQL can only take you so far though, I think we will move to something new eventually.

For most apps you want to watch a query for changes. SQL is a pain for that. If you can get inside the query tree though, you can do some caching and materialization to help. And I want to run my db in my browser too.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: