
We Can Do Better Than SQL - kristianp
https://edgedb.com/blog/we-can-do-better-than-sql/
======
tqi
It's pretty arrogant to complain about the syntax being inconsistent across
versions and databases and then present your own weird offshoot, as if every
other version wasn't introduced for the exact same reason with the exact same
lofty delusions of grandeur...

SQL is messy because describing the underlying data relationships are messy.
The orthogonality example is a great illustration of this. What exactly should
the result be if there are multiple dept heads? Should the result rows be
duplicated? It's not clear how edgeQl would handle either (their edgQl
orthogonality examples were constructed to only have result per sub query),
but it seems like they would be kept together as sets. In that case, the
result set is no longer a table, it's a dataframe, which is a useful data
structure but is also not what relational databases do.

~~~
quickthrower2
I was hoping for something more left field myself. If SQL is based on tables,
what about a QL based on relations only. Columns of data that are related, the
"TABLE" implementation detail doesn't need to factor into it.

~~~
iddan
This is what we are trying to do with Cayley
[https://github.com/cayleygraph/cayley](https://github.com/cayleygraph/cayley)

~~~
woile
The website needs a link to the documentation, I just couldn't find it :/

------
bjo590
Am I the only full stack dev that likes SQL?

SQL is an incredibly expressive and flexible way to read, store, and update
data. It's ubiquitous, so the SQL skills I learned six jobs and three
industries ago are still relevant and useful to me today. Relational Databases
and SQL are heavy lifters that I often relay upon to build projects and get
things done.

~~~
mathgladiator
No, I'm with you and prefer SQL for many tasks.

SQL got a bad rap in many ways due to security issues, databases in general,
and "web-scale".

SQL as a language within other languages is a nightmare from a security
standpoint, and if language integrated query was more common across languages
earlier on then this wouldn't have been an issue.

Databases generally depend on normalization, but normalization comes with
interesting scaling problems and how do you replicate normalized schemas. Thus
denormalization became a thing, and then the emergence of NoSQL and document
stores started to infect everywhere. The JOIN was a killer too, and then the
discipline required to do sharding made it annoying to manage, so easier to
manage solutions became a thing.

I'm looking at databases in a different light these days with more
appreciation, but now the hot new thing is GraphQL makes things...
interesting. I don't view GraphQL as a server-side solution, but a client
solution to overcome the limits of HTTP/1.1. However GraphQL clients are
exceptionally complicated, and I'm not sure they are worth it. The only
problem is that to overcome them requires engineers "to know how to do
things", but that is a hostile stance. People want to go fast and make
progress, and GraphQL enables that.

~~~
wmichelin
GraphQL and SQL are not mutually exclusive at all.

GraphQL is, in no way, faster than any REST alternative in terms of
implementation speed. If anything, it is slower, as you need to be extremely
methodical with your API changes, as (same with REST I suppose) deprecating
fields / entities, for mobile clients specifically, is a PITA unless your
clients have really nicely built out forced upgrades.

What GraphQL _does_ give you, is type safety and extreme client flexibility .
It is a better solution than REST in almost every scenario, other than the
initial learning curve, which takes a couple weeks and then you know it
forever.

Would I recommend some startup write a GraphQL API for their MVP? No, just get
something working. Are you at a more medium-sized company looking to build out
your much more permanent API? Then yes, you should probably strongly consider
GraphQL.

~~~
Draiken
Too bad nobody thinks whether or not they need the flexibility in the first
place. MVPs with single clients using GraphQL for flexibility that is not
needed or used are common.

GraphQL is a hammer, and now every project is a nail.

Anecdotal, of course, but the first thing all FE developers I've ever worked
with do when they start/join a project is add/suggest GraphQL/Apollo.

Nobody considers how awful things look on the back-end when you need to cache
or make magic happen to avoid thousands of N+1 queries.

I believe unfortunately it has become the only way many front-end developers
learn to interact with any back-end, and now everyone's forced to use it
regardless of its drawbacks.

Same with React. Facebook managed to get free training for all of their future
hires. I hate the company but that was a genius move.

------
ckocagil
I'm not impressed for two reasons:

1\. Anyone striving to build a better SQL should make a comprehensive list of
common (but difficult!) database tasks for OLTP and OLAP workloads. This will
expose the weakness of their language. SQL has had 50 years and myriads of
improvements to cover all these common cases. This is not a fair fight, so
come prepared.

2\. It's not enough to be just "better than SQL" to replace it. SQL has such a
huge momentum that a new language needs to be absolutely better _and_ it
should have many features that SQL cannot possibly have. My nice-to-have list
would contain predictable performance, lock ordering, ownership relations (for
easy data cleanup), and a standard low level language which the query
optimizer would output.

~~~
GeneralMayhem
Not disagreeing with your point (if purist language nerds had their way over
practicality, we'd all be writing Haskell and Prolog) but most of your nice-
to-haves strike me as properties of the database engine, not SQL itself.

Predictable performance - this will always be not only implementation-
dependent but data-dependent as well. In order to know whether a join will be
efficient or not, you need to know things like relative sizes of the tables,
which is not necessarily a language problem. I have worked with SQL
implementations that had extensions to let the user annotate joins with
relative sizes on each side, but I don't think that's quite what you mean.

Lock ordering - again, good databases should have defined semantics (Postgres,
for instance, does take locks in order when using ORDER BY), but I'll grant
that this one could be stronger. That said, I think this is pretty niche. How
often are you doing large multi-row transactions where lock order is a serious
problem? If I have enough volume that deadlock is likely, I probably have
enough volume that I want to be breaking up the process into a sharded or two-
phase commit anyway.

Ownership relations - I think this is a DDL problem rather than a SQL problem.

Low-level language - I don't think you'll get a portable low-level language
here (at least not for any definition of "low level" that's much lower than
the SQL AST) because, again, the basics are implementation-dependent. What
kind of scan is the base atom of a query? Well, it depends - is your database
distributed? sharded? row-store-based? column-store-based? I do wish more open
source database drivers would let you play with the AST in memory (Postgres
has ways to print it out, but I don't think there's a good API). That would
tend to solve the most significant problem raised in the article
(composability) - plugging together SQL clauses automatically is hard, but
plugging together subtrees can be much easier.

~~~
AmericanChopper
> Predictable performance - this will always be not only implementation-
> dependent but data-dependent as well

This immediately jumped out at me from the parent comment. It would be
entirely possible to implement a query language where you specify a plan for
your query. But then you’d immediately lose the “better than SQL” competition,
because your complexity and maintainability problems would skyrocket.

I’ve had to deal with this problem as an Oracle DBA, and it’s a complete
nightmare. It starts with a statistics refresh ruining a couple of execution
plans, so you start specifying them manually with the plan manager. Then it
gets worse over time, because stats refreshes become a big risk and you don’t
want to do them anymore. Eventually you get to the point where you pretty much
only run verified plans. Then your verified plans slowly degrade overtime,
because the underlying cardinality of every table is constantly changing.
You’ve replaced the query optimiser with yourself, which is not only tedious
work, but it’s simply not possible to do the job as well as any mainstream DB
engine could.

~~~
CaptainZapp
I wish that I could upvote your comment more than once, because this rings so
true.

There certainly are (rare) situations, where you need to provide hints in one
form or another, but it's really a bloody nightmare to maintain and may
completely bork, when you - say - upgrade to a new version of the database
engine.

I work with relational databases since the early 90s and can give you a no-
bullshit money back guarantee that you (not you personally, obviously) are
_not_ smarter than the optimizer.

Usually there are weird data patterns involved if you absolutely must provide
hints. But basically:

 _Don 't do it!_

~~~
mnsc
I think one of the challenges with sql is that beginner developers can create
naive sql queries that "work" but are extremely complicated for the optimizer
to "get right". So in some cases (talking from own experience) the developer
can, with the use of hints, "be better" than the optimizer when the problem
all along was the overall structure of the query.

Edit: don't do it

~~~
AmericanChopper
The relational model can _usually_ save the day here, without a huge amount of
effort. SQL certainly has its share of anti-patterns and footguns. But most of
the awful SQL I’ve seen over my career hasn’t come from poor mastery of SQL,
it’s come from poorly normalized schemas. If you have a properly normalized
schema, then you can do a huge amount with very simple SQL. When it’s poorly
normalized, you end up with all sorts of strange and inefficient design
patterns in your SQL.

This could come across as me saying “well it’s easy if you do it right”, but
the thing is, normalizing a schema is incredibly simple. I would expect a
relatively inexperienced software engineer to be able to pick it up literally
just from reading the Wikipedia page. In my experience, the more common
underlying problem is that inexperienced engineers (even if they’re only
inexperienced in terms of SQL and RDBMS knowledge), don’t actually know what
normal forms are, or why they’re useful.

Data structures and concurrency control is just fundamentally useful computer
science, but for some reason it seems to be a topic a lot of people don’t pay
enough attention too. Maybe it’s just my personal pet peeve, but I’ve seen too
many projects start with “wow NoSQL is great”, and a few months later end up
with giant nested loops in their lookups, and some poorly built custom
implementation of MVCC in their business logic.

(NoSQL is great btw, just not for relational data)

~~~
Akronymus
>footguns

Ran into one recently. Where a table was joined either to one or the other
table, based on if a value was null in the first one.

This was fine, until we added a where clause to a, through multiple joins,
base table for both options. This tanked the performance >1000x.[1] If we just
returned the value it had basically no impact.

We tried solving it with using the result set as a base for a select where we
did the filtering. This also resulted in the slow performance. In the end I
solved it by wrapping the column in a function call, which solved it. And I
still don't know why.

My guess is that somehow without the function call, it optimizes it into one
query, which results in basically the original case, while a function forces
the evaluation of the subquery first.

[1]Sub 1sec to over 15 minutes

~~~
AmericanChopper
Imo, polymorphic associations are one of the key areas that the relational
model in general struggles. You can do them in most RDBMS, but they’re always
a bit janky. Even when you’re just modelling your schema, you really have to
think quite hard about it, and you’ll really struggle to preserve simplicity.

~~~
Akronymus
In this case it was a

left join sometable on sometable.someuid = isnull(someothertable.someuid,
somethirdtable.someuid)

I guess that is such an uncommon case that it tripped up the optimizer
completely.

Also: Thanks for writing "polymorphic associations". Not knowing that probably
is why I struggled to find any info on it.

Edit: Both tables were actually the same one, just retrieved via different
joins, so different data.[1]

[1]One was a company, the other was the company we need to send money to. This
is for when deal with a daughter company but pay the parent company directly,
for example.

~~~
AmericanChopper
> Thanks for writing "polymorphic associations". Not knowing that probably is
> why I struggled to find any info on it.

We might have had a similar experience with this. The first time I stumbled
across this problem though I was specifically trying to figure out “what is
the relational way to implement polymorphism”, so I pretty much lucked into
the a rather productive series of google searches.

~~~
Akronymus
It wasn't strictly polymorphism, but the term you wrote led me to a article[1]
where it mentioned "alternative parent". This alone instantly made the problem
more understandable for me.

[1][http://duhallowgreygeek.com/polymorphic-association-bad-
sql-...](http://duhallowgreygeek.com/polymorphic-association-bad-sql-smell/)

------
nealabq
QUEL (
[https://en.wikipedia.org/wiki/QUEL_query_languages](https://en.wikipedia.org/wiki/QUEL_query_languages)
), the original query-language for Ingres, was more orthogonal and consistent
than SQL. But IBM decided SQL was more business friendly. Who can argue with
that.

And before that there was ALPHA. From
[https://www.labouseur.com/courses/db/s2-Remembering-
Codd-2.p...](https://www.labouseur.com/courses/db/s2-Remembering-Codd-2.pdf) :

"Ted [Codd] also saw the potential of using predicate logic as a foundation
for a database language. He discussed this possibility briefly in his 1969 and
1970 papers, and then, using the predicate logic idea as a basis, went on to
describe in detail what was probably the very first relational language to be
defined, Data Sublanguage ALPHA, in “A Data Base Sublanguage Founded on the
Relational Calculus,” Proc. 1971 ACM SIGFIDET Workshop on Data Description,
Access and Control, San Diego, Calif. (November 1971). ALPHA as such was never
implemented, but it was extremely influential on certain other languages that
were, including in particular the Ingres language QUEL and (to a lesser
extent) SQL as well."

~~~
nealabq
Ted Codd designed the Relational Calculus as a clean relational-query
language. It looks mathematical (scary?) and a little like a set-
comprehension. But I think the big mistake is its use of non-ascii chars like
∃ ∈ ∀.

Here's an example from
[http://arwan.lecture.ub.ac.id/files/2013/10/4.-relationalcal...](http://arwan.lecture.ub.ac.id/files/2013/10/4.-relationalcalculus.pdf)
:

SQL:

    
    
      SELECT DISTINCT F.Name
      FROM FACULTY F
      WHERE NOT EXISTS
        (SELECT * FROM CLASS C
         WHERE F.Id=C.InstructorId AND C.Year=2002)
    

Relational Calculus:

    
    
      {F.Name | FACULTY( F) AND NOT
        (∃C ∈ CLASS( F.Id=C.InstructorId AND C.Year=2002))}

~~~
anyfoo
I only had a quick look, is it a computer-friendly offshoot of relational
algebra[1], the actual mathematical model for relational databases, as an
actual query language?

[1][https://en.wikipedia.org/wiki/Relational_algebra](https://en.wikipedia.org/wiki/Relational_algebra)

~~~
kd5bjo
Relational calculus is another mathematical model that is dual to relational
algebra: if you have one representation you can always get the other.
Relational algebra describes a fairly direct set of manipulations of database
rows that can be implemented efficiently. Relational calculus operations are
more abstract, but have useful identity transformations you can use to
optimize query plans.

So, when you ask a database system to perform a query, you ask it in
relational algebra terms because they’re easy to understand. It then
transforms your query into a relational calculus-like form to shuffle things
around, and then back to a different, but equivalent, algebraic form to
actually execute.

~~~
anyfoo
Thank you, that makes perfect sense.

------
tristanz
There's a lot of negativity here for understandable reasons given the success
of SQL empirically. But I'd encourage everybody to read the home page
[https://edgedb.com/](https://edgedb.com/). This project is not trying to
replace SQL as its primary goal, it's trying to build a data modeling and
query interface on top of Postgres that meshes well with modern applications
that have hierarchical data akin to what you'd model with GraphQL. Major kudos
for trying to rethink and improve upon things -- I'm sure it wouldn't be too
hard to stick an SQL backdoor in there for the laggards too ;).

~~~
loco5niner
> for the laggards too

Speaking of negativity ;)

------
gregjor
Would be nice, but bazillions of lines of SQL at the core of almost every
business system make this as likely as “We can do better than five fingers.”

The article does nicely illustrate many of the well-known shortcomings of SQL.
Chris Date and Hugh Darwen unsuccessfully tried to fix SQL with Tutorial D.
Never heard of it? Exactly.

~~~
paulhodge
It's 2020 not 1995. There are already lots of good, production-ready SQL
alternatives out there. There's full-fledged businesses that have zero lines
of SQL.

~~~
shrubble
I would like to hear of one such business. Can you give an example?

~~~
Scarbutt
Stripe's primary datastore is mongodb.

~~~
adrianN
But what does their BI tooling use?

------
exabrial
Honestly I think SQL is pretty easy. I love it. I can teach the basics to a
new person in minutes.

You know what we could do better at? Crappy explains from database engines.
Crappy rate limiting capabilities. Poor feedback on keep cache pipelines fed
during scans. Poor feedback on column size effects on reading stripes from
disk and size alignments between the filesystem and database.

~~~
simonbarker87
Your brain must work different to mine. SQL is by far the hardest tool I use.
I’ve used all the main languages from asm up to js for real work and nothing
breaks my brain like SQL.

I use it daily in a business that is heavy on SPs and while I get by and am
improving the jump from inner joins and selects to CTEs and the other wizardry
is massive.

I want to be better at SQL but so many problems I hit up against and think
“well that’s a 2 minute job in js/swift/php”

~~~
bjo590
> I want to be better at SQL but so many problems I hit up against and think
> “well that’s a 2 minute job in js/swift/php”

The thing about SQL is that it's the fastest way to read&write data in a
relational database. Maybe writing the code is faster in js/swift/php, but the
code will run faster in SQL. If you need to do something to 100M pieces of
data you can do a lot worse than SQL.

~~~
lmm
That's circular reasoning though. Why do you want your data to be in a
relational database? Particularly if you're not actually using its features (I
don't think I've ever seen a web application that actually got any value out
of database-level transactions, for example). A different kind of datastore
could offer you better performance _and_ easier querying.

~~~
Akronymus
I love SQL for making it possible to almost trivially enforce most business
rules. Such as: You can only use one of the in another table specified values
in this field.

~~~
lmm
I struggle to understand this viewpoint. In my experience business rules are
harder to express in SQL than in practically any first-class programming
language. "Is this value one of this list of values" \- whether that list is
hardcoded or dynamically obtained - is completely trivial.

(Of course if you apply some double standard where editing your "source code"
requires multiple approvals whereas changing your "database" may be done at
will in production then you'll find SQL logic easier to adjust, but that's a
reflection of your policies rather than any fundamental reality.)

~~~
PeterisP
You're arguing that business rules are _easier to express_ in other
programming language, but the point of the parent post (and SQL) is that
business rules are _easier to enforce_ in SQL.

It's trivial to write some code verifying "Is this value one of this list of
values", but writing such code not ensure that this constraint will actually
be met in 100% of your past and future data.

It's very difficult to guarantee that a business constraint expressed in your
client app is actually enforced - there can be different versions of your app
applying the constraint differently, there can be multiple apps accessing the
datastore, there could be manual interventions in various ways to the
datastore, the app could have code paths that may cause data insertion or
alterations without running that verification in certain conditions, etc; so
for all intents and purposes you _can 't_ really rely on that constraint.

~~~
lmm
> It's very difficult to guarantee that a business constraint expressed in
> your client app is actually enforced - there can be different versions of
> your app applying the constraint differently, there can be multiple apps
> accessing the datastore, there could be manual interventions in various ways
> to the datastore, the app could have code paths that may cause data
> insertion or alterations without running that verification in certain
> conditions, etc; so for all intents and purposes you can't really rely on
> that constraint.

If you have business logic in your datastore then that puts you in much the
same position though. If you have a trigger you might have data written before
it was introduced, or queries that ran with it disabled. If you're gradually
rolling out a new version then you might have some data that follows a
constraint and some that doesn't. And so on.

~~~
Akronymus
For triggers, you can run a procedure that executes it for all past data. Or
write a procedure that updates the data to a valid state.

When gradually rolling out, you can have adjusted stored procedures that deal
with the different versions, and turn off the old one when it is no longer in
use.

So, I fail to see the problems you mentioned.

~~~
lmm
And with good practices (mainly a decent type system that lets you distinguish
between checked and unchecked values) you'll have no problems with constraints
in the application layer either.

~~~
marcosdumay
As long as your application layer is a single homogeneous application running
the same code.

That is approximately never the case. Current widespread practices are moving
away from it, with multi-services, and old fashioned practices of mixing
customized and off the shelf tools basically forbid it.

~~~
lmm
Current practice is moving away from the idea of having your storage layer be
a single homogeneous datastore as well.

~~~
SahAssar
So where are you proposing the consensus be placed? A lot (I'd argue most)
applications need consensus for them to be stable and reliable in the long
term, so either you place it in the datastore or in the logic processing.
Where else would you place it?

~~~
lmm
I prefer to see the whole system as a succession of stream transformations
([https://www.confluent.io/blog/turning-the-database-inside-
ou...](https://www.confluent.io/blog/turning-the-database-inside-out-with-
apache-samza/)). If you view the sequence of input events as first-class and
the "current state of the world" as derived, then a lot of problems go away.
You need a datastore that can give you a consistent answer as to what order
events occurred in, but it's a lot easier to make appending to an append-only
list atomic than to make arbitrary state computations ACID.

For downstream derived computations, basically you either make the causal
relationship explicit, or accept that you have eventual consistency. The only
case where you can have inconsistency is where you have a "diamond" in your
computations (i.e. you compute B that's derived from A and C that's also
derived from A, and you compute D that's derived from B and C). So you figure
out the business implications and either accept it or eliminate the diamond
(by computing (A, B) from A and (B,C) from (A, B) instead of computing B and C
separately from A). You will also get inconsistency if you do some totally ad-
hoc query that's not part of your existing pipelines, but usually that's the
kind of reporting query that doesn't need to be 100% consistent; if you do
need a consistent version of that then the best approach is to take a regular
snapshot, which can be consistent.

Basically you have a lot more precise control, you have causal relationships
where you have explicit dependencies, so you have the level of consistency
that you need for all your operational stuff, but you don't have a globally
consistent realtime view of everything. It may take more work up front, but
IME the notion of that kind of global consistency is a lie in a distributed
world; even if you just have a basic webapp then you can't actually achieve
the kind of consistency that that model pretends you have, because what the
user is viewing in their browser (and then potentially making changes based
on) at any given time is not necessarily the same as what's in the central
database.

To get back to the question, this means that if inconsistency is due to an
actual bug, it's pretty easy to solve: fix the bug and then regenerate from
the original events. If you don't understand why data is inconsistent then you
can always look back to the source events and figure out what it should be.

------
grandinj
SQL was designed by COBOL / NATURAL people and was intended to be used by
"normal people". That was part of a larger movement at the time to make
programming more accessible. Unfortunately (a) that experiment mostly failed
(b) the resulting syntax does not compose well at all (c) the syntax does not
represent the underlying semantics and mathematical operations well at all.

The combined effect is a rather tortured language, as it has been extended
over the years.

However, replacing it is equally problematic because of the huge installed
base.

~~~
jacquesm
> SQL was designed by COBOL / NATURAL people and was intended to be used by
> "normal people".

That's not how I remember it. What I recall is that SEQUEL was the result of
looking at how databases and set theory could be connected.

[https://en.wikipedia.org/wiki/Edgar_F._Codd](https://en.wikipedia.org/wiki/Edgar_F._Codd)

That had nothing whatsoever to do with COBOL or NATURAL.

~~~
grandinj
There is a reason it was originally called (SEQUEL)

    
    
      Structured __English__ Query Language
    

see the original paper at

    
    
       https://web.archive.org/web/20070926212100/http://www.almaden.ibm.com/cs/people/chamberlin/sequel-1974.pdf

~~~
jacquesm
I didn't say anything about the name. But you made some pretty strong claims
about the provenance of SQL that have - as far as I know it - no bearing on
reality. The groups behind NATURAL and COBOL had nothing to do with SQL afaik.

------
seer
I for one would welcome a new alternative to sql. It might not be _this_
alternative, but why not try.

SQL is very hard to learn properly, with all of its gotchas and
inconsistencies. There are running jokes for noobs truncating their tables due
to forgetting a where clause. I’ve seen junior devs crying in tears and
throwing their mice just because they needed to debug / optimise a complex
query.

The mare existence of all the ORMs is a testament that people would opt to
write (or use) insanely complex pieces of software just so they don’t have to
deal with the lack of composition and ease of use.

All of those look to me as signs that something wrong with the core itself. We
could do better.

If we settled for good enough in all cases we wouldn’t have Go or Rust, React
or Postgres. In fact every software that we have is a product of someone
thinking “this is hard/wasteful/unexpressive/etc, lets write an alternative”,
SQL included.

This alternative looks quite promising. We can wait to see how they can handle
the edge cases, but the core looks a lot simpler to deal with than regular
SQL.

~~~
isatty
> SQL is very hard to learn properly, with all of its gotchas and
> inconsistencies.

I don't understand why though? I've been using SQL (Postgres for the most part
but with a smattering of MySQL thrown in) for around 8(?) years, which isn't
much in the grand scheme of things but I have not had anything that couldn't
be resolved. I've written small straightforward queries to over 200 loc and
never had a problem understanding it if you read it slowly/broke it down into
smaller queries.

In fact, ORMs have been a massive headache because I can think in SQL but not
in whatever the creator of the ORM was thinking in. Those giant queries that I
was talking about - there's no way to represent them in ORM form.

SQL works in a language agnostic way, you can explain analyze your SQL queries
and run it through whatever medium you prefer. Typical experience with an ORM
goes like so:

1\. Lets use an ORM because it'll be easier

2\. It's not actually easier and it's a complex mess now, but let's stick with
it anyway

3\. Figure out a way to log the queries that the ORM made up for you/printf it

4\. Run that through EXPLAIN ANALYZE

5\. Can't make the ORM do that, file a bug report that'll be buried

6\. Use native query while you wait

7\. Tech debt etc;

~~~
zvrba
> In fact, ORMs have been a massive headache because I can think in SQL but
> not in whatever the creator of the ORM was thinking in.

Yes, fully agree on ORMs. They DO have one nice feature though: simple CRUD
operations are way less verbose than constructing SQL statements.

What we lack is a better integration between the host language and the
databse. Constructing a prepared statement from a string, setting parameters,
executing, fetching rows from the result set and mapping back to fields... all
is a major, repetitive PITA.

And yes, I find it easy to think in SQL and often wonder WTH an ORM is going
to generate. Just recently I improved performance of an application by going
from ORM to SQL; first I reduced number of round-trips (ORM/efcore first wants
you to fetch an entity before you can update it), second, I batched updates
into a single session/transaction. Win! :) [Oh, and don't get me started
ranting about ORM and transactions.]

~~~
fogetti
Just to let you know, those things are not needed at all if you use a
different type of ORM, similar to ActiveRecord or Django ORM. It all depends
on what you choose. So probably the first mistake was that you didn't do the
homework of exploring ORMs maybe?

------
dman
One word of caution to those trying to improve on SQL - for the many users of
SQL, technology is a secondary aspect of their jobs. Hence any replacement
runs into the issue that many of its core users do not have the bandwidth to
spend significant effort on learning another querying language.

I will cheer everyone who tries to displace SQL, because I do think it needs
to be displaced but would also want to caution such people on the magnitude of
the task ahead of them.

~~~
bjo590
I often think that a great way to displace SQL would be to create a language
that compiles to SQL similar to how TypeScript compiles to Javascript.

~~~
_nhynes
You mean like an ORM?

~~~
chime
ORMs I've worked with usually expect you to write the complex queries/joins
yourself using a series of functions e.g.
orm.select(table1).join(table2).on(keys).where(cond).group(by).having(cond).order(by).
It's just SQL with extra steps.

An ORM that understands db schema, indices, and automatically decides where to
use outer join or subqueries based on what would be most efficient given the
specific schema would be fantastic. Basically an ORM that 95% of the cases
comes up with the most efficient way to look up the data.

~~~
scotty79
At minimum some ORMs let you reorder things which makes SQL 50% better.

Smarter ORMs do what you want. Linq is magical.

------
julochrobak
I am a big fan of a relational model. SQL itself is OK but far from great. So,
I wish you a lot of success!

I was part of a similar attempt - building a better "SQL" and relational DB.
This was roughly 8 years a go. You can have a look at our GitHub Projects or
look at some further links and may be you get inspired :)

* [http://bandilab.github.io/](http://bandilab.github.io/) \- introduction to the bandicoot project

* [https://www.infoq.com/presentations/Bandicoot/](https://www.infoq.com/presentations/Bandicoot/) \- presentation of the Bandicoot language on

* [https://github.com/ostap/comp](https://github.com/ostap/comp) \- another interesting attempt, a query language based on a list comprehension

~~~
mathgladiator
bandicoot looks interesting, and it feels spiritually related to a project I'm
working on. I'm designing a programming language for board games:
[http://www.adama-lang.org/](http://www.adama-lang.org/)

~~~
julochrobak
Very interesting. I am not very familiar with many board games, but i'll
definitely look at the language more!

------
jayd16
I'm not convinced by a lot of the comments here focusing on SQL as an
irreplaceable juggernaut. Currently I think Postgres is just fantastic but
recent DBs have shown that even SQL can bleed.

There's some neat stuff here and I hope the project well. I would love to see
object/hierarchical result set support grow. SQL ORMs feel so kludgy.

~~~
LoSboccacc
SQL orm feel kludgey because they are, entities on rdbms are normalized in
their fractional parts and interconnected by a multitude of relations so that
you have multiple entry point to obtain different views of the same data and
multiple ways to organise and prune results to find interconnection across
multiple entities

once you reduce all that to object traversal all your options are lost, your
only entry point is the entity and the only connections are direct paths

it's not the underlying query language, it's the flattening to object

~~~
willtim
> it's not the underlying query language,

SQL and it's implementations do not support nested relations. The parent is
suggesting that e.g. nested relations would enable better ORM solutions.

~~~
LoSboccacc
> nested relations

parent was right that hierarchical queries are not well supported, but that's
not the same as outright using nested relations; also if you're not going to
store data following normal forms for convenience no wonder sql is going to
have a hard time querying it.

at which point you're better off with an object storage and a searchable index
to the side anyway

------
barrkel
SQL has a lot of incidental, accidental complexity, no doubt.

Though when I reason about SQL, I think mostly in terms of functional
operators over streams of data: projection, filtering, flat-map, join,
fold/reduce. Obviously optimization means looking through streams and seeing
tables to find indexes etc., but once you get to the execution plan, you're
firmly in a concrete world of data flow and streams of tuples.

I didn't get on well with the example syntax in this write-up. It didn't mesh
better with my mental model of relational algebra either at the logical or
physical execution level - and the truth is you need a foot in both worlds to
write good scalable SQL today.

Aside from the complexities of dynamic construction, my biggest problem with
SQL is modal changes in query plans, owing to how declarative it is. It's a
two-edged sword: the smart planner is great, up until it's stupid. And it
usually turns stupid based on index statistics in production at random times.

------
Flimm
Yes, SQL has flaws, and the article forgot to mention one of them: you need to
build a string to build an SQL query, rather than a more structured object,
leading to flaws like SQL injection vulnerabilities, and difficulties
adjusting the query.

Let's say you're building a CRUD app with search and filtering capabilities.
Unless you are using an ORM (which has problems of its own), you might be
tempted to build the SQL query string like this:

    
    
      conditions = " AND ".join(filter_key + " = '" + filter_value + "'" for kilter_key, filter_value in filter.items())
      order_by = column_name + " DESC"
      query = "SELECT col1 FROM tablename WHERE " + conditions + " ORDER BY " + order_by
    
    

But this has multiple SQL injection vulnerabilities. Doing it correctly is not
just a matter of using SQL parameters, because column names need to be escaped
differently than string literals. Linters can't distinguish between correctly
escaped queries and incorrectly escape queries in non-trivial cases. Also, the
query will throw a syntax error if the number of filters is zero, since you
can't have an empty WHERE clause.

I don't think a new query language solves this problem.

~~~
maweki
This problem would arise with every DSL. And it has been solved by using
embedded DSLs.

jOOQ or SQLAlchemy look like SQL (and you don't even have to squint your eyes
very much) and solve the problems you mention.

~~~
Flimm
jOOQ and SQLAlchemy are libraries for Java and Python respectively. You can't
take your knowledge of those libraries and use them in a different programming
language. And you still need to know SQL well in order to do any debugging,
because the database is still receiving an SQL query string.

What I am wishing for is for the language to be more like JSON, something that
matches closely to commonly found structures in programming languanges (like
lists, objects, numbers, strings and booleans), and that the database can
support natively.

~~~
maweki
> jOOQ and SQLAlchemy are libraries

While it is true, they are specifically embedded DSLs, which is exactly what
you said you needed. Having a (query) language (a DSL) that is not composed by
string concatenation. Embedded DSL solve exactly that problem and they are
usually just translations from some DSL into concrete syntax in some
programming language, ideally supporting type correctness and preventing
syntax errors, etc..

And what you "wish for" is, again, just an embedded DSL, you only wish for you
constructors/functions to be native instead of some import/module.

------
cmrdporcupine
SQL's fundamental misstep was to try to present the relational model in a form
they thought was more comprehensible by a) using an "English-like" language
and b) renaming core relational concepts (relation -> table/view, tuple ->
column, etc.) It has only muddied the waters and led to all sorts of
misunderstandings and complaints ("but my data isn't tabular!")

The "English-like" syntax means that what is actually happening is obscured
(so many misunderstandings of what "selection" is, for example), and it means
that composing multiple operations gets very awkward and hard to read and in
fact many things that the relational algebra itself permits are not really
expressable.

And renaming core concepts means people means people get confused. They don't
understand what the "relation" in relational is, and think it's about
relationships. They think SQL is all about tables, when tables are just one
way of representing predicates. Etc. etc.

The relational model is a very elegant method for presenting facts about the
world and then the relational algebra is a nice functional programming style
system for slicing and dicing those facts into information in basically
arbitrary and recomposable ways.

SQL has obscured that. It's awful.

------
mr_toad
> The NoSQL movement was born, in part, out of the frustration with the
> perceived stagnation and inadequacy of SQL databases.

I would dispute this. The antecedents of NoSQL were the parallel programming
models of HPC. They weren’t specifically excluding SQL, and NoSQL was a term
that was invented after the fact.

~~~
chx
> The antecedents of NoSQL were the parallel programming models of HPC.

Can you elaborate on what you are thinking of? As a refresh, here's when and
how the (current usage at least) of NoSQL was introduced:
[https://subscription.packtpub.com/book/big_data_and_business...](https://subscription.packtpub.com/book/big_data_and_business_intelligence/9781782175346/1/ch01lvl1sec09/introducing-
nosql-not-only-sql) in 2009.

> As Oskarsson had described, the meeting was about open source, distributed,
> non-relational databases, for anyone who had "… run into limitations with
> traditional relational databases…," with the aim of "… figuring out why
> these newfangled Dynamo clones and BigTables have become so popular lately."

I was using MongoDB at the time (we became one of their first paying customers
-- they didn't even want to take money for support at first!) and HPC wasn't
in the air. So please elaborate.

[http://2009.drupalcamp.at/sessions/chx-
session.html](http://2009.drupalcamp.at/sessions/chx-session.html) as far as I
can remember this was my first MongoDB talk. It's been a _long_ time ago.

~~~
mr_toad
I’m referring to Google’s 2004 MapReduce paper.

The functional style that MapReduce derives from had been used in parallel
computing models, e.g. the scatter/compute/gather model of MPI, and in turn
this was adopted by Hadoop, CouchDB, MongoDB and others.

------
cjf4
I'm not sold.

I acknowledge that these are real issues, and commend the authors for
attempting to address them. However, these issues rarely cause any real
friction for me - I generally find SQL among the most ergonomic languages I
use (regardless of dialect).

------
xupybd
Started reading believing the article would be utter nonsense but by the end
was convinced they might be onto something. My hat goes off to them if they
pull this off. It will take a lot to push SQL out of its stronghold.

------
pgt
Datalog is superior to SQL in every single way, except that very few people
know it:
[http://www.learndatalogtoday.org/](http://www.learndatalogtoday.org/)

------
jugg1es
SQL is definitely not perfect but it's supported everywhere and integrates
with everything, so you can augment it with other languages. That fact makes a
lot of the critiques sort of moot.

~~~
mapgrep
Yup. The fact that it has been around for 46 years and dominant in its problem
space for most of that time is pretty compelling too. In the 90s it was going
to be replaced by object databases, said the hype. In the aughts it was up
against Mongo and Couch and the like. The pretenders to throne keep coming and
going.

------
marmada
I see a lot of Stockholm syndrome in this thread / maybe low expectations.

Many people are saying SQL isn't that hard to learn but as someone who is new
to SQL, I disagree.

It takes a max of 15 minutes to understand basic JavaScript/Go/Python
primitives and write a program. SQL on the other hand seems much more complex.
I might as well be reading Haskell or Lisp. At least those languages are
consistent.

SQL does not feel like a language where I can learn a few primitives really
quickly and compose them together.

~~~
dsego
People who learn the nitty-gritty details and all the gotchas and tricks in
the book, appear to be experts and their job positions depend on it. It's all
a bit "ludditic", people confuse this familiarity with arcane commands and
proficiency with real expertise and deep knowledge. When you challenge that,
you challenge their existence. Knowing how null behaves (Null = null vs null
is null) is one of those shibboleths that differentiates an sql wizard from a
lowly commoner, like knowing all the nuances of prototype inheritance makes
one a true JS champion. These people charge a pretty penny to optimize badly
written sql queries and fix your joins. And you want to take that away from
them.

~~~
lurker458
At the other side of the fence we have JS/TS experts writing inscrutable one-
liners by chaining 15 lodash commands.

Having tried both SQL and the approach of using a programming language +
framework of the day, I prefer SQL for data manipulation. It's far easier to
troubleshoot, scale, hand-over or maintain in the long run.

------
duxup
As I read this I wondered:

"Has anyone fixed these problems elsewhere?"

Then:

>The NoSQL movement was born, in part, out of the frustration with the
perceived stagnation and inadequacy of SQL databases. Unfortunately, in the
pursuit of ditching SQL, the NoSQL approaches also abandoned the relational
model and other good parts of RDBMSes.

Yeah that's what I was thinking, they really don't fix the issues listed, just
have chosen to solve other problems, but not in a "going to fix SQL" kind of
way.

~~~
gen220
> The NoSQL movement was born, in part, out of the frustration with the
> perceived stagnation and inadequacy of SQL databases

I'm a little bit young, but isn't this a bit of a revisionist take, by the
author?

I thought that Amazon, Google, FB et al moved away from relational databases
because the sharding logic they needed to build on top of these databases was
approaching the complexity of a RDBMS. They didn't need strong consistency or
support for complex queries, on the kind of data they were storing at scale,
and so made compromises in those areas while engineering their purpose-built
alternatives (Dynamo, BigTable, Cassandra).

It's not that __SQL __didn 't work, but that the persistence layer was too
_strong_ and therefore too _slow_ for their very particular needs. It's like
comparing a minivan/suv (mysql/postgres) to a drag racer (nosql databases).
You don't want to drive your kids to soccer practice in a two-seater with no
airbags, and a 5* crash safety rating isn't as important to the pink-slippers
as horsepower and 0-60.

Or am I missing something?

~~~
jldugger
> I thought that Amazon, Google, FB et al moved away from relational databases
> because the sharding logic they needed to build

Note that these companies did not move away from relational DBs until long
after the "NoSQL is Web Scale" video. Yes, Google invented Big Table to help
power search (and others), but their revenue system, AdWords, didn't move off
MySQL until like 2015. And last I checked, Facebook is still a heavy user of
MySQL with sharding.

The original NoSQL software had two major value adds: you didn't have to learn
a new language, and were faster (typically via disabling fsync -- the DBA
equivalent of running with scissors). If you knew SQL or an ORM already, you
were really just hoping mongoDB was faster magically.

These days you can even just tune pgsql to support kv store formats:
[https://www.postgresql.org/docs/9.1/hstore.html](https://www.postgresql.org/docs/9.1/hstore.html).
Yes, you'll have to pay someone to know how to DBA pgsql, or pay AWS to pay
someone, but I'm comfortable paying that price.

~~~
uhoh-itsmaciek
>These days you can even just tune pgsql to support kv store formats

And you can turn off fsync! Though if you do, disable synchronous_commit
instead for most of the performance but none of the potential data corruption
(you're still risking data _loss_ , of course, just not corruption).

------
spo81rty
I love SQL. I HATE using Elasticsearch because it doesn't support SQL. If you
ever used Elasticsearch, you know exactly the nightmares I am talking about.

Products support SQL because everyone knows it and it works, regardless if it
isn't perfect. Trying to create a new version of SQL is ruining your
capability to have millions of trained users that already know how to use your
product.

~~~
oreoftw
Agree. Building elastic query AST is a PITA. Check out
[https://www.elastic.co/what-is/elasticsearch-
sql](https://www.elastic.co/what-is/elasticsearch-sql)

------
hota_mazi
The last technology that attempted this task was NoSQL, and we all know how
that ended.

First, people realized that schemas (just like static types) are
extraordinarily important for robust software.

Second, NoSQL lost to SQL over the long run in pretty much all dimensions:
query language, performance, scalability, concision, etc...

As a result, not only is NoSQL on the way out, but SQL databases have actually
become better at supporting NoSQL features than any NoSQL database.

SQL didn't just win, it absorbed its opponent and became even better as a
result. Never underestimate the versatility and adaptiveness of a technology.

~~~
aplummer
> The last technology that attempted this task was NoSQL, and we all know how
> that ended.

Do you mean billion dollar companies?

Don't get me wrong, wouldn't go near the popular NoSQL databases I've used in
the past again, but I sure wish I invented them.

------
cube2222
I do not really agree with the two cons the authors listed.

Null handling isn't intuitive in the beginning, but it makes it harder to let
missing data go unnoticed.

The expression / table thing can be solved like we solved it in OctoSQL[0],
and I think others have solved it in a similar way. Whenever you have more
than a single scalar value in expression position, just create a tuple, or
tuple of tuples out of it, which _does_ act like a single value.

[0]:
[https://github.com/cube2222/octosql](https://github.com/cube2222/octosql)

------
29athrowaway
If I could change SQL, I would flip the ordering in the syntax.

Instead of "SELECT ... FROM ... WHERE ..."

I would change it to: "FROM ... WHERE ... SELECT ..."

And you get the idea...

------
querylangs2020
Can you model this in EdgeQL?
[https://developer.mongodb.com/community/forums/t/is-this-
que...](https://developer.mongodb.com/community/forums/t/is-this-query-
possible-of-orders-in-april-2020-that-were-from-new-customers/4462/5)

Area of curiousity at the moment as I too agree that SQL is a poor fit, even
if the better DSL inputs eventually get reduced to SQL command text and
parameter arrays.

~~~
RedCrowbar
> Can you model this in EdgeQL?

Absolutely!

    
    
      WITH
        april := <datetime>'2020-04-01T00:00+00',
    
        NewCustomers := (
          SELECT Customer
          FILTER
            NOT EXISTS (
              SELECT .orders
              FILTER .date < april
            )
        ),
    
        AprilCustomers := (
          SELECT Customer
          FILTER
            datetime_truncate(.orders.date, 'months') = april
        ),
    
        NewAprilCustomers := (
          SELECT AprilCustomers
          FILTER AprilCustomers IN NewCustomers
        )
    
      SELECT
        (count(NewAprilCustomers) / count(AprilCustomers)) * 100;
    

This assumes the following schema:

    
    
      type Order_ {
         property date -> datetime;
      }
    
      type Customer {
         multi link orders -> Order_;
      }

~~~
querylangs2020
Thanks. I'm going to dig in a bit more. I've been sold by the above and the
homepage.. :)

------
systematical
There was a bullet in there regarding "poor system cohesion — SQL does not
integrate well enough with application languages and protocols." I was curious
to hear the authors thoughts on this. I feel ORMs have mostly solved this even
while introducing their own set of problems and learning curves. And when
you're done fighting your ORM on those bitchy <= 1% queries, well, you just
write SQL.

------
zomglings
Completely agree regarding the complexity of various flavors of SQL. The thing
is, this complexity came from legitimate need.

My team gets by with a very small subset of PostgreSQL functionality day to
day because most of the stuff we're doing with our database is just not that
complicated. Simple lookups, writes, joins when our applications interact with
the database. Simple joins, grouping, aggregation when we personally interact
with the database.

We are not confronted with the full complexity of PostgresQL every day. And
on, the flip side, the database itself gives us killer functionality in the
form of constraints and transactions. It offers a lot more, but this is all we
care about an overwhelming majority of the time.

I am curious how you see it. Is there a compelling reason for a team like mine
to leave their comfort zone to work with your new database? Does the new query
language really solve any problems for Joe Sixpack, developer?

------
yyyk
We could do a lot better than SQL. It's just that nearly every replacement
falls into one of two traps:

A) Not being able to express relational semantics.

In general, every programming language replacing its predecessor allowed to
express about the same semantics, even if it wasn't natural to the new
language.

One can write imperative Java/C++ even if the language doesn't like it, and
successful functional languages allow escape hatches for mutable objects.

The various NoSQL languages typically fail this hard.

B) Not offering enough of an improvement.

Minor improvement isn't worth the 'yet another query language' burden. EdgeQL
doesn't fall into A, but it may fall into B.

NULL is an annoyance, but not big enough to justify another language. Throwing
an exception instead is a very double-edged sword. The author needs to show
far more improvement to justify a new language (I'd have liked to see more
examples of composability for instance).

------
joliv
Previously
[https://news.ycombinator.com/item?id=19871051](https://news.ycombinator.com/item?id=19871051)
(428 comments)

~~~
jerrya
thanks!

------
the_cramer
I feel like SQL is in general "Misunderstood". How often did I see application
developers put the same principles and design patterns onto an sql database
like they do in js or c# or rust. They use functions to encapsulate
functionality, write sequential code, use cursors where not needed.

Then the language is blamed for not performing well or yielding different
results than expected.

Some points in this article are valid but I think the main issue is the
general notion sql would be like javascript or c#. It is not, it is very
different and needs a deep understanding, including the works of the
underlying dbms, to perform well.

I guess i'm just not a fan of throwing new languages and tools at problems we
identify, which seems to be a trend nowadays.

------
darksaints
Overall I like the ideas behind it and the problems with SQL that it solves,
at least on a superficially observational level. It could use some real brain
bending examples for those of us that deal with difficult data sets.

The tasks needed to actually take over where SQL has left off seem absolutely
monumental though. I can't help but think it will never get there, just like
every other attempted query language out there.

In my opinion, there are two things that can make SQL 100% better, which
wouldn't be a new language, but rather an update to the language standard:

1) algebraic data types, allowing us to get rid of terrible ternary null logic
and more closely model real world data domains.

2) a really well thought out date/time API, along the lines of JSR310.

------
norswap
The problem I see with this is that the "What's wrong with SQL" is an
excellent argument against SQL the _language_ , but not SQL the _engine_. The
optimal solution to that problem seem to be a syntactic skin over an existing
SQL engine.

------
KingOfCoders
No.

Having worked with MongoDB for 5+ years now back to PostgreSQL, I do like SQL
so much more than a custom query language. I would wish JSON would be better
integrated in SQL, it kind of feels like an addon not the core. Otherwise SQL
is a fine language.

Also I can leverage 20y of SQL.

~~~
kochandy
EdgeDB is built on postgresql and has excellent JSON integration.

[https://edgedb.com/docs/datamodel/scalars/json#type::std::js...](https://edgedb.com/docs/datamodel/scalars/json#type::std::json)

~~~
KingOfCoders
I wish instead of

SELECT to_json('{"hello": "world"}');

it would be

SELECT {"hello": "world"};

------
talaketu
> In EdgeQL every value is a set

This is trouble for the example for calculating the average number of reviews
across movies:

    
    
       SELECT math::mean(
           Movie {
              description,
              number_of_reviews := count(.reviews)
          }.number_of_reviews
       );
    

Never mind, they are not sets:

> Strictly speaking, EdgeQL sets are multisets, as they do not require the
> elements to be unique.

The relational model is firmly based on the idea of a relation as a "set of
tuples", and a major criticism of SQL has been that it views data as an
ordered sequence of tuples.

So I'm skeptical of the claim that EdgeQL is really based on the relational
model.

(Not clear whether multisets are ordered - wondering about window functions
etc...)

~~~
RedCrowbar
Both EdgeQL and SQL disregard the RM proscription about duplicate tuples for
practical reasons:

1\. Elimination of duplicates from every projection is prohibitively
expensive.

2\. Sometimes you actually _want_ duplicates to show up without injecting a
synthetic key into every projection.

3\. There's DISTINCT.

------
jerrya
I can see the improvement of their EdgeQL over SQL...

But (as I haven't read of their blog posts) I am a bit more reluctant about
the whole thing when they describe it as an ORM.

Can we leave the ORM and take the query language and implement this as a
Postgres extension?

~~~
bjt
Exactly my question. If improving on SQL also means that I have to throw out
all the maturity of Postgres, it's very unlikely to happen.

But if you can define a new query language that can be implemented by existing
relational DBs, you might actually have a shot at displacing SQL.

------
Wandfarbe
I'm not doing SQL every day but i have rarely any issues with it.

Do we really need to do better than SQL? I don't think so and i also don't
think that the chosen new syntax is better.

At the end of the day, most critical is not the language but understanding how
it works to optimize indezes etc. If you are only able to write simple SQL
because you are not good in SQL/Databases, you will not optimize your Database
independently from the language.

If you are good in SQL/Databases, you (or at least i) do not care about syntax
details; You just look it up, and get acquainted to your specific underlying
Database.

~~~
dsego
I'm not doing C every day but i have rarely any issues with it.

Do we really need to do better than C? I don't think so and i also don't think
that the chosen new syntax is better.

At the end of the day, most critical is not the language but understanding how
it works to optimize assembly etc. If you are only able to write simple C
because you are not good in C/algorithms, you will not optimize your
algorithms independently from the language.

If you are good in C/algorithms, you (or at least i) do not care about syntax
details; You just look it up, and get acquainted to your specific underlying
microarchitecture.

~~~
Wandfarbe
I do see my comment in context of using SQL for your Queries while your main
focus as a Developer (like for java etc.) is not writing SQLs as your main
priority.

Also it does state 'we can do better than sql' and i do have a certain amount
of practical experience to state my personal opinion that i do not think that
their approach is actually better than sql.

They did show quite avg examples; Examples which are leading me to assume
certain points like where they would like to replace sql.

------
Nihilartikel
I've grown pretty fond of the way Spark SQL queries can be represented with
DataFrame operations. There is a more or less 1:1 relationship with SQL,
except the commands can be programatically generated and composed. It sure
beats stitching together a SQL string when you have a bunch of query clauses
that might be optional, or need a generalized way to take 30 result columns,
and rename them with a prefix or something.

e.g. result = dframe.select(*[f.col(colname).alias(f"{colname}_old") for
colname in dframe.columns]).join(other_df, 'joincolumn', type='outer')

and so forth.

~~~
Icathian
I used to use SAS macros to conditionally generate monstrous SQL queries all
the time. It was a bit hacky, but man I could make a thousand lines of SAS do
just about whatever I wanted however I wanted. It really feels like a powerful
way to tackle messy real-life business logic.

------
gorgoiler
I often find SQL is extremely good for explaining what you are trying to do.

It is of course a declarative language, but more than that it does what a good
language should do: explain in both directions.

Languages need to tell the machine what to do, and to tell the person reading
the code what it was the original author was _supposed_ to be doing. Many bugs
happen when the two don’t match up, and the maintainer is often not the person
who wrote the original code.

Well written and handwritten SQL is some of the least commented code I’ve seen
_because it doesn’t need comments — it is self explanatory_.

~~~
disgruntledphd2
Mostly, unless the author was enamoured of subqueries, or believes that the
only acceptable name for a CTE is cte1, 2 etc.

But yeah, as long as you know what the underlying tables look like and what
they enforce, then SQL is really easy to maintain and understand.

------
closed
I feel like a lot of these concerns are resolved in tools like R's dplyr. You
use mostly the same R code, whether your data is a data.frame, or living in a
SQL source. dplyr generates the SQL query for you.

By reading the queries it generates it's quick to pick up how the SQL works!
Another big advantage is that you can always pull the data into R and have a
ton of general purpose tools available.

dplyr is aimed at data analysis though, so may be other use-cases for edge db?

------
thingsilearned
I've had similar thoughts to the title, but my ideas were more around
improving the ease of use/writing.

1\. A cleaner universal more natural syntax for analytics: I love writing
python as it is to me such a cleaner syntax than C or Java. We could do the
same for SQL and make something that feels more natural. Turning a common
query like

> SELECT count(*), TO_CHAR(created_at, 'YYYY-MM-DD') FROM Accounts GROUP BY
> TO_CHAR(created_at, 'YYYY-MM-DD') ORDER BY TO_CHAR(created_at, 'YYYY-MM-
> DD');

into something much more natural like

> count by Day(Accounts.created_at)

2\. A Visual SQL: for analytics it's so much faster to query and explore
visually. Building queries visually means you don't make common typo or syntax
or structure errors, joins happen smoothly, you can browse the data as you
build, you don't need to google for syntax (what's that date function again?),
and it works across dialects and databases. We've built and launched this a
few months ago at Chartio [https://chartio.com/blog/why-we-made-sql-visual-
and-how-we-f...](https://chartio.com/blog/why-we-made-sql-visual-and-how-we-
finally-did-it/)

------
moron4hire
I've always wondered if a newer language could be designed with ANSI SQL as a
transpilation target, or each of the vendor SQLs as targets. Optimization of
queries would be a huge problem, but it always seemed like the only way it
would be possible to break out of the SQL hegemony, i.e. first transpile, then
start developing native support in the open source databases, then pressure
the proprietary databases to adopt native support.

~~~
throw_m239339
That's what frameworks like Hibernate or Doctrine do, they have their own
object oriented query language that compiles to SQL.

~~~
moron4hire
Well, that's an object oriented design, and it's not really a full language in
that it has its own syntax. I'm talking about a relational language with its
own syntax.

~~~
throw_m239339
> and it's not really a full language

Both DQL and HQL are complete query languages.

------
romanovcode
Man, I'm so happy that DB hype is over and I can work with Postgres (which is
~35 years old) without anyone thinking it's the wrong choice.

------
fho
I might be ignorant here ... but why not have disc backed datastructures and
have some sort of ECS style interface that is like "normal" programming?

In pseudocode:

    
    
        newtype id = int
        names = dict<id,string>()
        balances = dict<id,int>()
        credit_scores = dict<id,float>()
        
        function broke_customers():
          return balances.filter(balance => balance < 0).keys()
        
        function exploitable_customers():
          bs = balances.filter(balance => balance < 0)
          cs = credit_score.filter(score => score > 100)
          return bs.intersect(cs).keys()
    

In the end, most queries are "just" set theory ... and having a very thin disc
io layer allows to use the host language to process queried data on the fly.

It's very basic ... and does not address performant views, clustering,
migration, etc ... but it's simple ... and does work well as demonstrated by
the ECS systems in game development (which are an application of the concept).

(Sidenote: that's some messed up pseudocode ... I've been working with C#,
Python and Haskell lately :-) )

------
FlyingSnake
The author seems to have overlooked the lindy effect behind SQL. SQL, like
other ubiquitous technologies like Javascript and Excel hits the sweet spot of
“satisficing” the user needs. Few things have the longevity of SQL in
programming, and the fact that it’s battle-tested will always be a significant
advantage over any new player.

------
slifin
I agree with the criticism of SQL here but to make yet another SQL but not SQL
is not the way forward imo

I hope the writer reads
[http://www.learndatalogtoday.org/](http://www.learndatalogtoday.org/)

In Clojure there are multiple databases that you can query by API, SQL and
datalog

~~~
millerm
If by datalog with Clojure, you mean Cognitech's Datomic, yeah it's pretty
friggin' cool. Really really liked it (well, the concept)... Until we started
working out licensing and realized we were going down a rabbit hole of unclear
future costs. They upped the cost on us in the middle of working on a project,
their tooling was awful, they deprecated the REST API (which was a selling
point for us they pushed) halfway through our efforts. We lost a lot of time,
money (we paid for consulting work which was useless), and our product was
delayed significantly. I know Stu will probably read this, but it's just the
way it is. We also threw away Clojure then and never looked back.

Doh, that's not what this discussion was about. It's an old wound that won't
heal. :-)

------
andrewstuart
I've come to love SQL databases, and even writing raw SQL.

All the modern SQL databases are unbelievably powerful.

------
S_A_P
I sometimes don’t love sql especially when debugging a huge script of dynamic
sql that was only partially scripted to generate the actual sql that is
executed and there are between 1500 and 5000 lines to sort through. But you
get used to it.

------
meowface
I think they make good points. This was written May 2019: how have they fared
since then?

~~~
1st1
We're approaching the 1.0 release; latest blog post:
[https://edgedb.com/blog/edgedb-1-0-alpha-4-barnard-s-
star](https://edgedb.com/blog/edgedb-1-0-alpha-4-barnard-s-star)

------
mathh
>>A language with good orthogonality is smaller, more consistent, and is
easier to learn due to there being few exceptions from the overall set of
rules (wikipedia btw)

table = (SELECT column | scalar expression FROM graph | table WHERE ..GROUP BY
... HAVING... ORDER BY...)

So in SQL, each scope is a table and that is the main primitive.

More metrics would be needed to criticize SQL orthogonality, instead of
providing only one example of subqueries as scalar expressions, when they more
generally produce tables.

Actually, SQL use the same query syntax for scalars and for tables and that
could be seen as good orthogonality.

------
hnarn
Very few people truly understand databases, and therefore very few people
truly understand SQL. I would be suspicious of any “SQL replacement” that
didn’t come from someone with many, many years in the field.

~~~
dsego
This comment is such an empty statement. The one plus one equals two kind. Are
you implying something? Do you know if the authors have many or few years in
the field? What does it mean to truly understand databases or SQL? To point
out the vacuousness, let me rephrase this statement to come up with an equally
profound message of my own:

Very few people truly understand CMS-es, and therefore very few people truly
understand Wordpress. I would be suspicious of any “Wordpress replacement”
that didn’t come from someone with many, many years in the field.

------
derryrover
Good idea actually! One day one of these efforts will succeed.

As a front-end programmer for 7 years I feel I have a fine understanding about
how a relational database and its queries can support my usecase. I understand
the basics well enough to advice the backenders. Anyway, SQL or any Object
oriented abstraction on top of it gives me migraine.

Let the critics criticize. Most people mistake pragmatism (SQL) for sound
solutions anyway. I do feel there is also a need for graphical editors. Yet it
is much better to build a graphical editor that compiles to something with
comprehensible syntax.

Good luck

------
haolez
I love SQL, but I liked the analysis in this article. However, the EdgeQL
language that was proposed looked horrible to me. It looks more like a general
purpose programming language than a query language.

~~~
Ziggy_Zaggy
+1 for SQL.

I never understood the need to rebuild a SQL-like solution bc SQL seems like
the right answer already?

Between inner joins and SPs, what else could you possibly need for data?

------
andy_ppp
I mean sure, probably possible, but I don’t really have a problem with SQL.
I’m pretty sure the query dialect the EdgeDB guys are talking about will die
with the company...

------
stillbourne
I feel like he is using the work Orthogonal incorrectly, I once had to
describe a complex relationship where there was a hierarchy of objects however
instead of being a strictly up down relationship there was an up, down and
right relationship. From the little chemistry I've learned I was familiar with
phenol and its bonds or para, ortho, and meta. I defined the up object as
para, the right object as ortho and the descending object as meta.

------
haspok
I think it's a mistake to design a language in 2020 with C-style curly bracket
syntax.

It just looks like line noise compared to SQL, or whitespace significant
languages.

------
linuxhansl
SQL is also an unfortunate mix of relational calculus and relational algebra.
The two can be shown to be equivalent.

While in calculus you declaratively describe the set of data you want and let
the system figure out how to get it, in the algebra you describe _how_ to
construct the set you want.

With that in mind SELECT ... WHERE ... is calculus, UNION and JOIN, etc. are
algebra.

I like SQL, though.

------
oxfordmale
EdgeSQL is no better than the SQL out there, and is arguably much worse as it
introduces yet another version of SQL. However, in the end EdgeSQL will only
succeed if it has a better query optimiser than Postgres, SQL Server and
Oracle. That requires a significant amount of investment and somehow I can't
see this happening.

~~~
viraptor
> However, in the end EdgeSQL will only succeed if it has a better query
> optimiser than Postgres, SQL Server and Oracle

EdgeQL is the query language, EdgeDB is the engine. The latter does query
optimisation. In theory EdgeQL could become popular with another engine.

------
kabacha
I'm a senior dev that up until recently managed to avoid having SQL in my
knowledge stack as we've always used no-sql databases.

I knew the basics but I took a weekend to catch up on some more advance use
cases and I can really resonate with this article. Unsurprisingly I came down
to a conclusion that SQL is just a bad language, no matter how you look at it.
It throws away every code flow standard in favor of their own nonsensical
flavor. Where normal synchronous programs go top-to-bottom SQL is a complete
spaghetti of flow and logic.

Just take a look at the most basic syntax: `SELECT person.name FROM person`
The variable is defined at the end of the program which is just absolutely
silly, what if the program is 100 lines long; do I need to start reading from
the bottom? SQL must be the reason mouse scroll wheels were invented.

As an alternative take a look at view based systems: `for person in people:
yield person.name` — isn't it infinitely more understandable and readable?

Unfortunately it seems SQL is here to stay as most people would rather work
with this mess rather than invest some time to adopt something better.

------
namelosw
Speaking of SQL alternatives, Datalog seems to be another language that
leverages relational model. I love how straightforward and expressive Datalog
is.

I wonder, why Datalog is not very popular for databases as a query language?
Is it because of performance optimization? Could anyone provide some insights?

------
gigatexal
Fads will come and go but SQL has staying power. It will be around for another
50 years me thinks.

------
roudaki
Stupid question, slightly off-topic but why every web app needs to be ready to
scale and serve a billion users? I don't mean this in the pessimistic way of
"be real you are not next Facebook". My question is what do you think all that
money is for, one you are asking for from VCs? Facebook was not ready for a
million users in the beginning. You need old stable and safe tech that will
not crash your demo or provide horrible user experience to important first-
generation users. Every time alpha breaks catastrophically is because somebody
ambitiously tried something not well documented or understood. People
understand MySQL. why would you add extra time and cost to your original first
budget? to prove what? I understand it makes job less "job" and more fun but
is it truly worth experimenting while you are fighting for survival? If I put
all the dead startups in my cv it would be 50 pages long

~~~
thanpolas
It doesn't. Who claimed that?

------
soobrosa
Still [https://erikbern.com/2018/08/30/i-dont-want-to-learn-your-
ga...](https://erikbern.com/2018/08/30/i-dont-want-to-learn-your-garbage-
query-language.html)

------
irjustin
Seems fitting, XKCD - Standards:
[https://xkcd.com/927/](https://xkcd.com/927/)

The post has weird self faults in its complaints - lack of consistency, and
poor system cohesion.

Lack of consistency isn't with the SQL standard, it's with the implementation
(this is recognize this later in the post too). Like browsers and the HTML
spec - everyone implements the standard SLIGHTLY but non-trivially
differently.

poor system cohesion - Being able to integrate/inter-op with every other
language literally means poor system cohesion. They're mutually exclusive
ideas. This is is why ORMs exist, so there is strong cohesion with a given
language.

I'm sad because there's so much work put in here by extremely smart people,
but this is a tool looking for a problem. SQL has its shortcomings for sure,
but simply replacing it with a slightly cleaner language is not the answer.
The benefits do not outweigh the huge amounts of drawbacks that would be
required to adopt something like this.

I don't know much about EdgeDB and hopefully there's a lot more benefit I'm
not aware of, but purely on the post itself, too many drawbacks.

------
mping
Judging by the comments here it seems lots of people suffer from Stockholm
syndrome - me included. I'm so used to SQL that it's hard to imagine something
better.

I wonder if we started from scratch today would we end up with something like
SQL.

------
apta
Good post.

> Swift, Rust, Kotlin, Go, just to name a few, are great examples in the
> advancement of engineer ergonomics and productivity.

golang is definitely not an advancement in engineering ergonomics. It can't be
grouped with the other mentioned languages.

------
reportgunner
He didn't even show an example of a join in his EdgySQL.

This is a marketing article.

------
okaleniuk
Ergonomics of a language I don't know is always worse than that of I do know.

As long as there's nothing valuable in the technology itself why would I
bother changing things?

~~~
throwaway_pdp09
> As long as there's nothing valuable in the technology itself

You're assuming there's nothing valuable. I'd say there is. Whether that's
enough to displace SQL I really doubt.

~~~
okaleniuk
Maybe there is. I didn't try it. But it's written rather explicitly "OK, so we
have highlighted the shortcomings of SQL. Why do they matter? It’s all about
ergonomics." That's what drove me off.

~~~
throwaway_pdp09
> "...It’s all about ergonomics." That's what drove me off.

I don't understand. Their critique seems eminently reasonable, what else
should it have been about that would have not driven you away? (thanks for the
answer, upvoted, it lets me try to understand your position).

~~~
okaleniuk
For instance, we choose Riak+LevelDB some 8 years ago as our data store. It
doesn't support SQL but it's fast and easily scalable. And free. I'm not sure
if we could achieve the same speed and fault tolerance with any SQL solution
available at that point. So the deal is, you give up your time invested in
learning SQL and get some advantages in return. This is reasonable.

But if it's only about language, no additional perks, then it just isn't worth
investing your time.

If continue the parallel with programming languages, there are tons of
C-killers nobody knows about. I think, roughly every third programmer in the
world tried to write more ergonomical C at some point in their live. And there
is Rust. Which is not about ergonomics really but about memory safety combined
with comparable performance. No more buffer overflows, no more Heartbleed.
People see what they are trading their time on Earth for and the language is
getting traction because of that.

This is, of course, only my personal opinion but languages are just thin
interfaces over technologies.

~~~
throwaway_pdp09
With respect, I think you're missing a lot. Sit with an expert SQL guy and
you'll learn a great deal. Maybe not enough to change your mind but certainly
enough to realise you're blocking out a really valuable technology.

> but languages are just thin interfaces over technologies

No. Not at all. SQL is derived from a mathematical basis. The principle came
first, then the language and tech together developed to fulfil it.

~~~
okaleniuk
But that doesn't contradict my point. The mathematical basis matters, sure,
but the language on top of it (exactly because it derives from the math
anyway) is mostly irrelevant.

I don't really care if I have to write SELECT or \forall or σ as long as they
all have the same meaning. But I'd prefer to use SELECT because I already
spent time learning how to work this way.

SQL as a language is fine. There is no real need to reinvent SQL unless you're
going to reinvent the math behind it too.

~~~
throwaway_pdp09
2 points then;

1\. Do you know SQL currently? Because your previous comments implied you
don't

2\. SQL is a _bad_ implementation of the underlying relational model. And it's
just bad in other ways. For one, it's bloody wordy:

    
    
      select * from people
    

but as SQL is an expression language, why not?

    
    
      people
    

From an answer of mine a few days ago (slightly modified here) to pick out
differences:

    
    
      (
      select *
      from t1
      except
      select *  from t2
      )
      union 
      (
      select *
      from t2
      except
      select *
      from t1
      )
    

I'd prefer to write that crap as

(t1 \ t2) + (t2 \ t1)

or similar. If you write a lot of SQL, that matters.

SQL aggregates don't nest easily. Transactions are screwed up
([https://news.ycombinator.com/item?id=23569513](https://news.ycombinator.com/item?id=23569513)).
SQL is poorly composable
([https://news.ycombinator.com/item?id=23550420#23561301](https://news.ycombinator.com/item?id=23550420#23561301)).
For better or worse it has nulls therefore tristate logic. This causes errors!
And more.

BTW. transactions don't work in the way you might think, atomically. Which is
why MS added [https://docs.microsoft.com/en-us/sql/t-sql/statements/set-
xa...](https://docs.microsoft.com/en-us/sql/t-sql/statements/set-xact-abort-
transact-sql?view=sql-server-ver15) and I'm sure Pgres has similar. From the
link:

"When SET XACT_ABORT is OFF, in some cases only the Transact-SQL statement
that raised the error is rolled back and the transaction continues processing"

That peculiarity is from the standard BTW.

SQL is problematic. It could have been much better.

------
tandav
In my experience Apache Spark and PySpark is the most convenient way to make
complex queries / to select, transform data

------
data4lyfe
10x better is me telling my database what metrics and values that I want. No-
code and probably not another language.

------
oarabbus_
Oh man, have I ever heard this one before. No, we can't do better than SQL.
SQL is powerful, and beautiful.

------
StreamBright
SQL would be the last thing that I give up for something else. It does a
pretty good job for what it is used.

------
ianamartin
I love critiques of SQL about implementations of NULL. "NULL is so special
that it's not equal to anything, not even itself!"

Like, duh. WTF should NULL be equal to?

Anytime I see people making this kind of argument about "doing better than
SQL" I can immediately tell they are pretty much fucked in the head.

Good luck, edgedb peeps. You haven't got a clue.

~~~
MaxBarraclough
> WTF should NULL be equal to?

In programming languages, NULL tends to be equal to itself. Why couldn't it be
the same way in SQL?

~~~
ianamartin
No. That is not even close to being true. You're confusing None with NULL,
like everyone else. Only a very few programming languages make this error.

NULL is undefined. It can't be equal or unequal to anything, including itself,
for reasons that should be obvious.

~~~
MaxBarraclough
> That is not even close to being true.

In C, _NULL_ is equal to itself. In C++, _nullptr_ is equal to itself. In
Java, _null_ is equal to itself. Same in C#. In JavaScript, _null_ is equal to
itself, and so is _undefined_.

> You're confusing None with NULL, like everyone else. Only a very few
> programming languages make this error.

Not so, as I've just shown.

SQL takes the philosophy that NULL isn't a value, but a marker for the absence
of a value, and gives it special treatment so that it is not treated as equal
to itself. Most programming languages do not take this approach, they instead
treat null as a special value, special in that dereferencing it is disallowed,
but it's still subject to the usual comparison rules (i.e. it's equal to
itself).

Your contrasting of _None_ again _NULL_ isn't meaningful. They're just words.
The semantics depend on the language.

> NULL is undefined

Depends on the language. In C it's defined as 0, roughly speaking. (Curiously
the bit-pattern used to represent NULL is not required to be zero. [0])

As a curious aside, in C, the special float value _NaN_ is not equal to
itself.

[0]
[https://stackoverflow.com/a/9894047/](https://stackoverflow.com/a/9894047/)

------
mrpeker
If SQL and GraphQL had a baby...

------
nickthemagicman
The claim can be made that: We can do better than ANYthing.

The counter is: Why hasn't anyone yet?

------
d--b
.Net Linq is a developer friendly alternative that has a way better syntax.

------
xiaodai
How do you even do Select col from table group by grp in EdgeQL?

------
SeanLuke
Wouldn't it be easier to just propose Datalog?

------
drej
Narrator: we cannot

------
dzonga
only thing, I would use other than SQL for querying would be the way pandas
handles queries or datalog. all this other stuff is doa

------
pmarreck
People have tried for decades. Go nuts.

------
xiaodai
SQL is the COBOL of relational data

------
kchoudhu
I do not know what the standard data query language of 2120 will look like,
but I do know it is going to be called SQL.

------
otterley
(2019)

------
mulmen
No, you can’t.

SQL is easy. Data is hard.

------
trapatsas
No, you can’t. Another competing standard against SQL will die trying to be
SQL.

------
monkeydata
sql is the language of the gods

------
hans_castorp
> There are many more cases like these, and there is no consistency in a
> single SQL implementation, let alone across all implementations

So you decided to create yet another incompatible "SQL"?

[https://xkcd.com/927/](https://xkcd.com/927/)

And quoting MySQL's broken handling of a division by zero as a reference.
Seriously?

------
dilandau
>sql alternative #1427904 that absolutely nobody will ever use

Love it or hate it, sql is a standard and there is a ton of knowledge
(stackoverflow answers, books, tutorials) and tooling (query builders, orms).
This is either hopelessly naive, or hopelessly arrogant.

------
tus88
I shall name it...NoSQL.

~~~
orogor
Data integrity, speed , replication Pick two. Nosql gave up integrity. If you
are fine with your bank account sometime missing your salary, great. Its not
quite the same as twitter sometime missing a comment.

------
chmod600
Why continue to use the keyword SELECT? It's just annoying.

~~~
hans_castorp
What's the alternative? FETCH?

~~~
chmod600
Why is anything needed at all? You don't need a keyword preceding an
expression in other languages.

------
CawCawCaw
Ah, so a new challenger arrives. Good luck overcoming lock-in.

------
jimnotgym
Please leave it alone. I'm sure we can do better than English as a
conversational language, knowing more about how people use it, but I don't
want to learn English 2.0, because I already know English.

------
peterwwillis
If you want people to take your replacement language seriously, lead with an
RFC that's been reviewed by industry peers. We all know the status quo sucks,
but it's the status quo for a reason.

------
YohAsakura
Some comments here, in my opinion, are missing the point. Saying "SQL is good
enough, no need for EdgeQL" is like saying "C/C++/Java/... is good enough, no
need for Rust/Kotlin/...".

EdgeQL is an explicit attempt to make not a new version of SQL, but a new
language. It's about evolution, which is, again, explicitly noted. Saying
"evolution is not needed" or "I don't want/need evolution" is strange to me.
At the same time, giving constructive feedback is useful, as always.

------
brainless
People have a love/hate relation with SQL. I think most software engineers are
OK with it, some even find it a curious case to tumble down the rabbit hole of
multiple JOINs or inner SELECTs.

But for most people out there who simply want to poke at their business data,
SQL, or this article's stated solution - EdgeQL are all pains in the wrong
places. If you are making the life of engineers a bit easier, then you have to
think of the millions of My/Pg SQL installations.

If you want to make the larger audiences' life easier (the business people who
need insights), then you need to think outside of SQL altogether.

------
jimbob45
I was looking around yesterday to find alternatives to SQL and found nothing.
Everything is in this ridiculous table-based model, the queries of which are
complicated and error-prone.

It’s heresy to criticize SQL these days or even suggest that DBs could be
easier and more robust. I envision a future DB language that offers perfect
ease and safety with queries the way Rust has shown us that the memory
unsafeness of C was voluntary all along.

~~~
ckocagil
Did you come across Datalog?

