
I Don't Want to Teach My Garbage DSL, Either - cf020031308
https://github.com/cf020031308/cf020031308.github.io/blob/master/blog/i-dont-want-to-teach-my-garbage-dsl-either.md
======
ken
The article this is in response to says:

> What’s worse than data silos? Data silos that invent their own query
> language.

and:

> I just want my SQL back. It’s a language everyone understands, it’s been
> around since the seventies, and it’s reasonably standardized.

Huh? SQL is just as DSL-y as anything else. It's so non-standardized that I
can't take any program written for any RDBMS and run it against any other
RDBMS, unless the author specifically extracted all the DSL into an interface
layer and ported it to that other RDBMS already. Even something as common as
"CREATE INDEX" has different _syntax_ on every database I've ever used.

It's even worse than HTML/JS/CSS 10 or 20 years ago, where at least it had a
chance of minimally working. And you can't seriously tell me that it's harder
to make SQL implementations (where you control the database) portable,
compared to, say, C++ compilers (where you don't control the hardware
architecture).

Yeah, data silos with their own query languages are bad. But RDBMSs are some
of the worst offenders, because they pretend this doesn't apply to them. It's
easy for newcomers to justify creating their own SQL-like languages, because
every existing database is already merely an SQL-like language.

All these database vendors need to get together and make some "SQL5" that
finally works consistently.

~~~
stcredzero
_Huh? SQL is just as DSL-y as anything else. It 's so non-standardized that I
can't take any program written for any RDBMS and run it against any other
RDBMS, unless the author specifically extracted all the DSL into an interface
layer and ported it to that other RDBMS already._

There are lots of weird parallels between SQL and Smalltalk. For example, both
SQL and Smalltalk are constructed with Douglas Hofstader's "strange loops."
All Smalltalk object instances have a class, and the classes themselves are
object instances. SQL tables are defined using metadata stored in SQL tables.
Another parallel, is that the language variants are very similar, but
ultimately incompatible to the point where translation is non-trivial. In
Smalltalk, this is due to a toothless language standard, resulting from
political maneuvering by various language vendors. Not sure what it is in the
case of SQL.

There is a problematic relationship between DSLs and libraries in languages of
sufficient power. In programming languages with a certain level of expressive
power, it's easy to write a DSL on top of a library, or on top of another DSL.

[https://xkcd.com/927/](https://xkcd.com/927/)

I used to joke about "lady/gentleman computer scientists." What's the
difference? A computer scientist knows how to implement another computer
language. A lady/gentleman computer scientist knows when they should know
better.

A DSL should be used to super-charge your project's very idiosyncratic special
context. The trick is knowing better when you shouldn't. Accessing a database
isn't your project's very idiosyncratic special context.

------
teddyh
If you’re developing a DSL which is just a query language, you are reinventing
the wheel, and you should ask yourself if any benefit of your language over
SQL is worth the effort of all your users to learn your new query language. It
may be worth it of your data can not be usefully be modeled by tables; e.g.
document query languages like XQuery and even simple XPath are useful and can
not be easily replaced with SQL.

~~~
rightbyte
Well are there any data on how much more fuzz there is to learn a domain
specific language compared to a domain specific library?

Arduino "whatever it is called but really c++"-language or Warcraft 3 scenario
language are kinda nice for what they are.

~~~
teddyh
Neither of those are query languages.

------
pdimitar
As a broader point, the general message of this post is not said often enough
and it should be.

People get attached to their creations and especially if the thing starts as a
hobby project, the author can get carried away pretty far. When they
eventually want to share their creation with the world they might get a really
cold shower.

IMO us the programmers started becoming less practical and more tinker-y.
Which is not a bad thing; it's healthy for the psyche. But when it comes to
pieces of tech that can end up getting used on vast scales, we should be more
responsible.

In this lane of thought, DSLs should be reserved to niche business domains and
not as the first tool you reach for when you can't quite describe a problem
with your programming language of choice.

------
deskamess
At the end of the day, most reports/dashboard UI's are a DSL over SQL. It may
not be language in the traditional sense - more of a Visual or Interactive
language. However, it is a language where the primitives (drop down, text
boxes, etc) are understood by many, even more so than the language it is
abstracting (SQL).

------
SenHeng
Unrelated to the article, just found it interesting how he’s using github to
literally auto generate his blog via their md-to-html thingy, and issues for
comments.

~~~
munmaek
It's not a bad idea, aside from putting your blog entirely on a third party
like github. Utterances is the alternative to disqus that uses github
issues.[0]

Personally I think the best solution is using a static site generator like
Hugo or Zola to control how you generate your content, and then host that
using Caddy + Docker. A database is overkill.

You can also use caddy's git plugin to kick off automatic builds build from
any git repo, not just a github one. It just needs to support webhooks, I
believe.[1]

[0]: [https://utteranc.es/](https://utteranc.es/)

[1]:
[https://caddyserver.com/docs/http.git](https://caddyserver.com/docs/http.git)

~~~
braythwayt
raganwald.com uses github’s md-to-html thingy, which is jekyl.

But there is nearly zero walled garden downside. I own the domain, I don’t use
github.io, so I don’t fear link rot if I move.

Because it’s git, I always have a copy of everything locally, I don’t depend
on github for storage.

Because it’s jekyll, I can generate my blog on my own system and upload it
somewhere else whenever I want.

I don’t support comments at all, but that’s a personal choice. I’m not in the
community business, I outsource comments to Hacker News. Which is also how
most of my readers want to discuss my writing.

~~~
Cybiote
Yeah, you're always going to be reliant on third parties in a complex world.
What is important is how beholden or locked in you are in any given
arrangement. If you are free to move from one contractor to another because
they are all interoperable, then you're independent in a more important sense.

------
ojglees
I don't understand why writing an ORM requires creating a new query language.
The point of an ORM, as I understood it, was to think about objects, not query
languages. When I wrote my garbage ORM
([https://hrorm.org](https://hrorm.org)) I just ignored everything that's hard
(you can write your own SQL for that) but the easy things (basic CRUD) don't
require a query language at all. They are just provided by an object-based
interface.

~~~
randomdata
The trouble with SQL is that its foundation is old and hasn't been given much
love with regards to modern language theory, and any attempts to talk about
how the language could be improved are shot down on the basis of confusing the
language with the application of the language.

The trouble with SQL is that it doesn't easily allow for basic building blocks
that ORMs benefit from, like composability. This leads many ORM authors to
build their own query language, which support features that SQL lacks or does
poorly, that compile to SQL in order to simplify the rest of the development
of the ORM.

This comes as a result of SQL not being a very good language (not to be
confused with the application of declarative querying of relational data,
which is beneficial and could benefit greatly from a good query language) by
modern standards.

~~~
linuxftw
SQL is a great language. It's for querying records from a database. Trying to
wrap stateful objects around SQL is where the industry went wrong, IMO. It
does make some code more portable across datastores, but for me the price is
too high. When you know how to write efficient SQL statements, ORM feels like
having a hand tied behind your back.

~~~
randomdata
_> It's for querying records from a database._

The application of the language does not make the language itself great. As we
have seen with DSLs that often come bundled with ORMs, there are other
languages to query databases with (even if they ultimately compile to SQL),
and I would argue that some of them do a lot better job than SQL does for
providing a comfortable and cohesive environment for developers to write
queries in.

In the imperative language space, we have one hundred and one different
languages all trying to make things slightly more comfortable to developers.
C, Go, and Rust can all be used to write the same kind of application, more or
less, but that does not mean all of those languages are equally great. The
same is true of declarative queries. Just because SQL is popular does not mean
it is great.

 _> When you know how to write efficient SQL statements, ORM feels like having
a hand tied behind your back._

ORMs and SQL are orthogonal concepts, really. There is no reason an ORM
couldn't require you to hand-roll every single SQL statement. An ORM's concern
is simply mapping the results of that query into the application's objects.
That some ORM implementations also include functionality to build queries for
you, often on top of the aforementioned DSLs, to make that mapping require
less effort on the developer is, I would argue, largely a result of SQL being
a bad language.

~~~
linuxftw
Working with relational databases might be difficult for those not acquainted
with them. However, I think the SQL language as an interface primarily for
relation databases is a great language.

I think, invariably when you use ORM, you end up having query-specific object
structures. For example [1].

You also might be passed objects with deferred fields [2]. This will be
completely opaque to someone consuming the resulting object. You'll eventually
run into this problem [3]. Solving the lazy load problem requires an
understanding of how SQL works in the first place. And if you look at the
solution in that example, it's an ORM-wrapped series of joins.

From 3: > For good measure, we add a raiseload to throw an exception if we try
to load anything that we didn’t load here.

Who wants to live in this world?

1:
[https://stackoverflow.com/a/45905714/5573538](https://stackoverflow.com/a/45905714/5573538)
2:
[https://docs.sqlalchemy.org/en/13/orm/loading_columns.html#c...](https://docs.sqlalchemy.org/en/13/orm/loading_columns.html#column-
deferral-api) 3: [https://engineering.shopspring.com/speed-up-with-eager-
loadi...](https://engineering.shopspring.com/speed-up-with-eager-loading-in-
sqlalchemy-87a176cfd7ad?gi=2e66321199fc)

~~~
randomdata
_> However, I think the SQL language as an interface primarily for relation
databases is a great language._

Which language(s) are you using as a point of comparison and why is SQL better
than those other languages? SQL is no doubt better than nothing, but that is
not in the spirit of our discussion.

------
neya
I agree to an extent. But, I disagree that learning ORMs are a waste of time.
For example, I'm on Phoenix, using Ecto. I find `Repo.all` much more
convenient that "SELECT * FROM .." etc.

I don't know if it's better or worse - the original blog post asking
governments to regulate data formats/code!

~~~
pdimitar
Well, Ecto is not an ORM. It's a Data Mapper[0]. Rails' ActiveRecord is ORM.

I agree that some DSLs make SQL better and easier to reason about -- and that
Ecto is one of them.

[0]
[https://en.wikipedia.org/wiki/Data_mapper_pattern](https://en.wikipedia.org/wiki/Data_mapper_pattern)

~~~
spdionis
Aren't data mappers generally referred to as ORMs?

~~~
pdimitar
Not to my knowledge. Data Mappers should be the "less batteries included"
piece of tech compared to the ORMs.

As a single non-representative example, ActiveRecord has `before_insert` hooks
you can simply add as methods to your model class. Ecto doesn't have those.

~~~
jrochkind1
That seems silly. They are a technology for mapping from a relational database
to a system based on objects. That's what an ORM is.

> Data Mapper: A layer of Mappers (473) that moves data between objects and a
> database...

([https://martinfowler.com/eaaCatalog/dataMapper.html](https://martinfowler.com/eaaCatalog/dataMapper.html))

Sounds like an ORM to me.

Some people try to distinguish between the "data mapper" pattern and the
"active record" pattern (it's not just the name of the Rails library, it's a
pattern... which the Rails library may or may not implement very well). Both
are ORMs, because both are ways of mapping from an rdbms to an object system.

(Neither of which actually has to do with query DSL. We could imagine just
taking the part of ActiveRecord that produces queries, but having it return
simple hash/string literals. It wouldn't really be an "ORM" (except in the
most technical sense that even hashes are objects in ruby), but it would still
have the parts you _don 't_ like. The nature of query building is actually not
related to 'data mapper' vs 'active record' \-- you could have an instance of
either in which you wrote raw SQL queries, or an instance of either which used
the _same_ non-SQL DSL)).

But even distinguishing between "data mapper" and "active record", in actual
practice, I don't think there are two completely separate, distinct, and
unified camps. I don't think these categories actually serve well to deliniate
the ORMs we've got. Instead, there are a just a whole bunch of approaches,
some more light weight than others, some more mature/reliable than others,
some 'leakier' than others, differing on all sorts of additional dimensions. I
agree that some ORMs are better than others -- and some may disagree on which
these are -- I don't think saying "data mapper" is actually useful for
understanding which these are.

------
athenot
A DSL _can_ be immensely useful in terms of useability. But it is not trivial
to be done right. You need to define a consistent grammar, make sure it's
intuitive to your users and free of all sorts of fun bugs that happen when odd
constructs are put together.

~~~
throwaway1492
I think you're describing what Fowler[1] calls an "external" dsl ie separate
lexer, parser, and code generator. The alternative being "internal" dsl,
something that uses a languages own constructs to create stuff; ie like how
Ruby is is used in several well known frameworks.

[1] [https://www.goodreads.com/book/show/8082269-domain-
specific-...](https://www.goodreads.com/book/show/8082269-domain-specific-
languages)

~~~
grumdan
I think this is still applicable advice for embedded DSLs too, since the
grammar of the DSL will be made up of the ways in which you can compose the
operations you provide, whether they are parsed by the host language's parser
or your own.

Overall, I agree that it's important to think about how to make sure that
operations compose in an unintuitive, expected way. I find this hard to figure
out without thinking of it in terms of a grammar and semantics, whether it's
embedded or not.

------
ukblewis
I don’t agree. Yes, of course everyone should consider if they really even
need their DSL. But sometimes as the author rightly notes, sometimes the
alternative to a DSL is a lower or significantly more complex existing
interface style (such as SQL).

------
jrochkind1
What is the difference between a "DSL" and an "API"?

We can't write software without making APIs...

------
adrianhel
Then again, REST/RPC are query languages in their own right.

How would one go about securing this thing anyways?

(I agree with the basic sentiment, but find it somewhat unrealistic)

~~~
adrianhel
Do anyone here expose real sql querying capabilities to your users?

~~~
tluyben2
In a dark past my company did: because these users were paying CMS users, they
would not try to hack things, but rather because of the power of SQL and their
lack of knowledge, they broke a lot which required backups to restore. We are
talking mid 90s and Perl CGI scripts.

~~~
adrianhel
That sounds kind of terrifying.

~~~
siscia
Read only access to a replica databases seems quite a sensible choice for me.

If along you expose not the the real tables but views I honestly don't see
what could go wrong.

~~~
asdfasgasdgasdg
The main thing I can think of is that migrations would become much more
painful, because your API is the entire capability surface of the relational
database you're using -- no more and no less. Hyrum's Law suggests that your
users will come to depend on every facet of it.

Then again, no doubt there are some cases where this is the best solution. But
it's worth being cautious before adopting an approach like this.

~~~
kerblang
The parent poster's grammar is kind of bad but I think their idea is to expose
only views, not actual tables, so there is some degree of dependency
immunization. Views are a pretty good poor person's substitute for an API
right & proper.

------
wuzhilang
传统的sql迟早被淘汰，我的见解就是这些了！没了

------
snicker7
If your API is so complicated that it becomes a DSL, then perhaps there should
be better communication between frontend and backend teams. Setting up a lean
API contract makes more sense than creating an internal DSL or exposing
internal data stores directly to users.

~~~
asdfasgasdgasdg
Some things are irreducibly complicated. You can't set up a lean API contract
for a complicated thing without giving up power. The example of SQL is
pertinent. Any attempt to set up a lean API contract for a relational database
is doomed to be weak and painful to use. Without knowing the specific problem
domain the blogger was dealing with, you can't know whether that was the
situation. But if we assume they are competent, then we can assume a lean API
was not possible.

~~~
jacob019
complexity != complicated

~~~
asdfasgasdgasdg
It literally is though.

com·plex·i·ty _n._ the state or quality of being intricate or complicated.

~~~
jacob019
They exist as distinct words for a reason. The difference is important. Search
"complexity vs complicated", I even found a few articles about it right here
on hacker news.

