Ask HN: I want to build my own query language

doctorzook · 2025-11-29T15:31:41 1764430301

Unless your data is really unusual, I’d generally recommend that you avoid writing your own query language and processor: it’s just damn hard to make it work well. Instead, look at how to put something like DuckDB in front of your data so people can just write SQL.

PaulHoule · 2025-11-29T15:48:14 1764431294

Or a step up from that: build a compiler that converts queries in a human-friendly or application-specific language to SQL or something similar.

benoau · 2025-11-29T15:49:44 1764431384

I'd stick with SQL, they can pull queries straight out of ChatGPT if they don't know it themselves.

If everyone lives within one database I'd throw up a per-customer read-only database in front of it for running their queries so they don't create performance issues.

bewal416 · 2025-11-29T15:52:08 1764431528

We do have a single-tenant DB. That’s one of my architecture challenges- how to handle permissions and clean up the schema a bit to entities that only my users need.

benoau · 2025-11-29T16:04:28 1764432268

Possibly achieve that with some views or w/e the equivalent is in your database, and database accounts that can only access those views.

Another option might be to let them ingest their data directly into the existing BI tools they use where they can do whatever they want, cool thing about that is it can entrench you into their infrastructure and it offloads a lot of this complexity you're dealing with.

bewal416 · 2025-11-29T21:22:00 1764451320

Okay- just spent the whole day tinkering wit this:

1) I create a baseline set of views I want my customers to have 2) For each new customer, I’ll run a script that create a replica of those views- filtered by their customer ID 3) I’ll allow my customers to write pure SQL- limiting them to only SELECT queries and a couple niche business rules, as well as masking any DB-level errors, because that just feels wrong

How does that approach sound?

benoau · 2025-11-29T22:34:15 1764455655

I think the main thing you're missing is creating an account in the DB that only has access to those views, so for each customer you'd do something like:

    CREATE USER customer_xyz WITH PASSWORD 'foo';

    CREATE VIEW customer_xyz_data AS SELECT * FROM data_stuff WHERE customer_id=x;

    GRANT SELECT ON customer_xyz_data TO customer_xyz;

So then two things are happening, SELECT-only is being enforced by the view itself no matter what, and their account is categorically unable to touch anything outside of that view too, so as long as you run their queries through that account it will always be sandboxed.

You can enforce all of that yourself but ultimately if they're using an account that can read/write other tables you will always have to be careful to make sure you are sanitizing their input not just to selecting but like, limiting joins and nested queries too.

bewal416 · 2025-11-29T23:38:06 1764459486

Gotcha. Yeah- I was thinking of working with my engineers to figure out a permissions layer, but I understand enforcing that at the DB-level would guarantee security.

Dumb question- is creating a set of Views for each customer even efficient for my MySQL database? I could realistically see us having ~12 customer-facing views- is having 12*N views a smart and scalable way to architect this?

benoau · 2025-11-30T00:14:57 1764461697

A view is just a query that pretends to be a table, so it will come down to the complexity of that query. Each time you're querying the view it will be running the combination of the user's query against the view's query so the performance comes down to whether your DB is optimized around basically "SELECT field1, field2, field3 FROM (SELECT * FROM data_stuff WHERE customer_id=x)". Whether you execute that query as a view or as ad-hoc SQL doesn't make a difference itself.

"Your side" of this can be optimized easily enough, but the user-submitted queries are likely to be inefficient or miss indexes, which is why one database per customer can be better since they each have their own resources.

You can create the views and accounts as needed and destroy them when sessions end rather than keeping them permanently too, so when the user signs in you create the view and account, after the session or some period of inactivity you remove them.

bewal416 · 2025-11-30T01:31:42 1764466302

Makes sense. The fact that my SQL Editor puts tables and views in the same section on its left sidebar was the main reason I did a double-take.

The idea of deleting and recreating views is an interesting one. I see that as a really cool approach- considering we can go without it as a v1 then include it as we scale.

Thank you for all your advice so far! This has been truly helpful.

benoau · 2025-11-30T15:06:30 1764515190

You're welcome!