Hacker News new | past | comments | ask | show | jobs | submit login
The curse and blessings of dynamic SQL (2004-2022) (sommarskog.se)
67 points by ingve on May 9, 2023 | hide | past | favorite | 29 comments



Historically at least a common use case for dynamic sql was to build up a predicate based on user-selected filters, where the user can select one or more conditions from a larger set. Handling this with a single static query requires the planner to deal gracefully with constructions like 'where @opt_param is null or some_col = @opt_param', particularly where the choice of index depends on which filters have been selected. Having not gone near such cases for a long time, I'd hope that the major dbs have become rather good at handling such cases these days.

Edit: the sanest solution, then and now, is probably to select between a number of alternative static queries based on what index values have been supplied, and to use the 'is null or' construction for other filters.


Our good old CRUD app does that. We dynamically generate a WHERE clause based on the column(s) the user wants to search for, set the parameter value(s) based on the search values and rerun the query.

This approach allows us to have a one-liner to enable searching in a grid, and scales well to grids with millions of rows and 50+ columns.

Of course the user might issue a search on a single non-indexed column, but this hasn't been an issue in practice. Either it happens very seldom, or they've already filtered on something indexed.

We also have use dynamic SQL for cases where we needed to select from different tables or views depending on some run-time condition, or similar. No direct user input there though so don't really need the safety that parameters bring.


Reading further I see the article goes into dynamic tables/views, and how it's normally a code smell. I agree it's not something you should do lightly, and you need to consider the cost vs the benefits.

I mean clearly we could just maintain multiple near-identical static queries and just pick the right one, rather than have a single dynamic query.

However, then you're stuck having to keep them in sync and you will eventually screw that up.

In our case, we'll resort to that if the alternatives are too painful, which so far has just been a few places amongst our several hundred queries.


increasingly, search is handled by something other than SQL, so this is less relevant than it used to be


At least in Italy you may be surprised how many people use a RDBMS for almost everything. Business logic with stored procedures, FTS, queues, OLAP, data lake... It's the tool the know and they don't want to learn something else.


If you could explain what a better tool is and why, I'd be extremely interested, TIA


Would you use postgres to queue a billion messages a day consumed by dozens of applications? Or a message broker like kafka? Would you store dozens of TB of data in postgres or in a dedicated data lake? Would you put your business logic in Oracle stored procedures, basically denying the possibility to change DBMS? Would you put your applications log and monitor data in a RDBMS?

There's no better tool generally speaking, there's the tool more suited for your use case, RDBMS are now very good, but not for everything.


If you're as big as Google maybe not. Most companies aren't receiving 5,000+ messages a second, and I don't understand what's so special about a data like. There's nothing wrong with storing terabytes in an RDBMS. Also nothing wrong in putting your app and monitor data in one either.


We in-line our sql code in our c# code and avoid dynamic sql in stored procedures at all costs. We write our queries to handle all the parameters the user can choose and rarely have problems with parameter sniffing which we know how to easily resolve. I did need to create a pivot query with a dynamic number of columns, so I use C# to generate the query and then pass the final result to the server using Dapper.


This is the way. I inherited a very long SP that built a SQL string within the procedure then `sp_executesql`'ed the resulting string. It was also at the very core of our product, which made it a living nightmare to support, because if you messed anything up all Hell would break loose.

I spent a good couple months translating it to C#, and making a few improvements while I was at it. Absolutely 100% worth it, those improvements made it possible to add features to make supporting the product much easier for frontline support to help people.

The only benefit is we could cowboy updates directly to prod instantly. If that made your heart skip a beat, congrats, you actually give a damn about change management!


> Some readers may ask: what is dynamic SQL? Dynamic SQL is when you write SQL code into a string variable and then execute the contents of that variable. This can be done in client code or in a stored procedure. With a strict definition, static SQL can only occur in stored procedures and SQL scripts, since SQL in client code is always embedded into string literals in the client-side language.

I still don't get it


To me "static" always meant that you only run SQL string literals. You can't build strings in arbitrary ways and execute them. Parameters can get subbed into some places, though. Like query("SELECT * FROM foo WHERE bar = $1", "baz"). The parameter you use doesn't affect the query plan, so some DBMSes will let you store this as a prepared statement, which has a slight performance advantage.

And dynamic would be like query("SELECT * FROM foo WHERE bar = $1 AND " + var ? "true" : "EXISTS(...)", "baz"). The value of `var` can change the entire query plan. Obviously this is a little more dangerous, but sometimes you need the flexibility.

I read the article and the comments here and have no idea if my definition matches theirs.


Para 3: "Some readers may ask: what is dynamic SQL? Dynamic SQL is when you write SQL code into a string variable and then execute the contents of that variable" so yes it does.


Never ask me to review or support any code that's using dynamic SQL written in MS T-Sql or Oracle. Not that dynamic SQL is inherently bad, but the procedural and string-management tools of those languages are appallingly awful and SQL is a bad tool for templating/smashing strings together.

Which, fundamentally, is what dynamic SQL is.

When I realized the article was going to be about that kind of dynamic SQL, I closed the tab.

Building SQL queries dynamically is fine.

Doing it within the SQL server is an abomination.


Coming from an Oracle background, both application and data warehousing, I wholeheartedly agree with the articles first paragraph saying that application based dynamic SQL is horrible, but DBA (and I'll include data warehousing) dynamic SQL is great, but it's essentially about how complex the thing is.

A DBA is likely doing things like looping over metadata tables and running

  'grant select on ' || some_table || ' to ' || some_user
While an app developer with free reign will end up doing something much more complex, and much harder to reason about/tune


It is also somewhat frustrating that bind variables cannot be used with DDL.

I am perfectly able to:

  SELECT ACCOUNT_STATUS FROM DBA_USERS WHERE USERNAME=:var;
But I cannot:

  ALTER USER :var ACCOUNT UNLOCK;
I really don't understand why that capability wasn't added a decade ago, as loud as the advice is to avoid hard parsing.


That, and the lack of functions with a varying number of parameters, and lack of type polymorphism, and of easy to construct local views, and the lack of sum types...

Dynamic SQL is used on practice to fix a lot of problems the SQL people have been refusing to touch.


Yeah. This is a definite problem when working with user generated databases (eg untrusted), that could have pretty much anything in them. :/


What about some_user has characters that need to be escaped?


The parent article touches on this point too - both Oracle and SQL Server (and any other major db) supports parameterising values in dynamic sql. some_user would become @some_user (SQL Server) or :some_user (Oracle).

(The characters never actually get escaped with parameterisation - they are not part of the query text when it is parsed so can’t affect it - hence parameterising a value in sql query replaces the need to escape it with something much more robust.)


So this is about creating SQL strings with SQL? Not interpolating SQL with ? placeholders in say Python.

The article could have made that much more clear.

It reminds me of the bash-completion project which is the default on Debian-based distros.

It tries to parse bash IN BASH. This is a very bad idea.

Oil Uses Its Parser For History And Completion - https://www.oilshell.org/blog/2020/01/history-and-completion...

I would also argue that parsing bash in C is not a great idea either!


This, if you want to go the dynamic SQL route, please do it using some decent glue language or templating system that is not SQL itself. Unless you hate yourself that is.


For analytics work tools like dbt are really gaining traction for this reason.

Most of the OLAP / warehousing stuff I used to lean on dynamic sql for can be done more cleanly, with the benefit of easy source control.

https://www.getdbt.com/


If anyone is using dynamic SQL in their application built using functions or stored procedures, I seriously doubt their competency.

Dynamic SQL exists just because it can. It had absolutely no use case outside of I want to slap together some SQL statements real quick and don't feel like using anything except the DB.


I'm using dynamic SQL right now because generating almost a thousand almost-identical statements by hand would be stupid. It is necessary sometimes.


I tend to insist that database tables are collections of global variables, and the worst kind of global variables because they're persistent (reboot won't fix them), they often affect multiple application instances at once, and application compilers can't help you verify them because they're usually in strings. Stored procedure languages have a little bit of an advantage here, but there are so many disadvantages that it's hard to recommend them.

Your best bet when writing SQL in strings is make it "greppable". Naming a table "user" is very convenient until someone asks, "What will be affected by this change to the user table?" and you're stuck with needles in a haystacks. Tossing in a tiny prefix e.g. "tblUser" will annoy some people but then your signal-to-noise is ideal.

But yeah, concatenating bits and pieces of text together into queries can rapidly escalate out of control without a conservative/judicious approach.


> I tend to insist that database tables are collections of global variables, and the worst kind of global variables because they're persistent (reboot won't fix them), they often affect multiple application instances at once, and application compilers can't help you verify them because they're usually in strings

This is so ridiculous it hurts

> Tossing in a tiny prefix e.g. "tblUser" will annoy some people but then your signal-to-noise is ideal.

just no.


Note the lack of supporting argument


If you can explain to me how these 'global variables' can be somehow made local or somehow more restrictive and still be useful, I'd be interested. If you can equally explain to me why you want the database to clear the variables on reboot (i.e. destroy all that useful information which databases are designed to preserve, the 'D' in ACID), please do. If you can explain also how they can not "often affect multiple application instances at once" and yet be useful to multiple application instances at once (this being a central part of database functionality) I'd be interested.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: