> SQL is surprisingly easy to write and expressive
Counter-example: using only majority-support ISO SQL string functions and operators (SUBSTRING, CHARINDEX, LIKE, NOT LIKE, LEN, etc - but not any regular-expressions), for a string column that contains URI query-string style name+value pairs, e.g. "foo=bar&baz=qux", try to extract out the value for the key "baz" - note that the source column value could also be "&baz=qux&foo=bar&" or "baz=qux&foo=bar&".
Doing that is trivial in most conventional programming languages: just use a couple of local-variables to store CHARINDEX values then pass those into SUBSTRING to get the desired "qux" value.
...but in SQL (at least to my knowledge) there's no concept of local-variables in a SELECT query, instead you need to pass values along via derived-tables or CTEs - or move the logic to a scalar UDF - or give up trying to be succint and do it with dozens of identical CHARINDEX call-sites and pray that the RDBMS is smart enough to memoize UDFs in a query step; none of those options are appealing even in their own right - so we're kinda stuck (and if you want to be succint you'll need not only UDF support, but also inlined and concurrent execution of UDFs otherwise your query performance will tank).
Instead, what if ISO SQL had the ability to define in-query "mini" pure-functions, which represent a (monadic?) sequence of operations, including mutable local variables, over its input parameters - this would solve the verbosity/DRY problem entirely - and without needing you to split a query's logic into two (i.e. the main SELECT query but also many CREATE FUNCTION statements). I note that the existing CTE syntax could be adapted for defining those functions, for example a query to extract querystring values could look like this:
```
WITH FUNCTION getQueryStringValue( input varchar, key varchar ) RETURNS varchar NULL AS (
SET key = CONCAT( key, '=' );
LET keyIdx = CHARINDEX( haystack: input, needle: key );
IF keyIdx < 0 THEN RETURN NULL;
LET valueIdx = keyIdx + LEN( key );
LET nextSeparatorIdx = CHARINDEX( haystack: input, needle: '&', startIndex: keyIdx );
IF nextSeparatorIdx < 0 THEN SET nextSeparatorIdx = LEN( input );
LET valueLen = nextSeparatorIdx - valueIdx;
LET value = SUBSTRING( text: input, startIndex: valueIdx, length: valueLen );
RETURN value;
),
cte AS (
SELECT
getQueryStringValue( src.QueryStringA, 'foo' ) AS fooValue1,
getQueryStringValue( src.QueryStringB, 'foo' ) AS fooValue2,
getQueryStringValue( src.QueryStringC, 'foo' ) AS fooValue3
FROM
src
)
SELECT
CONCAT( fooValue1, fooValue2, fooValue3 ) AS whatevs
FROM
cte
ORDER BY
someCol;
```
I don't disagree things like this are pain points, but one of the key things about unlocking the power of relational algebra is having normalized data, and putting multiple things (eg a bunch of key value pairs) in a single value sort of flies in the face of that. You can often get away with it, but it starts breaking down in more complex cases like this.
In the real world of course, sometimes it happens anyway. Maybe you were storing URIs and only later figured out you needed to extract keys from them. But thinking about this sort of thing beforehand as much as possible can save you a lot of heartache later on.
That said, this fails the ISO SQL requirement (I believe), but I'd use the table-valued function STRING_SPLIT in MS SQL, then I can just filter on LIKE 'baz=%", and SELECT the substring after baz=. Table-valued functions are great. The nice thing about this solution is it's very easy to generalize.
I believe MySQL doesn't have a similar function, but you could do something with SUBSTRING and LOCATE - search for the first occurrence of baz=, then you can use that to find the first location of & after baz=, which will allow you to select the substring between baz= and &.
Counter-example: using only majority-support ISO SQL string functions and operators (SUBSTRING, CHARINDEX, LIKE, NOT LIKE, LEN, etc - but not any regular-expressions), for a string column that contains URI query-string style name+value pairs, e.g. "foo=bar&baz=qux", try to extract out the value for the key "baz" - note that the source column value could also be "&baz=qux&foo=bar&" or "baz=qux&foo=bar&".
Doing that is trivial in most conventional programming languages: just use a couple of local-variables to store CHARINDEX values then pass those into SUBSTRING to get the desired "qux" value.
...but in SQL (at least to my knowledge) there's no concept of local-variables in a SELECT query, instead you need to pass values along via derived-tables or CTEs - or move the logic to a scalar UDF - or give up trying to be succint and do it with dozens of identical CHARINDEX call-sites and pray that the RDBMS is smart enough to memoize UDFs in a query step; none of those options are appealing even in their own right - so we're kinda stuck (and if you want to be succint you'll need not only UDF support, but also inlined and concurrent execution of UDFs otherwise your query performance will tank).
Instead, what if ISO SQL had the ability to define in-query "mini" pure-functions, which represent a (monadic?) sequence of operations, including mutable local variables, over its input parameters - this would solve the verbosity/DRY problem entirely - and without needing you to split a query's logic into two (i.e. the main SELECT query but also many CREATE FUNCTION statements). I note that the existing CTE syntax could be adapted for defining those functions, for example a query to extract querystring values could look like this:
``` WITH FUNCTION getQueryStringValue( input varchar, key varchar ) RETURNS varchar NULL AS (
), cte AS ( ) SELECT CONCAT( fooValue1, fooValue2, fooValue3 ) AS whatevs FROM cte ORDER BY someCol; ```Thoughts?