> One of the most common uses of null is to communicate missing values in the se...

ttfkam · on Dec 14, 2021

You're approaching this at too low a level. SQL is a 4th generation language, quite possibly the only successful 4GL. Unlike every other major language, it doesn't deal with "how" you get the answer. You see the "how" with commands like EXPLAIN. Instead it focuses on "what" you want and letting the engine figure out how to get there.

Optional<Foo> lends itself well to functional programming and prevents things like null pointer errors, but that's still at 3GL level, the EXPLAIN level. SQL doesn't throw null pointer exceptions. Ever. The query syntax is correct or it isn't. The same concerns in a programmatic language simply do not apply, which is why Optional<Foo> is a poor fit in this context.

Null in SQL is not the null in JavaScript or C++. Same name, but not the same animal, and I think this is what trips folks up. In your mind, replace SQL's null with a new keyword, UNKNOWN, and it makes more logical sense.

SELECT ... FROM ... WHERE x = UNKNOWN;

Even if x were UNKNOWN in a row, is it the same unknown? For example, if you see two girls, but you don't know their birthdays. Their birthdays are unknown.

WHERE Keasha.birthday = Cynthia.birthday

This partial statement would always be false no matter what until you know both of their actual ages. Just because something is unknown doesn't mean it's the same unknown.

Set theory, not a programmatic series of steps as found in most languages. And that's what SQL is: a DSL for set theory.

I believe it's advantageous to avoid nullable columns in relational schema design, but that's a separate issue from the concept of NULL in SQL as distinct from programmatic languages.

matwood · on Dec 14, 2021

> I believe it's advantageous to avoid nullable columns in relational schema design, but that's a separate issue from the concept of NULL in SQL as distinct from programmatic languages.

Agreed, but they should be used where appropriate. My favorite example of a nullable column is an end date. Until an end date occurs, it's unknown - thus null. Representing the unknown end date with anything other than null is semantically incorrect, and in many cases can lead to errors.

wyager · on Dec 15, 2021

Supporting sum types is not in any way incompatible with being a query or logic language.

civilized · on Dec 14, 2021

I agree, and more sophisticated analytics data structures, like R's data frame, do have "typed nulls" (except in R they're called NAs).

I don't know, and would love to know, why databases have been so slow to adopt this.

wongarsu · on Dec 14, 2021

You could argue any SQL value not marked as NOT NULL is an Optional<Foo>, with NULL being the None value. It's not really subverting the type system, it's just an unconventional choice.

Pxtl · on Dec 14, 2021

Yes, but I can't think of any language where, given Optional<Foo> x and y, x.equals(y) would return null instead of either true, false, or an exception if y is null.

That's what makes SQL null-values distinct - the fact that expressions that return booleans can return NULLs.

housecarpenter · on Dec 14, 2021

In the relational model used by SQL, types are atomic. Sum types are inherently non-atomic. I'm sure you could graft them onto the language, the same way Postgres can have JSON type fields and all that, but I don't think you'd be able to really integrate them into SQL without fundamentally changing what kind of language SQL is.

wyager · on Dec 15, 2021

The query algebra is completely compatible with sum types. I don't think "atomic" is describing anything central to the QA model.