
Comparisons in C++20 - ingve
https://brevzin.github.io/c++/2019/07/28/comparisons-cpp20/
======
makecheck
One of the most frustrating bugs I encountered in a previous job was an
incorrect implementation of a “strict weak ordering” that would cause
containers and algorithms to just plain _misbehave_. The evil part was that
there was a whole range of misbehavior, including “generally working” for most
data, while crashing in other cases!

The reality is that most programmers are not experts but they exist on C++
projects. People _will_ try “simple” things like hacking up a less-than
operator, and when it seems to “work”, they move on (leaving a code time bomb
behind).

The C++ compiler needs to provide robust compile-time and run-time checks to
tell programmers that things are wrong, like: “your operator does not meet the
requirements of the sort algorithm”.

Contrast the latest 20-page description of std::whatever_the_heck to, say,
Python sorting, which offers "key=..." because _the overwhelmingly common case
is to simply state a single field on which to base object ordering_!!!

~~~
comex
Python is almost as bad as C++ when it comes to making custom classes
sortable. In Python 2 there was had __cmp__, which was roughly the same as
C++'s new spaceship operator, but Python 3 inexplicably removed it in favor of
only having individual comparison operators like __eq__, __lt__, __gt__, etc.
Oh, and functools.total_ordering, which lets you "only" define __lt__ and
__eq__ and get the rest automatically implemented, but that's still two things
to define when __cmp__ was only one.

~~~
masklinn
IIRC the reason is that people would regularly screw up __cmp__ (despite the
cmp built-in being available), it’s harder to fuck up simple boolean results
that the any-integer-coerced-to-3-state of cmp.

Python 3 also removed the “everything is comparable” misfeature of P2,
providing way lower incentive to implement ordering: if your type is not
orderable, a user knows to go with key functions.

~~~
ioquatix
I don't trust someone who can't implement `__cmp__` to implement `__lt__`
either.

~~~
masklinn
Experience says you're way wrong, simply because combining a sequence of sub-
lt calls is way simpler and shorter than combining a sequence of cmp:

    
    
        def __lt__(self, other):
            return self.a < other.a and self.b < other.b and self.c < other.c
    

meanwhile

    
    
        def __cmp__(self, other):
            v = cmp(self.a, other.a)
            if v != 0:
                return v
            v = cmp(self.b, other.b)
            if v != 0:
                return v
            v = cmp(self.c, other.c)
            if v != 0:
                return v
            return 0
    

Rust (for instance) makes the latter less offensive (and error-prone) by
providing built-in combinators: [https://doc.rust-
lang.org/std/cmp/enum.Ordering.html](https://doc.rust-
lang.org/std/cmp/enum.Ordering.html)

    
    
        impl Ord for Thing {
            fn cmp(&self, other: &Self) -> Ordering {
                self.a.cmp(&self.a)
                    .then(self.b.cmp(&other.b))
                    .then(self.c.cmp(&other.c))
            }
        }

~~~
BlackFingolfin
But that first comparison function is usually _not_ what you want, you want
lexicographic ordering. This very example is in fact discussed in the article,
as an example to how people often mess this up...

~~~
masklinn
> But that first comparison function is usually not what you want

All those functions implement the same comparison, the first is simply not
including the similarly trivial implementation for eq:

    
    
        def __eq__(self, other):
            return self.a == other.a and self.b == other.b and self.c == other.c
    

> you want lexicographic ordering

Every word here makes sense but the objection makes none. What are you trying
to say exactly?

------
moomin
Interesting to compare to Haskell. Here you typically implement “compare”
which is equivalent to the spaceship operator and get < &c for free, but you
can implement “<=“ and get compare for free because the defaulting mechanism
is general and not built into the compiler. Also “Ord” has to be compatible
with “Eq” (which you can also get for free) which is different from C++’s
approach. Moreover, the compiler can “derive” implementations for regular data
objects.

C# is slightly different: all of the algorithms take an IComparison, which
implements the spaceship operator in another class (you can implement the
related IComparer in your actual class and it all works via the magic of
reflection. This isn't too bad since the reflection only occurs in the
constructor.). Comparison operators are basically never used in the standard
library and people rarely implement them.

~~~
claudius
> Moreover, the compiler can “derive” implementations for regular data
> objects.

When defaulting the spaceship operator, the same happens for C++ (by comparing
members). In this special case, the equality operator is also effectively
defaulted, so as described in the article

    
    
        struct A {
          …
          auto operator<=>(A const& rhs) const = default;
        };
    

gives you automatic member-wise comparison "derived" by the compiler.

~~~
moomin
Nice, I missed that.

------
ohazi
> Importantly, there is no language transformation which rewrites one kind of
> operator (i.e. Equality or Ordering) in terms of a different kind of
> operator. The columns are strictly separate.

Why do we need a primary == operator if we have strong_ordering::equal,
weak_ordering::equivalent, and partial_ordering::equivalent? Couldn't the
behavior of == and != be inferred from the <=> definition?

I guess I'm asking why a == b can't be rewritten as (a <=> b) == 0.

~~~
username90
> Why do we need a primary == operator if we have strong_ordering::equal,
> weak_ordering::equivalent

Weak orderings are defined as orders that can't distinguish between all
members, which is why it is called "equivalent" instead of "equal".

~~~
ohazi
> Weak orderings are defined as orders that can't distinguish between all
> members

I get that. This wasn't my question.

I want to know why I _need_ to define operator== when reasonable behavior can
be inferred from whether or not operator<=> returns strong_ordering::equal (or
weak_ordering::equivalent -- the standards committee can decide if this is
reasonable -- I don't really care). If I want special behavior, then sure,
defining operator== might make sense, and then it should obviously take
precedence.

But if the whole point of this new three way compare is to reduce the
combinatorial explosion of things that you need to define, I don't understand
the need to split the universe into {==, !=} and {<=>, <, <=, >, >=}, and
never let them interact with each other.

The part I take issue with is the statement "The columns are strictly
separate."

Maybe I've missed something, but I don't see why it needs to be like this.
operator<=> seems to be a strict superset of operator==, because it returns
information about when equality holds, (or when equivalence holds). Shouldn't
that be sufficient? Also, what's going to break if you design a type where
a.operator==(b) doesn't return true when a.operator<=>(b) returns
strong_ordering::equal, or vice-versa?

Again, maybe I've missed something, but this seems like a mistake. They're
removing all of the footguns except this last one. Why?

~~~
jcelerier
> Again, maybe I've missed something, but this seems like a mistake. They're
> removing all of the footguns except this last one. Why?

originally they weren't separated, but the risk for bad performance due to
that was too high. The complete rationale is described here :
[http://www.open-
std.org/jtc1/sc22/wg21/docs/papers/2019/p118...](http://www.open-
std.org/jtc1/sc22/wg21/docs/papers/2019/p1185r2.html)

~~~
ohazi
Thank you! This answers my original question. I guess all things considered,
it's still not too bad. Two functions is better than 6+, and it looks like
they'll be a lot harder to get wrong.

~~~
jcelerier
> and it looks like they'll be a lot harder to get wrong.

especially considering you can ` = default;` them, which should be enough _a
lot_ of times.

------
iso-8859-1
It's funny how Concepts are not mentioned at all. Given that they are the C++
equivalent of type classes and a common example is the "EqualityConstraint", I
can't believe there isn't some interplay. Concepts are also in C++20.

------
debatem1
I've thought for years that C++ was a sprawling mess of a language with a good
and useful subset buried in there somewhere. I'm giving up with C++20. It's
time to acknowledge that the language is too big and its virtues too small.

~~~
Negitivefrags
This non-sequitur of a reply is getting pretty tired on C++ articles.

Perhaps you could explain how this feature is a sprawling mess exactly?

In my opinion, not only is the feature easy to understand for beginners, but
it also :

1) Simplifies code - less code to write means less code to have bugs

2) Prevents people making mistakes - incompatibly defined comparison operators
wont happen with <=>

3) Gives a speed boost - in some algorithms it's faster to do a<=>b rather
than a < b and b > a.

You could find a deep dive article just as long about the minute details of
comparison for practically any language.

~~~
amelius
Well, if you really want to be convinced how hairy C++ has become, just have a
look at the implementation of the Boost libraries.

~~~
cmrdporcupine
With every revision of the C++ standard Boost becomes both less hairy and less
necessary.

The direction of the language is towards greater cleanliness and consistency.

~~~
amelius
Yes, but it shows that the problem still exists.

------
mehrdadn
The terminology is a little confusing. By "strong" total ordering do they
really mean "strict" total ordering? If not, what's the difference?

~~~
giomasce
No. The terminology is different because it refers to a different concept. In
C++20 "strong" vs "weak" refers to substitutability. When two elements x and y
compare to strong_ordering::equal, then it is assumed that f(x) == f(y). This
is not assumed if they are just weak_ordering::equivalent. Read section "A new
ordering primitive: <=>" of the article for a longer explanation.

~~~
mehrdadn
I did read that section... that's why I said it's confusing...

It seems to me they're using different terminology for concepts that _aren 't_
different in math? In math, "equality" (=) _already means_ the same thing as
substitutability. A non-substitutable equality is already called "equivalence"
(≡).

So in math we already have these:

[1a] Non-strict partial order: a binary ordering that allows incomparability
(≼, ≡, ≽, ?)

[1b] Strict partial order: like [1a], but irreflexive (≺, ≡, ≻, ?)

[2a] Non-strict weak order: like [1a], but with all elements comparable (≼, ≡,
≽)

[2b] Strict weak order: like [1b], but with all elements comparable (≺, ≡, ≻)

[3a] Non-strict total order: like [2a], where the equivalence is equality (≤,
=, ≥)

[3b] Strict total order: like [2b], where the equivalence is equality (<, =,
>)

If I didn't make a mistake above, then:

\- C++'s "strong total ordering" seems to be what in math we call "strict
total ordering"

\- Their "weak total ordering" seems to be what in math we call "strict weak
ordering"

\- Partial order is the same thing for both

Am I wrong here? If yes, how? If not, why did they randomly invent their own
terminology? I haven't seen their definitions used elsewhere.

~~~
Sniffnoy
Huh, this is... not the terminology I am used to, as a mathematician?

My experience is that what C++ is calling a "weak order", and what you are
calling a "weak order", is what in math is called a "pre-order" or a "quasi-
order".

Meanwhile, the distinction you are making between strict and nonstrict is just
ignored (except by constructivists), since they're equivalent ways of talking
about the same thing. Indeed I'm not sure why you included both because, well,
they're just two ways of talking about the same thing (again, unless you're a
constructivist).

Really (assuming classical logic) there's just 4 possibilities here:

1\. Total order (what they're calling a "strong ordering") 2\. Total preorder
(what they're calling a "weak ordering") 3\. Partial order (which they are
also calling a "partial ordering") 4\. Partial preorder (which they don't
account for)

~~~
mehrdadn
You're agreeing with me?

"Preorder" is just a (better-known?) synonym for "non-strict weak order" [1]
[2], so we don't disagree there.

"Strict" vs. "non-strict" I merely included because C++ comparisons return
strict orders. I thought it was worth including, but feel free to ignore it.

So just as you pointed out in your own list, and just as I've been saying,
"strong total order" isn't the terminology (every total order _is_ strong),
and neither is "weak total order" (it means "total preorder" which is... just
a total order). Like you said, they should say "total" order, "preorder" (or
"weak" order as I said), "partial" order. The "strong" and "weak total" stuff
is just something they seem to have invented in contradiction with the
established mathematical terminology for... no reason/gain?

[1]
[https://en.wikipedia.org/wiki/Weak_ordering#Total_preorders](https://en.wikipedia.org/wiki/Weak_ordering#Total_preorders)

[2]
[http://fitelson.org/roberts_measurement_theory.pdf#page=56](http://fitelson.org/roberts_measurement_theory.pdf#page=56)

~~~
Sniffnoy
Well, I disagree that the article is confusing on this point. They say quite
explicitly what they mean. I agree that the terminology is _annoying_ ,
because they should just match the existing math terminology, and it's also
annoying that they didn't account for partial preorders.

I'm not sure whether I agree that the terminology is confusing; on the whole I
think it isn't. The reason it's not confusing is that it sufficiently
_different_ from usual math terminology so as not to interfere -- i.e., the
terminology doesn't actually _disagree_ at any point, it's not incompatible.
Like, concepts get reinvented all the time and you just kind of have to get
used to things having multiple terms, and be ready to translate unusual
terminology into standard terminology, even as of course you should do what
you can to reduce this happening. As long as you don't end up in a situation
where one word means two different things, there's not really confusion, just
different terminology.

And, as I said above, I definitely disagree that the _article_ is confusing on
this point, because they're very explicit about what they mean.

~~~
mehrdadn
I said the terminology is confusing, not the article...

> the terminology doesn't actually disagree at any point, it's not
> incompatible

It _does_ disagree though? I thought I just explained this... I'll do it
again. "Total orders" are by their very definition in math always 'strong' \--
their entire _point_ is to use equality as the equivalence relation. That's
what distinguishes them from weak orders, which can have other equivalence
classes. So a "weak _total_ order" makes no sense -- if a weak order is total,
it's by definition the same as a 'strong' total order. That's in direct
contradiction with their terminology.

~~~
Sniffnoy
I mean, that's like saying "A ring by definition is always associative, so
saying 'a ring without associativity' is contradictory". Neither ordinary
language nor technical terminology work that way; not every phrase is
compositional in such a straightforward way, including in mathematics. I mean,
a "fake X" is not an X.

In mathematics the word "weak" plays a similar role; e.g. in differential
equations a weak solution need not be a solution; it's easy to find more
examples. Similarly when one talks about a "non-Y X" or a "X without Y" where
ordinarily Y is part of the definition of X. The result is _not_ an X, but
it's still (usually) clear what's meant, and these are still the terms that
get used. Anyway, the point is, new terms someone is defining have to be
considered on the whole.

An actual incompatibility, like I said, would be giving an existing phrase a
new meaning. E.g. if they had referred to total preorders as partial orders,
_that_ would be an incompatibility.

~~~
mehrdadn
> An actual incompatibility, like I said, would be giving an existing phrase a
> new meaning. E.g. if they had referred to total preorders as partial orders,
> that would be an incompatibility.

This _is_ what they're doing!! They're calling pre-orders "weak total orders".
That's a direct incompatibility... "weak total order" _already_ means "total
order", just like "total weak order" already means "total pre-order" which
already means "total order". "Weak" and "total" are both adjectives, and
they're interchangeable. If you asked anyone (who unlike you had already heard
of the term "weak order" before) that's exactly how they would interpret it.
Nothing in it would be left open for interpretation since everything is
defined.

Whereas in your solution example, unlike here, the phrase "weak solution"
_didn 't_ already mean something else, and there was no existing term for the
concept either. And unlike in "ring without associativity", they use the word
"total" in "weak total order" to mean nothing. They could've taken it out and
"weak order" would've meant _exactly_ what they meant.

If you want to make a comparison, what they're doing would be like using "a
nonnegative positive solution" to mean "a nonnegative solution" (rather than
"a positive nonnegative solution"). Or using "an fractional real number" to
mean "a fractional complex number" rather than a "real fractional number"
(i.e. real fraction). Which would be completely nuts.

~~~
Sniffnoy
> This is what they're doing!! They're calling pre-orders "weak total orders".

No. Giving one thing two names is not an incompatibility. Giving two things
one name is an incompatibility.

> "weak total order" already means "total order"

Does it? I've never heard it called that. I think anyone on hearing the term
"weak total order" would reasonably infer it refers to some notion than a
total order.

> "Weak" and "total" are both adjectives, and they're interchangeable.

No. This is completely wrong. Terminology is very frequently _not_
compositional in such a simple way.

I mean, this works perfectly fine with words that convey _additional_
conditions. It does not work with words that convey a _removal_ of conditions,
which is what the word "weak" does!

Like, in the phrase "weak order" (meaning total preorder), the word "weak" is
not imposing the conditions of reflexivity, transitivity, etc. It is starting
from a baseline of "order" meaning "total order", and then _weakening_ this to
a preorder. The word "weak" _only_ weakens!

Anyone, on seeing the phrase "weak total order", assuming they know the
general usage of the word "weak", can reasonably infer that it refers to
something _weaker_ than a total order. (Likely a preorder.)

Yes, this makes "weak total order" something of a ridiculous phrase, given
existing terminology. But again, remember that "weak order" is starting from a
baseline where "order" means "total order"; in a sense, "weak order" is really
something of an abbreviation for "weak total order".

(...of course, all of this highlights the odd way that terminology for orders
themselves work. After all, "partial" is also a weakener. I haven't studied
the history, but I'd bet you that "order" originally meant "total order" (it's
still often used that way), then "partial order" came later as a weakener,
then the meaning of "order" shifted as the study of partial orders became more
common, then "total order" was coined as a retronym.)

> If you want to make a comparison, what they're doing would be like using "a
> nonnegative positive solution" to mean "a nonnegative solution" (rather than
> "a positive nonnegative solution"). Or using "an fractional real number" to
> mean "a fractional complex number" rather than a "real fractional number"
> (i.e. real fraction). Which would be completely nuts.

All of these are examples where you're using two terms that both _impose_
conditions, rather than removing them. The word "weak" simply does not work
that way.

~~~
mehrdadn
> All of these are examples where you're using two terms that both impose
> conditions, rather than removing them.

All of these? Really? "Fractional" and "real" are 'imposing' conditions on
'number' rather than removing them? Like you said, I'd bet you that 'number'
first meant 'natural number' rather than 'complex number'.

And before you tell me how deletion is also commutative just like addition—I
could debate you the merits of that in natural language too, but that's beside
my point in the last paragraph.

> Does it? I've never heard it called that.

Yes? That's precisely why I just gave you 2 links to people calling it exactly
that above!!
[https://en.wikipedia.org/wiki/Weak_ordering#Total_preorders](https://en.wikipedia.org/wiki/Weak_ordering#Total_preorders)
[http://fitelson.org/roberts_measurement_theory.pdf#page=56](http://fitelson.org/roberts_measurement_theory.pdf#page=56)

And I already asked you how you think a mathematician who unlike you _knows
'weak' already has a definition this context_ would interpret 'weak total
order' to mean? and you just ignored me there too. Just like you ignored my
links way above where I showed you this is known terminology you're merely not
aware of. I don't know why you're not cooperating but I'm tired of continuing.

------
notmuchserious
A brand new language without backward compatibility would be nice. C+++

~~~
de_watcher
The whole point is to have backwards compatibility. It's just fascinating how
they add or fix things without breaking the code.

Doing without backwards compatibility is easy: take Rust or D.

~~~
Crinus
> It's just fascinating how they add or fix things without breaking the code.

In the same way it is fascinating to watch a forest fire from afar.

~~~
de_watcher
This hostility is tiring.

The ability to avoid rewriting is a practical thing.

------
twoodfin
How do the literal 0 comparisons fit into the language? Are they special
purpose syntax or is it possible to write your own operator that only
typechecks if the RHS is a literal 0?

~~~
wyldfire
Are you confusing the implementation of these operator overloads with their
signature? The spaceship returns {neg, 0, pos} and that's why the
implementations are like that.

~~~
twoodfin
I’m referring to this bit of the article:

 _The values of these comparison categories can be compared against the
literal 0 (not any int, not an int whose value is 0... just the literal) using
any of the six comparison operators..._

~~~
wyldfire
Oh, this is particularly confusing, I agree.

IIUC it's referring to how you can safely evaluate the return value from a
spaceship: by comparing against literal zero. (not how you would overload a
comparison with literal zero).

------
kensai
I really hope a modern language can take over many features of C++ without its
added complexity.

Potential candidates: Rust, Julia, Swift.

All three are modern languages with modern concepts. I have high hopes
especially for Julia.

~~~
nevi-me
Please excuse my lack of knowledge, but is Julia gen-purp enough to replace
C++? Can I write a hypothetical kernel with it, or a game?

~~~
cshenton
Game? Sure, if you’re willing to spend a non trivial amount of time
integrating with a graphics API, but that’s true of most non C++ languages.
Kernel? Probably not, AFAIK there’s no subset of Julia that can run without an
OS (like for rust and C++).

------
giomasce
In case someone rushed to cppreference.com to compare, I _think_ that
[https://en.cppreference.com/w/cpp/utility/compare/strong_ord...](https://en.cppreference.com/w/cpp/utility/compare/strong_ordering)
is slightly wrong, in that it should not list "equivalent" as an alternative.
I reported that (putative) mistake.

------
wyldfire
An early c++20 fan gives a tribute to his favorite new operator [1] (indeed so
early it was only drafts back then).

[1] [https://youtu.be/ZA6ehndc6co](https://youtu.be/ZA6ehndc6co)

------
Demiurge
This is interesting... As non-C++ developer, I've never thought I needed this.
Is this an answer to a question anyone is asking?

~~~
de_watcher
C++ just has this under-the-hood access to be able to define any comparison
operator to do arbitrary things.

C++20 is now more aware that this things are used for comparison, and it helps
to make definitions shorter.

Still handles non-totally-ordered cases fine.

------
inlined
The default operators are very cool, but I assume that many codebases would
ban them.

Imagine, for example, an engineer decided to reorder members in a struct to
make it pack better. Now the semantics of default <=> have changed!

Also, as a minor nit, the optional number sample has a possible bug. I would
assume that two null optionals would compare equivalently. Or is that part of
how <=> should work? Should NaN <=> NaN == partial_ordering::unordered?

~~~
maxwellburson
While I don't know about this specific case, I do know in some languages, like
JavaScript, NaN poisons any operation it is part of. So any operation on it is
NaN, and comparisons to it are false.

~~~
magicalhippo
This is default IEEE 754 behavior.

[https://en.wikipedia.org/wiki/NaN](https://en.wikipedia.org/wiki/NaN)

------
forrestthewoods
God this is so confusing. Why the $%&* does it take a 17-page 27-minute blog
post to explain how to implement comparison operators?

Can someone please write the 3-minute blog post that is easy to understand and
follow?

If the answer is "it's too complicated to explain in 3 minutes" then I think
that is a telling sign.

~~~
gumby
Here you go: less than 3 minutes to type out!

"The TL;DR is that there's a new operator, <=>, that returns less than 0, 0,
or greater than zero, just like, say, strcmp does, but for all sorts of C++
objects. You can define operator<=> and operator== and your compiler will take
care of the various other comparators. Doing this speeds up some library
functions, doesn't force you to write as much code, and makes comparisons more
intuitive.

"There are also a bunch of corner case optimizations but if you don't care
about them you don't need to know"

This was written by someone who cares about the corner cases and underlying
theory. It's like FP: IEEE743 is full of weirdo cases that some really smart
people spent a lot of time worrying about. All most users care about is that
"they have a fractional representation and that == doesn't reliably work
except against 0.0." Just because there's hundreds of pages in that spec
doesn't make it dumb either.

~~~
furyofantares
I don't think this is right. In order to implement <=> for a class, you have
to understand strong_ordering, weak_ordering, and partial_ordering in order to
pick one, right?

~~~
slavik81
Generally, no. Most of the time you're going to write these functions as
defaulted with an auto return type:

    
    
        auto operator<=>(const T&) const = default;
        bool operator==(const T&) const = default;
    

The compiler will implement the function by comparing all the class member
variables and will pick the type of the ordering as appropriate. You only need
to look into further details about ordering types if you want to do something
fancy.

------
voldacar
A new string type! How exciting!

~~~
2bitencryption
the article doesn't describe a new string type anywhere, only a hypothetical
one for the purposes of demonstrating the new operator

