
Tagged unions, or ‘variants’, in C - Procedural
https://gist.github.com/procedural/2e3c212af4f61dde313af1123ddfbbc9
======
klodolph
No, no, no! This is terrible!

variant_cast is just garbage. It relies on struct layout, and if anything has
wider alignment than ptrdiff_t, it's broken. For example, a double in 32-bit
PowerPC has 64-bit alignment but ptrdiff_t is 32 bits, so the pointer is
wrong. Maybe x64 you get away with it for most types (not all!), but can you
raise your right hand and solemnly swear that you'll never ever want to port
to a non-x64 system?

Additionally, all the explicit casting means that this has basically no type
safety at all. I can a.tag = tag(SomeThree) even though SomeThree is not a
member of a's type, and then I will be able to successfully let b = as(&a,
SomeThree) and boom! I've just scribbled over memory without so much as a peep
from the compiler. What is the of having tagged unions (a TYPE SAFETY FEATURE)
if your tagged union implementation is LESS type safe than untagged unions?

This last point is actually a bit subtle… accessing the wrong member of a
union is probably not what you want to do most of the time, but the language
in the C standard is quite clear (see DR #257, language was clarified in C99
but present in earlier versions) and accessing the wrong union member just
gives you whatever you stored in a union reinterpreted as whatever you read
out of it. But this tagged union implementation lets you access a union member
which doesn't even exist.

Let's not try to implement new language features on top of C with macros. It
never ends well. Either figure out a way to live with a little boilerplate or
use a different language, those are the only sane options.

~~~
huhtenberg
I think it's an elegant piece of garbage actually.

All your nitpicks are fairly easy to address.

    
    
      +  union zee_tag {
      +      ptrdiff_t    tag;
      +      max_align_t  foo;
      +  };
      + 
         struct Some {
      -      ptrdiff_t tag;
      +      zee_tag   tag;
    

> _I can a.tag = tag(SomeThree)_

tag() just needs to take 'a' as another parameter and do a static_assert.

> _It relies on struct layout_

As in "ptrdiff_t needs to be the first member in a struct"? Oh, the horror.
It's C, a language that generally expects you to know what you are doing.

> _all the explicit casting means that this has basically no type safety at
> all_

Duh. See above :)

Generally speaking, you are missing a point here. This sort of contraption
comes handy when the code needs to add a _bit_ more of invariant checking, but
without going over the top. In comparison with a plain union this will help
catching the misuse. Granted, it needs to be used correctly itself.

~~~
klodolph
> As in "ptrdiff_t needs to be the first member in a struct"? Oh, the horror.
> It's C, a language that generally expects you to know what you are doing.

Speaking of garbage, that's some really needless posturing. The comment was
about the position of the second element in the structure, I suppose that this
wasn't spelled out in the comment but everyone here knows that the first
member of a struct is at the beginning.

God forbid you would store __mm128i here, although generally that's an awful
idea, it's _not_ unreasonable to want to force 16-byte alignment or the like.

> tag() just needs to take 'a' as another parameter and do a static_assert.

Show me. I'm honestly curious what it would look like.

> Generally speaking, you are missing a point here. This sort of contraption
> comes handy when the code needs to add a _bit_ more of invariant checking,
> but without going over the top. In comparison with a plain union this will
> help catching the misuse. Granted, it needs to be used correctly itself.

Show me a version that adds invariant checking.

~~~
huhtenberg
> _The comment was about the position of the second element in the structure,
> I suppose that this wasn 't spelled out in the comment but everyone here
> knows that the first member of a struct is at the beginning._

Wit beyond measure, fascinating.

    
    
         union zee_tag {
           ptrdiff_t    tag;
           max_align_t  foo;
         };
    
         static inline void * variant_cast(void * variant_ptr, ptrdiff_t desired_tag) {
           ptrdiff_t * variant_tag = (ptrdiff_t *)variant_ptr;
           assert(*variant_tag == desired_tag);
        -  return (void *)((char *)variant_ptr + sizeof(ptrdiff_t));
        +  return (void *)((char *)variant_ptr + sizeof(union zee_tag));
         }
    

Shall I sprinkle some comments or is it OK this way?

> _Show me._
    
    
        @@ -15,6 +15,9 @@
         #define is(x, T) ((x)->tag == tag(T))
         #define as(x, T) ((struct T *)variant_cast((void *)(x), tag(T)))
        
        +#define union_field(T) struct T _##T
        +#define tag_union(x, T) (x)->tag = ((x)->_##T, (ptrdiff_t)&Tag##T)
        +
         struct SomeOne {
           int x;
           int y;
        @@ -25,17 +28,24 @@
           float w;
         } TagSomeTwo;
        
        +struct SomeThree {
        +  float zz;
        +  float ww;
        +} TagSomeThree;
        +
         struct Some {
           ptrdiff_t tag;
           union {
        -    struct SomeOne _;
        -    struct SomeTwo __;
        +    union_field( SomeOne );
        +    union_field( SomeTwo );
           };
         };
        
         int main() {
           struct Some a = {};
        -  a.tag = tag(SomeTwo);
        +
        +  tag_union(&a, SomeTwo);
        +  tag_union(&a, SomeThree);
        
           printf("Is `a` tagged as SomeTwo: %d\n", is(&a, SomeTwo));
    

> _Show me a version that adds invariant checking._

An exercise left to the reader.

~~~
klodolph
> Wit beyond measure, fascinating.

I'm flattered that you want to talk about me, but I'd rather talk about the
code.

> An exercise left to the reader.

That's really the only interesting part, isn't it? All the other stuff is just
minor bugfixes / bikeshedding. I said "show me" the version of tag() that does
the static_assert, and "show me" a version that adds invariant checking (that
the cast is possible). Neither of these things have been shown. I'm not
claiming it's impossible, just that subjectively, this version is worse than
doing things the boring way.

~~~
huhtenberg
If you don't understand that tag_union(&a, SomeThree) throwing a compile time
error is functionally equivalent to static_assert(), then I see no point in
continuing this exchange. Ditto for the invariant checking - OP's original
code already covered that with its assert(). In fact, that assert() is what
this whole thing is about.

All points in your opening "garbage" comment were trivial nitpicks, which
would've been fine if you didn't decide to express them by trashing other
person's code in an overly confident manner. I don't appreciate that, just as
I don't appreciate you trying to wiggle out from this exchange on
technicalities.

~~~
klodolph
Yes, you're right, tag_union does throw a compile time error, I didn't read
that correctly. By your earlier comment I was primed for a `_Static_assert`
and when I didn't see that. I thought it had been left out entirely rather
than reimplemented in a different manner.

On the subject of tone--I honestly believe that the original code has severe
subjective design flaws and objective implementation errors, and since the
post was gathering upvotes but no critique I was worried that people believed
that the code was correct. There are still some nitpicks about your new
version, but these are fairly minor in my view (the use of reserved
identifiers is pretty tedious and easily avoided). The remaining non-nitpick,
what I consider to be a show-stopper, is that you can't switch on the tag. 90%
of the time when I'm using a tagged union, I'm doing switch (evt->type). While
we can substitute a chain of if/else, I'm not fond of the ergonomics and I
would prefer a traditional enum. This is a subjective trade-off, however.

Your responses would have been fine if they were criticisms of my argument or
of the tone I took. Perhaps I crossed the line, and you called me on it? But
these comments are unacceptable ad hominems:

> Oh, the horror. It's C, a language that generally expects you to know what
> you are doing.

> Wit beyond measure, fascinating.

------
k__
Is there a reason, other than jargon, why this concept is called different
names in different languages?

Do variants work different than tagged unions or type classes?

Some people told me that Haskell is superior in FP because of its type
classes. Later I read TypeScript has tagged unions, which looked like type
classes to me.

Then I looked into Reason and they wrote about variants and I also was
reminded of type classes.

~~~
greydius
Type classes are a completely different concept. Tagged unions are sum types.

~~~
k__
I see.

Care to explain what type classes are then? :)

~~~
comex
They’re known in other languages as interfaces, traits, or protocols: just a
set of methods which can be implemented for different types.

Importantly, it’s possible to implement type classes for existing types in a
separate declaration, as opposed to traditional OO languages where a list of
implemented interfaces has to be declared upfront as part of the class
declaration. Thus, for instance, a client of a library can declare its own
type class and implement it for the library’s types. (Many languages other
than Haskell support this too, though.)

~~~
k__
So they're a bit like open classes in Ruby or implicit conversions in Scala?

~~~
jnbiche
If you're familiar with Scala, then type classes in Haskell are pretty close
to traits in Scala (but _not_ implicit conversions, which can make use of
traits, but are an orthogonal language feature). They're also close to Rust's
traits, it you're familiar with Rust.

~~~
k__
I see. Thanks.

------
valbaca
Any drawbacks of using these #defines? It looks awesome

    
    
        #define var __auto_type
        #define let __auto_type const

