Library users can force a compile error when new variants get added, using a lin...

WiSaGaN · 2025-03-03T03:34:29 1740972869

Does this require nightly? If so, #[warn(clippy::wildcard_enum_match_arm)] will do the samething but no need for nightly, and from clippy instead of rustc natively.

codetrotter · 2025-03-03T03:56:09 1740974169

That's pretty neat. I still don't completely understand why #[non_exhaustive] is so desirable in the first place though.

Let's say I am using a crate called zoo-bar. Let's say this crate is not using non-exhaustive.

In my code where I use this crate I do:

  let my_workplace = zoo_bar::ZooBar::new();
  
  let mut animal_pens_iter = my_workplace.hungry_animals.iter();
  
  while let Some(ap) = animal_pens_iter.next() {
      match ap {
          zoo_bar::AnimalPen::Tigers => {
              me.go_feed_tigers(&mut raw_meat_that_tigers_like_stock).await?;
          }
          zoo_bar::AnimalPen::Elephants => {
              me.go_feed_elephants(&mut peanut_stock).await?;
          }
      }
  }

I update or upgrade the zoo-bar dependency and there's a new enum variant of AnimalPens called Monkeys.

Great! I get a compile error and I update my code to feed the monkeys.

  diff --git a/src/main.rs b/src/main.rs
  index 202c10c..425d649 100644
  --- a/src/main.rs
  +++ b/src/main.rs
  @@ -10,5 +10,8 @@
             zoo_bar::AnimalPen::Elephants => {
                 me.go_feed_elephants(&mut peanut_stock).await?;
             }
  +          zoo_bar::AnimalPen::Monkeys => {
  +              me.go_feed_monkeys(&mut banana_stock).await?;
  +          }
         }
     }

Now let's say instead that the AnimalPen enum was marked non-exhaustive.

So I'm forced to have a default match arm. In this alternate universe I start off with:

  let my_workplace = zoo_bar::ZooBar::new();

  let mut animal_pens_iter = my_workplace.hungry_animals.iter();

  while let Some(ap) = animal_pens_iter.next() {
    match ap {
      zoo_bar::AnimalPen::Tigers => {
        me.go_feed_tigers(&mut raw_meat_that_tigers_like_stock).await?;
      }
      zoo_bar::AnimalPen::Elephants => {
        me.go_feed_elephants(&mut peanut_stock).await?;
      }
      _ => {
        eprintln!("Whoops! I sure hope someone notices this default match in the logs and goes and updates the code.");
      }
    }
  }

When the monkeys are added, and I update or upgrade the dependency on zoo-bar, I don't notice the warning in the logs right away after we deploy to prod. Because the logs contain too many things no one can go and read everything.

One week passes and then we have a monkey starving incident at work.

After careful review we realize that it was due to the default match arm and we forgot to update our program.

So we learn from the terrible catastrophe with the monkeys and I update my code using the attributes from your link.

  diff --git a/src/main.rs b/src/main.rs
  index e01fcd1..aab0112 100644
  --- a/wp/src/main.rs
  +++ b/wp/src/main.rs
  @@ -1,3 +1,5 @@
  +#![feature(non_exhaustive_omitted_patterns_lint)]
  +
   use std::error::Error;
   
   #[tokio::main]
  @@ -11,6 +13,7 @@ async fn main() -> anyhow::Result<()> {
     let mut animal_pens_iter = my_workplace.hungry_animals.iter();
   
     while let Some(ap) = animal_pens_iter.next() {
  +    #[warn(non_exhaustive_omitted_patterns)]
       match ap {
         zoo_bar::AnimalPen::Tigers => {
           me.go_feed_tigers(&mut raw_meat_that_tigers_like_stock).await?;
  @@ -18,8 +21,12 @@ async fn main() -> anyhow::Result<()> {
         zoo_bar::AnimalPen::Elephants => {
           me.go_feed_elephants(&mut peanut_stock).await?;
         }
  +      zoo_bar::AnimalPen::Monkeys => {
  +        // Our monkeys died before we started using proper attributes. If they are hungry it means they have turned into zombies :O
  +        me.alert_authorities_about_potential_outbreak_of_zombie_monkeys().await?;
  +      }
         _ => {
  -        eprintln!("Whoops! I sure hope someone notices this default match in the logs and goes and updates the code.");
  +        unreachable!("We have an attribute that is supposed to tell us if there were any unmatched new variants.");
         }
       }
     }

And next time we update or upgrade the crate version to latest, another new variant exists, but thanks to your tip we get a lint warning and we happily update our code so that we won't have more starving animals.

  diff --git a/wp/src/main.rs b/wp/src/main.rs
  index aab0112..4fc4041 100644
  --- a/wp/src/main.rs
  +++ b/wp/src/main.rs
  @@ -25,6 +25,9 @@ async fn main() -> anyhow::Result<()> {
           // Our monkeys died before we started using proper attributes. If they are hungry it means they have turned into zombies :O
           me.alert_authorities_about_potential_outbreak_of_zombie_monkeys().await?;
         }
  +      zoo_bar::AnimalPen::Capybaras => {
  +        me.go_feed_capybaras(&mut whatever_the_heck_capybaras_eat_stock).await?;
  +      }
         _ => {
           unreachable!("We have an attribute that is supposed to tell us if there were any unmatched new variants.");
         }

But what was the advantage of marking the enum as #[non_exhaustive] in the first place?

ffminus · 2025-03-03T05:22:50 1740979370

It lets you have a middle ground, with the decision of when breaking happens left up to library users. Without non_exhaustive, all consumers always get your second scenario. With non_exhaustive, individual zoos get to pick their own policy of when/if animals should starve.

Each option has its place, it depends on context. Does the creator of the type want/need strictness from all their consumers, or can this call be left up to each consumer to make? The lint puts strictness back on the table as an opt-in for individual users.

saagarjha · 2025-03-04T10:22:28 1741083748

Swift does this with unknown default.

kelnos · 2025-03-03T05:34:27 1740980067

Consider a bit of a different case. I run a service that exposes an API, and some fields in some response bodies are enums. I've published a Rust client for the API for my customers to do, and (among other things) it has something like this:

    #[derive(serde::Serialize, serde::Deserialize)]
    pub struct SomeEnum {
        AValue,
        BValue,
    }

My customers use that and all is well. But I want to add a new enum value, CValue. I can't require that all my customers update their version of my Rust client before I add it; that would be unreasonable.

So I add it, and what happens? Well, now whenever my customers make that API call, instead of getting some API object back, they get a deserialization error, because that enum's Deserialize impl doesn't know how to handle "CValue". Maybe some customer wasn't even using that field in the returned API object, but now I've broken their code.

Adding #[non_exhaustive] means I at least won't break my customers' code when I add a new enum value.

sophacles · 2025-03-03T16:23:17 1741018997

It's really nice when doing networking protocols and other binary formats. Lots of things are defined as "This byte signifies X : 0 == Undefined, 1 == A, 2 == B, 3 == C, 4-127 == reserved for future use, 128-255 vendor specific options".

This allows you to do something like:

    #[derive(Clone, Copy)]
    #[repr(u8)]
    #[non_exhaustive]
    pub enum Foo {
        A = 1,
        B,
        C,
    }
    
    impl Foo {
        pub fn from_byte(val: u8) -> Self {
            unsafe { std::mem::transmute(val) }
        }
    
        pub fn from_byte_ref(val: &u8) -> &Self {
            unsafe { std::mem::transmute(val) }
        }
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn conversion_copy() {
            let n: u8 = 1;
            let y = Foo::from_byte(n);
            assert!(matches!(y, Foo::A));
    
            let n: u8 = 4;
            let y = Foo::from_byte(n);
            assert!(!matches!(y, Foo::A) && !matches!(y, Foo::B) && !matches!(y, Foo::C));
            let n2 = y as u8;
            assert_eq!(n2, 4);
        }
    
        #[test]
        fn conversion_ref() {
            let n: u8 = 1;
            let y = Foo::from_byte_ref(&n);
            assert!(matches!(*y, Foo::A));
    
            let n: u8 = 4;
            let y = Foo::from_byte_ref(&n);
            assert!(!matches!(*y, Foo::A) && !matches!(*y, Foo::B) && !matches!(*y, Foo::C));
            let n2 = (*y) as u8;
            assert_eq!(n2, 4);
        }
    }

This lets you have a simple fast parsing of types without needing a bunch of logic - particularly in the ref example. Someone else sent you data over the wire and is using a vendor defined value, or a newer version of the protocol that defines Foo::D? No big deal, you can igore it or error, or whatever else is appropriate for your case.

If you want to define Reserved and Vendor as enum attributes, now you have to have logic that runs all the time - and if you want to preserve the original value for error messages, logs, etc - you can't Repr(u8) and take up more memory, have to do copies, etc.

    #[non_exhaustive]
    pub enum Foo {
        Undefined =0,
        A = 1,
        B,
        C,
        Reserved(u8),
        Vendor(u8),
    }
    
    impl Foo {
        pub fn from_byte(val: u8) -> Self {
            match val {
                0 => Foo::Undefined,
                1 => Foo::A,
                2 => Foo::B,
                3 => Foo::C,
                4..=127 => Foo::Reserved(val)
                128.. => Foo::Vendor(val)
            }
        }
    }

You also need logic to convert back to a u8 now too.

It's not strictly necessary, but it certainly makes some things far more ergonomic.

sophacles · 2025-03-03T21:56:38 1741038998

Looking at my code that works on this stuff - the above is just wrong. I was looking at my failed experimental branch not the actual code that does this. The above is a fun way to introduce all sorts of UB.

Apologies for my pre-coffee brainfarts.

codetrotter · 2025-03-04T01:59:16 1741053556

How does the working code look?