1. Mismatched type definitions in different translation units caused an implicit 32 to 16 bit conversion.
2. An addition to the 16 bit value overflowed, causing it to become negative. (The “run for 10 days” thing helped the value get to the point of overflowing.)
3. The negative value led to an out of bounds write, corrupting a key bookkeeping data structure.
Rust would have failed to compile at step 1, and would have panicked at 3 (and potentially 2, depending on compilation settings). I’m not sure what other safe languages are candidates for this domain, but I would suspect these issues would likely be similarly identified.
That bug, unfortunately was identified after release. The release blocking heap corruption bug I mentioned was due to a C++ object being deleted when it called back to an event handler and wrote to its member variables after the handler returned. Ownership and lifetimes would have prevented this design error (which is surprisingly not that uncommon).
> Mismatched type definitions in different translation units caused an implicit 32 to 16 bit conversion.
Rust would not have failed to compile at step one necessarily. Or were you assuming a full revamping of the build and dependency management systems in addition to the rewrite of the program itself?
If we need to clean up the build and dependency management system to start using Rust correctly, isn't that two major migrations?
> Rust would not have failed to compile at step one necessarily.
True. In this case there was old code and new code. I assumed in a hypothetical hybrid Rust codebase, bindgen would be used to bring the types in old code to Rust, so the compiler would identify the type mismatch in perhaps the impl From.
> If we need to clean up the build and dependency management system to start using Rust correctly, isn't that two major migrations?
At the scale of Office, managing the build system is an evergreen project staffed by dozens of engineers. Adding support for Rust integration is a typical deliverable for such an org.
> My point was that data can be corrupted due to logic bugs as well
For sure. But in the 10+ years I worked on Office, I spent far more time per UB bug than logic bugs. 1 UB bug could take weeks to fix. I don’t recall any logic bug taking more than a few days.
Developer productivity would be much improved if logic bugs were the only class of bugs.
1. Mismatched type definitions in different translation units caused an implicit 32 to 16 bit conversion. 2. An addition to the 16 bit value overflowed, causing it to become negative. (The “run for 10 days” thing helped the value get to the point of overflowing.) 3. The negative value led to an out of bounds write, corrupting a key bookkeeping data structure.
Rust would have failed to compile at step 1, and would have panicked at 3 (and potentially 2, depending on compilation settings). I’m not sure what other safe languages are candidates for this domain, but I would suspect these issues would likely be similarly identified.
That bug, unfortunately was identified after release. The release blocking heap corruption bug I mentioned was due to a C++ object being deleted when it called back to an event handler and wrote to its member variables after the handler returned. Ownership and lifetimes would have prevented this design error (which is surprisingly not that uncommon).