Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Rust doesn't compile that way -- you can't compile individual modules at once, only the entire crate.

I think we have a terminology mixup. I was using 'module' in the win32 LoadModule() sense: a shared dynamically loaded library (ie. a .DLL in windows or .SO in linux.) I'm not sure how Rust crates (or other compilation units) map to those - my guess would be that a given crate will be compiled into (in win32 terms) a .exe .lib or .dll

I /think/ the Rust equivalent of the case I'm describing would be that you have a struct that's part of the public API of a crate, and it's being used across multiple crates in a large project where you don't want to fully recompile the world in order to test your changes.

> Of course you may be in a situation where you can't rely on the debuginfo (stripped binary or something?), in which case this will be annoying. But it's really a similar situation as you have with inlining when you don't have debuginfo.

In my C++ experience there end up being plenty of cases where it's really useful to be able to inspect raw memory (ie. hex dump, with no debugger or without enough context for the debugger to help you) and figure out what was going on. Obviously Rust is designed to dramatically reduce the frequency of that kind of debugging, but to me this still feels more like a simple-vs-easy trade off [1] than a strict win.

> The presence of ADTs in Rust mean that the layout of many types isn't immediately obvious without debuginfo anyway.

Pardon my Rust ignorance, but is this scenario significantly different from C++ templates? The layout of a (judiciously) templated C++ class may not be "immediately obvious" but in practice it's often still very straightforward to infer.

[1] https://www.infoq.com/presentations/Simple-Made-Easy



There are two specific cases here where the layout is not obvious.

The first is the null-pointer optimization (I think this is the official name but I swear I question myself every time I mention it), in which we use knowledge that an inner struct contains a reference to avoid enum discriminants. that is, Option<i32> will have an extra field up front saying if it's None or Some, but Option<&i32> will just encode None as the null pointer because references can't be null. This also optimizes something like Result<&i32, ()>. The net result is that a lot of stuff that looks expensive is basically free. There has been discussion of extending this to use multiple pointers so that we can hit more complicated enums like Option<Option<(&i32, &i32)>>, but this has thus far not happened.

The second is enums themselves. The discriminant algorithm is not obvious. If you want a discriminant of a specific size, you can pick it with a repr. But otherwise it's implementation defined.

And there is one third thing we have discussed doing but haven't yet. If you have a bunch of enums nested inside each other, having multiple discriminants is a waste. There is no reason the compiler can't just collapse them down into 1 in a lot (but not all) cases.

For anyone who wants to know the specific algorithm for all of this, it's now all in one place: src/librustc/ty/layout.rs


> I'm not sure how Rust crates (or other compilation units) map to those - my guess would be that a given crate will be compiled into (in win32 terms) a .exe .lib or .dll

You're correct.

> but to me this still feels more like a simple-vs-easy trade off [1] than a strict win.

If you're meaning the easy side is stopping people having to reorder fields themselves, it's more than that: generics plus C++-style monomorphisation/specialisation mean there are cases when it's impossible for the definition of the type to choose the right order. For instance: given struct CountedPair<A,B> { x: u32, a: A, b: B }, all three of CountedPair<u64, u64>, CountedPair<u64, u8> and CountedPair<u16, u8> need different orders.


> I think we have a terminology mixup.

Not really -- my core point was that C++ compilation units are usually smaller than Rust.

Most C++ codebases I've dealt with will be of the kind where there's a single stage where all the cpp files get compiled one by one. Not a step by step process where one "module" gets compiled followed by its dependencies.

For these codebases, you have a huge win if you can touch a header file and only cause a small set of things to be recompiled. For Rust codebases, it's already a large compilation unit, so you're usually already paying that cost (and with incremental compilation the compiler can reduce that cost, but smartly, so you get a sweet spot where you're not compiling too much but are not missing anything either).

But yes, being able to skip compilation of downstream crates would be nice.

(You're right that a crate is compiled into a .exe or .so or whatever)

> Pardon my Rust ignorance, but is this scenario significantly different from C++ templates? The layout of a (judiciously) templated C++ class may not be "immediately obvious" but in practice it's often still very straightforward to infer.

ADTs are tagged unions. There's a tag, but it can sometimes be optimized out and hidden away elsewhere.

You can mentally unravel templates to figure them out. Enums are a whole new kind of layout that you need to understand.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: