0. The issue is real and interesting, especially (for me) its implications for reproducible builds.
1. The issue affects many other programming languages besides Rust.
2. The issue has not been ignored by the Rust team.
3. The headline is a clickbait cheap shot. Whether Rust is "most loved" on Stack Overflow has nothing to do with filepaths in binaries. The article author is just trying to inflame passions and get people outraged.
Not a cheap shot at the authors as you're right, other languages have or had this issue. But it still existing years later feels a bit ignored.
Note that since then, other people have opened other issues, which haven't been closed, and we continue to talk about it.
(Also, while I said in the thread and at the time "I am not aware of a flag or anything to turn this off, though I may be wrong.", at least today, that flag does exist.)
https://github.com/rust-lang/rust/issues/41555: "Tracking issue for Path Prefix Remapping", stabilized in 2018.
Solution to that and more problems: build in a clean chroot or container with the minimum of dependencies installed.
That's better anyway, not only due to leakage of build directory paths, but due to some projects just auto-picking up dependencies and enable the respective features if they find the libraries and the header files in the build system. For example, QEMU does this, but is not the single one to do so, but probably one where one notices this so much as it has just so many damn features!
Naturally one can disable all unwanted features explicitly in the build system, but as every release three new one are added, and it's just better to have a clean environment anyway (avoids "oops, I installed a newer GCC version to test something, and broke ABI by accident"), that is a valid solution too in my book.
1. We care about reproducible builds.
2. There is a flag to re-map these paths, as is pretty standard among other compilers. https://reproducible-builds.org/docs/build-path/
3. Bugs happen sometimes, and if that flag isn't working, or if there are other issues related to being non-reproducable, we'd appreciate folks filing them.
As always, happy to talk about any of this. I'll be around.
Since this is rust, I'm even more offended from a software quality perspective than the privacy perspective. It's just sloppy.
But then the problem with automatic path remapping is that they can break you "automatic navigate to the problematic pice of code by clicking on the path" functionality many people use.
Here's a leaked CIA Wiki/CMS thread from 2015 discussing some of the fuckups found by Kaspersky in NSA ("Equation Group") malware binaries: https://wikileaks.org/ciav7p1/cms/page_14588809.html
>This is PDB string, right? The PDB path should ALWAYS be stripped (I speak from experience. Ask me about Blackstone some time.). For Visual Studio user mode stuff, the /DEBUG linker switch should NOT be used. For drivers, it's a bit harder to avoid it, but a post-build step using binplace will strip the path information.
>For other strings generally, yeah, search the binary for them. Don't use internal tool names in your code.
Examples from Kaspersky's report (https://securelist.com/files/2015/02/Equation_group_question...):
>"Timeout waiting for the “canInstallNow” event from the
"Implant" is NSA lingo for "persistent malware". Probably best not to include routines for helpful error messages in your production-build nation-state malware code.
Conspicuous all-caps NSA codename.
"The codename GROK appears in several documents published by Der Spiegel, where “a keylogger” is mentioned. Our analysis indicates EQUATIONGROUP’s GROK plugin is indeed a keylogger on steroids that can perform many other functions."
Follows NSA's internal username / @nsa.gov pattern.
(The whole thread's an interesting read in general. You can even hover over all the myriad acronyms/initialisms to see what they mean. Also kind of amusing to see the unsurprising rivalry and passive-aggressive shade between CIA's and NSA's cyber operations teams.)
Especially in things like legal opinions, company media releases, ...
- If you ever build, pack or otherwise "prepare" any kind of software always do so in a container "CI-style" (but not necessary in a CI, for many more reasons then just privacy).
- Always considers stripping debug symbols from binaries if you will distribute them using a post compilation step, potentially also keep the un-stripped binary for yourself for certain debug purposes.
- Generally consider using non-relevant user names on single-user desktop systems, there are ton's of programs leaking your user name.
One step of making a build bit-for-bit reproducible is stripping out the leading parts of paths. I suppose you could achieve this with tooling that always builds in /opt and only links to /opt (or something), but the article says no such tooling exists.
Total anecdote and a bit of a strawman, but the only folks I've seen complain are random commenters that say something like "oh well it has privacy concerns so I can't use it." Which is a bit absurd.
If I remember correctly, Stuxnet was identified as being a US/Israeli joint cyberattack by, among other things, common home directory names in debug build strings.
>"If it helps, you including user ids like this violates GDPR... so this should be addressed by the rust team."
This isn't true: GDPR covers entities that collect, store, and use user information ("data controllers"). Rust's developers are not data controllers; they're compiler developers. Maybe you could argue that crates.io has inadvertently become a data controller because they store these packages, but that still seems like a stretch. Regardless, the inadvertent leak of user information is directly tied to the legitimate purpose of allowing the compiler's panic machinery to output a usable backtrace. So I don't think you can scream "GDPR" just to get your Github issue more attention.
crates.io distributes packages by source, so I'd imagine it doesn't have any info regarding the username of the system the package was uploaded from (although it does obviously have the username associated with the crates.io account, but there's no reason those have to be the same)
Why so? It's not different from any other serivice where people submit things. It shouldn't matter if it's a binary, text, or a video, and I've heard opinions that the contents users submit need to be monitored for personal information anyway.
However, can someone elaborate the actual (practical) implications. I.e. problems that emerged from such an issue in the past and caused significant damage in some way.
It's not really much on it's own, but it could definitely be used to make other security issues more effective.
perl -pi.bak -e 's|/home/kfairmasterz|xxxxxxxxxxxxxxxxxx|g' a.out
This is not only a privacy concern but also a concern on efficiency to me. Sure on a desktop application, who cares, but since Rust is targeted also for embedded development, on a firmware where you have a few kB of flash memory wasting space for strings like paths on your system doesn't make a lot of sense.
In embedded you can use various options to reduce binary size, including the ones that get rid of these paths. I commented on this a few days ago over on reddit: https://www.reddit.com/r/rust/comments/m0irjk/opinions_on_ru...
pretty much every build system will by default leak paths and times and whatnot from the builder, and need to be configured to not do that. AFAICT, the open issues are that some people want that to be the default and a few people are upset that changing the default requires an RFC.
I'm pretty sure they have 200000 other problems to care about too.
Why should they start with this?
This smells like bullshit to me, but I don't know enough about GDPR to say for sure. Can anyone comment?
edit: more broadly, can free software (ie: rustc), run on individual developer machines, actually run afoul of the GDPR somehow?
The difficulty of a workaround is irrelevant for the nature of the problem. Also as the article mentions this is undocumented behaviour, so unless rustc/cargo builds everything in containers by default this workaround should not be required.
And while I'm not an expert in GDPR details it is still very problematic as the file path to the compiled project could contain an arbitrary selection of very personal information that would then be leaked.
Sure you can build it in a container, you could also run a car assembly line in a cleanroom. It's just ridiculous to have that as requirement to not secretly leak paths.
But in any case, it wouldn't be rustc distributing your built code.
it's equivalent to saying that your self-hosted Postfix servers are violating GDPR because they includes information about which devices your message passed through on your local network during delivery in the header.
More precisely the debug information about dependencies can contain the absolute path to that dependency.
Note: The github.com-... is not leaking private information.
As you can see this might leak:
- your home directory and with this implicitly your user name.
- the CARGO_HOME path
- the dependency the binary uses, including it's version (but there are many ways you can potentially find that out, still a leak).
Especially in case of `path` and `git` dependencies this might leak more information.
BUT for debug builds I would prefer it to be this way as it allows me easily to navigate to the file which caused the panic. As this are potentially files not relative to the project root using relative path wouldn't work!!
But then on release builds I would prefer them to be obfuscated by default (maybe with an option to disable it).
There are various ways how you sometimes can not have this options included, e.g. by changing how panics work, so there is no need to include this. But they seem to not work always and are not reliable.
Still lets be honest building any binaries you plan to distribute not in CI or at least a container is generally a terrible idea so in practice this shouldn't be to much of an issue, which is also why no one bothered fixing it in 4 years (Rust isn't sold by a company, so if anyone wants fixes they either need to do them themself or hope someone else happen to fix them because they are interested in it being fixed).