So that function checked if the following C code compiled, and only in that situation enabled the landlock?
Except that lone period, hard to recognize because of its small size and proximity to the left edge of the diff, caused the C code to become always invalid, hence keeping the landlock always disabled?
That's both vilely impressive and impressively vile. I didn't even spot it on my first read-through.
It also shows how brittle these sorts of checks are in general. I really think that this auto-disable of features based on compilation was a big mistake. It is not only very annoying when packaging things (Why does feature X not work? Because you forgot to include libY in the build environment of course) but they also fail silently even if intended. While this `.` was almost certainly malicious it is easy to imagine an accidental break. Maybe a minor typo or `-Werror`is enabled any one of these functions was marked as must use or something similar.
It seems like it is infinitely better to have a default set of features and explicitly require the user to say `--enable-x --disable-y` as needed. Then error out if a required library isn't available. (Preferably saying which feature requires it.)
It's also iffy tbh. that the compilation check functionality:
- doesn't force users to differentiate between syntax errors and other errors (e.g. symbols not found). Partially you can't even make it work properly if you want due to C macros.
- it seems sensible, tbh. if "does compile" checks for compatibility seems sensible then there is so much wrong with the ecosystem in so many ways
It's standard autoconf stuff, BUT... the right thing to do for a security option should be to throw an error at the configure stage If the Option is requested but can't be supported, And not to silently turn it off.
That is because a human should have to Look at the build and manually make the decision to switch off a security feature from the build.
Disclosure: I've got zero C/C++ on my resume. I was asked to diagnose a kernel panic and backport kernel security patches once, but it was very uncomfortable. ("Hey, Terr knows the build system, that's close enough, right?")
That said, perhaps something like disabling the default -fextended-identifiers [0], and enabling the -Wbidi-chars [1] warning.
Not yet. I was working on the standardization for C23, but this was postponed to C26, and has not much support. MSVC and SDCC liked it, CLANG and GC not so much.
It seems like there should be a way to catch these types of “bugs” - some form of dynamic analysis tool that extracts the feature detection code snippets and tries to compile them; if they fail for something like a syntax error, flag it as a broken check.
Expanding macros on different OSes could complicate things though, and determining what flags to build the feature check code with — so perhaps filtering based on the type of error would be best done as part of the build system functionality for doing the feature checking.
I'd prefer if the ecosystem standardized on some dependency management primitives so critical projects aren't expected to invent buggy and insecure hacks ("does this strong parse?") in order to accurately add dependencies.
It would be interesting to see what the most common compile feature checks are for, and see what alternative ways could be used to make the same information available to a build system — it seems like any solution that requires libraries being updated to “export” information on the features they provide would have difficulties getting adoption (and not be backwards compatible with older versions of desired dependencies).
At least for newer C++ standards it seems like there is decent support for feature test macros, which could reduce the need for a feature check involving compiling a snippet of test code to decide if a feature is available: https://en.cppreference.com/w/cpp/feature_test
Handling the output from several of the most recent GCC and Clang versions would probably cover the majority of cases, and add in MSVC for Windows. If the output isn’t from a recognized compiler or doesn’t match expectations, the only option is falling back to current behavior. Not ideal, but better than the current status quo…
That would only work for projects that care only about current compilers, where C in general has more desire to support niche compilers.
A mitigation here would be to make result of autoconf only provide instructions for humans to change their build config, instead of doing it for them silently. The latter is an anti-pattern.
FWIW, the approach you propose is how the UI tests for rustc work, with checks for specific annotations on specific lines, but those have the benefit of being tied to a single implementation/version and modified in tandem with the app. Unless all compilers could be made to provide reasonable machine readable output for errors, doing that for this use case isn't workable.
Well - that would be exactly the point of the attacker. GCC errors out, and it does not matter whether this is because the intentionally typoed header does not exist or because non-English characters are not allowed. Errors encountered during compilation of test programs go to /dev/null anyway, as they are expected not to compile successfully on some systems - that's exactly the point of the test. So no, this option would not have helped.
Probably the code review tools should be hardened as well, to indicate if extended identifiers had been introduced to a line where there wasn't any. That would help catching the replacement of a 'c' character with a Russian one.
Btw, the -fno-extended-identifiers compiler parameter gives an error if UTF-8 identifiers are used in the code:
<source>:3:11: error: stray '\317' in program
float <U+03C9><U+2083> = 0.5f;
> Probably the code review tools should be hardened as well, to indicate if extended identifiers had been introduced to a line where there wasn't any.
Maybe in the future more languages/ tools will have the concept of per-project character sets, as opposed to trying to wrangle all possible Unicode ambiguity problems.
I suppose then the problem is how to permit exceptions when integrating with some library written in another (human) language.
Or we could just accept English as the lingua franca of computing and not try to support anything other than ASCII in source code (at least not outside string constants). That way not only do we eliminate a whole class of possible exploits but also widen the number of people who can understand the code and spot issues.
The -fno-extended-identifiers option seems to do something in this area, but I don't know if it is sufficient. But it may block some characters which are used in standard English (for some values of standard).
> But it may block some characters which are used in standard English
So what? Source code was fine with ASCII for a long time, this push for unicode symbols is a recent endeavor and IMO a huge mistake not just because of the security impliciations.
I mean, GCC erroring is how the exploit works here. cmake tries to compile the source code: if it works then the feature is available, if it fails then the feature is not available. Forcing the build to fail is what the attacker wants.
Another point where I appreciate the Rust compiler:
warning: the usage of Script Group `Cyrillic` in this crate consists solely of mixed script confusables
--> src/lib.rs:1:4
|
1 | fn SYS_landloсk_create_ruleset() {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: the usage includes 'с' (U+0441)
= note: please recheck to make sure their usages are indeed what you want
= note: `#[warn(mixed_script_confusables)]` on by default
The character was in a string, not directly in what was being compiled. The contents of the string failing to compile was the point, as landlock was then disabled.
From what I understand, this landlock disabling wasn't relevant to the sshd attack. It appears it was setting up for something else.
After the concept of such attacks had blow up idk. 1?2?3? years ago a lot of places now warn at least when mixing "languages"/ranges or sometimes on any usage of non us-ascii printable characters.
So stuff like that is increasingly more reliable caught.
Including by 3rd parties doing any ad-hoc scanning.
Through adding `compiles check` fail with syntax errors definitely should be added tot he list of auto scan checks, at least in C/C++ (it's not something you would do in most languages tbh. even in C it seems to be an antipatters).
Sure; but so long as the compiler error is silently discarded (as it is here), the configure script will assume landlock isn't available and never use it.
A misplaced punctuation has some plausible deniability. Like the author could say he was distracted and fat fingered nonsense, honest mistake.
A utf character from a language that has zero chance of being mapped to your programmer's keyboard in the middle of a code line, that would be obvious intentional tampering or at the very least raise some eyebrows.
> A utf character from a language that has zero chance of being mapped to your programmer's keyboard in the middle of a code line, that would be obvious intentional tampering or at the very least raise some eyebrows.
I accidentally get Cyrillic "с" in my code few times a year. It's on the same key as Latin "c", so if I switch keyboard layout, I don't notice until the compiler warns me. Easy to do if I switch between chats and coding. Now my habit is to always type a few characters in addition to "c" to check what layout I'm _really_ using.
Granted, it's easier with a one-letter variable called "c", but with longer names I can easily see myself not typing "c" the first time (e.g. not pressing hard enough), starting build, chatting in some windows, getting back, facepalming, "fixing" the error by adding "c" or "с" depending on my keyboard layout.
even worse, I have Punto switcher that automatically switches language when I start typing. With default config, it changes latin c to russian one because russian language includes word "c" while it's non-sense in English.
and since it's the only Cyrillic character that's placed on the same key as the same-looking english character, I don't even see the problem with my eyes when the autoreplacement fires
I mean, I agree that the punctuation mishap is a better cover story, but why would any particular language have “zero chance” of being mapped to the programmer’s keyboard?
> why would any particular language have “zero chance” of being mapped to the programmer’s keyboard?
You left out a very important qualifier: in the middle of a code line
The chances of someone accidentally fat-fingering a keyboard layout switch, typing one character, then switching back again without noticing while typing a line of code are very slight indeed.
Plus the fact that you'd need to speak a language written in that alphabet to have a plausible reason for having the keyboard layout enabled in the first place.
Yeah, if anything, python worsens the situation.
I had a friend DOS our server because he accidentally inserted a tab, causing the illusion that one statement was inside a block but was actually outside it.
He swore off python at that point.
I personally avoid the language, but I understand due to issues like that these days mixing tabs and spaces is an error (or is it just a warning?) by default. Regardless, still pretty silly to me to have whitespace play such a major significant role, besides the fact that I find it visually harder to read, like text without punctuation or capitalisation.
Mixing tabs and spaces usually throws a runtime exception. I'm not gonna make a value judgement about that, but your story doesn't make sense based on how I understand py3
Edit, sorry, shoulda read your whole commebt before replying
Yep. It was a few years ago while that was stilled allowed (as I'd noted ;) ) but regardless. Significant whitespace is just annoying.. There's a lot of things that render as whitespace, and source code one might be reviewing could be printed wrapped or copied and pasted in odd ways. Other languages are more robust to this.
I feel that the main point still stands, though. Saying that Python doesn't have a whitespace problem because you can send the code through a tool that detects whitespace-related problems still acknowledges the existence of said problem.
Well, in C the visual indentation and the meaning of the code (given by {}) can diverge. That's even worse, and happens in practice. Many style guidelines for C have evolved specifically to address that problem with 'the language itself'.
The compiler doesn't check your indentation at all. (OK, not true, these days you get warnings for misleading indentation.) But here's an example of misleading indentation in C:
if(some condition)
do something; {
do something else;
}
do another thing;
You can stretch 'some condition' out over multiple lines and have some more parents inside of it, to make it more confusing.
Not necessarily. In practice, in C-as-actually-used people (should) set up linters and formatters, so that you can rely on indentation.
When programming, this means that you can behave as if both curly braces and indentation are significant, and you get an error when they are out-of-sync.
I think its more of an tooling issue. If your editor / diff viewer cant differentiate between different whitespaces get a better one. Also if you want to ensure some function isnt in local scope etc just test it.
> There's a lot of things that render as whitespace
Like what? As you note, mixing tabs and spaces is now an error.
I've never understood the objection to semantic whitespace. Complaining about having to indent code correctly strikes me as being akin to complaining about having to pee in the toilet instead of the bathtub.
there is a huge difference between having a coding standard (handled easily by a linter on commit in most languages) and making a particular whitespace indentation a critical language feature.
Bonus, if there's a good reason to change the linter format, you can do so.
Rust handles this rather well I think.
Which non-printing characters are you talking about? Whitespace characters are very much printable.
Yes, I agree that Python should just forbid tabs. As a second best, you can tell your linter (and/or formatter) to forbid tabs (and any weird unicode whitespace). That's basically the equivalent of compiling your C with -Wall.
This was an issue in Python 2, where for some dumb reason it allowed mixing tabs and spaces and equated tab to 8 spaces (I think). Python 3 doesn't have that issue.
Of course, if one was coding an exploit, one could still use python2. It is still commonly available due to a long tail of legacy scripts and in some cases (like a script I use routinely but didn't write) the difficulty of porting it to python3 (I've asked over a dozen pythonistas over the years, they kept running into same problems)
Regarding your almost query. There was a debate over Ogham space mark in unicode.
It is considered whitespace though, with the rationale that it is sometimes visible, but sometimes invisible. Depending upon whether the text has a stem-line.
That doesn't make the set of non-whitespace delimited languages empty.
Perhaps there is one with an always-visible delimiter that didn't get the whitespace justification, but does at least give one human language delimited by a printable character (which happens to be whitespace).
That's reasonably sane of SQL. In Slang, you don't need to quote. (The syntax is still unambiguous. In principle, eg Python could do something similar, because they don't have any existing syntax where you just put two identifiers next to each other with only a space in between. But C could not, because variable declaration is just two identifiers, one for the type and one for the variable name, next to each other with a space in between.)
In Slang, are “x y” with different number of spaces in the middle different identifiers or different spellings of the same identifier? SQL standard says different identifiers
> eg Python could do something similar, because they don't have any existing syntax where you just put two identifiers next to each other with only a space in between
The interaction with keywords would cause some issues. For example, right now, “if” is not a valid identifier (keyword), but “if_” and “_if” are. However, with this proposal “x y” could be a valid identifier, but “x if” would introduce ambiguity
> But C could not, because variable declaration is just two identifiers, one for the type and one for the variable name, next to each other with a space in between
This is one aspect of C syntax I have never liked. I always wish it had been Pascal-style `x:int;` instead of `int x;`
> In Slang, are “x y” with different number of spaces in the middle different identifiers or different spellings of the same identifier? SQL standard says different identifiers
Sorry, I don't remember, and I can't seem to find it out online. I could ask my friends who still work there, if it's important. (For what it's worth, I never remember anyone complaining about mixing up a different number of spaces in their variable names. So either all number of spaces were treated the same, or perhaps multiple spaces in a row in an identifier were just banned (via linter or via language?).)
> The interaction with keywords would cause some issues. For example, right now, “if” is not a valid identifier (keyword), but “if_” and “_if” are. However, with this proposal “x y” could be a valid identifier, but “x if” would introduce ambiguity
Yes, you would need to sort out these details, if you wanted to add this 'feature' to Python.
> This is one aspect of C syntax I have never liked. I always wish it had been Pascal-style `x:int;` instead of `int x;`
I'm glad Rust made the same decision.
I do like using the 'space operator' to denote functions calls, at least for a language like Haskell or OCaml.
humans are weird creatures sometimes. there was this bad thing that happened that won't happen again now, but now I can't use the thing forever because Reasons.
In Python, you only need to indent the `def test_foo` by an additional whitespace, to make it a locally scoped function instead of a proper test function.
No, not if you have any sane linter or formatter involved. They wouldn't let you get away with indenting by a single space, but only by multiples of whatever you normally use in the rest of your program.
I mean, some CI system should be checking that "if_code_compiles()" blocks compile somewhere. It should be an error until the CI system has that header and can test both variants.
People are really quick to add optionality like this without understanding the maintenance cost. (Every boolean feature flag increases the number of variants you need to test by 2!) Either make a decision, or check that both sides work. Don't let people check in dead code.
AST diffs instead of textual diffs might have helped here (to spot the `.` making the code un-compilable).
Edit: oof, though the stray character in question is inside a perfectly legitimate C string, so to catch this, any such diffs would need to Matroyshka down and that seems unsolvable / intractable.
> Edit: oof, though the stray character in question is inside a perfectly legitimate C string, so to catch this, any such diffs would need to Matroyshka down and that seems unsolvable / intractable.
Not sure, you could also just forbid code that's too complex to analyse without going down a rabbit hole. Instead of trying to go down the rabbit hole.
In general, it's hard to analyse arbitrary code. But we don't have to allow arbitrarily complex code when doing code review.
I think main issue was that it was embedded in the file itself like that. Would have preferred to have it in a separate valid C file with syntax highlighting etc and being parsed from that file.
That’s what makes this so clever: these systems were born in the era where you couldn’t trust anything - compilers sometimes emitted buggy code, operating systems would claim to be Unix but had weird inconsistencies on everything from system calls to command line tool arguments, etc. - so they just try to compile a test script and if it fails assume that the feature isn’t supported. This is extremely easy to miss since the tool is working as it’s designed and since it’s running tons of stuff there’s a ton of noise to sort through to realize that something which failed was supposed to have worked.
You could just run tests for the feature detection on a known system or a dozen(VMs are cheap). The big problem is that most code is not tested at all or test errors are flat out ignored.
Nothing “just” about it. VMs weren’t cheap when the C world started using autoconf - that was a feature mostly known on IBM mainframes – and in any case that couldn’t do what you need. The goal is not to figure out what’s there on systems _you_ control, it’s to figure out what is actually available on the systems other users are building on. Think about just this library, for example, your dozen VMs wouldn’t even be enough to cover modern Linux much less other platforms but even if it was (drop most architectures and only support the latest stable versions for the sake of argument), you’d still need to build some reliable prober for a particular bug or odd configuration.
The problem here wasn’t the concept of feature detection but that it was sabotaged by someone trusted and easily missed in the noise of other work. What was needed was someone very carefully reviewing that commit or build output to notice that the lockdown feature was being disabled on systems where it was fully enabled, and we already know that maintainer time was in short supply. Any other approach would likely have failed because the attacker would have used the same plausible sounding language to explain why it needed to be a complicated dynamic check.
That's kind of the point. It's feature detection code. If the code cleanly compiles, the feature is assumed supported, otherwise, it's assumed not present/functional. This pretty common with autotools. The gotcha here is this innocuous period is not supposed to be syntactically valid. It's meant to be subtle and always disable the feature.
Shouldn't whatever is depending on xz supporting landlock be verifying that it's the case through blackbox tests or something? Otherwise a check like this even without the bug could end up disabling landlock if e.g. the compiler environment was such that a given header wasn't available...
Unicode lookalikes would be detected by IDEs and other tools.
There would be plausible deniability in a different situation, but this is the same author who implemented the backdoor and several similar changes that disable security features. I don't think the benefit of doubt is deserved here.
A lot of IDE configurations, such as VSCode on the default config, highlight Unicode characters that are invisible/look like others (eg. cyrillic characters or non-breaking space) or even all non-ASCII characters, regardless of whether they're in a string or not.
In either case, it would appear as added code, not different characters in the same code, so maybe about the same. If someone changed one character in an existing string though, I think it would be more likely caught visually by someone used to seeing accidental garbage show up
Ok there's a larger question here about the bazaar software development method.
How can we ensure say, that Microsoft doesn't pay someone to throw a wrench in libre office development or Adobe to sabotage Gimp?
There's lots of deception strategies for bad faith actors and given the paucity of people who actually do the work, it's really really hard to be picky.
Especially with the complexity of library dependencies. Defect-at-a-distance may actually be a common strategy and this is the first that was caught
>How can we ensure say, that Microsoft doesn't pay someone to throw a wrench in libre office development or Adobe to sabotage Gimp?
Microsoft and Adobe have reputations to uphold long into the future.
Is that infallible? Hell no it isn't, but consider that Jia Tan only needed to uphold his reputation insofar as getting his backdoor onto everyone's systems. Once that is done, his reputation or the lack thereof becomes a non-issue because his work is done. We're lucky we caught him just before the finish line.
The likelihood of demonstrably reputable parties like Microsoft and Adobe poisoning the well is practically nil because they don't have a finish line.
They can secure the dominance of their offering against the open source competition for well under 500k a year. It's a no brainer.
What this might look like would be say, poorly discernable icons, clumsy UI design, or an unstable API that makes plugins constantly break. Large volumes of documentation that are inadequate or inaccurate in critical places, etc.
If I was malicious I'd pay someone to write otherwise functional code and bug fixes but make really user hostile decisions whenever possible.
We should be diligent for this in a few key projects. Tech companies could easily be doing this already.
Well since you seemingly want to paint Microsoft and other such companies in a bad light, let me point out to you that it's actually Microsoft who brought awareness to this very problem: Andres Freund works at Microsoft.[1][2]
It is probably prudent for you (and other like-minded individuals) to be more careful who you think your enemies really are. Infighting against friends and allies, or even neutral parties, is exactly what your real enemies would want you to do.
It would be hard for them to not get caught with their hands in the cookie jar - corp employees have high turnover, and low loyalty past their employment.
Even paying a cutout consulting company to do it would be iffy since so many finance employees would be involved in paying them, it would leak sooner than later - or at least raise a lot of questions that would be hard to answer legitimately. Being a public company, also hard to avoid auditor scrutiny.
Even a private company would have an easier time.
Nation state actors though? No such issues - that’s literally why security clearances, compartmentalization, and ‘you never really leave’ are done the way they are.
And they don’t have to worry about market blowback unless they get really excessively heavy handed (like France did in the late 80’s-90’s). They are setup to do shady stuff, it’s their raison d'etre.
There are levels of skull duggery. Hiring someone to pretend to work for a competitor while secretly sabotaging them is a whole other level of skullduggery with a lot of liability attached. I don't think that would be worth it to them.
You say all this after recent documents were revealed about Facebook intercepting and analyzing Snapchat encrypted traffic via a man-in-the-middle conspiracy.
> Maybe I'm out of the loop but is intentionality settled here?
Yes. The exploit used a payload, that was stored in the tests directory as a binary compression test case, but which is very clearly a very intentional exploit payload.
All the sneaky stuff was about deploying that payload without it being obvious.
I'd never suspect this to be intentional if I'd spot it in a patch, even given the consequences in this particular case.
I have written and committed things into code instead of writing it into some other window several times. Without a linter I probably wouldn't spot an extra dot when reviewing my own change before sending it out.
I love this type of hindsight to 20-10 comment. "If I saw...". That is a BIG if. Plenty of smart people on that mailing also missed it. I missed it myself when I opened the HN lead link. Very subtle.
It may be hard for him to re-establish trust. Maintaining xz for more than a decade then doing this would be quite a "long con" but if HN threads are any indication, many will still be suspicious.
His commits on these links look legit to me. It's a sad situation for him if he wasn't involved.
Honestly, he should call it quits and just drop xz utils and move on with life.
He maintained it for 15 years and struggled to get anyone else to help until Jia showed up. Meanwhile the Linux ecosystem depended on it.
The Linux ecosystem will either figure shit out and maintain it or move into a different archive library.
I worry about his home/personal system, this guy seems to have befriended Lasse in order to get the position, is it possible Lasse himself has been compromised?
I'd be nuking everything, possibly even disposing of hardware and setting up a new verifiable online identity on a clean system/hardware.
It’s a giant block of new code, what is it supposed to do beyond tell you it’s a giant block of new code?
Note that this is C code inside if a cmake string, even if your diff can do highlighting the odds it would highlight that are low, and if it did highlighting is generally just lexing so there wouldn’t be much to show.
I don't love unified diffs as a rule either. They're very noisy to read in general.
A side by side in something like "meld" will highlight changes but also means you're reading the full context as it will exist in the code base.
My number one complaint about "code review" is the number of people who simply run their eyes over the diff (in private orgs) because management says "code review" and that's as far as they ever take that in terms of defining outcomes.
The function is “check_c_source_compiles”. The comment indicates that the intention is to confirm that the Landlock functionality can be compiled on the system, in which case it will be enabled.
The stray dot isn’t valid C, so it will never compile. By ensuring it can never compile, Landlock will never be enabled.
This is not something that a unit test can catch. First, this 100% coverage rule applies to the program/library code, and only to the subset that is not ifdeffed out (otherwise, you will not be able to have, e.g., Windows-specific code), and definitely not to the build system's code. Second, how would you test that landlock works, in a unit test, when this feature is optional and depends on the system headers being recent enough? You can't fail a unit test just because the software is being compiled on an old but still supported system, so it would be a "SKIPPED" result at best, which is not a failure and which is not normally caught.
The proper way would be to have a minimum glibc version (or whatever it depends on) where you expect landlock to be available and then shout loudly if it is not so that you can either fix the check or correct your expectations. This isn't just for malicious users, these checks can be brittle enough that a small change in the library or even compiler update can occasionally break something. Of course this is ideal and does not match common practice. I can't even claim of doing this consistently myself although I did start that practice before this mess.
I used to and it's actually how I got a large part of my computer skills, but unfortunately I got medicated for ADHD and became normal and can't do it anymore.
But I still think reading boring logs is a great way to understand a system. Turning on all the intermediates for a compiler (llvm ghc etc) is very educational too.
The actual compiler output from the autoconf feature test is one of the things I'd probably look at fairly early on if some feature is disabled when it shouldn't be, but I maybe have a bit more experience running into problems with this than younger folks.
It feels like this is a poor way to determine if a feature should be enabled. There are just too many reasons it could fail, and default off on compilation failure just seems like a glaring security bug. If this is a common practice, it seems like the community should rethink this choice.
I don't think it fails the build. It's part of a test, trying to compile a bit of code. If it compiles the test is true and a certain feature is enabled. If the compilation fails the the feature is disabled.
This is quite common in build systems. I had such a test produce incorrect results in the kernel build system recently. The problem is that the tests should really look carefully for an expected error message. If the compilation fails with the expected message the test result is false. If the compilation fails for some other reason the build should fail so the developer is forced to investigate.
Disclaimer: I have not studied the xz build system in detail, just what I saw from the diff plus a bit of extrapolation.
Problem is that every compiler/compiler version will have different messages for the same error. And xz is the kind of project that gets built on really really random homegrown compilers. Maybe we need an option to make the compilers output a standard JSON or whatever...
> Maybe we need an option to make the compilers output a standard JSON or whatever...
While I like the idea, I fear this would go the same route as UserAgent strings, with everybody giving carefully crafted lies. The main issue is that bugs are only known in retrospect.
As an example, suppose Compiler A and Compiler B both implement Feature X. Therefore, they both produce JSON output indicating that they support Feature X. A year later, developers for Compiler A find and fix a bug in its implementation of Feature X. Now, how do we handle that case?
* Update the version for Feature X as by Compiler A? Now Compiler B is publishing the same version as the buggy Compiler A did, and would get marked as not having a usable implementation.
* Update the spec, forcing a version update for Feature X in both Compiler A and Compiler B? Now a bugfix in one compiler requires code updates in others.
* In the build system, override the JSON output from Compiler A, such that Feature X is marked as unavailable for the buggy versions? Now the JSON is no longer trustable, as it requires maintaining a list of exceptions.
* Try to trigger the bug that Compiler A just fixed, so that the build system can recognize the buggy versions, regardless of the JSON output? Now we're back to the starting point, with features determined based on attempted compilation.
There I am, scanning carefully, and I see a period where one clearly should not be. "This wasn't so hard", I said to myself.
I poked my screen, and the period moved.
Curse you, monitor dust particle.
I can't believe that system security is dependent on such a loose chain of correctness. Any number of things could have stopped this.
+ # A compile check is done here because some systems have
+ # linux/landlock.h, but do not have the syscalls defined
+ # in order to actually use Linux Landlock.
Fix those headers on those systems to explicitly opt-out. What's the point of headers if they don't declare their capabilities?
Also why isn't there a single test after a binary blob (even when compiled from open source) is made to ensure security is in-tact?
I wouldn't even ship a website checkout without end-to-end tests for core capabilities. There must be a priority misalignment of adding features > stability.
Edit: I hope the 'fix' isn't to remove the '.'--I just saw the other post on HN that shows removing the '.'
You are quoting Jia Tan [1]. The malicious actor wrote that comment when deliberately breaking the check in the first place.
Fixing headers or extra tests would not have prevented this, as there is no indication the headers were broken in the first place, and extra tests could have been compromised (or ignored for release tarball) some other way.
The better fix would be to move to proper dependency management and depend on a version range of a dependency instead of hoping you can do a better job of modelling the same with strings hardcoded into a CMakeLists.txt.
Tangentially related, but I've had a case where GCC 9.4.0 shipped a broken arm_acle.h header [https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100985] - code which includes the header would always fail to compile.
Since users could be trying to compile it on GCC 9.4.0, and as the functionality dependent on the header isn't critical to the application, if the build detects the header is broken, it just disables the functionality and emits a warning.
Back to xz: If security isn't your top priority, it seems reasonable to just ignore the failure of an optional feature and just move along. Of course, in hindsight, it's easy to point fingers, but without that, it doesn't sound completely unreasonable to me.
... there's a huge bias in things that get tested (and how they get tested). easy to test things get tested. there's a lot more developer, committer, maintainer for web stuff. there's a cultural issue, it's hard to improve on old ossified processes/projects, etc.
Isn’t it a bit ironic with how much code everyone depends on that can freely be altered by some unknown party, while so much time goes into code reviews to verify internal changes at most companies.
you can have ten comments about the name of a variable, but no one bats an eye at a new npm package introduced. Also, devs that wrote code that Google depends on can't pass the leetcode gate check to get a job there.
The last sentence is an overreach to me, but I have experienced much of the same bike-shedding during code reviews. 95% of them are useless. Read that twice; I am not joking, sadly. I am not against code reviews, but in my experience(!), the reviewers are not incentivized to do a thorough job. Seriously, if you deliver a new feature vs do a deep, difficult code reviews, which one benefits you more? To repeat: I don't like it; I am only pointing out the perverse incentives to rush during code reviews.
One more thing that I don't see enough people talking about here: Writing good software is as much about human relationships (trust!) as it is about the computer code itself. When you have a deeply trusted, highly competent teammate, it is normal to put less effort into the review. And to be clear, I am talking about the 99% scenarios of writing the same old CRUD corporate code that no one gets excited about here. Please don't reply to this post with something like, "Well, I work on Dragon at SpaceX and all branches have 2x code coverage and the QA team has 600 years of combined experience... Yada..."
One 1 hour doing code review is not really stolen from doing feature work really is it? For the vast majority its stolen from playing video games or some other non-work.
That heavily depends on the individual developer and the organization in question.
In general, the most highly skilled developers who are most capable of doing a thorough code review are also the ones who are most likely to be genuinely over capacity as is.
I'm not sure about your experience, but companies which have strong code review practices also have strong controls on the third party code. In terms of review granularity, it makes more sense to be critical of maintenance/readability for code you actually own and maintain. Third party code has a lower bar and should, although I also believe it needs to be reviewed
Yeah I think this is the common case. I think we usually trust that dependency A took a look at their dependency B and C before releasing a new version of A. And even if properly reviewing our bump of A, how often do we check out changes in B and C
Edit: yes for FAANG-ish companies this is usually a bit different, for this reason. And licenses..
The problem niver would have been fixed in proprietary software. And it's unlikely the problem would have been considered anything more than a 0.5s startup delay in some situations if xz were proprietary; it would have been reported as a performance issue to the malicious maintainer, who would have treated it as such and improved the startup time.
I just mean fewer total packages and fewer maintainers. Linux libraries and packages don’t have the culture of making a package out of a single small function and importing it everywhere, which is part of the reason why NPM is a good case study in opportunities for supply chain attacks.
I guess you cannot really use it in a generic library like compression/decompression. The library has no clue what the program is supposed to do and what should be restricted.
For a program it might be clearer.
The sshd attack was using liblzma as a library. So disabling landlock seems unrelated? A sign that there is more bad code waiting to be detected / had been planned to be inserted???
It's pretty impossible to lock down sshd. The restrictions are inherited by all ancestors. What would the sysadmin say if they find themselves in a restricted shell?
Its weird. Like i would consider doing two unrelated backdoor-esque things in the same project really sloopy. Seems like it just significantly increases the risk of being discovered for minimal gain.
Its very confusing. Parts of this sega seem incredibly sophisticated while other parts seem kind of sloppy.
Dude is a developer just like the rest of us. We all try to write clever code from time to time, but at other times write sloppy crap. Especially if there's a boss insisting on a tight deadline.
You can use it and it's a good idea, although it's meant to stop exploits rather than deliberate backdoors.
The idea is that the library starts a separate thread to do any complex operation and then waits for it in the API call the user code made. The thread landlocks itself and then begins the operation. If something goes wrong the code is confined.
Of course in this case it doesn't work because the sandbox is controlled by the attacker, so they can just turn it off or make it weaker than it should be. But you can also do things other ways.
What was even the game here? Eventually even more backdoors, ones that would have more plausible deniability? Afaict neither the oss-fuzz nor this change would actually discover the found backdoor.
But why put your backdoor eggs into one basket (library)?
I wonder why it was useful to prevent Landlock from being enabled in xz. Perhaps a later stage was to inject malicious content into xz archives? But then why not just inject malicious activity in xz itself?
Because a deliberate vulnerability is much easier to hide than actual malicious content.
One could probably sneak a buffer overflow or use-after-free into a C project they maintain without being noticed. Actually shipping a trojan is much harder, as observed with the xz-to-sshd backdoor.
Ah, so the next stage would have been to add a "bug" in xz that would trigger during the supposedly sandboxed execution, when presented with certain input files. Clever.
This is surprisingly obvious. I mean it's a clever technique to disable the feature and a really plausible commit. But then why did they go with an obvious syntax error instead of just misspelling something. E.g. would you have spotted it if the mistake was `PR_SET_NO_NEW_PRIV`?
On an unrelated note, this malware team has assembled a great dataset for training AIs on identifying security problems. Every commit has some security problem, and the open source community will be going through and identifying them. (Thanks, maintainers, for the cleanup work; definitely not fun!)
Currently it doesn't work, but yeah, it'll be really cool when we have tech like that! (It'll still only be able to detect known vulnerabilities, but we don't often invent new ones.)
Like in this article, we have a patch that introduces a security flaw (disabling landlock). We later have a patch that fixes it, specifically. The job of the LLM is to reproduce the fixing patch given the problem patch. Or at the very least, explain that this patch results in landlock always being disabled. To be clear, this problem is much, much harder than the problems LLMs are solving now, requiring knowledge of autotools behavior that isn’t included in the context (identifying that a failed build disables the feature, and that this build always fails).
There was another example where this team submitted a patch that swapped safe_fprintf for fprintf while adding some additional behavior. It was later pointed out that this allows printing invisible characters to the stream, which allows hiding some of the files that are placed when decompressing.
This is why I really don't like configure style build systems that automatically enable / disable features. When I want something I explicitly opt-in for it. If there's good reason for feature to be default, then instead explicitly allow opt-out.
This is a huge pain when packaging things. You set up the package, add dependencies until it builds and think you are done. But they feature X is missing. What? Oh, it was silently disabled because libY isn't available. Ok, go back and add it. Then a user reports that feature Z isn't available...
Yeah, just have a default set of features and allow `--enable-a --disable-b` as needed.
The fact that it silently swallows bugs in the check is just another problem of it.
In my code I have a bunch of routines optimized for different platforms, e.g. using x86 AESNI instructions. Not all compilers support them, and they don't even make sense when compiling for a different CPU architecture.
It's much simpler to say "enable this if we can compile it" rather than detecting the compiler and target platform and throwing a mess of conditional compilation into an autoconf script.
I'd rather have people complaining about having to run configure a bunch of times to disable several features they don't have the libraries for than complaining that a feature doesn't work (because it ended up being disabled without them knowing).
Likewise, I'd rather distros figure out the hard way when a new release has a new feature and needs a new dependency rather than their users complain that a new feature is missing.
I’m not quite following what the diff here is suggesting - was this some cmake magic to detect if a feature was enabled, but the file had an intentional syntax error?
Autoconf and CMake both compile small test programs to verify that the feature in question actually works. The test programs almost never actually do anything; the just refer to the feature of function that they rely on. If there is a compile or linker error then the feature isn’t available. In this case the compiler always outputs an error because the test program doesn’t have valid C syntax.
Of course in practice you shouldn’t be writing each test program by hand. Autoconf has macros that generate them; you would only need to supply an identifier or at most a single line of code and the rest would be created correctly. I’m sure CMake is similar in that regard, so the first red flag is that they wrote a whole test program.
That's exactly right. It was checking if the c code compiled to detect the landlock feature, and there was a single period in the middle of the code that made it always fail to compile and thus silently leave the feature disabled.
Would it be reasonable to expect that this MR comes along with a test that shows that it does the thing it’s claiming to do? I’m not sure how that would work in this case.. have a test that is run on a system that is known to have landlock that does something to ensure that it’s enabled? Even that could be subverted, but it seems like demanding that kind of thing before merging “features” is a good step.
I like the idea of testing build-system behaviors like this, and I don’t think it’s ever really done in practice. Scriptable build systems, for lack of a better name for them, exist at a bad intersection of Turing complete, hard to test different cases, hard to reason about, hard to read the build script code, and most of us treating them as “ugh I hope all this stuff works” and if it does “thank god I get to ignore all this stuff for another 6 months”.
If you mean testing the "disable Landlock if the headers and syscalls are out of sync" functionality then I agree, workarounds for such corner cases are often not fully tested.
But it would have been enough here to have a test just to see that Landlock works in general. That test would have broken with this commit, because that's what the commit actually does - break all Landlock support.
Based on that it sounds like there wasn't a test for Landlock integration, if I've understood things correctly.
Create a tiny fake version of landlock with just the features you're testing for. Since it's only checking for 4 #defines in 3 header files, that's easy.
Then compile your test program against your fake header files (with -Imy-fake-includes). It should compile without errors even if landlock is missing from your actual system.
Then build your test program a second time, this time against the real system headers, to test whether landlock is supported on your system.
I'd say this MR is a bad approach in general. The headers say what interfaces are known, not what features are available. You should be able to compile with landlock support on a system which doesn't enable it. Same situation as seccomp and others. Your build machine doesn't have to match the capabilities of the target runner.
But yeah, to test it, you can have a mock version of landlock which responds with the error/success as you want, regardless of what the system would normally do. It relies on the test not being sabotaged too though...
Read the code of the check again. It mostly checks that the required SYS_* constants are defined to be able to use the syscalls. You can compile this on a system that does not have landlock enabled in the running kernel, but the libc (which imports the kernel system call interface) has to provide the syscall numbers.
This doesn't change my opinion in general - that version should be exposed through a library call and knowing about the specific syscalls shouldn't be needed in xv.
Should've said function call not library call. My bad. Basically if you already have the linux/landlock.h, that should provide everything you need to do without explicit references to SYS...
Now we are running in circles. As you see in the git commit, the compile check was added because the existance of linux/landlock.h alone was not enough to check that the feature can be used.
This header defines the data types for the Linux kernel interface, but not how the syscall landlock_create_ruleset(2) will be issued. That is provided by libc either as a separate wrapper function (does not exist in glibc) or the implementation needs to use syscall(SYS_landlock_create_ruleset, ...), with the constant also being provided by libc. That is how it works for all syscalls and you won't be able to change this.
The only source of the claim that the existence of.linux/landlock.h is insufficient is (AFAICT) the malicious git commit. Why trust the comment, written by the attacker, to explain away a malicious change?
I already explained above why the existence of linux/landlock.h is not sufficient. Why do you still question it? If you know a bit about system programming and how configure checks work, the change in itself is totally reasonable.
Is a compilation-test a legitimate/common/typical method to go about this?
Independently, of the breaking code, to me it seems accidental failing, or even accidentally not failing, would be in the nature of such an assessment... So, this commit seems to raise the question of "why?", even if you missed the dot, doesn't it? If a feature is formally available, but effectively broken somehow, wouldn't you want the compiler to complain, instead of the feature dropped silently? Is the reasoning in the code comment sound? Can you test, if syscalls are defined in another way?
> Is a compilation-test a legitimate/common/typical method to go about this?
Yes—in fact, compilation tests are often the only way you can tell if a feature actually works. It's extremely common for C build systems to detect and work around weird systems.
It’s by design. The job of autotools is to find “ground truth” about whatever environment you’re compiling against. It’s meant to discover if you can use a feature by actually seeing whether it works, not just by allow-listing a known set of compiler or library versions. This is because the whole point is to allow porting code to any environment where it’ll work, even on compilers you don’t know about. Think back to a time when there were several dozen Unix vendors, and just as many compilers. You don’t want your build script to report it can’t compile something just because it isn’t aware of your particular Unix vendor… you want it to only fail if the thing it’s trying to do actually doesn’t work. The only way to do this is by just testing if certain code compiles and produces the expected result.
Some of the checks are there to tell whether or not the compiler even supports your code. You may not be able to compile your code at all, and the job of the build system is sometimes to just emit useful errors to help the person building the code to understand that they need a compiler which supports [language feature X].
Again, this is intended to be portable software. It is designed to work on lots of OS’s, with lots of compilers, in a lot of future environments that don’t even exist yet.
If you have a security feature for example, which uses the pledge() syscall on OpenBSD, but you can only use that feature on OpenBSD systems, you have two choices:
- Conditionally compile it based on whether you’ve detected that this is an OpenBSD target at build time, or,
- Conditionally compile it based on whether some sample code that uses pledge() builds successfully.
You can’t defer this decision until runtime, because it would require linking to pledge() symbols even though they may not exist on this system, which would cause the executable to fail to link at runtime, unless you completely rearchitected to use a plugin model, which is overkill.
So given the above are your main two options, the latter is preferred mainly because it allows new systems to come in and be compatible with old ones (maybe someone adds pledge() support to Linux one day) without having to fudge the uname command or something. This was super important in the early Unix days… perhaps less so now, but it’s still a good way to write portable software that still can take advantage of platform-specific features.
> Again, this is intended to be portable software.
A scathing criticism of the OpenSSL library by the BSD team was that it was too portable in a (very real) sense that it wasn't even written in "C" any more, or targeting "libc" as the standard library. It would be more accurate to say that it was "Autotools/C" instead. By rewriting OpenSSL to target an actual full-featured libc, they found dozens of other bugs, including a bunch of memory issues other than the famous Heartbleed bug.
Platform standards like the C++ std library, libc, etc... are supposed to be the interface against which we write software. Giving that up and programming against megabytes of macros and Autotools scripts is basically saying that C isn't a standard at all, but Autotools is.
Then just admit it, and say that you're programming in the Autotools standard framework. Be honest about it, because you'll then see the world in a different way. For example, you'll suddenly understand why it's so hard to get away from Autotools. It's not because "stuff is broken", but because it's the programming language framework you and everyone else is using. It's like a C++ guy lamenting that he needs gcc everywhere and can't go back to a pure C compiler.
Job ads should be saying: "Autotools programmer with 5 years experience" instead of "C programmer". It would be more accurate.
PS: I judge languages by the weight of their build overhead in relation to useful code. I've seen C libraries with two functions that had on the order of 50kb macros to enable them to build and interface with other things.
C basically has no standard library. It's no surprise to anyone who has ever used it more than in passing that you depend on the chosen build system to replace that. Building portable C libraries is very different because of this from any other commonly used programming language - even C++.
It's ironic to me to say that C has no standard library while at the same time libc is one of the most important libraries that most programs on an installed system can't go without.
So it has one, but it's small. It has a few useful functions, for example system(const char*) which was used by the exploit.
Actually the interaction with libc is not like what you expect from a standard library in other languages. Even apart from being very small, it's not really a single library - you have glibc, musl lib c, the BSDs each have their own libc, Mac OS has its own, Windows has its own. And if you want a very portable C program, you can't assume your program will run with the libc on your system, you need to take into account differences between these. Also, writing C programs that don't use the standard library at all is not unheard of - even apart from C in-kernel or on bare metal. For example, on Windows, libc is just a wrapper over win32, and you can just use that directly and gain much more functionality if you are not going to be portable anyway.
Additionally, libc is typically more of a system component than a part of your program. You can't choose to distribute a libc you prefer with your program and use that, you have to link to the system libc on many OSs. Even on Linux where it's not strictly required, if you use a different libc than the distribution provided one, you can end up in all sorts of problems when you interact with other programs.
AzulJDK and Android JDK would be better examples, as Oracle JDK is just an Oracle-blessed build of OpenJDK.
The difference from libc though is that there is no problem in distributing a program with your preferred JDK, and multiple Java programs can live on the same system while each using its own JDK and even communicate risk free with each other.
Also, different JDKs are significantly more similar in the API they offer to Java programs than different libc are - at least for a common core of functionality.
Which validates my point: OpenSSL is not “C”, it’s not even “Perl”, it’s targeting a bespoke framework and build platform that just so happens to be written in Perl.
> If you have a security feature for example, which uses the pledge() syscall on OpenBSD, but you can only use that feature on OpenBSD systems, you have two choices:
Just in case, I want to note that pledge(2) and unveil(2) are also supported by SerenityOS, so checking only for an OpenBSD target is insufficent.
You'll never make it to runtime if you try to include headers that don't exist. You'll never make it to runtime if you try to link in libraries that don't exist.
Runtime syscall number detection is very common in practice, since the kernel returns ENOSYS to enable that exact ability for glibc and other shim libraries
That's syscall for _availability_ detection, not its number. There's no way to ask the kernel "what's the number of the syscall commonly known as read()".
It's not by design, unlike what siblings say. It's by accident (so, "legacy", as you put it).
The problem is that already in the 80s there was tons of variability from on Unix system (or version of it) to the next, but there was no standard way of representing what features/standards/APIs/libraries a system supported or had installed. When faced with such a mess people wrote code to detect features are present on the target host.
This then got made into tools with libraries of detection code. Think autoconf/autotools.
Now we also have pkgconfig, but it's too late and it was not enough anyways.
Some things you might only detect at run-time, provided that their ABIs are stable enough that you can copy their headers into your application.
It’s by “design”, in the sense that C and C++ provide no better way to really know for sure that the functions you want to call really exist. In more modern languages we rely on metadata and semver, but none of that exists for C and C++.
Very true, but don’t forget that Autoconf checks for “interesting” compiler choices as well as library and OS features. And then there is libtool, which abstracts out the differences between how compilers generate shared libraries so that you only have to understand one way of doing it and it will work on all of them.
- instruction set architecture
- OS versions
- ABIs
- libraries (whether they are installed)
- and what functionality they provide
- commands/executables
- anything you can write a macro to check
All stuff too disparate to reliably have the OS be able to answer every question you might have about it and the stuff installed on it. You can't wait for any such system to learn how to answer the questions you might have about it, so some things you can only detect, either at build configuration time, build time, or run time.
Yea, it’s not impossible to do, just not standardized. If you use a library and it provides useful version information, then definitely use it. It’s just that the language or the tooling doesn’t force libraries to have that kind of metadata. Compare that with Rust, where every library you use must come with a standardized manifest that includes a version number, and where they tell library authors up front that they are expected to follow the semver convention.
The fact is that things have good easier over the last 10 or 20 years. It used to be the case that any program targeting Unix had to either spend a lot of time and energy tracking the precise differences between dozens of different commercial Unices, or use autoconf. Autoconf was the project that combined all of that lore into a single place, so that most people didn’t have to know every single detail. But these days most of the Unices are dead and buried, and 99% of all new projects just target Linux. Kinda sucks if you prefer one of the BSDs, or OpenSolaris/Illumos/SmartOS, but it does mean that new Linux developers never have to jump through those hoops and simply never learn about autoconf. And while on the one hand that represents a loss of knowledge and competency for the community (making this type of supply–chain attack much easier), on the other hand autoconf is in practice an abomination. It is (or at least was) extremely useful, but it was implemented in M4 and reading the source code will literally damage your brain.
> It used to be the case that any program targeting Unix had to either spend a lot of time and energy tracking the precise differences between dozens of different commercial Unices, or use autoconf. Autoconf was the project that combined all of that lore into a single place, so that most people didn’t have to know every single detail.
As a data point, the place I worked for in the mid-90s had a single codebase with something over 600 permutations of supported OS and compiler when you included the different versions. One thing we take for granted now is how easy it is to get and install updates – back then you might find that, say, a common API was simply broken on one particular operating system version but your customers would have to wait for a patch to be released, sometimes purchased, put onto floppy disks or CD-ROM, and then manually installed in a way which had enough risk involved that people often put it off as long as they could. Some vendors also did individual patches which could be combined by the sysadmin so you had to test for that specific feature rather than just saying “is it greater than 1.2.3?”, and it wasn’t uncommon to find cases where they’d compiled a common library with some weird patches so you had to test whether the specific features you needed functioned.
Part of why Linux annihilated them was cost but much of it was having package managers designed by grownups - I remember as late as the mid-2000s bricking brand new Sun servers by running the new Solaris updater, which left them in an unbootable state.
Back in the day when people compiled source from tarballs on their personal machines, the autoconf script would query your system to see what functionality it supported. It did this by trying to compile small programs for each feature. If the compilation failed it assumes the feature is unavailable on your system and a flag is set/unset for the rest of the build.
Have you ever tried to manually build something from a release tarball by starting with ./configure? If so, have you observed how many times the compiler is invoked in this configure phase before you even run make?
That "check_c_source_compiles" function should first test if the provided code snipped is "valid C code in general" and only then check if it compiles in given system.
Optional features should not depend on feature detection, ever. Feature detection for a security feature should be suspicious, even if it works as intended.
Optional features should always be configured by whoever tries to compile the code. There can be defaults, but they shouldn't depend on the environment.
It's in an autoconf style compile test. The compile or not of the snippet is being tested, unfortunately. This just decides the configuration, doesn't stop the build.
Description of Linux's Landlock access control system if you are not familiar with it: https://docs.kernel.org/userspace-api/landlock.html
xz official (maybe...) incident response page: https://tukaani.org/xz-backdoor/