Hacker News new | past | comments | ask | show | jobs | submit login
No POST after rm -rf / (archlinux.org)
422 points by wsx on Jan 31, 2016 | hide | past | favorite | 169 comments

Reading poettering's response (https://github.com/systemd/systemd/issues/2402) I'm left feeling like he needs a thorough "Mauro, SHUT THE F* UP" response from Linus himself (https://lkml.org/lkml/2012/12/23/75)

It is not a systemd bug to mount efivars read/write. The efitools - efibootmgr et al - require write access to that table. By the spec, this should not brick computers.

The problem is not systemd, its disastrous proprietary UEFI implementations that are shipping the most insecure and awful code in the world.

The problem is we cannot fix this for 9233. MSI will absolutely refuse to disclose the firmware to his laptop so that he can make it so his replacement does not also brick itself. People have been treating coreboot / libreboot like a joke for a decade, but this is exactly why those projects matter and why the continued erosion and trend towards firmware blobs and proprietary bootloaders cripples individuals control of the hardware they supposedly own.

Its the John Deere tractor problem, but until enough people care - I mean, enthusiasts and techies already don't care, and we would need a popular general consumer movement to care to inspire real change - it will only get worse.

All the 802.11 AC wireless NICs in the Linux kernel use firmware blobs. As of Skylake, there is not a single GPU supported on x86 systems in Linux that does not use firmware blobs. Almost every Chromebook is shipping Coreboot with cancerous unauditable firmware blobs. Samsung SSDs have bricked themselves because of their proprietary firmware blobs. Its a constant endemic problem yet nobody cares to put their money where their mouth is.

Yes, firmware blobs suck, no argument there. But forget about that aspect for a moment, and consider the firmware a mostly unchangeable part of the hardware.

Hardware has bugs. A lot of hardware has had bugs for a long time. Linux has had tables of "quirks" for hardware pci ids / usb ids / etc. for a long time, for thousands of buggy hardware devices it needs work-arounds for. Some of those bugs are really in hardware, some are in the firmware loaded on the hardware, it doesn't really matter. This is a pervasive reality, and it can't just be demanded that the user get hardware which is not "shitty" by this metric ... it's all a trade-off.

And finally, I've used linux on bios systems and efi systems, and I've never needed efivars mounted, I've always set up the bootloader some other way (which was simpler for me to control and manage as I prefer). My personal biggest complaint about systemd is how it automatically mounts and starts and uses all kinds of stuff that I don't need. I prefer to set up what I need and want, and not have anything else cluttering up my system, just waiting to cause serious reliability or security problems, and getting in my way when I'm debugging something else.

So I'll be up-voting all stories about "systemd did something automatically and on some systems it was unfortunate" because yeah, UNFORTUNATE STUFF HAPPENS WHEN STUFF HAPPENS AUTOMATICALLY. This is why I left windows and OS X in the first place! So I had easy and convenient control over my computer! And now it takes extra effort to override and disable all the crap that systemd is doing automatically, and I resent it. (I actually already have a script on my systems that unmounts efivars and pstore and some other unneeded filesystems after boot.)

Systemmd is exposing something much deeper and icky as a normal file system object. That is beyond wrong.

It is the linux kernel who does that. Systemd just mounts what the kernel exposes. You can't be all unixy and everything is a file, without file systems.

I think the blame is put on the wrong party here. Systemd mounts efifs so that only root can rw to it, root rm the fs, hardware is affected. Places where there is a bug, hardware+firmware + unix idea that everything must be a file. Systemd follows the specs and behaviors that are expected of it. If the EFI fs should not be a FS complain to Linus. If the EFI should not brick itself after a rm / complain to EFI developers.

Systemd already took the reasonable security precautions. Root can rm anything it wants on Unix systems. rm efi fs is dangerous, so only root can do it. If root does it then all bets are off. Root needs to be able to write to the fs, per api and other tools needs.

People are quick to blame Poetering, but it is Linus who is leading the project which has the design decisions that are causing problems.

In the end it sounds like 3 projects needs to change their code to avoid an issue with a user mistake/bad firmware combo that only avoids that issue an a blue moon Monday.

All in all the usual storm in a tea cup against systemd. In that case it is funny because the issue is that systemd is to unixy :) everything is a file (system).

Everything you say is true. It would still be better for systemd to mount that filesystem read-only, and for the few (root-only) utilities that need to write to it to remount it read-write, write to it, and remount read-only.

Another reasonable option would be for a distribution to include SELinux policies that allow only the blessed tools (grub-install, systemctl, etc) to write to that filesystem. It would be a big change, though, because most distributions leave root and normal user logins unconstrained.

Except you only say this in hindsight. When developers were adding the efi mounts the only sane and reasonable assumption was "the bios is providing us write access to a table in its internal memory, this must be configuration data unrelated to the actual boot process and only needed for communication of data between bios and os".

In hindsight you can say "we should have mounted efi ro and made our efi tools much more complex so they can remount it rw write to it and then remount it ro again" but that is asking the systemd and efitool developers to have been prophets that motherboard firmware makers would be so disastrously inept and stupid they would put essential parts of their boot process in a shared variable table with the OS.

Its like saying you cannot ever introduce a memory clear into a GPU shader because you might wipe the firmware and the GPU could explode when its fan shuts off. You have to have the reasonable expectation the API you are working with is conforming, and when it is not you need to raise hell about it, and in the short term introduce the work around until the upstream problem is resolved.

The issue here though is that the upstream problem is proprietary farmed out awful firmware that restricts your freedom and bricks your hardware because nobody can audit or fix its terrible design.

Oh, no, I'm not saying what they did wasn't reasonable at the time, nor am I saying that the blame doesn't lie with the firmware vendors. Just that now, we know how bad the problem is, and a few possible workarounds.

Not everyone who runs linux wants to learn what is necessary to spend hours doing whatever the opposite of "automatic" is.

I feel for those people, I do. But in this market, you can't get something which does the right thing, without hours doing the opposite of "automatic". Such an offering does not exist. You can live with a system that has various issues due to malware, or major updates breaking things you depended on, or get someone who did spend many many hours learning to spend a couple hours fixing your problems.

OS X isn't too bad of an approximation though, if you don't sign into iCloud, turn off spotlight integration with internet services (see the recent story about people's macs crashing due to a malformed response from the server!) and don't link your iPhone to a mac (or don't have one). Finally, don't update to the next release until just before it's replaced by another, e.g. don't upgrade from 10.10 to 10.11 until just before 10.12 is released (but do install security updates of course).

But then again ... you'd have to be a bit of an expert to know whether you could trust my advice, and if not, who to trust, especially since you'll definitely get contradicting advice from similarly-knowledgable-sounding people ... so you're ultimately on your own.

The road to compile your own kernel and build your own distribution is well documented and there is even several books on the topic. Compiling your own system from scratch without the aid of a distribution makes it easier to appreciate the automatic defaults that usually gets installed.

I personally do not want to be forced to know all the hazardously settings that exist in the kernel and base system. I want sane defaults, defaults that are set automatically when I install my Debian packages. If I disagree with an automatic default, my options are to either utilize the control (freedoms) that FOSS software provides, or issue a bug report and try get the community to agree with my views.

If "rm -rf /" is a valid use case/concern, send up a bug report. Its better than trying to remove automatic defaults from distributions.

> I personally do not want to know all the hazardously settings the kernel has, where even a single mistake could spell the ruin of the whole machine.

That's kind of an ironic argument to make, given that you are using it to argue in favour of a design where a single mistake can actually brick your machine to the point that fixing it needs a soldering iron.

And you think in this case allowing the user to brick the system is the sane default, instead of putting it hinter an (easy to flick) switch?

It is behind and easy to flick switch called: su or sudo. The claim is that systemd should add a second switch because the first one is "not enough" that the systemd devs disagree with.

There are more operations that root can do that can brick a system or destroy hardware. Why should systemd try even harder to make root not do that?

Because working with su/sudo is still something that's often enough required for normal operations, that IMHO shouldn't have side-effects of that level. The "with great power..." spiel sudo displays is nice, but it isn't just experienced sysadmins running sudo anymore.

Since the OS doesn't provide permission levels to express this difference, it makes sense to create that isolation otherwise.

I've run rm -rf as root in the wrong directory before, and nuked stuff that required a backup to fix. I'd prefer if everything worse than that required some extra mental confirmation that, yes, I'm sure I want to do that.

There is always SELinux which can limit root. It was fairly easy to setup last time I tested it, and there has been attempts in the past to put it in as default.

A lot of distros also alias "rm" to "rm -i", something that many users explicitly disable. Its a complex problem of security vs usability where most discussions has been rehashed several times.

Personally i find rm too accepting, and rm -i too restrictive.

Using rm on its own will happily perform the command without further verification.

On the other hand, rm -i will request a yes/no on every last file involved.

Personally i have taken to using mc for any "complex" file system manipulations.

I sympathize with you that the UEFI implementations are bad, but I don't agree that it's an excuse for everything on top of UEFI to be bad.

I think the two competing ideas are:

Either you make libraries that strictly conform to the spec and are "technically perfect", leaving any bugs that stem from non-compliance to the violating parties (i.e. the UEFI implementors)

Or, you make libraries that conform to the spec and have some "dirty" handling to work around holes left by crappy UEFI implementations.

The first options feels great to write as a developer, but the second option is what most users really need - it's the "it just works" factor that users care about. I know it hurts to code around shitty implementations, but there's no other alternative if reliability and idiot-proofing matters.

The best I can suggest is to make the workarounds "pluggable" so that developers don't have to deal with the harsh realities unless they specifically go looking for the plugged-in workarounds.

Working around ACPI bugs has been a big part of platform support pretty much from the inception of ACPI. It's already been a fact of life that systems ship with firmware that only just barely manages to not completely break with the intended OS (Windows du jour, or OS X for Macs) while Linux needs piles of workarounds. But UEFI significantly expands the exposure surface of bad firmware and the situation truly is moving toward being untenable.

The computer industry really needs to overhaul how system firmware is developed, tested, and deployed. Linux devs and users aren't the only ones feeling the pain; they're just the ones who are both skilled enough to trace the problem back to firmware, and inclined to rant about it publicly.

Its a cultural problem in a lot of these companies (Foxconn, Samsung, etc). The software is a cost center, the hardware is the product. So the developers who write the code are rarely best in class or treated well. They don't have any influence in their companies to cause change. The suits would rather have everything proprietary with a flashy gif riddled loud glossy UEFI graphical screen than actually working because it looks good in board meetings.

We have seen this happen elsewhere, and seen how it turned out. Browser rendering engines used to be NIH proprietary behemoths and IE6 was the pinnacle of the failure of that model. Graphics APIs and hardware are another great example - game developers are writing broken shaders that driver developers need to work around because the API is broken and the hardware is undocumented proprietary bullshit. Then you end up with insanity like Nvidia having the de-facto standard OpenGL implementation in the industry such that a ton of OpenGL programs break on AMD drivers because they don't have nuances and breakages of the Nvidia one.

But gradually people realize that its not a profit center to be a closed source broken douchebag. Either the industry goes through the pain of correcting the damage (gecko / webkit / blink / etc) or just abandons the insanity (Metal / Vulkan / DX12 / Mantle).

The thing is we are not putting any financial pressure on motherboard makers to not be complete proprietary douchebags. Libreboot and Coreboot are at best infant hobbyist projects with extremely low penetration and they take half a decade to support products or just ship firmware blobs that were half the problem in the first place anyway (cough, Chromebooks). Network routers used to be that way, until the grassroots OpenWRT projects were so succesful manufacturers started making sure their routers were supported (TP-Link / Linksys) or even in the best scenario selling hardware running open source firmware (Buffalo). This needs to happen with x86 chipsets and motherboards desperately because UEFI is making the ACPI mess worse when the answer has been obvious this whole time - open documentation and specifications and working together rather than working in secret.

Which also reminds me of something else, which is the problem of multiple hardware vendors making extremely similar commodity "beige boxes" that can only compete on price. One of the entries on my wishlist for an Intel/AMD branded laptop is a "high quality UEFI BIOS": https://www.reddit.com/r/hardware/comments/42cnbq/would_inte...

As an aside, it's still crazy and undocumented in the video-game world.

God forbid you should try and play a new AAA game without the latest patches for your driver!

The problem there is largely game writers doing stupid/non-standard things. The driver writers then special-case the game in their drivers to patch the wrong behaviour. That's why many driver releases from nVidia have only one changelog entry, "optimised for <popular AAA game here>", and coincide with the launch of the game.

This is wrong on so many levels. It makes it really hard to compete with nVidia/AMD when you need to hire hundreds of driver writers to patch every game under the sun if you want to have reasonable performance.

Sadly MS may well have been the very entity that institutionalized this behavior.

For example, they included an exception in their memory protection code for SimCity. This because said game held a "use after release" memory bug that had gone undetected. And rather than have the game crash on later MS product releases, MS put in said piece of code.

> Or, you make libraries that conform to the spec and have some "dirty" handling to work around holes left by crappy UEFI implementations.

It's a long-standing tradition to have workarounds in kernel modules for broken hardware.

Hopefully there is a sufficently-well-funded testing lab out there to get a complete list of UEFI implementations that are broken in this way so that the next rev of the uefifs module can properly refuse to do whatever it is that's bricking these dangerously broken motherboards.

> but there's no other alternative if reliability and idiot-proofing matters.

But that is my whole point - there is. We can need market pressure that it is unacceptable to sell computer hardware whose controlling software's source is undisclosed.

There is no solution for this user, but settling for broken implementations and broken libraries is the stuff of the Windows world where everyone behaves like a baby who refuses to share their toys.

This isn't Facebook or any consumer proprietary product where the software is the product. This is hardware controlling firmware that is never pretty or sexy and whose implementation is not a trade secret or the difference between selling units.

MSI will absolutely refuse to disclose the firmware to his laptop so that he can make it so his replacement does not also brick itself

The IBM PC, XT, and AT came with a physical copy of the complete source code for the BIOS, several hundred pages in a 3-ring binder. (You can get a digital copy of it here: https://archive.org/download/bitsavers_ibmpcat150ferenceMar8... ) IBM did this, despite the fact that they strongly recommended applications use the BIOS interfaces and not access the hardware directly. They could've saved so much paper if they just documented the API, but they didn't --- they released the whole thing.

Yet 30 years later, in an era where it's easier than ever to distribute large amounts of information, companies regard such details as confidential and proprietary, hiding them even from their customers --- the real owners of their products. It costs next to nothing to distribute a DVD with each machine containing all the source code and documentation, or just put it up for download. Unfortunately the only recourse seems to be the occasional leak[1], and what's more disappointing is the overwhelming response that this is a "bad" and "insecure" thing, when it's really liberating what users should've received with their hardware.

[1] http://adamcaudill.com/2013/04/04/security-done-wrong-leaky-...

It wasn't useful for cloners either anyway, as it was copyrighted.

Well it could be used to write a spec sheet that then was used on clean room implement a clone. but i guess there was a risk that some terminology or similar could leak through.

Compaq managed it with their "Portable".

> but until enough people care - I mean, enthusiasts and techies already don't care, and we would need a popular general consumer movement to care to inspire real change - it will only get worse.

Create a virus that goes around messing with UEFI vars to brick the machine. Don't damage any user data, just the BIOS.

It'll get fixed.

Not difficult to do it under Windows either BTW.

> It is not a systemd bug to mount efivars read/write. The efitools - efibootmgr et al - require write access to that table. By the spec, this should not brick computers.

Why a filesystem and not a library? Write access doesn't require a filesystem mount.

Part of the UNIX philosophy is to expose as a file what can be exposed as a file. That way, open, read, write and close is all you need.

Is it? I understand there are many things in UNIX that are exposed as file like objects, but is it part of the philosophy that this should be done with anything that it _can_ be done on? Can you cite me any sources?

I've heard and read that for Plan 9, but I too would like the source for UNIX.

Everything a file, even the computer itself: https://media.giphy.com/media/P7PmvHY6kzAqY/giphy.gif

That's not a real good reason given the damage and since we're talking about systemd.

Given that much of the criticism of systemd is based on, "it's not like Unix," that certainly seems like enough of a reason to me. People would complain even more if systemd required a custom library to install your bootloader than they do when calling rm -rf / on a system without GNU extensions (--preserve-root by default) bricks a motherboard.

FreeBSD uses a library and I don't think it suffers from criticism that its not UNIX. UNIX isn't Plan 9, and exposing something that can brick your system in the filesystem is just irresponsible. You can blame the vendor, but it should have never gotten as far as the vendor's bad decisions.

> It is not a systemd bug to mount efivars read/write. The efitools - efibootmgr et al - require write access to that table. By the spec, this should not brick computers.

How often do you need to run efitools though? Should it always be mounted read/write for the one or two times in the lifetime of the system that you need to adjust the boot variables? Wouldn't it be more reasonable for efitools to suggest you might need to remount it read/write while you're changing the variables?

He needs one, and you shouldn't get downvoted for saying this.

Systemd has brought lots of good things to Linux, but tons of bloat and insanity too. I'm quite scared because the next thing they seem to be tackling is one of the core things that makes Linux so nice, package management [1].

They intend to container-ize everything. While that might be in principle good (NixOS got it right), their solution is likely to end up being a mess: no control over container contents. Imagine a critical security bug on e.g. OpenSSL. Good luck patching all your dozens of unmanaged containers running different OpenSSL library instances.

[1] http://0pointer.net/blog/revisiting-how-we-put-together-linu...

Reinvention without purpose often appears to take credit for fixing "old==bad" and dodging blame for churn, usually ortogonal of understanding of users' needs: enterprise, industrial embedded, IoT, desktop, mobile, etc. and the actual pain and cost of change at large scale (beyond people one knows and non developers).

Note that NixOS do not use containers. Instead it checksums every "package" so that only the dependencies that match exactly gets used at run time. Thus you may have multiple copies of a single lib version, depending on the compile flags used.

I think what gets quite few riled up over systemd, besides the continued shoggoth-like scope creep, is the flip flopping on how things are handled.

For example, if you run mount -a, mount will present an error pr failed mount, but will continue trying to mount the rest of the entries in fstab. Systemd on the other hand will fail the whole mount unit on just a single such error, and this in turn will fail the mount because various other units depends on the mount unit completing.

This has bitten admins of remote systems that has gotten "upgraded" to systemd as part of a update of stable distro releases. All because they had a straggling entry in fstab that may have been sitting there quietly for a decade.

Then you have them overriding a kernel default on mount namespacing, because the containers people want it that way, while others have come to rely on the kernel default being, well, default.

I don't think they have yet to "solve" handing NFS mounts, instead giving the world their own take on ctrl-alt-del, while disabling the sysrq sequences.

Or how systemd would (will?) blank out a environment variable via systemd-pam when su -l was used, resulting in people getting their personal settings getting mauled by root. Apparently su is "broken", according to Poettering.

And now Poettering goes ahead and closes this report with what amounts to yet another "wontfix".

Theodore Ts'o seemed to hit the nail on the head nearly 2 years ago[1].

[1] https://plus.google.com/+TheodoreTso/posts/4W6rrMMvhWU

I know NixOS is not based on containers. But you can containerize any environment (same in Guix) for sandboxing purposes.

Yes, but lets distinguish between providing the option and having it as part of the design requirements.

And iirc, containers are crap sandboxes.

From my perspective, containers just make explicit what's always been true: the library dependencies linked into the "release" of an "app" need to be stewarded by the app developers, rather than the OS distributor. System components are owned by the system maintainer, but apps are owned by their developers, and the SLAs are separate.

If Nginx, for example, has a security flaw, it's properly Nginx's job to make a new container. That flaw might be because of Nginx's dependency on OpenSSL, but it might also be because of a vulnerability in Nginx itself. Nginx (and every other "app" developer) needs the process in place to get new releases out in the latter case; if they have that process, there's no reason to not apply it also to the former case.

Distro maintainers have been picking up the slack of creating "security releases" for app software for decades now, but they only do it because app developers were unwilling to take on the responsibility of maintaining packages in several different package ecosystems. App developers are not nearly as timid when it comes to container-image releases, since it's just the one QA process rather than five. So, properly, the responsibility—and the SLA—is reverting back to the app developer.

You can't expect, now, that paying Redhat or Canonical for a support contract will mean that Redhat/Canonical will be required to rush you patches for your app software. It never worked this way anywhere else; Microsoft and Apple aren't responsible for patching the vulnerabilities in the apps in their ecosystems. (Even the ones you buy through their stores!) The Linux world is just coming back to parity with that.

Now, on the other hand, system components—things the OS itself depends on—still need to be patched by the OS manufacturer. Microsoft still delivers IE patches; Apple still delivers Webkit patches. Because those are fundamentally system components, used by the OS to do things like drawing notification dialogs.

Those components happen to come with apps, but the system component and the app are decoupled; there's no reason the version of Webkit the OS uses, and the version of Webkit Safari uses, need to be the same. And they're not: you can download a fresh Webkit from Webkit.org and Safari will pick it up and use it. Apple, thus, only takes SLA-responsibility for the system-component Webkit—not the "app Webkit." The same is (soon to be) true for Linux distro-makers.


The near-term effect of all this, of course, isn't that app developers universally pick up the maintenance-contract stone and start carrying it. Most have spent too long in the idyllic world of "downstream support" to realize, right away, what this shift spells for them. Instead, in the near term, it'll be your responsibility, as a system administrator, to be a release-manager for these container-images. This was always true to some degree, because every stack tends to depend on some non-OS-provided components like Nginx-current or the non-system Ruby. But now it'll be pretty much everything.

An interesting incentive this creates is to move your release-management more toward monolithic synchronized releases. If you're managing the releases of all the components in your stack, it's much easier to think of yourself as releasing a whole "system" of co-verified components, rather than independently versioning each one. When you upgrade one component, you bump the version of the system as a whole. This sort of release-management is likely familiar to anyone who has managed an Erlang system. :)

But don't you think that something like Nix gives us the best from both worlds?

I can have containers, but these containers are still manageable, not black boxes.

Furthermore, since packages come as declarative recipes, one can try to reproduce binaries (see guix challenge command). Otherwise you are at complete mercy of the packagers.

For very large deployments, Docker-like containers are fine. But for desktop applications I think it's not the way to go.

The thing you don't get with Nix is the ability to package software that was developed for some silly legacy distro. Containers embed the required bits of whatever OS the app was built for; Nix requires the app was written for Nix (or written to be autotools-style flexible.) Pragmatically, in the case where the app developer isn't the app packager, containerization wins—I can just slap together a container-image formula that pulls in the e.g. Ubuntu-PPA-only libs the software depends on, that might not be available in Nix's ecosystem.

If you're writing software from scratch, though, by all means, target Nix as your package-dep ecosystem, and make your own software into a Nix package. Nix is a great source for creating "community verified-build" container images, for any "legacy" container ecosystem. Where a Nix package exists, it's exceedingly simple to write a container-image formula sourcing it. Thus, having a Nix package makes the software much more accessible to many platforms.

It would also, as an aside, make perfect sense to enhance systemd with the ability to run Nix "units" that are just references to Nix packages that get resolved and downloaded+built at launch-time. (systemd would still be better off turning those Nix units into container-images, though—systemd relies on containers to provide runtime sandboxing, ephemeral container filesystems, and cross-image deduplication of base-files.)

This would seem to work great for large projects with large teams, but what about smaller teams with more modest library requirements?

Smaller teams aren't usually so concerned with where their SLAs come from, as they are with just having software that works—or rather, software releases that are incentivized by some force or another to be both stable and fresh.

So, for small teams, I see a likely move more toward the model we see with "community-made verified-build Docker container images": a GitHub repo containing the container-image formula for the release, that many small teams depend on and submit PRs to when vulnerabilities occur.

While not ideal, this is far better than the Ubuntu PPA style of "some arbitrary person's build of the release." It doesn't give you anyone to blame or bill for downtime, but it does give you frequent releases and many-sets-of-eyes that hopefully make your downtime quite small.

It's a bit like the atomized "bazaar" equivalent to the "cathedral" of a distro package-management team, now that I think about it. Each verified-build formula repo is its own little distro with exactly one distributed package, where anyone with a stake in having good releases for the software can "join" that distro as one of its maintainers. :)

He doesn't seem to see a difference between "merely" having to reinstall the OS and restore from backups, and a machine that can't boot any OS anymore? And he's apparently a "senior software engineer" at RedHat?

It doesn't even matter if you're pro or anti-systemd. That sort of response just shows a huge lack of understanding about the severity of the problem.

I have also seen him brush away the difference between su and su -l, claiming that su was fundamentally broken and that we should instead ssh into localhost.

Then let the distro handle it. If your users are bricking machines, mount efivars read-only. (This is trivially done even when systemd is hard coded to mount it read/write).

This was discussed yesterday[1]. I still don't think Lennart is saying what people seem to think he's saying. Specifically, "The ability to hose a system is certainly reason enought to make sure it's well protected and only writable to root."

Note that he say writable only by root and well protected. As far as I can tell, that's agreement that there needs to be something done to make it safer. All his other statements seem to be noting that it's not as simple as making it always read-only, as there is legitimate need for write access in some instances, and tips on how to mitigate the current issue.

1: https://news.ycombinator.com/item?id=10999335

Poettering's behaviour ultimately reflects poorly on his employer, Red Hat. It seems like they just allow this behaviour to continue unchecked.

The personal attacks on Poettering should be directed to Red Hat instead, who actually have the power to do something about it.

Disclaimer: I work at Red Hat.

I don't think you quite understand the relationship Red Hat has with its employees ("associates" in company lingo but I abhor such doublespeak). Allowing those employees to act and speak independently is kind of a core value. That freedom is only curtailed when it directly impacts paying customers to the extent that it would be considered a breach of that relationship. Upstream, in the open-source community, almost anything goes. Yes, that means systemd developers can be a bit priggish. It also means other developers, also employed by Red Hat, can call them out for it. It's the only way to ensure the diversity and meritocracy that are essential to innovation. Otherwise, you end up not being able to trust a word that employees of a company say because you know they'd never dare say anything even slightly inconsistent with the party line. I used to see that when I worked at EMC, just for example, and it's really quite stifling.

Personal attacks on Poettering should not be redirected anywhere. For one thing, personal attacks don't get anyone anywhere. Legitimate criticism of his views should be directed at him, just as legitimate criticism of my views should be directed at me and legitimate criticism of your views should be directed at you. There's no reason to bring any third party into it. No matter how much you hate them or why, that's simply irrelevant.

>Legitimate criticism of his views should be directed at him

Meanwhile, on the GitHub issue:

>Locking this one. Note sure which peanut gallery site linked this...

This is clearly a guy who thinks his own opinion is above reproach and the unwashed masses have no right to question him.

Most of the discussion on that issue was just useless noise after it got picked up by HN and other sites though, so it makes sense to close that one down for now.

After it happens enough time you would think a lightbulb would turn somewhere...

I think there was a light going on: they found that bugzilla.freedesktop.org doesn't offer the right ACL tools to keep the trolls away, so they migrated to github.com's issue tracker.

> Disclaimer: I work at Red Hat.

The problem you seem to miss is that his arrogant antics _do_ reflect on Redhat, and reflect badly on Redhat. I for one will _never_ use anything from Redhat, nor pay Redhat any money for anything as long as Poettering and the rest of his crew (Seivers et. al.) are employeed by Redhat.

I am sure I am not alone in this viewpoint. As more realize that the problem is a problem at Redhat, eventually Redhat management will be forced to intervene.

PS as an employee, you should be pointing your upper management to these types of discussions, for the very reason that the actions of a loose cannon group is reflecting very badly on Redhat as a whole.

> As more realize that the problem is a problem at Redhat, eventually Redhat management will be forced to intervene.

I think you're missing the point. It's not about who you tell. It's about who you're criticizing. Red Hat is not the one making these comments. If you condone a company summarily removing a project leader you don't like, then you also condone a company summarily removing a project leader you do like. That doesn't end well. In fact, I could name projects on which I feel Red Hat has forced their will on upstream entirely too much, to the detriment of both. It's not the way a meritocratic community is supposed to work. I think in general it's better to let technical communities deal with their own issues, and in general Red Hat is wise enough to recognize that.

Believe me, I know where you're coming from. I almost didn't join Red Hat myself because of people like Ulrich Drepper (who was still there) and Al Viro (who would even have been in my group). I understand the sentiment. Criticize Poettering if you want, make sure Red Hat knows the effect that his behavior has on your purchasing decisions, but don't blame them for trying to do the right thing by adhering to a policy with a solid philosophical and practical foundation behind it. Do you want corporate overlords to be meddling in every project's internals?

This is what I hate. This whole attack the employer for the employees opinion. This is backward and disgusting. This is the same tactic used by tumblr and twitter in their witch-hunts against people who have an opinion they don't like. It's despicable.

Another example is Red Hat's long support of the awful Ulrich Drepper.

I know I'm a tiny fish in the game, but because of the behaviors of Red Hat employees, I am disenclined to purchase products they make, and even use Linux as a whole. People ask my advice, I give it, but don't have control over pursestrings, or anything like that. However, the behavior of Red Hat employees, like Poettering, Sievers, and Drepper before them makes me believe that Red Hat is not an organization that does a good job of creating leaders. The aforementioned people are poor stewards of projects, showing poor technical decision making and poor communications ability, and it reflects poorly on the organization as a whole.

Poettering, being a high profile employee of Red Hat, is one of the faces of the company. Because of this, I would argue that Red Hat has a duty to step in from time to time to tell him that what he is doing is harmful in the long term. If they don't do it soon, I can almost guarantee there will be a lot fewer support contracts for Red Hat in the months and years to come; they just don't create dependable developers.

What you're doing is a bullshit power move in attempting to threaten someone's job because you don't like their attitude. One discussion where someone was an arrogant douche might be outweighed by the sum total of all their contributions.

If all Poettering did was community outreach, you would have a much stronger case that he's bad in all aspects of his job and should be fired. Trying to pressure Red Hat by saying it 'reflects poorly on you' is basically saying 'you should get rid of this guy even if you think the good outweighs the bad, because I don't like him.'

No, this is a decision on the fact that he has made many many discussions where he was an arrogant douche. This is a decision made where he's made many many technical flaws, such as including the fundamentally broken efivarsfs in systemd, in choosing poor defaults for other portions of systemd, like pointing the default ntp server against a source which says not to use it as such, like in making the easily-corrupted journal an inexorable part of systemd. He has a long track record of making questionable decisions and then getting downright petulant when criticized. He's being paid in part to be a leader of a few high profile projects when he shows a lack of leadership ability. That's why I feel he should be gotten rid of.

Perhaps you're right that he should be gotten rid of, but redirecting criticism of him toward Red Hat is precisely not the way to achieve that. Red Hat won't care if you criticize them as a company. Why should they? However, effective community leadership/stewardship is part of the job description for someone like Poettering. If a person's conduct is poor, criticism should remain focused on them individually. If there's enough such criticism, then the project's own community should act to resolve the situation. It's both less optimal and less likely for an employer or sponsor to take action, but even then it would only be due to criticism of the individual and not themselves. One of the nice things about open source is that it's resistant to such political "get you in trouble with your boss" backstabbing. Address the individual directly and honestly, or forget it.

Since HN won't let me respond to uuoc directly, I'll respond here. I wasn't saying criticism should only be directed to its target. That's not the kind of "redirection" people seemed to be suggesting. What I'm saying is that the criticism should remain about that person, not displaced onto some third party. Believe it or not, it's possible to see how someone's behavior is affecting a project, without relying on that person to convey such information themselves. We've all done it here, after all. Anybody who thinks it's reasonable to blame individual behavior on someone's employer should consider whether they'd be comfortable with the idea when it's their behavior and their employer involved. I doubt it.

> However, effective community leadership/stewardship is part of the job description for someone like Poettering.

If this is true, then this is exactly _why_ the criticism should be directed towards Redhat. Because in his arrogant world view, he can do no wrong, so he will not report to his bosses that he has ineffective community leadership/stewardship. The only way his bosses will know of his ineffective leadership/stewardship is if the criticism is directed towards Redhet, and therefore, his bosses.

I.e., the criticism has to go around the roadblock, Poettering being that roadblock.

> Poettering, Sievers, and Drepper

Three employees among thousands, representing two projects out of hundreds. Why generalize from that sample? Why ignore all those contributions to the Linux kernel or Fedora, OpenStack or Kubernetes, gcc or coreutils? Some pretty good leaders in there. And if "creating leaders" is supposed to be how we judge companies, what should we make of much larger companies where few employees engage with the community at all? I don't even mean unpopular companies like Microsoft or Oracle. What about Google, for example, or Apple? When it comes to community leadership, they're net negatives; existing leaders go in, and are never heard from again. When every single developer at a billion-dollar-a-year software company is engaged with some open-source community or other (often several), there will be a few losers. That's a poor reason to insult thousands of others.

The way i see it is not so much the individuals, but that so much of the traffic between them happens within the corporate realm that by the time it hits the public repositories they have all agreed on some iron clad world view.

The whole thing reminds me of how priests and monks would debate the number of angles that could dance on the head of a pin.

> That freedom is only curtailed when it directly impacts paying customers

So that's basically what uint32 proposes: to make an impact.

That sounds like a lovely bit of legal ass covering.

Redhat unfortunately seems beyond reproach by many, even with all the bad behavior they've allowed under their watch. Ulrich Drepper was an employee of theirs during his most imfamous time as glibc maintainer. Their purchase of Cygnus scattered a lot of real interesting systems development tools, like the Insight debugger, to the wind. Unfortunately, they're one of the loudest mouths in the room, so a lot of their bullshit doesn't get called out, leaving everything worse for the wear.

And having them all under one (virtual) roof may be producing a echo chamber effect among them.

I can't seem to locate it now. But i seem to recall a video released within the last year so so that showed various people from within RH, regarding the history of Linux. And at one point one of them energetically declared "we won" in regards to some "unix war".

All in all the video gave the impression of a company culture that had the mentality that they could do no wrong.

The "Unix wars" refers to the fights among Linux and the various proprietary Unixes (AIX, HP-UX, Tru64, Solaris etc.). "We won" referred to Linux, not Red Hat. Of all the proprietary Unixes, only Solaris is still alive and it's partly open source too.

Hear hear, until Redhat begins to feel some heat from Poettering's (and Seivers and the rest of these fools), these fools will just continue on their arrogant bull in a china factory path they are presently on.

Start informing Redhat that these responses on Poettering's part reflect badly on them as a whole, and especially _stop_ paying Redhat any money until the issue is resolved, and Redhat will resolve the issue very quickly.

At the risk of a No True Scotsman Fallacy, No linux user uses Redhat for more than 2 years before hating everything about life, or moving on to something better (or getting paid to support the monstrosity). The company seems to thrive on PT Barnum's philosophy of a sucker being born every minute.

Poettering and Sievers are just freedesktop.org adding another layer of feces on the pile. The fact that people switch distributions to avoid his applications (NetworkManager, PulseAudio, Avahi, systemd), and the fact that someone hacked XBill to include him because he's a worse villain than Bill Gates, should put him on a level that even Ullrich Drepper couldn't touch, and everyone hated that guy.

Redhat has been a useless organization since they chose to promote Havoc Pennington over Carsten Haitzler (longer than any of you have known they exist). Blaming them does nothing to stop their slow lurch over the linux landscape. Boycotting them might, but there are too many RHCE's out there. Best to focus on FDO and Poettering/Sievers.

On that note, Freedesktop was founded by Pennington...

RedHat employees have always been quite quirky types working alone with a "luser" attitude. Sad but true.

No surprise. This guy is dangerous.

Things like this add to my impression that UEFI is a solution looking for a problem. The fragility caused by all this extra complexity is extremely undesirable for a system component whose only job should be to test and setup the hardware minimally, then leave the rest to the OS.

Then again, there's also the question of why removing EFI configuration variables would make the machine unbootable; you would think that in the absence of any explicit configuration, the firmware should just choose a sane default. That would be like making an rm -rf / also reset your BIOS settings, which is probably surprising but easily recoverable behaviour. This seem as crazy as mounting the whole flash ROM as a filesystem, so deleting it erases the firmware. The symptoms described do sound like what happens if you try to boot a motherboard with a completely blank BIOS chip (from personal experience...) --- the system will power on and just stay on, but nothing else will happen, not even a POST beep.

Edit: given that the majority of users have probably never touched BIOS/UEFI settings so that they remain at defaults, resetting them would not be noticed by them either. It's likely the advanced users, the overclockers and so forth, which will be running non-defaults.

It took Linksys months to make their firmware for their "new open source" WRT routers reasonable enough that DDWRT / OpenWRT didn't have to deny their patches.

All these hardware companies - the motherboard makers, the wifi radio makers, the hard drive makers, the peripheral and chipset makers - all write the most awful insecure disastrous code in the industry. But because nobody cares enough to put their wallet where their mouth is and refuse to buy hardware without access to the firmware to audit, improve, fix, or replace it this is what you are left with, and you get what you pay for.

And there are plenty of firmware settings you might want to change even without overclocking. Change the default boot hardrive? Firmware. Turn off unused ports on your motherboard? Firmware. Change fan speed settings? Firmware. Any implementation of network / usb booting? Firmware. Full HD encryption? Firmware.

I just know the next laptop I buy will be whatever the highest end liberated Chromebook at the time is, preferably without cancerous firmware blobs that control everything, but that seems unlikely considering how anti-freedom Intel is.

I just hope AMD saves x86 computing in the Zen generation. They are the underdog, they have reasons to not throw users under the bus for complete control of the platform like Intel does. But their hijacking of radeonHD and injection of firmware blobs there doesn't make me hopeful for first-gen freedom respecting hardware from them any time soon either.

RISC-V, save us!

Agreed, but one small pedantic correction: the Linksys firmware was written by Chinese contractors. I know because I had to deal with them. It's also why the open source process took so long - the firmware was a binary blob and the contractors refused to cough up the GPL'd source.

> I just know the next laptop I buy will be whatever the highest end liberated Chromebook at the time is, preferably without cancerous firmware blobs that control everything, but that seems unlikely considering how anti-freedom Intel is.

Yep, I don't think you'll find x86 Chromebooks without blobs. Nowadays the only way Intel provides to boot their hardware is "here, take this binary and put it at the beginning of your BIOS".

Same thing with AMD, btw.

I'm fine with ARM, hopefully future iterations of the ASUS C201[1] are portable to Libreboot in the future by the time I'm shopping for a new laptop. It won't be for a while, I hate giving these scum freedom denying companies my money for locked down puke.

[1]: https://libreboot.org/docs/hcl/c201.html

> RISC-V, save us!

You presented evidence that hardware vendors write awful firmware code - why will these people write decent firmware as soon as RISC-V comes, i.e. what evidence do you have that we won't have the "accepted" behaviour of having RISC-V and bad firmware?

The point is that RISC-V will be documented; you don't need to blindly accept whatever firmware is bundled, because it's possible to write alternative firmware. This is actually the case for a lot of embedded processors and SoCs too, but it doesn't do as much good their because even when things are sufficiently documented they're still not actually standardized. Given the opportunity, projects like coreboot and OpenWRT produce great results.

The problem there is that x86 is well-documented too. Painfully so in a lot of cases. Overly so in some. You can get, with a few clicks, the documentation, including pinouts, register settings, etc for any processor Intel has, the chipset, peripheral controllers, pretty much everything. The biggest thing you can't get documentation for is the $10 thunderbolt controller some boards have, for asinine reasons. The problem more is the companies that make the boards farm out the coding to the lowest common denominator, so all that gory documentation goes to waste all too often. RISC-V isn't going to save us from crappy motherboards, unfortunately.

I like to hope and pray that since RISC-V was found on the principles of royalty free open instruction sets and blueprints that anyone who sockets the architecture might follow through.

A lot of the problems with x86 motherboard vendors is probably that they legally cannot disclose a lot of the internal documentation and code handed them from Intel, because Intel uses some of it as a trade secret against ARM / AMD.

Intel gives a fuckton of information to any and everyone, they submit code to the TianoCore project, they pretty much bend over backward to say, "here, take this, make your stuff not suck." The UEFI forum's pretty similar in access to tools and specifications. This all comes down to a race to the bottom for coders because software's a liability to a lot of the companies developing hardware, even if things are handed to them pretty much on a silver platter.

Poke around Intel's firmware developer center http://firmware.intel.com/develop . There's pretty much everything there you need to make a firmware that isn't terrible, but companies will find a way to provide one anyway.

I'm not seeing where firmware.intel.com has any documentation for stuff that's lower-level than EFI modules and drivers. They've got a Firmware Support Package that "provides a binary package to initialize the processor, memory controller, and Intel chipset initialization functions", but only for some mobile and embedded platforms. Otherwise, they seem to just refer you to Phoenix, AMI, and friends so you can license their buggy stuff to build a custom GUI for. There's really not much benefit to the community in making it possible to build better EFI modules if you've got no way to integrate them into a usable replacement firmware for your existing hardware.

The real problem is that SoCs are SoCs, not Cs. The peripherals (networking, graphics, disk interface, etc.) don't necessarily need to be open on a RISC-V processor.

This is also true with ARM, which is why I doubt RISC-V is a solution to the problem of bad firmware and/or lack of SoC documentation. ARM is "open" in the sense that the CPU architecture is publicly documented --- you just can't legally implement the CPU without licensing it from ARM.

FYI the pre-Pentium subset of x86 has been public-domain and free of patents for a long time, and I believe several more Pentium-level patents are going to expire soon, so in that sense a lot of the basic x86 instruction set is more open than ARM. No doubt if RISC-V becomes popular there will be plenty of proprietary extensions to it too.

The good news is that the eg Intel FSP firmware blobs would not be the problem in cases like this and never would be.

The problem is the Linux based boot loader that circumvents Windows Activation Technology. I am convinced that -- and that alone -- is why we live with UEFI.

au contraire, the problem is windows thinks it owns the machine I bought for the sole purpose of running linux.

It's poorly phrased, but I think he is agreeing with you.

Are you confusing WAT with SecureBoot?

Probably WAT, those bootloaders patching SLIC so Windows thinks it's a certain OEM system. I think those hacks are/were based on GRUB.

The thing I find particularly hard to grok is that they came up with this idea of putting disk-specific boot information on the motherboard. Then it was one simple step for board manufacturers to mix this with low level hardware settings which also need to be stored somewhere and produce the kind of trouble we are seeing.

Systems would be safer if this stuff was stored on disk and OSs never had any reason nor even possibility to tinker with motherboard's configuration memory.

Then it was one simple step for board manufacturers to mix this with low level hardware settings which also need to be stored somewhere and produce the kind of trouble we are seeing.

The "low level hardware settings" were always stored on the motherboard, ever since the PC/AT. The big difference is that a simple CMOS reset would reset those to the defaults, and the machine would be bootable with the defaults. In the old days some errant program could corrupt CMOS (writes to port 70h/71h), but that was relatively easily fixed.

With this UEFI stuff, it appears the configuration data is stored in nonvolatile RAM, there's no easy way to "reset to defaults", and the defaults are either missing/unusable.

the machine would be bootable with the defaults

The good-old-days weren't all Wine and Roses. For example, there was this: http://webpages.charter.net/danrollins/techhelp/0054.HTM

If you wiped out CMOS, a "simple CMOS reset" was not sufficient to allow booting, because knowledge of the type of disk you had installed was lost.

Sure, you could iterate thru those preloaded disk types and stumble upon the correct one. BUT, having a fixed selection of types proved to be too limiting. So there was a scheme to add drives types. That info was also stored in CMOS, so if you lost that, it was quite difficult to restore the configuration:

   Newer BIOSes provide a means to define a custom
   setting.  The setting will be stored in an
   undocumented location in CMOS memory (and is lost
   if the battery ever fails -- so write it down!)
It wasn't an insurmountable problem. It simply meant that you had to come up with a few bytes, perhaps by calling the manufacturer or integrator. But this was in the days before Google, so it wasn't easy to find this info with a quick search.

Drives have had CHS values printed on them since the very beginning. It does mean you have to open the case to read it, but you do have to open the case to reset CMOS anyway...

But the problem of drive geometry detection essentially disappeared with IDE autodetection, which quickly became the norm sometime in the early 90s.

> Systems would be safer if this stuff was stored on disk and OSs never had any reason nor even possibility to tinker with motherboard's configuration memory.

But isn't boosting the security of PC systems a selling point of the UEFI/Secure boot implementation? Or was that all a lie?

> But isn't boosting the security of PC systems a selling point of the UEFI/Secure boot implementation? Or was that all a lie?

SecureBoot is a farce. 99.99% of users will never be the target of the attack it supposedly prevents and the other less than 0.01% of users know who they are.

On top of that it doesn't even work. The premise is for it to be used in combination with full disk encryption (since otherwise the attacker could just remove the drive), to protect the integrity of the boot shim that prompts you for the decryption password so the attacker can't replace it with one that gives the attacker the password. But there is necessarily an unencrypted analog connection between the human and the computer and the attacker can still capture the password that way.

I'm not sure how moving boot settings from one OS-accessible place to another improves security.

If you are bothered by people booting unauthorized disks on your hardware, enforce signature checking on OS images.

I think the term is second-system. It, much like ACPI and systemd (heh), has been designed with the intent of solving all the "deficiencies" of what it is replacing in one fell swoop.


Reminds me of Cantrill, more so his story when they completely bricked a machine of a fellow worker once and then took a close look at the standard[1].

# rm -rf /

Among other things, it will delete the current directory. In the standard it does not say what to delete first, In their implementation it will try to remove the current directory first -> undefined behaviour -> it fails.

The logic behind it: When is it really your goal to delete your entire machine, mostly never, you don't type it out by accident, but shell scripts with unset variables might do it.

And regarding Poettering's response[2] (not trying to start a fight): It's Poettering, what do you expect? You can hate or love systemd, but part of why people hate it is his intellectual arrogance in everything he does.

[1] He tells the story somewhere in here https://www.youtube.com/watch?v=l6XQUciI-Sc (quite entertaining)

[2] https://github.com/systemd/systemd/issues/2402

> And regarding Poettering's response[2] (not trying to start a fight): It's Poettering, what do you expect? You can hate or love systemd, but part of why people hate it is his intellectual arrogance in everything he does.

I don't think it's just his arrogance, it's that it's not backed up by substance. Linus is an arrogant prick, yeah? But his kernel works pretty well, so he gets some slack. All pulseaudio ever did for me was waste my time and break my ability to output sound. Systemd wastes my time, makes my computer work different for no reason that's apparent to me, and now makes it easy for me to brick my machine if I'm not careful and I have a terrible bios[1].

[1] I've yet to meet a bios that isn't terrible, although hopefully few are terrible in this specific way.

I disagree with Poettering's technical decisions more often than I agree with them, but to be fair, some of PA is because the distros adopted it early. Heck, when Ubunutu picked it up the readme still described it as something along the lines of "The server that breaks your audio system."

EDIT: "bricked" is wrong, they completely deleted the machine. Had to correct this ;)

It is both terrifying and amusing that rm'ing files in /sys can brick my Linux machine.

I find it even more worrisome that some compare this mistake to accidental clobbering of /dev/sd?.

Lesson learned: if you want to semi-permanently take a system offline, hope they have a bad EFI implementation and rm -rf /sys. The fact that a malicious actor can compromise your hardware via software like that is incredible.

That's an insane design decision and if I replicated that in my professional capacity designing heavy machinery I'd be rightly fired and sued to oblivion because the equivalent result is dead people. This is a basic case of the principle of safety-in-design.

Which is why the campaign for liberated firmware is so important. If motherboard manufacturers were committing work to a common project like libreboot then hundreds of eyes would be upon it and awful code that does this insanity would never enter official repos.

Linksys had this problem a few years back with their new line of "open source" routers - it took them months to clean up their awful internal coding styles to get patches accepted into DDWRT, and even then they were accepted on a compromise where DDWRT developers had to fix a lot of it to make it less of a security portability and readability nightmare.

These hardware vendors at all levels - storage controllers, chipsets, radios, and more all have absolutely no QA on their code and by being so extremely proprietary nobody can do anything about it, and not enough people care to speak with their wallet to changes these terrible habits.

> These hardware vendors at all levels - storage controllers, chipsets, radios, and more all have absolutely no QA on their code

Hardware vendors do have QA, but it's mainly about ensuring that things work, not about trying to break them in every possible way. Safety and security seems to be notoriously hard for people who have been taught how to make things work, but not how to make them fail.

Which is exactly why it would be so valuable to have that code in the open.

I know I'm repeating myself but I still think the interations between OpenWRT and the Chinese firmware vendor that was pushing Linksys firmware upstream is a valuable example of why open source is valuable in this context even if you are not intimately involved in the development, testing, or inspection of such code. Public code by its nature requires more scrutiny and its harder to get people to accept something broken or poorly written when they can see just how bad it is.

If you want to develop awful coding habits, only work with people who never develop free software. If you want to have really good habits, work in a very popular free software community, because when your work is in the open like that and everyone is a volunteer nobody is going to put up with crap.

  > then hundreds of eyes would be upon it and awful code that
  > does this insanity would never enter official repos.
As the example of OpenSSH clearly shows.

Well, the insanity would rarely enter official releases.

There is no comparison between the bugginess of BIOSes and OpenSSH.

OpenSSH would not fit my definition of a popular project, which is exactly why it has become a security disaster. Though another contributing problem is that C as a language is awful for writing secure or trustworthy code in in the first place, which is the primary cause of most of OpenSSH's problems.

There are degrees of return on code visibility, though. Even a dozen competent developers could miss arcane buffer overflows or bad page execution issues in a large patch because the language is awful and lets you do crazy shit. That is one aspect of development quality that doesn't go away when you move from closed to open source.

But the best practices - consistent code style, documentation, reasonable variable names, reasonable line lengths, and the need to defend your contributions are all products of open collaborative development processes.

I'd argue in many ways that the open nature of OpenSSH is why we have only had three (four?) major security vulnerabilities out of it in the last five years. Its a sixteen year old ANSI C codebase, of course its a security nightmare, but it is a lot less dangerous than it could have been - imagine having heartbleed on a proprietary TLS implementation where developers could not immediately fix it or easily deploy the fix.

You can do exactly the same thing on Windows by calling SetFirmwareEnvironmentVariable. The problem is on the firmware side!

Yeh I wasn't laying blame on the *nix side of things either. No OS should be able to do this because the OS shouldn't have that level of control over the system it's sittting upon. Firmware absolutely shouldn't fail to a broken state. If you hose some configuration and it crashes, it should revert back to a known good configuration e.g.: a factory reset / fail-safe configuration.

You shouldn't be able to hose it completely except through special equipment, for example by connecting to system programming terminals on the motherboard with external hardware. The fact that a higher-level system can damage a lower-level system is just bad design.

It's interesting to contrast this with Apple's solution to the same problem: El Capitan's rootless.

As of OSX10.11, the live, everyday-use OS doesn't have write access to EFI variables. Instead, to fiddle with EFI vars (which happen to include the OS's kernel-module code-signing grant store, which is how people most often run into rootless) you have to reboot into the recovery partition.

In other words, instead of creating a custom BIOS setup as a special UEFI app with privileges that the OS never has, Apple has instead given OSX the equivalent of SysV runlevels—and then made EFI only writable in the single-user maintenance runlevel. Instead of transitioning between these runlevel-equivalents "online", you reboot between them; and instead of being modes of the same OS image, they're two distinct copies of the same OS. But the usage semantics are the same.

(The key to security here, if you're wondering, is that the recovery OS is a single solid image that's been code-signed as a whole, with the signer's pubkey kept in another EFI var. The live OS can't just be made to overwrite the recovery OS into something malicious, even though the live OS has full control of the disk it sits on and is responsible for replacing the recovery OS when it receives updates.)

Personally, I think something similar might be the best solution for Linux as well. People are suggesting something like a wrapper program, but a wrapper can still be used maliciously. It's far easier to secure a "maintenance mode" of the OS that must be rebooted into, and doesn't bring up the network; such a mode necessitates (remote virtual) console access to actually do what you want, rather than allowing you to simply trigger off a destructive EFI op over SSH.

This can still be automated; your automation just needs to be able to speak to the remote console. And tools like grub-install can still work; they just need one program on the live-image side and one on the recovery-mode side, where the live OS's grub-install just records your desired changes, sets the recovery-mode flag, and reboots; and where the recovery-mode grub-install agent reads the file, actually performs the op, unsets the flag, and reboots back.

Well, it's still a solution to problem which shouldn't exist in the first place. Before UEFI, x86 boxes were hard to brick unless you really knew what you were doing.

Tell that to the bloke who did it with the echo command in 2001:

* http://narkive.com/yG8yWfLt.1

I found the one that described removing the "backup battery" under the wristrest. Once I removed that, and replaced it... everything came back together, and the laptop booted (with generic factory settings).

Difficulty in getting to the battery aside, he just did a regular CMOS reset, the standard technique for getting otherwise unusable systems back to a good state.

Sorry, I forgot about ACPI :)

Now that you posted this, I think I recall one friend telling me that hibernation killed his laptop. But this was over 10 years ago and I only know about one such incident.

OTOH, what UEFI gave us is basically a portable and convenient API to brick any machine from any OS.

Dangerous tasks should have the most safety interlocks, but not require manual, needy attention that makes it harder to automate deployments. This edge-case functionality may still be useful for self-destructing / remote bricking sensitive embedded devices.

    efidestructivecmd opts... --really-brick-myself-and-catch-fire  # fire optional

If you call SetFirmwareEnvironmentVariable, then you reasonably expect firmware memory to be updated. You do not expect any firmware to be modified by "rm -rf /". So the efivarfs filesystem interface is the problem. It violates the principle of least astonishment. In Unix we say that "everything is a file" but it is false; efi variables definitely aren't files.

>fact that a malicious actor can compromise your hardware via software like that is incredible.

Why is it incredible? It's no news that you can flash various things if you've got root.

Fun anecdote, a while back I was installing Windows onto a box with Debian. When I needed to get GRUB set up again, efibootmgr was throwing an inscrutable error when installing the boot loader, but I had no issues manually booting it from GRUB on a USB.

Ended up being the case that the EFI pstore was filled (half full?) with Linux crash dumps from before I ironed out some OC stability issues. Had to manually mount it and then delete files named "dump-type0-" from the BIOS NVRAM to resolve the issue, which was pretty fun.

Something along these lines: https://bugzilla.redhat.com/show_bug.cgi?id=947142

forgot account name

More precisely, rm'ing files as root in /sys can brick certain machines with broken UEFI implementations.

Here are some interesting items to think about.

Here's OpenRC also mounting this filesystem read-write, since 2013:

* https://github.com/OpenRC/openrc/blob/e52b5f59c22283b22e2b5a...

* https://github.com/OpenRC/openrc/commit/02a7d3573d551c5d169e...

Here's a Debian bug from a year and a half later, asking for systemd to do the same on Debian Linux:

* https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=773533

And here's Finnbarr P. Murphy in 2012 explaining the whitelisting that the old efivars system imposed upon variable access, stating that this system "should be retired", and questioning why these checks are not performed in applications-mode code rather than in kernel-mode code. I suspect that a lot of people can now answer that question, with hindsight. (-:

* http://blog.fpmurphy.com/2012/12/efivars-and-efivarfs.html

One thing of note is that OpenRC is done via shell script, so going in and adding a ro option should be straight forward.

With systemd it is done in the C code of their init binary, thus you have to work around it by a remount on fstab.

dupe: the matching systemd bug (https://github.com/systemd/systemd/issues/2402) has been extensively discussed yesterday https://news.ycombinator.com/item?id=10999335

Wasn't the fear of such bloat developing on Linux one of the main reasons why so many/few people were against systemd?

Many regular (and somewhat tech-minded) individuals may not fully understand the issue, but this issue was a problem that crept up due to pure coincidence that somebody tried to rm -rf something, otherwise that bug would linger like how OpenSSL bugs did.

I also really wish the pro-systemd crowd would stop attacking anybody that does not agree with their views. Linux was and always will be about community, and if you alienate the rest of the users, freedom means they will (and probably should) move on, even if 90+% of Linux-variants now use systemd.

This issue doesn't really have anything to do with the scope of systemd and its sidecar utilities (unless you're concerned that it mounts /sys/firmware/efi/efivars/). It's just a debate about what the default mount options should be.

This really doesn't have anything to do with systemd specifically. Any init system on current linux on x86(-64) will have to deal with the efivars pseudo-filesystem (provided by the kernel), because certain utilities (mainly grub-install) need to write to it. It might be wrong to mount it read-write by default (as systemd does), but other init systems (OpenRC, e.g.) have also mounted it read-write.

I am still left with the same thought as yesterday's article, why was the decision made to mount efi as a filesystem in the first place? It's obviously dangerous to mount it when there are unintentional interactions that can have extreme effects. A library approach such as how FreeBSD does is much safer.

tl;dr: This can permanently brick some/many/all? UEFI devices. Three cases confirmed so far on Lenovo/MSI/unknown hardware.

Linux distribution other than Arch are likely to be affected as well because systemd is hardcoded to mount the EFI variables pseudo-filesystem with RW access on every boot.

Matthew Garret proposed a solution to this problem, disable writes to all non-standard UEFI variables until we can sort out the bugs: https://twitter.com/mjg59/status/694004077923938304?s=09

From the comments on that forum, seems like a lot has changed since http://linuxbsdos.com/2012/04/29/what-will-rm-rf-actually-do... was published.

What I got from the thread: efivars are NVRAM registers for UEFI and are mounted on /sys/firmware/efi/efivars and doing rm -rf clears/deletes them. And they are required for booting.

Is this correct ?

Yes and no, UEFI will look at those vars when determining how to boot the computer but clearing them shouldn't brick the computer.

Basically the UEFI spec is crap to begin with and there are implementations that make egregious mistakes that don't matter if the guest OS is Windows but goes horribly sideways if something atypical comes up.

For instance, a while back there was a case of Linux bricking a bunch of laptops because while efivars had space available, trying to use some of this "free" space would leave you with a paper weight.



Systemd decides of its own accord (that is, the distro cannot tell systemd not to do this) to mount dangerous filesystem read-write by default.

"Solutions" provided by systemd developers:

  "Well, there are tools that actually want to write it. We also expose
  /dev/sda accessible for root, even though it can be used to hose your system."
We make sda writeable, yes. But it's much more difficult & unlikely one would write a script that opens and overwrites random block files to destroy hard drive data than it is one could untintentionally unlink random files, in this case resulting in the destruction of hardware.


  "I don't see that particular behaviour as much of a problem. The problem
  is that buggy systems can be bricked; it could just as easily happen
  because of, say, a bug in gummiboot or refind."
So since other software can also brick the hardware, systemd's behhavior does not need to be fixed. Got it.


  "So all fixes mentioned here can only protect from accidental deletion -
  not malicious intent."
So because someone could intentionally brick some hardware by being malicious, it's pointless to prevent someone from accidentally bricking their hardware. Got it.


  "As long as distribution that are aimed at consumers remount it ro and 
  on updating kernels wrap grub with remount this is a complete non-issue."

  "If anyone needs protection from idiocy, mount it as ro in /etc/fstab."
So by default, every distribution in the world - and a bootloader - needs a workaround for your software's dangerous behavior. Got it.


  "To make this very clear: we actually write to the EFI fs in systemd.
  Specifically, when you issue "systemctl reboot --firmware" we'll set the
  appropriate EFI variable, to ask for booting into the EFI firmware setup.
  And because we need it writable we'll mount it writable for that."
One of the commenters (devs?) mentioned that systemd could mount it read-write, apply this change, and mount it read-only again, which would work around the danger we've been talking about. But from this final comment it seems like you (poettering) basically don't care about the problem.

Got it.

A lot of people are missing the point here. Of course it's not system's fault here, but it's completely absurt for them to completely dismiss a fix just because they're not at fault. efibootmgr requires rw access, but it is a root process, so it could mount, do whatever it needs, and unmount again. It's a much cleaner solution than making users edit their fstab to make it automount as ro after install. That's just an ugly hack.

> some desktop motherboard allow to reinstall a corrupted firmware by putting a special named file on a usb key

This sounds fun and dangerous.

Probably not in this case, because affected systems don't give even a slightest sign of life.

"Eventually he ended up sending the machine to MSI for repair, which will be covered by warranty." explains most of it - broken hardware should be fixed by the manufacturer.

These sorts of problems wouldn't exist if *nix didn't think everything is a file...

True. It also wouldn't exist if people didn't rm -rf /.

There are good things that come from treating everything as a file.

True, and I generally like how I can poke around /sys with just ls and cat, but perhaps somewhat sensitive boot configurations should live behind an ioctl interface (or some other syscall) instead of being shoehorned into the filesystem.

ioctls are definitely the wrong interface (they require an open file, for one). To be honest, there are very good reasons why the "everything is a file" model is very useful. You can use it to do many more cool things with shell scripts (such as changing the fan speed, or backlight brightness or any other dodgy cowboy stuff). And at the end of the day, why is the "everything is a file" model bad?

> And at the end of the day, why is the "everything is a file" model bad?

Because not everything is a file? And therefore forcing people to treat everything as a "file", which they have certain natural expectations for, leads to problems such as the one here when the objects cannot fulfill the users' expectations?

> You can use it to do many more cool things with shell scripts (such as [...] dodgy cowboy stuff)

I feel this is fairly obvious... if your argument for the everything-is-a-file model is a love for dodgy cowboy hackery, then is it really that surprising that you're sacrificing something (in this case, usability/sensibility/etc.) in the process? I mean, yeah, those who feel like cowboys might find your system intuitive, but do you not see how it might not be very usable by (or useful to) other people?

> > And at the end of the day, why is the "everything is a file" model bad?

> Because not everything is a file? And therefore forcing people to treat everything as a "file", which they have certain natural expectations for, leads to problems such as the one here when the objects cannot fulfill the users' expectations?

"Everything is file" doesn't mean that everything is associated with a block on disk. Filesystems are a very good (and intuitive) way of describing hierarchies, and benefit from requiring literally only one syscall interface that works will all of your tools and programming languages without needing to update the stdlib. How would you propose to represent hierarchies using syscalls? Would you have a "set_uefi_variable" syscall? How would that not become unweidly? ioctls wouldn't work (not just because they're ioctls but also because you'd need to open a file, and devices aren't files either -- because "everything is a file" is bad, right?). You could try doing it all with kdbus (or something), and that might even be somewhat plausible. Until you realize there's no way of doing anything that doesn't require breaking out a C compiler. Shell scripts couldn't do simple things like change the dimness of your backlight.

> > You can use it to do many more cool things with shell scripts (such as [...] dodgy cowboy stuff)

> I feel this is fairly obvious... if your argument for the everything-is-a-file model is a love for dodgy cowboy hackery, then is it really that surprising that you're sacrificing something (in this case, usability/sensibility/etc.) in the process?

It also doesn't require 500 pages of API docs each time you want to change your backlight with a shell script. The whole point of Unix is to solve problems quickly, not to have to break out your C compiler each time you want to do something less trivial than renaming a file.

> I mean, yeah, those who feel like cowboys might find your system intuitive, but do you not see how it might not be very usable by (or useful to) other people?

Filesystems are intuitive to almost all users. When teaching Unix to my friends, I start by saying "everything is a file" and move on from there. Why? Because it's actually a useful abstraction. Filesystems are already incredibly intuitive. Not to mention that nobody was actually complaining about how intuitive it is to have efivarfs, it's a non-issue.

My take on this is simple: is it a bug: no. Should we add more protections/add quirk handling: yes, assuming we don't like people having a bad day over this.

As a BIOS and OpenRC user I always chuckle at this kind of news.

It's very funny until they drop BIOS compatibility mode from new machines.

As long as they still let you turn off Secure Boot, I don't see why you can't just put a BIOS CSM in your EFI system partition and load it before proceeding with business as usual.

As a human being, I don't chuckle at other people's pain, nor do I brag about it.

> As a human being, I don't chuckle at other people's pain, nor do I brag about it.

I guess Germans aren't human beings? [0] :/

[0] https://en.wikipedia.org/wiki/Schadenfreude

To the downvoters:

mwfunk is making the incredibly exclusionary statement that human beings should not (and that he does not) take pleasure in the pain and/or misfortune of others.

We can infer from this things like:

* mwfunk does not chuckle when a series of unfortunate and highly improbable events results in someone suffering very minor injury.

* mwfunk enjoys almost no comedy, because the vast majority of humor revolves around the retelling of stories where at least one of the participants has been harmed in some way, however minor.

* mwfunk does not feel any form of pleasure when a Bad Guy has gotten his comeuppance and is now being punished for the wrongs he inflicted on others.

It seems rather unlikely that these three points (and the hundreds more like them that could be inferred) all apply to mwfunk. The more likely explanation is that mwfunk is speaking from a rather high horse, and hopes that we groundlings can't hear when he chuckles at a rather good joke in a stand-up skit.

You don't like the movie Dumb and Dumber?

As (presumably) an ACPI user, you should rather be sympathizing.

Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact