
Linux 4.5-rc5: efivarfs fixed to avoid “rm -rf /” bricking UEFI - the_why_of_y
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=0389075ecfb6231818de9b0225d3a5a21a661171
======
nkurz
For context, this is in reference to a bug that was discussed a couple weeks
ago:
[https://news.ycombinator.com/item?id=10999335](https://news.ycombinator.com/item?id=10999335)

    
    
      Systemd mounted efivarfs read-write, allowing motherboard bricking via 'rm' 
    

Essentially, systemd defaulted to a configuration where the computer's
motherboard could be permanently destroyed by removing a 'file' from the
command line. The bug reporter argued that this was unduly dangerous, but the
systemd developers thought that systemd was working as intended.

Here's a reasonably impartial discussion on a FreeBSD list that gives an
overview:
[https://forums.freebsd.org/threads/54951/](https://forums.freebsd.org/threads/54951/)

And from that thread, here's a link to Matthew Garrett (the creator of
efivarfs) saying that efivarfs is at fault here rather than systemd:
[https://twitter.com/mjg59/status/693494314941288448](https://twitter.com/mjg59/status/693494314941288448)

~~~
kbenson
> but the developer's thought that it was working as intended

Really? Is that evidenced by Lennart's response to this, which stated "The
ability to hose a system is certainly reason enought to make sure it's well
protected and only writable to root."[1]? I think it implies the _opposite_.

1:
[https://github.com/systemd/systemd/issues/2402](https://github.com/systemd/systemd/issues/2402)

~~~
datenwolf
It happens rarely, but actually I totally agree with Lennart on this one.
Maybe not for the very same ultimate reasons, but nevertheless, I agree.

Being able to brick hardware through a very oftenly used action (unliking a
filesystem entry) throws us back into the times where one could damage display
devices beyone repair by feeding them scan frequencies outside their
operational range or by destroying hard disks by smashing the heads into a
parking position outside of the mechanical range. We left those days behind us
some 20 years ago: Devices got smart enough to detect potentially dangerous
inputs and execute failsafe behaviour. It's just reasonable to expect this
from system firmware.

When talking about (U)EFI variables we're not talking about firmware updates,
which are kind of a special action (and even for firmware updates its
unacceptible that a corrupted update bricks a system¹). Manipulating (U)EFI
variables is considered a perfectly normal day-to-day operation and the OS
should not have to care about sanity checks and validity at boot time. (U)EFI
is the owner and interpreter of these variables, so it is absolutely
reasonable to expect the firmware to have safeguards and failsafe values in
place.

IMHO (U)EFI is a big mess, a bloated mishap of system boostrap loader. And I'm
totally against trying to workaround all the br0kenness in higher levels. The
more and often systems brick due to the very fundamentals of (U)EFI being so
misguided, the sooner we'll move on to something that's not verengineered.

\----

¹: Just to make the point: When we developed the bootstrap loader for our
swept laser product we implemented several safeguards to make it unbrickable.
It's perfectly fine to cut the power or reset the device in the middle of a
firmware upgrade. It safely recovers from that. Heck, the firmware in flash
memory could become damaged by cosmic radiation, the bootloader would detect
it and reinstall it from a backup copy in secondary storage.

~~~
rjzzleep
I completely disagree. I looked at the patch above. I personally don't like.
mounting efivars rw is akin to mounting boot rw by default.

A sane linux distro will mount it ro and switch to rw whenever need.
Defaulting to rw efivars is, excuse the language stupid.

I've done a fair share of efi debugging even removing some of the variables
that the kernel will now protect you from breaking.

If the issue is that users should be able to remount efivars as rw whenever
needed then that should be addressed, not prevent you from doing stuff to it
because there is a rogue init system doing crazy stuff.

EDIT: BTW, i don't think systemd does anything besides write to the various
Boot* variables, but I may be wrong. I don't see why that can't be addressed
with a remount. If you replace the boot.efi you still have to remount the efi
partition anyway.

While Matthew may be right that there is an issue that needs to be addressed,
but in one of his tweets he basically says the kernel should fix it because
tooling isn't and bioses suck. Well, maybe tooling should be forced to fix it.

or from the issue:

    
    
        Matthew-Jemielity commented 24 days ago
        What needs efivars mounted at all anyway? So far I've seen:
    
        grub
        systemctl --firmware-setup reboot
        efibootmgr
        Since those likely need superuser, couldn't they handle (un)mounting it themselves?
        
        @annejan
        annejan commented 23 days ago
        As long as distribution that are aimed at consumers remount it ro and on updating kernels wrap grub with remount this is a complete non-issue.

~~~
datenwolf
> I personally don't like. mounting efivars rw is akin to mounting boot rw by
> default.

No, it's not the same. Mounting `/boot` rw by default does not put your system
in the danger of getting damaged beyond repair. If you hose the boot partition
you can always start a recovery system (live Linux or similar) to repair the
damage.f

But if deleting efivars renders a system inoperable on a firmware level you're
essentially SOL, save for rewriting the contents of the system firmware flash
using an external programmer and a clean image. That is an absolutely
inacceptable situation. The year is 2016 and hosing a firmware by writing
malformed values into the firmware API is, simply put, a software
vulnerability that allows to permanently DoS a system. As such this is a
security issue that must be fixed at where the security issue happens. And in
case of efivars the issue is that certain input is not properly validated
and/or sanitized. If a system firmware can not properly start with certain
variables being unset or removed or set to invalid valued, its should be a
implementation requirement to validate input on such variables before
executing the change.

> Defaulting to rw efivars is, excuse the language stupid.

It probably it. But it's not the responsibility of the OS to sanitize values
that are not intended for being used by the OS. efivars are intended to be
used by (U)EFI and hence it's the (U)EFI implementation's task to properly
sanitize access to them.

Essentially we're talking Bobby Tables here, just with a different API.

~~~
capitalsigma
> (U)EFI implementation's task to properly sanitize access to them.

If the implementation is bad (which it is), that doesn't excuse reckless
userland software that refuses to acknowledge flaws in the firmware.

~~~
EmanueleAina
There's a layer between userland and firmware where those things should be
handled, and indeed everyone agreed that it should be handled in the kernel

Saying that Lennart is wrong has become a rather popular sport, let's not go
overboard to say it even when he's right by all accounts.

------
useerup
This is a case of a _leaky abstraction_
([https://en.wikipedia.org/wiki/Leaky_abstraction](https://en.wikipedia.org/wiki/Leaky_abstraction)).
The "everything is a file" philosophy is the real problem here.

Specifically, the way a mental model of a hierarchy is broken by mounting a
higher-order ressource (UEFI variables) as a subordinate of a file system that
is itself a subordinate of the OS.

UEFI vars are just hardware resources. Mapping them as a file system object is
just unnatural and, yes, stupid.

Trying to use a permission model ("only root can do it") overlooks the real
problem: The user do not _expect_ higher order objects to be mapped as
subordinates of the file system.

When you delete from the file system, you expect objects to be deleted from
the _disk_ \- not UEFI variables to be altered or deleted! And because the
user does not expect such behavior, there's a good chance she/he will override
warnings and go ahead with the operation expecting only file system objects to
be affected.

This is "everything is a file" taken a bridge too far.

~~~
mjhoy
Where does Linux promise that files are bits on a disk? As a user I certainly
don't expect that. Perhaps you have a problem with the name "file" but the
abstraction itself still seems useful. (And yet I do find it quite odd when I
have to do something like `echo "TPAD" > /proc/acpi/wakeup` to disable wake-
on-trackpad.) That said I don't disagree with you that UEFI variables should
not be delete-able, but there are many files on Linux that you can't delete.

~~~
useerup
It is a _file_ system, consisting of file system objects, like files and
directories.

Already exposing processes as files is an abstraction. It somewhat works
because you can imagine the file representation being maintained by the
process. But it is an abstraction, because a process _is not_ a file.

But what is more important: A file system is a _hierarchy_. At the root is the
most fundamental object. Each level has _subordinate_ objects. That the model
you expect.

Having UEFI variables mounted as a file is a surprising loop back to something
even more fundamental that the OS itself: The firmware of the physical
computer. It a breach of the mental model.

It breaks one of the most fundamental principles that should be followed in
man-machine interaction: The principle of least surprise.

I have a machine. I have installed an operating system on it. The OS manages
several disks. On the disks the OS manages file systems. I expect the files of
that system to be managed by the OS.

I _do not_ expect that regular file system actions have effect _outside the
hierarchy_ of the directory on which I perform the actions. Specifically I _do
not_ expect files on that system to manage the physical computer.

~~~
deathanatos
> _Already exposing processes as files is an abstraction._

No. The _file system_ , is the abstraction. Adding /proc onto it is a use of
that abstraction.

There's two basic extreme positions, and you're adopting one, your parent is
adopting something closer to the other.

a. The filesystem only exposes filesystems actually on disk, mapped to some
hierarchy. As you say, " _On the disks the OS manages file systems._ "

b. The filesystem is (roughly) a hierarchical container of named binary blobs
(called "files") with some defined associated metadata, such as permissions.

While you can adopt (a), and that's fine, some of us (myself included) see a
lot of value in (b). The biggest problem with only exposing "real" file-
storing FSes in the file hierarchy is that it leaves you with a ton of
questions about how to expose all the other things. Taking the stance that
we're only going to expose "real" files in the file hierarchy leaves us with
several classes of objects that aren't files-on-disk, and you need to name
them s.t. the user can interact with them. It is certainly possible to expose
each different type of thing in a completely separate namespace. You'll
probably also need to be able to associate permissions with those objects¹, as
so now you've got a named, ACL'd list or hierarchy of objects, and it's
starting to look a lot like a filesystem. You now also need another set of
tooling to work with each of these classes of objects. You need another set of
syscalls for each of these objects.

The great thing about having a unified file hierarchy in the (b) abstraction
is that tooling works on all of these different classes of objects different.
It's really just the "CRUD" idiom, and normally it allows things to
interoperate quite smoothly. I can write a bash script that draws a progress
bar of my battery, and it requires no knowledge other than where in the file
hierarchy the battery is.

This is, of course, a case where the power is somewhat biting us. That doesn't
make the abstraction wrong, nor does it mean the abstraction isn't leaky. (In
fact, in this case, the abstraction works really well, I'd say. Any other
implementation of UEFI variables is going to have a "delete" call, AFAICT.
What bit us here is that all the objects are in one bucket together, and thus
rm -rf / removes more than just files.)

> _It breaks one of the most fundamental principles that should be followed in
> man-machine interaction: The principle of least surprise._

While I agree, that doesn't mean we need to throw out all the power of having
a unified file system, but it might beget some way of ensuring the user
understands what `rm -rf /` actually does. There's certainly more than one way
to solve this, some of which don't involve limiting what can be done with the
FS. (As some examples: perhaps rm shouldn't recurse to a different FS, and
objects of similar types are on different FSs, which prevent the very error
that got us here; perhaps some files force "user acknowledgement" of their
removal; perhaps it really does get mounted read-only.)

¹While you might be able to get away with "only root accesses UEFI vars" in
the scenario that they're not in the file hierarchy, if you remove all non-
real-files then you've got a lot of other things to deal with: unix sockets,
block devices, terminals, all the various I/O ports, temp sensors, battery
data… the list is extensive.

~~~
useerup
> No. The file system, is the abstraction. Adding /proc onto it is a use of
> that abstraction.

Agree that the file system is an abstraction. Makes us think in terms of
directories (containers) and files (items). Everything in the file system is
designed around the idea of files and directories. Permissions (rwx),
operations (create, move, copy, append, delete).

However, already adding /proc challenges that. What does it mean to have
"execute" right to a process? It is already running? What does it mean to
append to a process? to move it? If processes are "files", why can I not kill
the process by deleting the file? Processes are not naturally files. Yes, it
makes somewhat sense if you think of /proc as status information being
maintained for each process, i.e. they are extracts, owned by the OS.

But UEFI vars makes absolutely no sense. It is a true leaky abstraction. If
one need to be able to write to UEFI vars, then create an API for it, possibly
some utilities. That way I need not risk altering fundamental firmware
settings by performing seemingly file system operations whose effect I expect
to be limited to the hierarchy!

> The great thing about having a unified file hierarchy in the (b) abstraction
> is that tooling works on all of these different classes of objects
> different. It's really just the "CRUD" idiom, and normally it allows things
> to interoperate quite smoothly. I can write a bash script that draws a
> progress bar of my battery, and it requires no knowledge other than where in
> the file hierarchy the battery is.

But it actually just sweeping complexity under the rug. I need documentation
for _what_ the file contains on each "line" \- what it means to write to it,
etc. It is not discoverable at all. If you expose system resources as actual
resources and do not try to map them onto files, you can actually make a
discoverable system. An example of such a regime is CIM. On Windows,
PowerShell (or Python or VBScript or ...) can be used to interact with such
fundamental system resources. To use your example of a progress bar of the
battery, here is an example of how the entire process from _discovering_ the
correct ressource (the battery) to displaying a progress bar on Windows
_without_ consulting documentation:

    
    
    			PS C:\> #there's probably some class for batteries. let's look for it by name
    			PS C:\> get-cimclass *battery*
    
    			   NameSpace: ROOT/cimv2
    
    			CimClassName                        CimClassMethods      CimClassProperties                                                                                                                                                                      
    			------------                        ---------------      ------------------                                                                                                                                                                      
    			CIM_Battery                         {SetPowerState, R... {Caption, Description, InstallDate, Name...}                                                                                                                                            
    			Win32_Battery                       {SetPowerState, R... {Caption, Description, InstallDate, Name...}                                                                                                                                            
    			Win32_PortableBattery               {SetPowerState, R... {Caption, Description, InstallDate, Name...}                                                                                                                                            
    			CIM_AssociatedBattery               {}                   {Antecedent, Dependent}                                                                                                                                                                 
    
    			PS C:\> # the Win32_Battery probably offers the most specific information
    
    			PS C:\> Get-CimInstance Win32_Battery
    
    			Caption                     : Internal Battery
    			Description                 : Internal Battery
    			Name                        : DELL 1C75X31
    			Status                      : OK
    			Availability                : 2
    			CreationClassName           : Win32_Battery
    			DeviceID                    : 647Samsung SDIDELL 1C75X31
    			PowerManagementCapabilities : {1}
    			PowerManagementSupported    : False
    			SystemCreationClassName     : Win32_ComputerSystem
    			...
    			BatteryStatus               : 2
    			Chemistry                   : 6
    			DesignCapacity              : 
    			DesignVoltage               : 12992
    			EstimatedChargeRemaining    : 94
    			EstimatedRunTime            : 71582788
    			ExpectedLife                : 
    			MaxRechargeTime             : 
    			...
    			ExpectedBatteryLife         : 
    
    			PS C:\> # yep - that's it. lets save this instance in a variable
    			PS C:\> $bat = Get-CimInstance Win32_Battery
    
    			PS C:\> # display a progress bar and update it continually every 10 secs
    			PS C:\> for(){ Write-Progress Battery -PercentComplete $bat.EstimatedChargeRemaining -Status "Charge remaining"; sleep 10 }
    
    

> This is, of course, a case where the power is somewhat biting us.

No, what biting us is a leaky abstraction that surprises us: We can
accidentally delete firmware variables because file system operations are not
constrained to the directories/files they operate on.

> That doesn't make the abstraction wrong

It is an abuse of the abstraction.

> Any other implementation of UEFI variables is going to have a "delete" call,
> AFAICT.

Indeed. In PowerShell you can discover the commands for manipulating by gcm
_UEFI_

> What bit us here is that all the objects are in one bucket together, and
> thus rm -rf / removes more than just files.

No, what bit us is the broken expectation (a surprise) that a higher-level
resource was mapped below some file system directory.

~~~
stinos
Off-topic, but imo HN needs more PS-promoting posts like this. I have the
impression there are still tons of people out there who're stuck in the
'windows has no proper command line so administration means ugly batch files
and registry hacks' mindset. Examples like this should open their eyes. Sure
you can't pipe text around (and go through hoops to parse it corectly) like in
bash so it takes some getting used to depending on your background, but one
you get a hold of it you realize it really is _power_ shell.

------
babarock
A lot of people are busy arguing who to blame, I just think it's interesting
that sometimes you need to support non-standard software. Actually, I think
it's interesting that highly successful programs are not the ones who go "not
my fault, you should just implement the spec better".

Raymond Chen talked[1] about the importance of supporting that ran on Win95
but broke on WinXP, even if they weren't complying to Microsoft specs.

I also remember reading that web browsers had to go to great length to render
completely non-compliant web pages.

In your experience, when should you decide to support "non-complying"
behavior?

[1]: Unfortunately I cannot find the original article by Chen but I could find
extensive mentions of it in [this article by Joel
Spolsky]([http://www.joelonsoftware.com/articles/APIWar.html](http://www.joelonsoftware.com/articles/APIWar.html))

~~~
fpgaminer
This may be the Raymond Chen article you are referring to:
[https://blogs.msdn.microsoft.com/oldnewthing/20031224-00/?p=...](https://blogs.msdn.microsoft.com/oldnewthing/20031224-00/?p=41363)

------
Mic92
Instead of flaming systemd developers for mounting efivars read/write, the
kernel is the right place to fix the problem for everybody!

~~~
ajross
No, the firmware is the right place to fix the problem. A BIOS that bricks
itself because of within-specification deletion of variables via a standard
API is just plain broken.

But in the real world no one ever fixes firmware bugs, so this is the best we
can do.

~~~
spoiler
> But in the real world no one ever fixes firmware bugs, so this is the best
> we can do.

PREFACE: This is an anecdote, but I do believe it reflects on general state of
hardware vendors, because when I Google'd, it showed that people had similar,
if not worse problems than I did.

And this is so incredibly sad. Especially when you buy a $2.5k laptop which
only works with Windows (with quirks).

I bought a laptop^[model] on which you couldn't even install another OS
because of a crippling firmware bug. It wasn't until a shit storm on their
forums that they released a firmware update which fixed the issue (which was
that the SATA controller was stuck in RAID mode, and you couldn't change it to
AHCI), which prevented any OS from being installed (even window, that _was_
installed already, which is bizarre) because no OS could recognise the PCIe
NVMe M.2 SSDs.

After the update was released, I did happily install Linux on it, but the ACPI
DSDT was so broken, I didn't know where to begin with fixing it (apart from
this whole hardware stuff being outside of my domain). Other than that jack
detection is jack shit (pun intended). I literally can't use my headphones
without special OEM or Realtek software (forgot which) on Windows, and I can't
use them at all on Linux because there's no equivalent. I tried playing with
various modes^[modes] and output configurations, but to no avail.

Also, on Windows I hear a subtle scratchy sound from somewhere in my laptop,
but I don't hear it on Windows. I noticed it the most while moving my USB
mouse or when there's a lot of CPU intensive work. No, all the solutions
recommended online didn't work, and this is apparently an issue with Windows
on Asus/Realtek for _years_ , if not decades.

Furthermore, there's a bizarre flicker which subtly intensifies and then
subtly goes away on Windows (and it interestingly happens only in some
applications which appear to use GPU acceleration) which doesn't happen on
Linux (even during an intensive OpenGL benchmark followed by a WebGL
benchmark).

The things I thought I'd have most issues with (the GPU and the Skylake
processor) turned out to be the _least_ of my problems. Actually, 0 problems
with them. So, kudos to NVIDIA for their proprietary Linux drivers (the
novueau ones worked great, too, but I devcided to go for the proprietary ones
due to the slight performance benefit).

So, no this isn't a Linux issue to anyone who wants to scream "boohoo linux is
bad for consumer PCs". This is all an issue of shitty hardware vendors.
There's probably over a hundred models documented on the Archlinux
Wiki[archwiki] with all their various quirks and what not. Most of those are
actually hardware problems, and there's no way for Linux to fix all these
problems without there being some giant database with each laptop model and
its quirks and applying configuration fixes, and this would also have to be
distro-agnostic or cover various distros to work properly. The only reason why
most of it _kinda_ (not flawlessly) works on Windows is because the various
Vendors actually cooperate with the Windows developers (I imagine), and its
rare that I see them even trying to cooperate with Linux developers; maybe I
just missed it, but each time someone _does_ cooperate, it's met with this
grand praise that's quite hard to miss, so I doubt I missed it (this excludes
certain vendors who have always cooperated with Linux devs, or who
specifically write drivers for linux in the first place).

It's so, so solemnly sad that people blame most of this, if not all, on Linux.
Especially considering Linux does its best to try and patch this endless
stream of oncoming shitty hardware and nobody (not literally nobody, but a
very small percentage) sees or recognises that effort.

\----------

[model]: ASUS ROG G752, for anyone wondering

[modes]: [https://www.kernel.org/doc/Documentation/sound/alsa/HD-
Audio...](https://www.kernel.org/doc/Documentation/sound/alsa/HD-Audio-
Models.txt)

[archwiki]:
[https://wiki.archlinux.org/index.php/Category:Laptops](https://wiki.archlinux.org/index.php/Category:Laptops)

~~~
voltagex_
The DSDT problems aren't just for the high end laptops either - I have a $350
Z3735F system that requires a DSDT edit which I don't understand [1]
[https://gist.github.com/voltagex/eec041092d719c77483e](https://gist.github.com/voltagex/eec041092d719c77483e)

systemd can't take all the blame either - I bricked (yes really bricked) one
of these by grub installing a "stub" that only booted into grub-rescue on my
EFI partition. I can't get into the firmware settings and the rescue loader
can't read the partition tables -> bricked unless I can corrupt the EEPROM
somehow and force a menu (no CMOS battery in these low-end devices to pull)

~~~
ajross
FWIW: that DSDT hack is replacing a constant return value from the first
method in the "HAD" device. It was specified to return 0, but now returns 0xf
instead. This is almost certainly the _STA (status) method, which informs the
OS about the operational status of the described device. I forget the exact
meaning of the bottom four bits offhand, but 0xf is the standard value for
"device present and operating normally -- use it!".

That it was returning zero would cause the linux ACPI framework to ignore it
and not probe its driver. My vague understanding is that windows works
differently, and calling _STA is done by the driver, so it's possible to just
not do it and still have a working system.

I don't know what the device itself is, but given that the script says "audio"
in there it's probably the audio codec.

~~~
voltagex_
PendoPad 11" 'laptop' running Windows 10 Home 32 bit (despite having a 64 bit
capable processor).

I replaced the bricked device and I'm going to be a lot more careful this
time.

Booting Ubuntu Wily works, but there's no battery (status/charging?), wifi,
audio or touchscreen. So if you use the XDA scale it's working perfectly!

I have another Z3735 device (MeegoPad T01 - Intel Compute Stick knockoff), but
it's unusable because the clock runs fast, then slow, then fast - enough that
an NTP sync makes the clock go backwards and then everything breaks.

These chipsets are turning up everywhere and most of the time the
implementation is garbage. I hope Intel did better with the reference
implementation/s but I can't afford them at the moment.

~~~
yuhong
The idea of multiple vendors making the same "beige boxes" and only competing
on price is bad. One of the things on my wishlist for an Intel/AMD branded
laptop is a high quality UEFI implementation.

~~~
voltagex_
Keep wishing. Even a $1400 Lenovo X1 had terrible UEFI when I used it - what's
the point of re-implementing your old keyboard-only BIOS UI exactly if you
have mouse support? Dell does better, at least in the business grade products.

~~~
yuhong
I am talking about an Intel/AMD _branded_ laptop.

~~~
voltagex_
I get you now - do they exist outside of CES and the Intel employees toting
around prototypes?

~~~
yuhong
I know it does not currently exist. The point is that it would be a better
idea than what we have now.

------
ktRolster
This commit message shows why I like the Linux Kernel team:

    
    
      >These fixes are somewhat involved to maintain
      >compatibility with existing install methods 
      >and other usage modes, while trying to turn
      >off the 'rm -rf' bricking vector.
    

They go out of their way to make sure changes are backwards compatible.

~~~
5ilv3r
Like unbricking ftdi chips, the kernel team has a great track record of taking
care of users where the vendors fuck them over.

------
RaleyField
It shouldn't be possible to brick UEFI if it was sanely designed. That's a bad
firmware.

~~~
semi-extrinsic
As usual when someone mentions "UEFI" and "sane" in the same sentence, I post
this quote from Matthew Garret [1] (of Linux EFI maintainer fame):

""" UEFI stands for "Unified Extensible Firmware Interface", where "Firmware"
is an ancient African word meaning "Why do something right when you can do it
so wrong that children will weep and brave adults will cower before you", and
"UEI" is Celtic for "We missed DOS so we burned it into your ROMs". """

[1] [https://lwn.net/Articles/444666/](https://lwn.net/Articles/444666/)

~~~
yuhong
[http://linux.slashdot.org/comments.pl?sid=8693705&cid=514191...](http://linux.slashdot.org/comments.pl?sid=8693705&cid=51419159)

------
gpm
So if I understand this correctly, now instead of bricking the system it will
just fuck up the bootloader, even if the bootloader is completely unrelated to
the linux install you are `rm -rf /sys`ing. Since the useful efivars that set
up bootloaders must be on the whitelist.

It's an improvement, but it seems like we should do this _in addition to_
default mounting read only.

~~~
protomyth
It still seems to me that Linux should follow FreeBSD and not mount it as a
filesystem and just use a library to manipulate the values. It clearly has
some huge problems with being a filesystem. This isn't Plan 9 and everything
does not have to be a file.

~~~
floatboth
FreeBSD actually doesn't have any support for EFI variables at all! It just
installs the loader into the default location (bootx64.efi) and the loader
does everything.

~~~
protomyth
I guess I counted the loader as part of the system. Its installed by FreeBSD
to do the job.

------
literally_
And so, when we say " _permanently destroy_ " do we really mean that something
is " _destroyed_ " and so done with " _permanence_ "?

This motherboard... It refuses any sort of reflashing of the firmware? Taking
the button cell out of the battery slot, and removing all power from the board
does nothing? The motherboard won't enter BIOS, upon pressing F10 at power on?

What is this... " _bricking_ " we speak of, here?

~~~
nkurz
Yes, this is one of the rare cases where "brick" is being used in the correct
technical sense of rendering the motherboard permanently unusable without
repairs involving a soldering iron. There is no reset option.

~~~
Laforet
Depends on the package used it may be possible to reprogram the UEFI chip in
situ with test clips[1]. But of course one will still need to acquire a
reprogrammer and a know good copy of firmware.

[1][https://www.digikey.com/product-search/en/test-and-
measureme...](https://www.digikey.com/product-search/en/test-and-
measurement/test-clips-ic/2294786)

------
chris_wot
Ok, so I was wrong when I said Lennart Pottering's original response was
"pretty appalling".

Unlikely he saw it, but my apologies nonetheless.

~~~
coldpie
Poettering gets shit on a lot. While his software does often have problems, he
is really well-intentioned and working on really hard things that really do
need fixing. New software simply has bugs and problems, and while the
transition period can be rough, I think things will be really improved when we
break through to the other side.

~~~
cyphar
People don't hate systemd because it has bugs. People hate it because it's
clear that it thinks it owns your system and that its manifest destiny is to
contain all software that exists.

~~~
cbd1984
No, they hate it because they think hating it is a good way to troll people.

It isn't honest hatred, it's just dancing around making ugly faces, like a
six-year-old.

~~~
chris_wot
I really like systemd, but I think that's not accurate. People have well
thought through reasons for disliking systemd

~~~
coldpie
However, "it's clear that it thinks it owns your system and that its manifest
destiny is to contain all software that exists" is not one of those well-
thought-through reasons :P

~~~
chris_wot
I don't disagree :-)

It would be hilarious for it to take over LibreOffice or FireFox.

~~~
cyphar
Well, it's already taken over your logging system (storing things in a binary
format, which means you're fucked if it gets corrupted), your bootloader, top,
it provides an alternative container runtime, etc. It's just a matter of time.
:P

------
devit
It seems to me that the real issue is that "rm -rf" should by default not
recurse into mounted filesystems, but should at most try to unmount them.

In addition to clearing EFI variables, the current behavior will also attempt
to clear any mounted removable drives and any mounted network drives, which is
usually even more harmful than messing with EFI.

Of course that would be a backwards incompatible change, although I don't
think many scripts rely on this behavior.

~~~
Etzos
To be fair "rm -rf /" doesn't just work. You have to confirm that you really
do want to delete everything. Destroying / in itself is pretty harmful. If
you're planning to do that you should already know not to have anything you
want to keep mounted.

~~~
protomyth
For the few use cases where a system admin wants to "rm -rf /", there are
hundreds of bad scripts that can screw up a system. I believe Solaris did the
right thing and made it not work.

[https://www.youtube.com/watch?v=l6XQUciI-
Sc&t=81m](https://www.youtube.com/watch?v=l6XQUciI-Sc&t=81m)

~~~
Etzos
To be clear, the problem described in the video is not something that can
happen. "rm -rf $1/$2" where $1 and $2 aren't defined (therefore making it "rm
-rf /") will not run. If you really want to destroy your root directory you
have to specify the --no-preserve-root flag. No more accidents from scripts
that assume things poorly, but it will still do exactly what the user asks. ﻿﻿

~~~
protomyth
Yet, I can still delete my home directory by accident (e.g. Steam patch). The
idea that any rm can kill the directory I'm in is just bad. A flag on rm is
the wrong solution. It should just fail.

~~~
Etzos
I disagree with this almost completely. If I tell my computer to do something
it should just do it (possibly after some complaining). You cannot delete the
directory you are immediately in, so that at the very least is prevented. But
as you move away from just root the usefulness of deleting nearby folders and
files actually becomes useful. And putting those kinds of things is absolutely
a reasonable solution. It keeps you (and scripts) from shooting yourself in
the foot but lets you do things as long as you acknowledge what you are
actually doing.

~~~
protomyth
I said "The idea that any rm can kill the directory I'm in is just bad. A flag
on rm is the wrong solution. It should just fail."

you say "You cannot delete the directory you are immediately in, so that at
the very least is prevented."

I have no idea what the rest of your comment is in relation to what I said
other than I'm pretty sure you can accidentally delete a directory your in
given what Steam did.

Deleting you current directory is against the POSIX standard. It should not be
allowed.

~~~
Etzos
What I was saying was that "rm -rf ." just won't work. You cannot delete the
directory you are in directly ("." and ".." are not valid options).

If however you delete a directory that is higher up the directory tree (e.g.
the parent directory), it will be deleted.

As far as I can tell this does not violate the POSIX standard[1], as that
situation is left as undefined (since in theory the directory you are deleting
will chain to the directory you are currently in which is open in the tty).

Edit: The rest of my previous comment was trying to say that the utility of
being able to self destruct the current directory is arguable. Why should it
be prevented (especially when it could just be hidden behind a flag to prevent
accidental destruction)?

Edit2: D'oh. Forgot the reference:

[1]
[http://pubs.opengroup.org/onlinepubs/9699919799/functions/rm...](http://pubs.opengroup.org/onlinepubs/9699919799/functions/rmdir.html)

~~~
protomyth
"If the current working directory of the process is being removed, that should
be an allowed error."

Ok, I read that wrong a long while back, but "allowed error" is really odd. I
guess I still side on an error is an error and it should not be allowed.

------
smaili
I've never actually seen or had the courage to attempt "rm -rf /". Does anyone
know what would happen if it were run?

~~~
floatboth
You wouldn't be able to run any programs from the hard drive, and when you
reboot, you'll end up with no operating system.

Unless you're running FreeBSD (or Illumos) with ZFS and Boot Environments, in
which case you'd just select a backup boot environment and continue working
:-) Probably without your home directory though, as that is usually excluded
from boot environments. But you can set them up however you want.

But if you're running Linux (before this update) on a laptop with terrible
piece of shit firmware, you'd end up with a brick.

P.S. found a cool post about rm -rf / in my bookmarks:
[https://lambdaops.com/rm-rf-remains/](https://lambdaops.com/rm-rf-remains/) –
you can recover a running rm'd Linux machine by using a running shell and
/dev/tcp :D

~~~
dingo_bat
I wanna try it out but I'm not brave enough to do it directly in my OS. WHat
would happen if I did it in Ubuntu inside a VM? What if VirtualBox has mounted
some directory on my system to the guest? I'm afraid to try this too :/

I once tried 'format C:' on a Windows 10 laptop I didn't care about and I just
got a boring error message.

------
milkey_mouse
good

------
chm
Well that is a nice follow-up! However I can't remember if I read someone
arguing this was a feature...

------
tvbn
They disabled kernel feature to prevent systemd bug? It doesn't make any
sense.

~~~
kbenson
It's not a systemd bug, it was just _exposed_ by systemd. This was what most
everyone seemed to completely miss in the prior exchanges.

Another thing that was missed was that Lennart wasn't being unreasonable, nor
was he saying it wasn't a problem (he specifically stated the opposite, in
fact). I had a feeling at the time (based on his responses) that the reason he
wasn't specifically stating he was going to fix it or open a bug report for it
in systemd was that he was going to push it up-stack to a more appropriate
place, and it looks like that's what happened.

~~~
eeZi
> Another thing that was missed was that Lennart wasn't being unreasonable,
> nor was he saying it wasn't a problem (he specifically stated the opposite,
> in fact). I had a feeling at the time (based on his responses) that the
> reason he wasn't specifically stating he was going to fix it or open a bug
> report for it in systemd was that he was going to push it up-stack to a more
> appropriate place, and it looks like that's what happened.

This. It's really hard to blame this on systemd (not that people didn't try
anyway).

~~~
5ilv3r
My bsd init and sysv init machines do not have this issue. They don't
automount every damn filesystem they see at boot for no reason.

~~~
ambrice
It seems to me like it's a bad thing that an accidental rm in an efivarfs
filesystem can brick the system, regardless of whether the filesystem was
automounted or not.

