Hacker News new | past | comments | ask | show | jobs | submit login
Red Hat and CentOS systems aren’t booting due to BootHole patches (arstechnica.com)
113 points by thg 49 days ago | hide | past | favorite | 60 comments

As usual, there are prophetic warnings from Linus:


BIOS should be simple, because it is buggy anyway. Handing over to a bootloader in the MBR is all that a BIOS should do. Now one is at the mercy of NVRAM, grub2 and loads of gratuitous complexity.

EFI really was a step in completely the wrong direction. Massive complexity & feature set for no good reason.

Legacy BIOS had to go, it was designed at the time of 8086 and DOS and was really out of step with modern HW and OS needs. The replacement could have been even simpler though, now that OSes really want to take over everything themselves and no longer lean on the BIOS that way that DOS used to. Instead EFI created a monster and now we're stuck with it.

XSFI, the Extremely Simple Firmware Interface (currently in its second draft) is a good solution. https://www.alm.website/misc/specs/xsfi

It could be implemented in UEFI, and bundled with disks as a compatibility layer, for machines that support UEFI but not XSFI.

Few will remember but Intel came up with something simpler: Simple Firmware Interface.

Wikipedia only mentions the mobile platform, but it was also what the card-based Xeon Phi was using (own research).

BIOS is a good idea. It should have been expanded. For example I should be able to download Nvidia driver, install it into BIOS and then talk to it via standard interface (something like Vulkan). Now every operating system can just utilize that interface and use well tested driver from manufacturer, rather than ask manufacturer to write that driver for every operating system.

> install it into BIOS and then talk to it via standard interface

Isn't that how it's worked since forever? The graphics card already comes with a driver in a flash ROM chip, the BIOS installs it during boot, and programs can talk to it through the standard INT 10h interface. AFAIK, modern graphics cards also already come with an EFI driver in the same flash ROM chip, which the BIOS installs during boot when in EFI mode, and EFI programs and operating systems can talk to it through standard EFI interfaces.

"Via Standard Interface" I don't think you can gloss over the differences between say DirectX and Metal that easily. Maybe you're arguing they should be gotten rid of, but I think that's rather pie in the sky.

That’s a nice idea, but the history of graphics APIs and drivers suggests otherwise. The path between the hardware and the application has only gotten shorter and more direct, not less.

I hope RISC-V will get booting right.



Early discussions (cannot find a trace online, from my mail archives):

From: ron minnich Subject: Re: [sw-dev] SBI extension proposal v2 Date: Sat, 10 Nov 2018 08:46:07 -0800

At Google and other places, we've been struggling now for years with overly complex firmware that is implemented incorrectly, enabling exploits and other bad things. The list of things vendors get wrong in firmware, both enabling exploits and enabling others to enable exploits, is long and it continues to this day. There is an unbelievable amount of money out there all involving firmware exploits, very little of it involving nice people.

I'm currently working on deleting all use of the x86 version of M mode, i.e. SMM. There are many proposals out there for deleting SMM from the architecture. I've also shown at a talk in 2017 how we could redirect SMM interrupts back into the kernel. We're also removing all use of callbacks into UEFI on x86. We're almost there.

Which is why I'm a bit unhappy to see this (to me) cancerous growth in proposals for M- mode code. PPP in firmware? Really? multiple serial devices? really? We've been here before, in the 1970s, with something called the BIOS. If you're not familiar with it, go take a look, or you can take my word for it that these proposals implement that idea. We spent over 20 years freeing ourselves from it on x86. Why go back to a 50 year old model on a CPU designed to be in use for 50 years?

My early understanding of M mode was that it was an Alpha PALCode like thing, enabling access to resources that were behind a privilege wall. I did not like it that much, but I was OK: it was very limited in function, and the kernel could replace it, or at least measure it. I also accept that every cpu vendor uses m mode like things (e.g. ARM TF) for reasonable purposes and also (let's be honest here) for dealing with chipset mistakes. But that does not mean you need to recreate BIOS.

The SBI should be hard to add to, deliberately. It should be used only when there are no possible alternatives. It needs to be open source and held in common. It should be possible for a kernel to replace or at least measure it. And, further, there needs to be some work done on why you add to it, and why you don't, with bias against adding to it. This proposal works against those ideals, as it explicitly enables vendor-specific forks of the SBI. Sure, this can happen, but why make it so easy?

see https://github.com/riscv/riscv-sbi-doc/pull/12 for other thoughts.

Also, I've had discussions with some security folks in our firmware community about the fact that the PMP can be used in a way that the kernel can not measure the SBI, since SBI might read-protect itself. This is a real step backwards, FYI. Not sure if it can be changed at this point.

ron p.s. For interleaving debug and console output firmware, use the oldest trick in the book: ASCII is 7 bits. Since console out is 8 bits, reserve 128 values for console out, and 128 for debug stream, and if the debug stream needs 8 bit for some words, you know what to do. It's very easy and doesn't require that we add multiple UART support to SBI.

Not at all like systemd then. I had hoped by now we'd be using something that took the arguably good qualities (simple unit files) and none of the complexity (why does it implement DNS?) and we could have the better subset.

It will continue to bite until enough people petition it's replacement.

Don't get me wrong, some of the features are good, but the kitchen sink is not.

systemd does not implement DNS. There's a helper program that can resolve DNS queries (systemd-resolved), but it is optional.

systemd is largely architected the way you'd hope it to be. systemd itself, the core program, is responsible only for maintaining the lifecycle of other programs. The other functionality is provided by satellite daemons, which you can choose to use (or not use!) at will.

Now, distro maintainers may choose to use the whole kit-and-caboodle, but that is a deliberate decision on their part. It's not forced by the systemd authors, and you are free to override your distro maintainer's choice if you so desire.

> systemd does not implement DNS. There's a helper program that can resolve DNS queries (systemd-resolved), but it is optional.

So it implements DNS.

...and I must add, implements it badly [0]

> Now, distro maintainers may choose to use the whole kit-and-caboodle, but that is a deliberate decision on their part. It's not forced by the systemd authors

Not forced, instead they "gently push" [1]

0. https://news.ycombinator.com/item?id=8595335

1. https://lists.freedesktop.org/archives/systemd-devel/2010-Se...

This is part of the mess that Oxide Computer wants to clean up, right? Are they going to use EFI at all? Anyone from Oxide around?

I’m not saying Linus isn’t extremely competent, much more so than I and probably anyone reading this will ever be, but in general terms I think anyone arguing against complexity in software design, while in a way “doing the Lord’s work” even in my own opinion, will never be wrong: in the cases where their prophecies never come true, the opinion will still be considered by most to be “correct in theory”, and if something does break down the line, the opinion can be referred to as prophetic.

So, while I agree that we should keep things simple and modular, it’s a thankless job trying to solve issues and being forced to add complexity. Nobody defends complexity, and maybe we shouldn’t, in order to stay on our toes, but defending simplicity is also about the safest thing you can do.

How is EFI responsible for a security bug in open source bootloader that's not even EFI-specific?

It's a lot simpler on embedded devices. U-Boot for example is extremely simple and much more robust than GRUB or LILO.

Like the biggest attack vector was BIOS, they wanted to secure that with UEFI, it was nothing but headaches with my dual boot installs.

I really do miss the simplicity of grub 0.97,keeping it as long as I can on my older boxes.

Grub simple?

If you want simple use systemd-boot. It's a hell of a lot leaner than grub ever was. The config is sane and doesn't require a billion additional modules to be installed.

It should be the default at this point, especially after all of this fiasco.

Next thing I will hear "Linux simple? Systemd-kernel is a lot leaner"

I think we have different definitions of lean and simple. After about a decade of Linux use I just found out Grub has modules thanks to you. You point it a disk and it installs. You edit the config file and that works. That's my experience it. I have had many debates with people who like systemd-isms , it's a fundamental difference in use case and philosophy.

Systemd-boot is not originally designed in systemd philosophy. It was originally gummiboot and got adopted into systemd project.

The explicit design goal was to be a minimal alternative to grub.

"New software will eventually contain bugs", and then bringing it up 14 years later, seems to bring nothing to the table.

Hence the “prophetic” bit.

Ubuntu had this as well, but got a release out quickly so it seems to have not been too huge an issue.

Bug report: https://bugs.launchpad.net/cloud-init/+bug/1877491

Description/remediation: https://wiki.ubuntu.com/SecurityTeam/KnowledgeBase/GRUB2Secu...

>Currently, we populate the debconf database variable grub-pc/install_devices by checking to see if a device is present in a hardcoded list [1] of directories:

Is there a reason why this patch was silently applied? For something as risky as breaking the boot process, you'd think you want user confirmation before proceeding. It can obviously be done, eg. https://d11a6trkgmumsb.cloudfront.net/original/3X/f/5/f55e36.... Also, recovering from this might be easy if you're technically inclined, but it could be worse if you have FDE enabled with the boot keys sealed in TPM. Changing the boot loader or the secure boot settings in that case might lead to the TPM refusing to release the disk encryption keys, which could lead to permanent loss of data.

Number of users saved from attack by secure boot: 0 Number of systems bricked by fixes to secure boot vulnerabilities: Many

Sad to see TSA in software form.

We have a centos 8 server that has been bricked, and there doesn't seem to be any solution right now. Downgrading the packages shows "lowest version already installed, cannot downgrade it.", and I can't manually install the old packages because there doesn't seem to be any way of getting them. Someone says to use http://mirror.centos.org/centos-8/8.2.2004/BaseOS/x86_64/os/... but that seems to have the very latest packages and even after updating the affected ones it still fails to boot.

> Someone says to use http://mirror.centos.org/centos-8/8.2.2004/BaseOS/x86_64/os/... but that seems to have the very latest packages

At least for me, that page seems to have both shim-x64-15-13.el8.x86_64.rpm from 2020-07-29 22:10 and the older shim-x64-15-11.el8.x86_64.rpm from 2020-05-07 19:53; the older one should work. Worst case, you could manually copy the shim executables from a working server to the EFI partition of the broken server (from what I have read at https://bugzilla.redhat.com/show_bug.cgi?id=1861977, in the RHEL/CentOS case it's the shim executable which is broken, so you don't have to do anything to the grub executables).

Yes, I see I needed to look for the 81 version numbers. I have now installed those packages (and confirmed I have the older grubx64.efi in /boot/efi/EFI/). However it is still giving the same error on reboot.

I installed these rpms (which are the affected ones for my system):

grub2-common-2.02-81.el8.noarch.rpm grub2-efi-x64-2.02-81.el8.x86_64.rpm grub2-pc-2.02-81.el8.x86_64.rpm grub2-pc-modules-2.02-81.el8.noarch.rpm grub2-tools-2.02-81.el8.x86_64.rpm grub2-tools-efi-2.02-81.el8.x86_64.rpm grub2-tools-extra-2.02-81.el8.x86_64.rpm grub2-tools-minimal-2.02-81.el8.x86_64.rpm

Is anything else needed? I'm thinking the easiest solution is for me just to wait a few days and do a "yum update" in rescue mode once an update fix is available. Luckily this is a non-critical server.

It just links to the redhat info which doesn't work on centos 8.2 (error message is that those packages are already at lowest versions).

I see both the old and new packages on mirrors.kernel.org:


Hey all, just wanted to make you aware of the mortar project here: https://github.com/noahbliss/mortar

It takes a comprehensive approach rather than piecemeal like a lot of these patches, leveraging technology already in your system to build a conceptually airtight and fully audited system. Happy to get some of your opinions on it, constructive criticism, and pull requests!

I still don't understand why this even needed to be fixed. Finding a way to circumvent UEFI DRM seems to be a good thing.

Secure boot can usually be turned off if its unwanted. I agree its not necessary on every device—but if it is enabled, the OS should assume the user wants security at that level of the chain. It certainly shouldn't just circumvent it as a matter of course.

> Secure boot can usually be turned off if its unwanted.

Emphasis mine. The main objection to secure boot is that, some time in the future, it will be mandatory; that already is the case for Windows devices with ARM CPUs (https://www.softwarefreedom.org/blog/2012/jan/12/microsoft-c...).


Windows Client (laptops/tablets/...) devices with 64-bit Arm CPUs have Secure Boot unlockable. That article applied to Windows RT and Windows Phone, which were earlier projects.

My objection to Secure Boot is right in the name. What's it designed to be secure from? From me, the user.

But it's not, it's meant to be secure from rootkits!

A lot of secure boot implementations let you add your own keys. Some don't, and that's bad, but it's not the fault of secure boot!

Number of rootkits Secure Boot has saved me from: zero

Number of times Secure Boot has locked out a legitimate user: too many

The only reason everyone used the Fedora key was because the alternative was registering with Microsoft, paying $99, and hoping for approval. Microsoft are as much a gatekeeper in this as they've always been, and the whole framing of the news around this feels like an attempt to discredit those who would go around the gatekeeper: https://mjg59.dreamwidth.org/17542.html

Where are you even buying computers that make secure boot mandatory? I know they exist but I’ve never run into them. Did you consider just getting a different model?

Please don't victim-blame :)

no guarantee they'll be unlockable in later projects

I'm kind of out of the loop - is BootHole a UEFI or GRUB2 vulnerability?

BootHole is a GRUB2 vulnerability that could allow a local attacker to bypass Secure Boot protections. The patches for RHEL/CentOS lead to failures in booting due to a crash in the UEFI loader for GRUB2.

The most secure boot is the one that does not take place ;) So maybe there is a method in this madness?

It seems that it's really critical issue. How does this patch pass Red Hat's test?

Bugs get through testing quite frequently, and this one was probably somewhat fast-tracked because it's a security fix; it happens.

FWIW, I tested upgrading a few (test) CentOS virtual machines at work to see if I can trigger this bug, but they worked fine, so perhaps the bug only triggers with a configuration they happen to have not tested.

> I tested upgrading a few (test) CentOS virtual machines at work

Were the virtual machines using EFI? Most virtual machines I've seen boot through legacy BIOS, not EFI.

they boot using BIOS, yes. I suppose some of our EFI-booting hardware hosts might be affected, but all of our virtual machines boot using BIOS since it happens to be the default and there's not much point in switching them to EFI.

Fixes now available for RHEL, still waiting for CentOS.

Is Secure Boot on Linux actually secure?

I remember reading it was like a signed loader and that's it. But I presume that's incorrect?

"Is it secure?" is an incomplete question -- it has to be "Is it secure against...?". And then the answers will depend on whatever threat model winds up in the dots.

That said, all secure boot even tries to assure is that the software that's booting is the same thing you thought you installed. If that is, say, a Linux distribution running a webapp which has problems, well... the boot mechanism can't save you from those.

Ah yes. The question I should have asked is: Is it more secure than not using it, and in what way?

Like for example, the Linux kernel isn't signed, right?

The code for the kernel is signed through git, but you or your distro maintainer need to then compile it. The resulting binary could be signed, but it wouldn't be by Linus/The Linux Foundation. Given that there are a million options specific to your use-case and hardware, you can't really rely on reproducible builds in the general case.

Secureboot is one step in a chain of verification you'd need to do to make sure you're only running the binaries you've approved:

> Secure Boot is a technology where the system firmware checks that the system boot loader is signed with a cryptographic key authorized by a database contained in the firmware. With adequate signature verification in the next-stage boot loader(s), kernel, and, potentially, user space, it is possible to prevent the execution of unsigned code.


> I remember reading it was like a signed loader and that's it.

That is literally what secure boot is, nothing more:


That's the great thing about open source, though. You guys can fix it yourself. Me, I'm stuck on Windows 10.

There's probably more people capable of fixing Windows 10 at Microsoft, then there are people capable of fixing bootloaders in Linux in the world. Just because something is open source, doesn't mean anyone can just jump in there and change the oil.

Exactly my point.

Applications are open for YC Winter 2021

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact