Hacker News new | past | comments | ask | show | jobs | submit login
Grub2 security update renders system unbootable (redhat.com)
134 points by beefhash 8 days ago | hide | past | favorite | 84 comments

I assume this is related to this from yesterday https://news.ycombinator.com/item?id=23990075 Which is about revoking secure boot keys

There was a story on HN last night from Debian where they laid out this issue, and basically stated "Yes, this security update is going to render some systems unbootable, here is why we're doing it anyway."


Stability is important, especially when it comes to unbootable machines—but I don't quite know what anyone was supposed to do here. If a user has secure boot enabled, the OS has to assume that the user wants/needs security at that level of the chain—and it is therefor responsible for ensuring the chain's integrity. In this case, there was no way to do that without some machines (temporarily) failing to boot.

What would have been a better way to handle this?

There are other problems with the grub configuration and install, as a result of these changes, it seems. I had a few AWS EC2 instances, running an ubuntu-20.04 AMI, not come back after a reboot this morning, due to this brand new issue:


"here is why we're doing it anyway.":

1. Because if you even use SecureBoot then not booting is a better alternative than your security being compromised.

2. Because you can do a rescue boot and update other relevant software.

3. Because if you really want to, you can just disable SecureBoot.

Is there proof that secure boot actually prevented an attack?

Secure boot has, for the most part, prevented the underlying system from being compromised by malware that can run before boot (and therefore hide itself from the OS). I don't think we can really prove a negative here, but it means no more "boot sector" viruses, either on the HDD itself, or on external bootable media. That certainly is a good first step to preventing foot canon incidents, and giving the OS a decent chance at preventing malware running before the OS has control of the system.

Couple in, as another post said, a tpm which only releases the decryption key if the system is in a valid state with secure boot running, it helps a lot with basic cold boot protection. TPM is far from perfect, but it is arguably better than a user not encrypting their disk at all, and this prevents attacks like replacing the utilman.exe with cmd.exe etc.

I don't know, but if you don't think secure boot isn't valuable and/or worth the trade-off, you should turn it off on your machine. Anyone who does so will not be affected by this update.

I know there are some environments that don't allow turning off secure boot, but this really isn't the OS vendor's fault.

A huge number of devices today (mobile or not) have their disk encryption keys protected by secure boot or similar mechanism where the TPM vends the keys only when securely booted, and they are not breakable trivially. So I'd say the answer is yes.

I will state the same comment as last time.

Can distros maybe consider moving to systemd-boot at some point? Systemd is already built in and can handle things like mounting pretty easily and simply.

It is a hell of a lot leaner than grub, doesn't use a billion superfluous modules. That and it is a lot easier to prevent tampering compared with the cumbersome nonsense that is grub passwords.

Oh and it enables distros to gather accurate boot times and enables booting into UEFI direct from the desktop.

It works with secureboot/shim/Hashtool. Also each distro has it's bootloader entries in separate folders to avoid accidental conflicts.

systemd-boot (AKA, gummiboot) is leaner, for sure- much in the same way a bicycle is leaner than a truck.

It might surprise you to know this but systemd-boot does not support, among other things: "BIOS"/MBR boot, EXT4, XFS, mdraid, LUKS (hidden boot), OPAL (aka hardware FDE, self-encrypting drives) or even btrfs!

mdraid being pretty critical to a lot of server work.

Then there's the topic of altering bootloaders for distro's which value stability, it wouldn't happen in a point release.

> systemd-boot does not support, among other things: "BIOS"/MBR boot,

That's a feature :). Unless your firmware doesn't have UEFI support (I'm looking at you, Gen8 Proliants...), you should use UEFI.

> EXT4, XFS, mdraid, LUKS (hidden boot),

It doesn't support ANY filesystem (does it defer to builtin UEFI filesystem drivers?); it loads kernel and initramfs from the EFI partition itself. You can then have whatever you want compiled in.

Grub, on the other hand, has read-only drivers, so it will load your kernel and initramfs from any /boot partion you want (except zfs pool with some features enabled).

> OPAL (aka hardware FDE, self-encrypting drives)

Now this could be a problem, however, normal UEFI has to work somehow.

> mdraid being pretty critical to a lot of server work.

Many servers boot from internal sd-card or usb, and if they have local drives, they are put into use when whatever system from the sd-card or usb boots.

> Unless your firmware doesn't have UEFI support (I'm looking at you, Gen8 Proliants...), you should use UEFI.

Why is that? It's a serious question. I use BIOS boot on all my systems after failing to install Linux (Qubes, Arch, Fedora) on several machines until I switched to BIOS boot.

Is there anything I'm missing out on by using BIOS?

It has less magic behavior.

The legacy BIOS way is basically that the BIOS will read the first sector on the first disk, check the magic bytes and if are OK match, jumps to it (executes it). All this in 16-bit real mode. For runtime services (sleep, hibernate, display backlight, etc), BIOS32/ACPI entry points are used. The services are more limited than the UEFI versions, no Secure Boot for example. For preparing/installing such a boot sequence, you need special tools, and hope, that it won't clash with anything - eventually filesystems learned not to use first few megabytes of their partition.

If you have multiple operating systems, they might not play nice with each other and each might fight for the ownership of this magic sector.

For network boot, PXE is used. Basically DHCP request, with reply containing TFTP server and image name. BIOS will then download that using tftp and execute it (still 16-bit real mode). To avoid the limits, many use iPXE - basically download more advanced loader over tftp and hand over control to it.

For UEFI, there is no magic. There is a partition formatted with FAT32 filesystem (some implementations support also other filesystem, i.e. Intel supports NTFS and Apple has also their own; but the specification makes only FAT mandatory), called ESP - it has specific GUID, though - where operating systems place files that they want the boot manager to find. Boot manager is an integral part of the firmware, so if you have multiple operating systems, you can choose which one to boot even if the operating systems (or their bootloaders) do not cooperate and do not allow booting other operating systems on your machine. For OEMs, it allows placing system tools, like diagnostics or system restore, into ESP partition too. For power users, you can place UEFI shell (if your firmware doesn't include it) here. Boot happens in protected/32-bit mode, and the runtime services are much more advanced - maybe too much, the specification has several hundred pages and many consider it bloated. A thing that might interest some, that you can instruct the firmware, how it is supposed to boot next time - you can reboot from operating system into firmware configuration without pressing magic keys at boot at the right time (`systemctl reboot --firmware-setup`). There's also standardized way to update the firmware itself ("UEFI capsule"), so there's no problem updating it, even if you are not running Windows, which used to be a problem in the past. The network boot can happen via http, including DNS resolving, where the DHCP will return URL and the firmware will fetch it. And other many quality-of-life improvements.

There have been some machines, that had their firmware thrown over the wall (mostly those cheap netbooks and tablets), where if it booted the supplied system, it was considered ready to ship. On normal, business class laptops, enthusiast desktop motherboards or server machines, I've never had a problem (except that unfortunate gen8 proliants, where HP doesn't support UEFI intentionally and limits you to legacy BIOS). The important thing to avoid having problems with UEFI is to treat is as UEFI, not as legacy BIOS.

As I wrote elsewhere on HN, Intel had a roadmap wrt UEFI, where since 2020 they wanted to go UEFI class 3 only (that means no CSM - legacy BIOS - support, UEFI only). Presentation on this should be findable on the interwebs.

I don't use it, and I don't respond in the least bit of snark (honestly!), but it DID surprise me to learn all the things it doesn't support.

I feel like you're making this sound worse than it actually is

* systemd-boot doesn't support those filesystems as the /boot partition, all of those things for your / are totally fine. For most people having your /boot partition be XFS vs FAT32 is firmly in the "who cares?" realm.

* Nobody really supports OPAL, even grub. The only reliable solution is using https://github.com/sedutil/sedutil to unlock the disk and then soft reboot into any normal bootloader.

* You can do mdraid, you just have to stick the metadata at the end of the drive rather than the beginning. The utilities that set up your raid even warn you about this because this setup is so standard.

> systemd-boot doesn't support those filesystems as the /boot partition, all of those things for your / are totally fine. For most people having your /boot partition be XFS vs FAT32 is firmly in the "who cares?" realm.

XFS is also not supported. But yes, your /boot can be fat32, but saying "nobody cares" is just needlessly dismissive, I can think of many reasons to care at least a little. Primary of which is lack of encryption and ability to mirror easily, another can be the possible corruption of /boot (FAT is not typically used for it's data consistency properties)

> Nobody really supports OPAL, even grub. The only reliable solution is using https://github.com/sedutil/sedutil to unlock the disk and then soft reboot into any normal bootloader.

I guess this is a fair statement, if you ignore that the "PBA" is actually using grub itself, but yes, the PBA is non-encrypted and usually lives as a separate partition, making it functionally equivalent between systemd-boot and grub2.

> You can do mdraid, you just have to stick the metadata at the end of the drive rather than the beginning. The utilities that set up your raid even warn you about this because this setup is so standard.

You can't boot to an mdraid device with systemd-boot, you can only have a separate (non mirrored or raid'd) partition on a drive, and point to that drive to boot.

> You can't boot to an mdraid device with systemd-boot, you can only have a separate (non mirrored or raid'd) partition on a drive, and point to that drive to boot.

Sorta kinda true in theory but not in practice. Yes systemd-boot isn't raid aware but if you configure mdraid to store its metadata at the end of the drive then you wind up with a, say, mirrored disk that to the firmware doesn't look like it's mirrored at all. And since the firmware just reads the partition it's safe and works.

So yes, you have to configure your firmware to look at both drives in a mirrored setup as a fallback but you can boot.

> but saying "nobody cares" is just needlessly dismissive

Fair, I'm more saying that having firmware that only supports a small set of simple filesystems is the usually the more normal situation, the fact that GRUB comes with all the drivers makes it the weird one.

> You can do mdraid, you just have to stick the metadata at the end of the drive rather than the beginning. The utilities that set up your raid even warn you about this because this setup is so standard.

You're screwed when the firmware decides to write to the ESP though.

This is 100% true but I haven't been so unlucky as to encounter firmware that does this under normal operation. This has been the "firmware doesn't understand software raid" solution for ages.

Indeed, it hasn't happened to me but I remember reading about someone who discovered the hard way that their firmware fell back to storing boot variables in a file on the ESP for some reason. It's the sort of thing you only notice once you're staring at an error message, wondering why your ESP is corrupted.

It's hardware RAID or nothing, at least for the ESP for me!

> systemd-boot (AKA, gummiboot) is leaner, for sure- much in the same way a bicycle is leaner than a truck.

I think this is the perfect analogy why systemd-boot is better: you don't want to use a truck to get to your local bakery, it is going to be either a walk (ESP boot directly with kernel anyone?) or with a bicycle.

> mdraid being pretty critical to a lot of server work.

No, mdraid is not critical for server work. Never run mdraid for production, it will fail, you will lose data.

Servers use hardware RAID, either on a dedicated card or integrated into the MB.

Can systemd-boot offer me editor at boot time where I can change kernel cmdline?

Is it able to use serial console for remote control?

Can it boot non-Linux non-systemd OSes?

Does it handle encrypted disks better than GRUB2 does?

Does it work well if I prefer bios/dos/mbr mode?

> Can systemd-boot offer me editor at boot time where I can change kernel cmdline?

Yes, just press 'e' as long as editor=yes in the loader config.

> Can it boot non-Linux non-systemd OSes?

Yes, auto-entries will find Windows automatically and other EFI images can be set up through loader configs. Not sure why it wouldn't work with non-systemd systems considering that's an init system and it's a bootloader.

> Does it handle encrypted disks better than GRUB2 does?

Doesn't the initramfs handle decryption of disks?

> Does it work well if I prefer bios/dos/mbr mode?

No, it doesn't support BIOS boot.

Doesn't chain loading BOOTMGR break bootlocker?

It doesn't chain-load as GRUB/MBR does. For UEFI boot loader, other operating systems are just EFI images. UEFI boot loaders do not care what specific OSes these images contain.

This is also used for boot time or boot repair tools. UEFI shell for example is just another EFI image.

I thought it I validates the attestation chain so the TPM won't play ball. Hang on...


I cant find the reference I'm looking for which was a decent explanation about why the days of traditional style dual booting where the Linux boot loader let's you choose between Linux or Windows using GRUB to choose between Linux and Windows were over, but that question basically covers the same ground.

On dual-boot machines, I've always prefered to use the built-in UEFI boot manager (yes, UEFI has one!). This way, Linux boots using systemd-boot/grub/whaever, Windows using NTLDR and whatever else is there, it uses its native way too. It avoids exactly this, where the TPM doesn't consider that there might be an additional component, properly signed, but might cause havoc with sealed values.

Indeed, I only meant that dual booting with GRUB (other boot loaders are available) in the boot chain are over. At least, if you want to use bootlocker with keys in the TPM anyway!

Yes, yes, yes, no if you want encrypted /boot; yes if not, no.


Unfortunately, I think this move would receive quite a lot of flack due to it being systemd. I totally agree though.

I just wish systemd would upstream the ProxMox pve-efiboot-tool. The thing works great, it syncs multiple ESPs on different disks for redundancy and it just has a hook to run on kernel updates, etc. /boot isn't even left mounted during normal operation and only gets mounted to update the contents meaning it's less likely to experience corruption during a power failure or accidentally get deleted by some script gone awry, etc.

Systemd also supports using kexec to reboot a machine booted with systemd-boot for faster OS updates. Of course you can use kexec without all of that, but anyone who has knows that it's not exactly streamlined whereas with systemd-boot it'll use the kernel image, initrd, and kernel cmdline straight from what the bootloader is configured for. For UEFI it really is great, only thing "missing" IMHO is support for adding boot entries for EFI stub kernels directly to skip the bootloader entirely for the fast path and rely on the UEFI bootloader if you need access to systemd-boot or an older kernel. Not too relevant for servers where it shaves off like 700 ms or so but still a neat trick and I like my desktop booting without an extraneous extra bootloader when it's not needed.

If you are wondering why many people hate updating working systems, no matter what the security implications, look no further than this.

Time and again it is an innocent security update that would end up in a reinstall, finding a bunch of bloat ware on a system, losing critical functionality, data loss and time lost.

Updates should be restricted to the absolute minimum and tested to the point that deploying them does not put customer data at risk.

This is also why I pay someone to change my oil. It's a simple procedure, but the list of things that can go wrong is a list.

At a small shop I ran our VCS on my workstation (because otherwise we had none) and at the end of a day I rebooted to do some maintenance on the machine, and it didn't come back online until around 1pm the next day. On the plus side we finally got a dedicated server.

That said, you still gotta touch systems.

If you get off the upgrade treadmill, you will find yourself one day facing a CVE issue that have no patches for the versions you use. And then not only do you have to do a security update, you have to fast-track a version upgrade that your code might not work with anymore.

Last time we had this, I got the team to agree to putting one upgrade of OS or libraries per month onto the task list. We even let the assignee choose what they were going to upgrade (a few had no preferences, and so we'd have them work on whatever was most overdue).

We put some caches in place at work recently, to improve perf and give us some protection against partial outages. Already we are seeing services that cannot handle the same workload they could three months ago.

To paraphrase Jim Highsmith: there are no answers here, and we waste enormous amounts of time looking for them. There are only tradeoffs, with their own sets of consequences to be managed.

The tradeoff I prefer is to drill it into people that shit happens. Something is going to be down, and the more things we have, the higher the likelihood that one of them will be down at any given moment. You can put all of your eggs in one basket and have one outage a year, in which everyone not involved in fixing the damned thing might as well go home, or you have a bunch of systems where I can do 80% of my duties if any one of them is offline.

Systems that are never seen to break (or at least stumble) tend to become siloed, insular. Being able to talk about the 'bigger issue' around an outage is often a time when teams finally communicate with each other (unless you have dysfunctional managers, in which case: run)

And if all else fails, there is always some set of tasks you want people to do but they never seem to have the time for. Having a captive audience isn't the worst thing that could happen to you.

> This is also why I pay someone to change my oil. It's a simple procedure, but the list of things that can go wrong is a list.

This used to be me. I motorcycle, but unlike perhaps your typical motorcyclist I don't have a good brain/hand/mind for working on engines. I just tend to do a shitty job, get bored before I'm done, or can't wrap my head around the geometry of something (so I'll almost always turn a clutch adjuster the wrong way).

So I'd take my bike in for everything. But I was noticing that things didn't quite seem actually fixed, so I brought the bike over to MotoGuild, a place where you can rent time on a motorcycle lift and they have all kinds of tools you can use included, and popped the engine cover off.

I found horrifying evidence that I can't trust anybody to work on my bike other than me. The sprocket where the chain connects to the engine was loose to nearly falling off - it should be tightened to a shitload of newtons and secured with a locking washer. The clutch area was chunky with crud, despite me having paid for a clutch cable swap and them assuring me they had cleaned everything to avoid my new clutch cable getting gnarly and gross like the last one did (I'd lubricate it and black nasty would come out the cable cover). The chain wasn't really cleaned at all, they had clearly just run a rag on the outside, but between the links and on the tire side, it was nasty. Therefore, it wasn't actually effectively lubed, either.

Bunch of other random things as well.

I am like the LAST person to ever say "if you wanna get something done right, do it yourself," because I am flush with cash and bereft of discipline and willpower, but holy shit, apparently, if you wanna get something done right...

On anything with two wheels you better be sure that you inspect it and if at all possible work on it yourself. Your life depends on just about every single bolt. Think ground bound helicopter levels of care.

This is why mainframes running COBOL still are a thing.

If it works why muck it up?

People forget how dependent we have become on computerized automation.

> This is also why I pay someone to change my oil. It's a simple procedure, but the list of things that can go wrong is a list.

If you're not going to change your own oil, take it to the dealership or a mechanic that you personally trust (I doubt this exists as mechanics don't really do routine oil changes). Putting the wrong amount of wrong type of engine oil into your engine is the absolute worst thing you can do to your engine.

I encourage everyone to change their own oil. If you run full synthetic (and you should), you will save a fair amount of money. Even if the shop gives you the right weight, and right amount of oil, you're not getting premium oil, and your oil filter is likely the equivalent of an empty tin can at most Tire and Lube type places.

OEM dealerships are the only places I trust to do any kind of fluid change if I'm not doing it myself.

> as mechanics don't really do routine oil changes

What's this now? Every single mechanic I've ever met is more than happy to do oil changes. Even my "BMW guy," who only works on BMWs and regularly works on $100k+ cars and fun projects like performance upgrades is more than happy to change the oil on my decidedly pedesterian BMW (12 year old entry-level 328xi, worth maybe $5-6k tops).

I won't let anyone change my oil because a monkey once severely crossthreaded the drain plug by feeding it on with an impact wrench rather than hand starting it.

I'm fine with that because I just make damn sure they fix it, and if I have to fight at all for it, I take my business elsewhere.

I had a plumber crack the porcelain on the sink when tightening the drain. They told me about it and bought me a new sink. That's my plumber until I move. In another town the same thing happened and they put some silicone on it and didn't tell me. That plumber was never called again except to tell them why.

I failed an emissions test. The repair place near work wanted to replace the fuel injectors, it was going to cost a third of what the vehicle was worth.

The one near my house pointed out that there was chunk of EGR hose missing and the rest were ratty. They could replace them all for $280 (above the mininum required to address a failed test).

It was the hoses. Went from above the limit on two tests to 10% of the limit. I used that mechanic until I moved across town.

Oh, the pain is real. Cross threading is one of my nemeses. My eye still twitches when I see people cut a bolt on Youtube without threading the nut on first to work as a die (as in 'tap and die'). You're gonna cross-thread that thing! AAH.

I worked as a bicycle mechanic for a bit (lots of bolt cutting, more cross threading than I wanted), but also spent a cold winter working with my dad on a vintage car and I discovered that I don't have as much nostalgia for car work as I thought I might. For the same reason my boss pays for licenses (because you can yell at them and make them pay for the repairs) I pay others to do some of my handiwork.

Despite still being able to turn a bolt the right way while upside down and with the bolt facing away from me, I figure I'm almost as likely to get a rude surprise due to my own stupidity as theirs. If, and this is a big If, I've developed a relationship with the vendor, instead of just driving down to the Jiffy Lube where I'm nobody.

Here's a simple trick that will help you to never ever have a cross threaded bolt: turn the bolt the wrong way until it clicks. That's the step of the threads disengaging from each other. Now they're perfectly aligned to mate so you move forward just a 10th of a turn or so and it will immediately engage. If it doesn't there is something wrong and you need to inspect the threads.

In my opinion, the true antidote is reproducible, and ideally immutable, systems. Maybe this bug would’ve been harder since it’s all the way back to the bootloader, but assuming a more minimal bootloader (perhaps EFIstub directly, or systemd-boot,) you could have something like NixOS where you can, at boot time, jump to a previous configuration with all of the old software.

(Works even better for docker or VM images where you have a supervisor that can move around disk images. Can just construct immutable root images and roll back by booting with an older one.)

I'd also add that you should reboot after updates--even if they haven't touched anything that requires a reboot to put in use.

The reason is that there is plenty of configuration information that is only read during startup, and plenty of programs either only used during startup or that do things during startup that they do not do later.

If an upgrade broke any of that you might not find out until the next reboot. It is a whole lot easier to deal with if you find out right away that the update broke booting than if your first reboot is six months and several intervening updates later.

We hit something like that once. An extended power failure took down several servers. When they came back up several of our things were messed up. We eventually tracked it down to some update months earlier had made a change to how network filesystems got mounted, making them show up later in the boot process--which turned out to be late enough to be after some of our stuff that expected the network drives to be available.

This is especially concerning since the entire value proposition of RedHat is stability and meticulously maintained packages

I thought their value proposition was enterprise support for large companies that want to outsource that ?

Well, and they still do a better job than... a lot of companies.

Notably, this is the second time in a few months that this happens. The previous one was a microcode update, though that one was Intel's fault (according to Intel, it happened only when the microcode was loaded by the kernel; they probably had tested only loading it from the firmware).

> according to Intel, it happened only when the microcode was loaded by the kernel; they probably had tested only loading it from the firmware

I followed the Spectre situation quite closely, and never heard anything along those lines. Do you have a source?

Many system vendors pulled the faulty firmware from their own sites, and Dell went so far as to recommend rolling back if you had installed it, so the partner ecosystem didn't want people running the firmware at all.

> > according to Intel, it happened only when the microcode was loaded by the kernel; they probably had tested only loading it from the firmware

> I followed the Spectre situation quite closely, and never heard anything along those lines. Do you have a source?

Sure, it was this one: https://github.com/intel/Intel-Linux-Processor-Microcode-Dat...

"Intel identified an issue when OS loading microcode update revision 0xDC for cpuid 406E3 and 506E3. The microcode update has been reverted to revision 0xD6. This issue does not affect the microcode update when loaded from BIOS."

Ah, I see. This is for a different firmware issue. They have so many these days, it's hard to keep track. :/


Rarely such updates are deployed into production without testing in non-prod envs

But that is part of the resistance to updating. A whole pile of testing must be done again.

But the testing should be automated so that incremental changes are just that incremental. Stop the world massive changes are an anathema as it so much harder to pinpoint the breaking change.

What can you recommend as a balanced linux distro between the constant updates of arch on one hand and mostly very outdated package versions of debian/centos on the other?

You can't start with a wobbly foundation and then build sturdy structures on top, so:

start with Debian stable and add your cutting-edge packages on top, then watch their sources and mailing lists carefully.

There is of course Fedora. The latest is quite cutting edge, but you can choose when to upgrade to the next release. And there are security updates for older releases, so you don't have to immediately upgrade. Still, keeps you in the Red Hat universe.

This is exactly why I settled on Fedora after a few years on Arch, a few on Ubuntu/Debian/etc. Fedora is the best blend I've found. New releases about twice a year but you can skip a release if you want and upgrade every other release (or always stay on release n-1 which some people I know do).

Arch and Fedora follow the same conventions so everything on Fedora is where you expect it from Arch (this also makes much of the great Arch wiki applicable directly).

There will always be a soft spot in my heart for Arch, but Fedora is the perfect balance for me. I can't have my work machine getting broken and forcing me to read forums and mailing lists, etc.

Instead of being afraid of updates you should be able to rollback any problematic updates and be impossible to end up in a broken system because updates interrupted. The only distros offering that are NixOS and GuixSD.

That's a great start, but does not help if the data was also affected. For instance, running a new version might have upgraded a database to a newer format, or added new data with a format that the older version does not understand.

True, schema upgrades are an issue which hasn't been discussed[1]. Maybe some kind of versioning or pre-update backup in case of non-invertible data upgrade could help?

[1]: https://discourse.nixos.org/t/automatic-database-schema-upda...

I've found Ubuntu LTS to be a reasonable balance between stability and "newness," and many applications and libraries test with it.

Obviously, perfect stability and perfectly up-to-date packages are not possible, but I think Ubuntu LTS is a good compromise. I've used Arch (which I enjoy using to improve my understanding of Linux w/their excellent docs), Debian, Fedora, OpenBSD, etc and I always find myself going back to Ubuntu LTS for my work machine.

Lots of projects also put out PPAs with frequent releases for Ubuntu, which are ideal if you need the very latest and greatest. Just realize that you're relinquishing root access to every PPA you add, so be very careful whose PPAs you install, and try to limit them to the extent possible (same applies to any apt repo, obviously).

You might take a look at openSUSE Leap. I use the rolling update variant Tumbleweed for my laptops. But my feeling is that they are working hard to find the sweet spot between stable and bleeding edge.

So you want stable + cutting edge?

This one is actually quite interesting: on a long running server with kernel livepatching, you might not even notice it if it gets fixed before you reboot the next time.

The stablest systems I had the pleasure to manage were two identical rhel6 clusters into different geographical locations for higher availability and fault tolerance. Such systems were installed, turned on and never touched again. Kernel 2.6.32 that managed about six-seven years of uptime up to mid-2019. We operated a lot onto such systems, mounting and unmounting iscsi devices, starting and stopping stuff, turning on and off network interfaces and clustered filesystem (thanks to Veritas cluster manager).

The key move was never updating.

Such systems were literally mission critical, without that cluster the whole company was unable to produce its main products.

Considering how much stuff they ran and how many simultaneous users were connected, I was humbled by their stability (and by rhel's stability).

If you're getting angry at this post: the customer was not in the it field and was completely okay with buying new hardware and doing a full reinstall every X years.

Usually security is one of the top reasons to update software systems. Would a security flaw in these clusters have been an issue? Were they aggressively firewalled?

Yes, there were multiple firewall layers. Also, services were not accessed from outside the corporate network.

The link is to a redhat bug report, but the issue is also affecting other distros: https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/1889509

This is why I dislike grub. It's really, really bloated. A bootloader just needs to pick the partition to boot, and little else. I switched to gummiboot ages ago, and it's so simple. There's far less to go wrong (gummiboot got absorbed by systemd, so it's now called systemd-boot)

Am I reading this correctly that "yum update" is all I have to do to screw up an 8.2 minimal install?

sudo yum update

"This update enhances security by making the system unbootable, which is the most secure a computer can be."

After reading this I decided to downgrade my Ubuntu machine for now until it's figured out. There are instructions here: https://wiki.ubuntu.com/SecurityTeam/KnowledgeBase/GRUB2Secu... under the heading "DOWNGRADE `GRUB2`/`GRUB2-SIGNED` TO THE PREVIOUS VERSION FOR RECOVERY"

Under the heading is a small shell script that will download the old debs for you. Note that for it to work and not have wget spam 404:s, you have to update the entire GRUB2_LP_URL and GRUB2_SIGNED_LP_URL to the links in the little table. At first glance it looks like you only have to change GRUB2_VERSION and GRUB2_SIGNED_VERSION.

Grub2 is not a bootloader. I'm not even really sure what it is. In Grub 1 you had a configuration file with the operating systems you wanted to boot. Simple, effective. With grub2 all I'm seeing is a bunch of sh scripts that are impossible to write by hand.

Again? I remember when zfs didn't init in time so grub found no root. It was fun panic when it's on a virtualization server..

On this subject, a few months ago I had Ubuntu servers suddenly failing due to a bizarre automatic snap update that basically took most of our Docker containers down. Did anyone experience this with production systems?

Any word on whether the equivalent update on Debian has similar issues?

Yes, there are reports of failures on Debian: https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/1889509...

Well, a system that can't be booted is totally secure against many kinds of remote & local attacks. So, "mission accomplished", I guess?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact