Someone clearly thought that was a good idea, and I wouldn't be surprised if they thought the same of the bloated monstrosity that is UEFI. "Let's make the setup an EFI application" sounds like a reasonable argument, but they don't realise that it's a very important application, one which should be accessible under all circumstances short of having the BIOS erased.[1]
[1] Most if not all BIOSes on EEPROM (late 90s onwards) before UEFI had "boot block recovery" which would automatically detect if they were corrupt and attempt to recover by flashing from a specially formatted floppy disk.
I've come across similar issues with OEM installs before, where it simply will not boot if anything but the OEM's factory image is installed by the OEM's install disks.
Profoundly annoying, but, thankfully not commonplace and those couple times it's happened to me I've been in the return window, so, back they went. I figure that's as big of an FU as I can muster, the OEM dealing with a higher than normal return rate.
Last time it was a HP laptop running a god aweful bloatware infested version of windows 8.1, never again.
Oh God fuck HP, their EFI always seems to be the most crippled yet full of BS ones. Unfortunately we currently have a deal with HP at work for workstations and had all kinds of B/S going on with the setup utility, upgrading it, enabling stuff etc. we were with Fujitsu for ages before and they mostly just had very stock looking setups that even were still good old text mode.
Ultimately, each are OS-specific solutions, while it would be interesting to make a crossover between efibootmgr and GetSetVariable (and that C program) to create a tool working on both Linux and Windows with cosmopolitan to restore/hack 8BE4DF61-93CA-11D2-AA0D-00E098032B8C variables, because a quick search on that magic shows some people have uploaded their efivar to github for other models so it must be a common issue!
> Ultimately, each are OS-specific solutions, while it would be interesting to make a crossover between efibootmgr and GetSetVariable (and that C program) to create a tool working on both Linux and Windows
I seems that nobody except for Linux (for some not yet determined reason) is having issues with retrieving EFI variables on this hardware, and one could potentially classify this as a bug in `efibootmgr` as well (due to how it handles creating the new entry in unknown conditions).
In either case, Linux is the only thing affected by this, so in real-world setups Linux is going to be the only boot option that is available while the boot menu is in a broken state.
There's a pretty decent chance nobody else is actually trying. The only other OS getting installed is Windows, and that's probably coming straight from a disk image or a recovery partition.
Especially in laptops, a lot of hardware / firmware issues are simply "solved" by baking a fix into the pre-installed Windows version. It's a solution for 99% of users, so why bother spending time looking into the root cause?
> The only other OS getting installed is Windows, and that's probably coming straight from a disk image or a recovery partition.
> Especially in laptops, a lot of hardware / firmware issues are simply "solved" by baking a fix into the pre-installed Windows version. It's a solution for 99% of users, so why bother spending time looking into the root cause?
The Windows versions I installed for testing were non-OEM versions. They still behaved as expected.
Notably, Windows didn't just know about all the standard UEFI variables, but also about a non-standard one that I added for testing. This means that there definitely is a way to ask for the list of variables so that the UEFI accepts it (sadly, reverse engineering that is a pain), and that the Linux kernel is most likely the place where an actual fix has to happen.
Of course, yes, at the end of the day, the root cause is a specification non-conformity in the UEFI itself.
> (sadly, reverse engineering that is a pain), and that the Linux kernel is most likely the place where an actual fix has to happen.
You did most of the work already, and it's super interesting (or at least, it's the kind of things I find super interesting lol) so you may want to finish fixing the issue?
It's funny how outside RU.EFI, there're no nice tools for such a basic features as tweaking UEFI variables, so if you are into this kind of things, you may also be interested by writing a better efibootmgr: many people (including myself, and now you) are dissatisfied by the issues it can create: https://old.reddit.com/r/archlinux/comments/18j6o7x/rfc_what...
>There's a pretty decent chance nobody else is actually trying. The only other OS getting installed is Windows, and that's probably coming straight from a disk image or a recovery partition.
From TFA: "Note: At this point, I checked that Windows and various other UEFI tools are able to read the variables just fine, so Linux’ output is confirmed to be incorrect."
> From TFA: "Note: At this point, I checked that Windows and various other UEFI tools are able to read the variables just fine, so Linux’ output is confirmed to be incorrect."
I just think it'd be nicer to have a multiplatform way to tweak UEFI boot variable, so you can fiddle with your UEFI variables from either Linux or Windows without having to actually go into the UEFI shell or use a PE32 like RU.EFI : https://ruexe.blogspot.com/
> Especially in laptops, a lot of hardware / firmware issues are simply "solved" by baking a fix into the pre-installed Windows version. It's a solution for 99% of users, so why bother spending time looking into the root cause?
I've never really seen a single example of this. Could you provide some?
Around 20 years ago I had Fujitsu Siemens V3505 that beyond giving me a lot of grief when it came to get it to work ACPI, hardware buttons or getting FN-less function keys it had weird characteristic of requiring motherboard replacement every single time I installed Linux on it.
As it was advertised as OpenSUSE compatible (and even came with stickers and such) first one or two times they made a fuss about it, but after that they replaced pretty much on the demand.
I wonder if that was also related to the issue article was mentioning. But such design lasting for multiple years?
That’s an interesting failure mode. Is there any reason why the manufacturer didn’t consider that boot entries can be created and destroyed by the customer?
PC firmware is a mess because pre-UEFI boot behavior was mostly a custom rather than a defined standard, but then standards got added over time. That basically adds up to the success target for most manufacturers being "does it boot Windows?" - once it did that, then they were done.
UEFI was supposed to make this better, I guess, by specifying everything in several hundred (if not thousand) pages, but it added a lot of complexity (esp. with Secure Boot and such). It doesn't help that most end-user visible firmware functions are rarely accessed, so as long as it does boot Windows in the manufacturer supplied hardware configuration, most people won't care or even know of any issues.
PC firmware bugs aren't really anything new. When ACPI first came out in the late 90's/early 00's, initial implementations were buggy - so buggy that I think starting from Vista, it won't boot if the BIOS date is before 2000. Linux source code has numerous BIOS workarounds in it.
El Torito, the standard enabling bootable CDs, also had problems when it first came out.
I had a Haswell-era MiniITX Gigabyte motherboard that had a similar issue. Installing the Clover/Chameleon and booting into macOS messed up the EFI and the system wouldn't boot. Went through two motherboards before I bought a different brand.
Early UEFI firmware having quirks like this is pretty common, and it's also not uncommon for the BIOS Menu and other things you'd hope to be persistent (Diagnostics, Secure Erase Tool on Lenovo laptops) to just be boot entries with some hardcoded (but not absolute) protections.
I remember the opposite problem being the case on the T420 BIOS. If you didn't set a newly added entry as NextBoot, it'd just disappear after reboot.
Is there any indication why Linux can't read these efivars? That seems like a short fix away from the research already done, and would take care of everything. (Once you have the C skills already demonstrated in the post, kernel code is not especially difficult to debug.)
As far as I have tracked it down (quite literally up to the point where it switches into the EFI context to run the respective service handler), the UEFI denies a call to `GetNextVariableName` with `EFI_INVALID_PARAMETER` (that part is actually indicated in `dmesg`) even though the request appears to be specification-compliant (and the existing implementation evidently hasn't been an issue on any other notable hardware).
The main issue with fixing it properly is that I'd most likely have to reverse engineer the Windows kernel or the UEFI firmware itself (note to self: I haven't yet checked whether any of the *BSDs can read EFI variables in general and on this hardware in particular) to figure out where the request is going wrong/what Windows is doing different.
It's not impossible, given that one can unpack the UEFI PI firmware image into all the separate modules, but going through them to figure out where variable management is implemented will still take me a few weeks at least (not due to any particular challenge, it's just consuming a lot of time that I don't have right now).
One thing I've gotten used to over the years is Linux complaining that some part of the machine's EFI is buggy no matter what machine it is installed on. Apparently it's just too complicated for hardware manufacturers to get right. I also feel like there is far too much duplicated effort in the industry with everybody making their own version that is slightly buggy in a unique way.
In the old BIOS + ACPI days, the OS carries hardware specific hacks. These hack were buggy and hard to keep to day.
We (the community as a whole) decided it is better leave the hardware specific hacks to the hardware, UEFI was supposed to provide enough abstraction for all we need.
The result is, of course, the hacks with all its bugs are moved to the firmware.
Uncompress it, and dd it to your USB drive. (dd if=FreeBSD-14.0-RELEASE-amd64-memstick.img of=/dev/sdb bs=1m conv=sync, assuming sdb is your usb stick..)
Makes sense! I wonder if there is a way to dynamically watch the Windows call, to compare it with the Linux one, to avoid the tedious reverse engineering. Or if the syntax of Windows GetNextVariableName() use is generally understood/documented?
This could happen either through somehow getting logging from the Windows end, or somehow changing the UEFI to be one you control and logging there, or finding a different BIOS/OS that can read the vars and getting it to log its work.
> Makes sense! I wonder if there is a way to dynamically watch the Windows call, to compare it with the Linux one, to avoid the tedious reverse engineering. Or if the syntax of Windows GetNextVariableName() use is generally understood/documented?
The userspace interface is somewhat documented by third-parties (because it is technically internal). However, the important parts happen kernel side, and I'd rather avoid diving too deep into Windows because some very interesting job postings (understandably) have "No exposure to Microsoft code or reverse-engineering of Microsoft software" in them.
I already tried getting to the service handler implementation via Linux, but memory protections made it weird enough that I was even questioning whether it was returning correct raw data when trying to read it from memory (or I have been looking at the wrong set of headers).
Last I checked (which was about a decade ago) Windows doesn't call GetNextVariableName() - it just accesses variables on demand. We should probably handle that in a cleaner way.
It seems like that is no longer the case. I was able to successfully retrieve a non-standard variable using the `UEFIv2` PowerShell module [1] (which is just a thin wrapper around the undocumented `NtEnumerateSystemEnvironmentValuesEx` function) without actually naming the variable in question.
To the untrained layman like me, this sounds like Windows actually is querying via `GetNextVariableName`, because UEFI doesn't seem to offer any other interfaces that aren't "get/set variable by name".
Ok, yes, sounds like it is in that case. Which means figuring out how Linux is doing this differently to Windows, sigh. The easiest validation is to boot Windows under qemu with a debug-enabled EDK2 build to trace the calls.
I guessed which chip it was (with a little bit of help from leaked schematics for similar laptop models), bought the first programmer that I saw online that appeared to be compatible (anything with a CH341A in it, apparently) and ran `flashrom --programmer ch341a_spi -r bios.bin`.
Had there been any more than one possible type of chip and more than three of these similar looking chips on the [accessible part of the] motherboard, I'd probably still be sitting here trying to figure out what to do.
In any case, I'll try and add it to the blog post once I figure out how to do footnotes. :^)
I have tried to use a generic usb CH341A previously, and ended up seemingly frying the chip. By chance, would you happen to know the pin out on the ch341a, or something else that may be useful? I’ve stayed away from using it since, out of fear of frying a more expensive/important device.
The programmer that I bought was already wired up for 24 series and 25 series chips, so I assume the pinout matches whatever is in that respective datasheet.
The 25Q32BVSIG operates on 3.3V, so this particular programmer was compatible by default, but for 1.8V one would have to watch out indeed.
If I may guess, with a dongle supported by flashrom, soldering wires to the flash chip, and a lot of patience - because doing the same thing from say linux can run into permission problems: the firmware really doesn't want you to read the flash chip.
Luckily, it was the type of flash chip that I could use an external programmer and a test clip for. I haven't actually ever tried reading/writing through the internal programmer.
Had to reflash an SOIC i2c chip before and just soldered wires on. Some pins are easier where you have a via to solder to instead of the pin. Had to carefully lift the VCC pin to avoid turning on the whole board. Not all is lost if you break a pin: can carefully file down a bit of the plastic chip encapsulation to expose some pin to rebridge it once complete.
It's really nice to see how much cheaper ZIF sockets, test clips and programmers have gotten over the last few decades.
I was able to fix a laptop I bricked that way, note to self: never disable usb on a device with only usb input. the device had no bios clear pins. So I got a chip flasher and a soic clip and reflashed the firmware the hard way. luckily it worked "on board"
A process note. I did not really know what I was doing. and nothing was flashing correctly. Most of my problems ended up being with the cheap soic clip I had bought. After buying a nicer one it flashed first try. If anyone wants a recommendation, the nicer clip was from pomona electronics.
> Most of my problems ended up being with the cheap soic clip I had bought. After buying a nicer one it flashed first try. If anyone wants a recommendation, the nicer clip was from pomona electronics.
I can +1 for that clip. Had similar issues pulling data off a flash chip. If `flashrom` could see it at all, each read would come back with a different `sha1` hash.
Spent _way_ too long fighting that before I had the "wonder if this cheap clip is the issue..." thought. The pomona clip is so much better made and holds on to the chip well.
Interesting, a quick search seems to indicate that it is considered more informal or colloquial even, which is likely why it sounded weird to my ear. But you are right, it is considered grammatically correct.
Someone clearly thought that was a good idea, and I wouldn't be surprised if they thought the same of the bloated monstrosity that is UEFI. "Let's make the setup an EFI application" sounds like a reasonable argument, but they don't realise that it's a very important application, one which should be accessible under all circumstances short of having the BIOS erased.[1]
We're approaching 10 years since this happened: https://news.ycombinator.com/item?id=5139055
And almost 8 years since this: https://news.ycombinator.com/item?id=11008449 https://news.ycombinator.com/item?id=10999335
[1] Most if not all BIOSes on EEPROM (late 90s onwards) before UEFI had "boot block recovery" which would automatically detect if they were corrupt and attempt to recover by flashing from a specially formatted floppy disk.