
Security of BIOS/UEFI System Firmware from Attacker and Defender Perspectives - adulau
https://github.com/advanced-threat-research/firmware-security-training
======
oneplane
An this is why I'm trying very hard to get coreboot to work on my systems, and
I'm very eager to see Libreboot-type of FSP/BSP-free images in the future so
we can actually verify our boot chain.

Keeping all of the firmware code secret makes no sense, and seems to be
artificially enforced by 'patents', and 'trade secrets' while most likely the
vendors are trying to keep buggy code secret and hope that obscurity alone
will keep them safe.

~~~
minipci1321
In my opinion, vendors are oblivious to the notion of the buggy code. From the
releasing-a-product standpoint, there is a tested or an untested code. While
personally I wouldn't be surprised that someone could knowingly release
firmware containing known "stability issues", I doubt any vendor hides source
code out of fear that someone else will discover some hypothetical bugs. Among
other problems, finding bugs in an unknown firmawre is not so simple, as it
requires an intimate knowledge of the underlying hardware (and a fair share of
the bugs is closely related to its quirks), which might be not so familiar as,
say, well-established (and relatively well-documented) Intel Architecture.

------
zkms
Seeing slides in BIOS-UEFI-Security.6-Mitigations.pdf that imply that there's
critical crypto being done in SMM mode makes me feel unfathomable
hopelessness. The whole x86_64 platform security model (which includes all the
privilege levels and the corresponding access control mechanisms) is one hell
of an overgrown clusterfuck that could not be _more_ hostile to formal
verification.

There's lots of wantonly convoluted stuff going on -- a random example is the
access control mechanism for the PCH's GPIOs (search "GPIO registers
lockdown", in quotes). This isn't, of course, implemented with a register
which contains bitfields determining which privilege level can write to which
sets of registers. That would be too simple. The GPIO Lockdown-Enable bit
_can_ [sic] be changed by the same software that the lockdown mechanism is
meant to deny access to! This seems like utter pointlessness -- what use is an
access control mechanism if the agent being restricted can change the
parameters of the mechanism willy-nilly -- but Intel has a solution! Changing
the lockdown-enable bit triggers an SMI and shunts control to SMM mode, which
is, naturally, expected to include code that figures out that this bit should
not be disabled, and is to flip it back on (and return from the interrupt,
having trashed the caches a bit).

This is, of course, a pathologically needless level of complexity -- in the
ARM world, we have registers with silly names like "NSACR" that higher-
privileged execution modes can set to restrict access to certain resources.
There's certainly no BIOS-OEM-provided code that needs to exist and be correct
in order to implement such a basic task. In the end, this level of access-
control is equivalent to a bloody Boolean operation or two, for heaven's sake!
All the CPU needs to do is to decode the instruction, realise it's a
potentially-privileged instruction, decode what the instruction tries to
modify, look up in a table which register holds the relevant permissions
bitfield, and do the relevant boolean operation between that register's
contents and an appropriate bitmask, and fault depending on the result. Since
there's an extremely close match between the properties that needs to hold
(the truth table of all combinations of "privilege level _x_ can modify
resource _y_ iff bit _n_ in register _z_ is set") and the mechanism that
enforces it, it's easy to reason about this scheme and not difficult to either
prove an implementation correct (or find counterexamples).

Meanwhile, the "wake up SMM and hope it'll countermands the illegal write"
scheme depends on a lot more machinery. How does it work on a multi-
core/multi-socket platform? How does this mechanism interact with the caches
or the memory model? Is it possible to set up a race condition where the
illegal write ends up going through uncountermanded because SMM mode can be
made to not see the register in an illegal state? This is orders of magnitude
of orders of magnitude more complex than analysing a lookup-table and a
bitmask -- we need to understand the semantics of memory reads/writes, of
caching, of mode switches to SMM and the SMI interrupt, and how all of this
clusterfuck is affected by the fact that there's _multiple cores_ in our
system. LANGSEC people will call this a "shotgun parser" \-- when input data
checking / recognition is interspersed with processing logic.

Even if all of this miraculously works and there's literally no way that all
the cores working together can send an illegal write that SMM code won't
countermand -- there's still the issue of making sure that the specific SMM
blob that our BIOS OEM wrote cooperates properly with this. Indeed, making
BIOS OEMs implement these sorts of convoluted and critical mechanisms and
expecting them to get all of them perfectly right requires a level of optimism
that doesn't yet exist. The situation has devolved to the point that there is
literally a tool called "chipsec" that lets you test for the presence of a
handful of well-known security-critical things (from time to time someone
discovers a new one, of course, UEFI/ACPI and the x86_64 privilege model is
too complex for people to be sure that we found all the issues) that UEFI
programmers are _notorious_ for messing up. That this tool needs to exist is
shameful. Of course, the security of the x86_64 platform doesn't just depend
on a bunch of magic access control registers being set right, there's Turing-
complete code that needs to be implemented by the BIOS OEM (and runs in the
most privileged execution mode that isn't Intel's "management processor") that
is _security-critical_ , and, well, it's hard to prove that arbitrary Turing-
complete code behaves correctly.

The auxiliary CPU mode (SMM) initially meant to hide APM and emulate PS/2 mice
in 90s-era computers is now _critical_ to platform security, and does
dangerous stuff like crypto and handling pointers from UEFI / the OS. Every
few months someone I follow on Twitter finds some new way to trick some
widely-deployed SMM code into writing to a memory region it shouldn't, it's
quite depressing. Great. Another pointless defender/attacker arms race that
the defenders could decisively win had Intel thrown away the spitefully
complex intricacies of SMM and the x86 security model and replaced it with a
clean, formal-verification-friendly set of privilege levels whose correct
operation doesn't depend on platform firmware code. Even AArch64 is less
broken when it comes to this.

