Hacker News new | comments | show | ask | jobs | submit login
Replacing exploit-ridden firmware with a Linux kernel [pdf] (schd.ws)
368 points by dmmalam 72 days ago | hide | past | web | favorite | 78 comments



These slides are from the talk given at the Open Source Summit (formerly known as LinuxCon) last week in Prague. The abstract given at the talk page https://osseu17.sched.com/event/ByYt/replace-your-exploit-ri... is

With the WikiLeaks release of the vault7 material, the security of the UEFI (Unified Extensible Firmware Interface) firmware used in most PCs and laptops is once again a concern. UEFI is a proprietary and closed-source operating system, with a codebase almost as large as the Linux kernel, that runs when the system is powered on and continues to run after it boots the OS (hence its designation as a “Ring -2 hypervisor"). It is a great place to hide exploits since it never stops running, and these exploits are undetectable by kernels and programs.

Our answer to this is NERF (Non-Extensible Reduced Firmware), an open source software system developed at Google to replace almost all of UEFI firmware with a tiny Linux kernel and initramfs. The initramfs file system contains an init and command line utilities from the u-root project (http://u-root.tk/), which are written in the Go language.


Because the schedule does not provide it, here's the direct link to the talk: https://youtu.be/iffTJ1vPCSo


Video of the talk was posted yesterday: https://news.ycombinator.com/item?id=15572978


>"UEFI is a proprietary and closed-source operating system, with a codebase almost as large as the Linux kernel"

Does anyone know how they were able to compare the size of the codebase given that UEFI is proprietary/closed-source?


UEFI is just a spec. Intel did release an open source implementation: http://www.tianocore.org


My understanding is that most of the UEFI implementations out in the wild is based on this one, since the spec is pretty big and definitely not something a random mobo vendor would like to implement.


Most UEFI in the wild is based on Tianocore/EDK-II, which has about 1.5 million SLOC, plus about 400,000 source lines in header files. Some subset of this would actually be running on the device, but this doesn't include the drivers and the very base of the firmware. Let's say maybe two million SLOC is running on a typical UEFI firmware blob.

Linux also lacks the most basic level of the firmware (which is most likely handled Coreboot? I'm not fully clear on what NERF is overall). A much smaller proportion of Linux codebase would be running on a given device, as compared to the EDK-II, though. Who knows. /kernel and /arch/x86 together make about 440k SLOC.

Funnily enough, this is actually the second time a "Linux BIOS" has been tried. The first time was called LinuxBIOS (later renamed to Coreboot), and the idea was quite literally to put a kernel on ROM and have as little firmware as possible to get to a booted system.


Non disclosure agreement? Or just asked the UEFI vendor(s)?


Compare the size of the binary, perhaps?


While that would also be a somewhat useful measure in this context, there's very little correlation between the size of the codebase and the size of the binary.

Programming languages and styles can vary in verbosity a lot. For example, if UEFI is written in some form of assembler code, then you quickly get a lot of lines of code while the binary doesn't grow particularly. On the other hand, high-level programming languages like Java, or even moreso functional and logic programming languages like Haskell and Prolog respectively, can generate huge binaries out of even just a few lines of code.


What is Apple’s privacy reasoning for accepting UEFI?


Cost.


dotTK, now there is a TLD I haven't seen in a loong while


The last time I saw them with any regularity they were mostly spam and guild/clan pages.


Ever since I read that Intels UEFI reference implementation has a comparable complexity to the Linux kernel[1] I thought it's absolut nuts to have something with such a complexity just to set up the hardware for another operation system. And that's just UEFI and not even touching ME and other stuff.

[1]: https://mjg59.dreamwidth.org/10014.html


On the other hand, isn't quite a lot of the code in Linux kernel for the same thing, handling hardware complexity?


Even if it is, that still means you're doing it again.


Power9 servers appear already do this. They have minimal firmware to load the on-rom linux, which then uses userland application (petitboot) to figure out which installer media or already installed linux system you want to kexec() into:

https://sthbrx.github.io/blog/2016/05/13/tell-me-about-petit...

This seems logical, if you are going to run linux, why reimplment all the drivers in UEFI.


Nitpick: NERF isn't coreboot, not even "coreboot 2". It's a separate project.

Nice project though, and sad that it's needed in the first place. Oh well.


So far AMD has ignored fixing this pretty much because "only privacy nerds" care about it. Well, some of those privacy and security nerds happen to work for companies like Google and even have influence over those companies' hardware purchases.

So now AMD has just lost a big chuck of potential Google business, and who knows what kind of other potential contracts, because it continues to ignore this issue instead of actually showing some leadership on it.

Maybe it's time for AMD to actually differentiate from Intel on this issue?


Devil's advocate: if large customers care, AMD can easily give them a "special" microcode image with it disabled. Intel does the same thing with the "High-Assurance Platform" for the intelligence community, which is effectively a ME killswitch.

However, these fixes are generally not offered to the public even when they already exist (again, see: Intel HAP).

The rules of the game are different when you're one of the Super 7 instead of just some random pleb. And increasingly those guys are no longer interested in leaving this threat vector open, which says something.


>" Intel does the same thing with the "High-Assurance Platform" for the intelligence community, which is effectively a ME killswitch."

Do you have any links or documentation you could share regarding these exceptions? Thanks.



To see the Int community have a special mode to disable ME simply confirms that the ME is exploited, if any further proof was needed.


This article discusses Intel Management Engine and a company offering laptops with it disabled: https://www.theregister.co.uk/2017/10/21/purism_cleanses_lap...


>> Well, some of those privacy and security nerds happen to work for companies like Google and even have influence over those companies' hardware purchases.

>> So now AMD has just lost a big chuck of potential Google business...

> Devil's advocate: if large customers care, AMD can easily give them a "special" microcode image with it disabled...

> However, these fixes are generally not offered to the public even when they already exist (again, see: Intel HAP)...

Google could take the "don't be evil" motto one step further: they could require CPU vendor give them publicly available microcode that disables these "features." Basically, they could use their market clout to put the PC market in the right direction.


Google servers and home users have very, very different privacy and security needs. The bottom line is that the biggest threat vector on most home machines is the user. Securing a PC against threats when the user installs a browser extension for coupon printing and blasts through UAC prompts without reading them is very, very different than securing a server farm.


There's also the factor that some people, at least, would like to secure PCs from the user, particularly in the case of DRM video (which has an implementation in Intel ME). There's no real equivalent for Google servers.


Maybe it's time to bag the stupid X86 instruction set entirely and move to a 64-bit chip that has a sane instruction set and that embraces loyalty to the end user all the way down to the metal.


> "only privacy nerds"

There's nothing weird in caring about security.

Data in tech companies and banks is worth hundreds of millions. Attackers are therefore willing to pay millions for a ME 0-day.


What I find shocking is that nobody among the big names who should be conscious about security seem to give a rats ass about that. If I was say a high profile banker and someone told me that all my computers contained code that could be used to spy on me, my data, my business and my personal life, that would drive me crazy.


The high profile banker knows that computer security can be a money sinkhole. Unfortunately there's no clear and measurable relation between how much you spend on security and the benefits you get.


I find the use of Go for the userland pretty cool. It makes sense in this case considering if you disable cgo the only external dependency you have is the Linux syscall table.

I've implemented a number of system tools in Go and it is pretty well suited to this sort of job IMHO. One particular case that I've struggled with however, is when I've needed to modify some characteristic of my running thread using an OS mechanism. For instance, say I have a number of go-routines running and then I want to make sure a filesystem is unmounted from all mount namespaces the kernel has. My only pure-Go option right now, is to execute another go program (or re-execute the same program with some arguments) that will enter the namespace via syscall, do the work in that namespace and then quit. If you have a bunch of namespaces this seems wasteful.

This is because you don't have complete control over what Go does with the OS-thread you are working in. Yes, you can lock your go-routine to a particular OS thread but you can't stop Go from using that somewhat special and potentially more privileged thread, to create other OS threads to service go-routines. Thereby potentially sprinkling your go-routines with different capabilities and/or namespaces.

I ended up using cgo, which was a shame. Perhaps someone knows some neat trick to work around this?


Hey there.

In snapd (which is implemented in mostly go) we have this problem a lot. The real issue is that certain system calls fail if more than one thread exists in the calling process. One of those is setns(2), as is documented in the manual page.

What I ended up doing is to use a small C preamble that parses command line arguments, figures out where to go and uses setns before the go code even begins initializing.

This solved the particular case we were working on but in my opinion golang's opinionated approach to threading is not suitable for writing many system tools in it.

My wishlist item for golang 2.x is a build mode where threading is 100% under developer control but this seems to be at odds with the design for non-blocking IO.


Interesting. I don't have access to the code I was talking about anymore, but I'm pretty sure that I did have different threads in different mount namespaces.

Looking at the man-page briefly it looks like the restriction you are talking about applies only to user namespaces.

Interesting to know that there are even more situations to be considered.


The question is,

Why Linux? You already have Linux for OS, why the same on firmware?

Why not Minix? ( Used in Intel Me Already )

Why not OpenBSD? ( Very Secure )

Why not NetBSD? ( Extremely Portable )

Why not FreeBSD?

Why not Coreboot?

Why not the OpenSource UEFI implmentation?

Why not switch back to simple and easy BIOS?


> Why Linux? You already have Linux for OS, why the same on firmware?

The Linux they're flashing to the firmware ROM is a custom minimal build. Using that, they can then boot a standard distro kernel. And depending on your threat model, you might not need to update the firmware kernel as often as the distro kernel, as it's only used for booting. Note that this is for servers; for an embedded system you might as well boot directly to the final kernel.

> Why not Minix? ( Used in Intel Me Already ) Why not OpenBSD? ( Very Secure ) Why not NetBSD? ( Extremely Portable ) Why not FreeBSD?

A few guesses:

- Linux is the one most familiar to the developers of this. And to the Google sysadmins, presumably.

- Compatibility: If the distro kernel resides on, say, an XFS filesystem on an MD-RAID device, the firmware boot kernel needs drivers for that.

- It uses kexec which is a quick and easy way to boot a Linux kernel from another Linux kernel. If you want to boot from another OS you'd have to figure out something else.

> Why not Coreboot?

Coreboot is preferable, but since the low-level firmware on modern x86 server platforms is tightly controlled with hardware signing keys, no data sheets released etc., coreboot hasn't been able to run on an Intel x86 server platform for the past decade. This project (NERF) is a pragmatic compromise, by replacing only the upper parts of the firmware stack and removing/disabling the ME as much as possible.

> Why not the OpenSource UEFI implmentation?

It's designed-by-committee crap which is far less battle tested than Linux.

> Why not switch back to simple and easy BIOS?

It's less evil than UEFI sure, but I guess it's unfortunately only a question of time until it's removed from PC firmwares (Apple already did it many years ago, AFAIU). Besides, AFAIK switching to BIOS mode does nothing to disable the ME.

And if you're going to do something from scratch, you might as well use something sane like coreboot.


> I guess it's unfortunately only a question of time until it's [BIOS is] removed from PC firmwares

As far as the bit I know, Legacy Boot, an option on (most/all) current PCs that simulates BIOS and which many people confuse with disabling UEFI, is actually a mode of UEFI. For more detail, look up Compatibility Support Module, and the difference between Class 2 and Class 3 UEFI.


Thanks, I stand corrected.


There's also Open Firmware.[1] Sun and Apple used to use that.

[1] https://en.wikipedia.org/wiki/Open_Firmware


It sounds like the whole point of this project is to not have two separate OSes. Linux+Linux is much simpler than Linux+Minix or whatever.


If you are going to use a Linux kernel (on layer 0) anyways it reduces the attack surface by half!


Mods, please change title to replace Coreboot with "Linux kernel". This is not Coreboot and doesn't involve Coreboot at all :)


I watched the talk video and I don't remember it was stated anywhere that Google will replace UEFI/Intel ME. The correct title is "Replace your exploit-ridden firmware with a Linux kernel".


Doesn't Coreboot implement UEFI? I suppose a more accurate title is "replace proprietary UEFI implementation"... nonetheless, the idea of putting a full Linux kernel in the firmware (if I'm reading the article correctly), presumably just to boot another one in the actual OS, despite the openness, sounds like it would increase complexity even more.

I think Linus' opinion remains relevant over a decade later:

http://yarchive.net/comp/linux/efi.html

https://plus.google.com/+LinusTorvalds/posts/QLe3tSmtSM4

I'd actually love to see a "replace UEFI with regular BIOS" project ;-)


Over 20 years ago, we had MILO, the "mini-loader" for DEC Alpha machines that was a Linux kernel used as a bootloader to find the real Linux kernel we'd then boot. In that case, it would initialize built-in device drivers to be able to use the network or the block devices, load another kernel file into RAM, and then kexec the new kernel and disappear until the next system reset.

It is an interesting wrinkle to ponder leaving such an embedded kernel running in place of UEFI or other SMM hypervisors. Should there be a new syscall layer between the two kernels, or are current SMM traps, ACPI, and UEFI constructs really the right way to do it, even if the same open source community is in charge of both layers?

However, a general practice has been that the firmware loaders would be replaced less often and stick with "known good" versions, while the booted kernel could be updated more often. But in this new and threatening world, the firmware kernel can be "known bad", so you need a way to push out updates frequently. You also need a fairly good fail-safe technique to recover when the new update isn't so good after all. And you need to avoid solving this problem with yet another firmware layer which lets you choose between your firmware copies, but itself can be compromised and cannot be easily patched...


> Doesn't Coreboot implement UEFI?

No. Though I think there is some work to allow people to build a coreboot + upper layers of UEFI (presumably using the open source tianocore UEFI implementation) combination in order to boot operating systems that require UEFI. Just like it's possible to build a coreboot + seabios combination in order to boot OS's that require BIOS services (such as DOS).

> nonetheless, the idea of putting a full Linux kernel in the firmware (if I'm reading the article correctly), presumably just to boot another one in the actual OS, despite the openness, sounds like it would increase complexity even more.

UEFI is very complex. I.e. it contains a TCP/IP stack (v4 & v6), and whatnot. The Linux IP stack is certainly a lot more battle tested than the UEFI one. And further, the idea is to allow the owner of the hardware to update the boot Linux kernel and not be beholden to the whims of the Intel & the HW manufacturer, who may not have the owners interest as a primary concern.


> UEFI is very complex. I.e. it contains a TCP/IP stack (v4 & v6), and whatnot.

For better or worse. Most regular BIOSes contains IPv4 for PXE anyway.

UEFI just allows you to do PXE a little more cleanly directly from firmware without needing a disk to bootstrap the process with other stuff like iPXE.


> Most regular BIOSes contains IPv4 for PXE anyway.

Indeed. But that's UDP only, not TCP which is a lot more complex. UEFI contains TCP as well (IIRC recent versions of UEFI support boot over HTTP, similar to iPXE).

> UEFI just allows you to do PXE a little more cleanly directly from firmware without needing a disk to bootstrap the process with other stuff like iPXE.

That's correct. OTOH when your firmware + bootloader stuff requires a TCP/IP stack, HTTP, support for booting from all kinds of software RAID setups, filesystem support in order to be able to find and load the kernel, and whatnot, replacing all of that with a minimal Linux kernel + userspace doesn't sound so crazy anymore.


> Most regular BIOSes ...

Can you still buy machines with non-UEFI BIOS from mainstream manufacturers?


No, UEFI actually leaves components running after the OS boots. This talk discusses how to replace the proprietary UEFI BootManager with a Linux kernel, and then uses it kexec the Linux kernel from the boot device.


From the presentation, looks like that - as of today - a ChromeBook are way more secure than a MacBook. Is that right?


I don't think much is publicly known about Apple's UEFI implementation; what if it's already stripped down?


They confuse exploit with vulnerability. Exploit = ready made program to take advantage of a security hole.


Yeah, but it's also extremely common to conflate those two. It's not as bad as how everything was somehow a "zero day" for a few years there.


What are the reasons the hardware the microcode is running on isn't directly exposed to the user? Why shouldn't we teach our compilers to generate code for the lowest level possible?


There is actually an significant technical issue with user-supplied microcode on modern fast CPUs: the control store is not fully writable and only small parts of it can be patched in the field. Reason for that is that the control store is probably the fastest block of memory in modern PC as it has to have latency on the same order as CPU clock period and usually has very wide data bus. Implementing this as entirely writable RAM would be uneconomical.


Backwards compatibility is a big reason. If the OS handles everything then you can only use an OS that has the latest drivers. Driver bugs could also brick your hardware in some cases.


Slide 3 refers to "ISH" and "IE" among "X86 CPU(s) you don’t know about". What are they referring to?


ISH: Integrated Sensor Hub

IE: Innovation Engine

There's also the PCU Package/Power Control Unit that no one talks about.


So many layers with a single goal: preventing the user from owning the hardware.


Does this indicate us end-users can apply it too to our own gaming PCs for example?


AFAIU at this point it's quite early, and the way they're working on it at the moment is to remove the ROM flash chip (which, depending on the motherboard, may require desoldering the chip) and flash it with a SPI programmer.

If you're up to that stuff, I think they'd be happy to have your help. Otherwise maybe you should wait and hope they'll figure out how to flash it from Linux, and make it robust enough that there's a relatively low risk of bricking your system.


Is'n't GRUB supposed to play that role? Why a full kernel?


I think grub would usually be called after the boot firmware (like UEFI, or coreboot, or a traditional BIOS) has already initialized the hardware. Some of these systems can also directly load your operating system, and I would guess the one of the few thing missing from the boot firmware would be the filesystem support, and not much more


  > The problem 
  > ● Linux no longer controls the x86 platform 
  > ● Between Linux and the hardware are  at least 2 ½ kernel
  
Please release it for general public, so that everyone has the choice (for Intel+AMD CPUs).

  > code you don't know about: Kernel -3, Minix 3
Wtf "Minix 3" running invisible behind the scene


Parts of Intel ME are probably derived from Minix 3.

http://blog.ptsecurity.com/2017/04/intel-me-way-of-static-an...


Compiling on demand is cool, but why is it needed?


If I understand correctly to make it more auditable. Still: tool-chain remains prebuilt and trusting trust issues. More auditable is better than less auditable.


Why is compiling on demand more auditable than just having open source BIOS?


UEFI is open source to some extent. There is an Intel project [1] that every company's UEFI is based upon. Even having fully open source BIOS is not enough by itself. The binary that is on your system can be anything. You can somewhat trust it if you can verify it, only if the project would use reproducible builds. But the hardware can still lie to you. What I also have in mind is the trusting trust issues [2].

[1] https://github.com/tianocore/edk2

[2] http://wiki.c2.com/?TheKenThompsonHack


How I read the title: replacing a exploit-ridden software, with another exploit-ridden software (but this one is open source)


Linux has undergone a lot more (hostile and friendly) penetration testing and bug-fixing than UEFI.


Linux is still certainly more "bug-ridden" than UEFI, because of size alone.


From the abstract:

> UEFI is a proprietary and closed-source operating system, with a codebase almost as large as the Linux kernel...

And this kernel build is a trimmed-down one, removing functionality that isn't necessary to initialize the hardware and boot the full kernel.


The other one is also open source (maybe a bit less). However Linux should be a bit less exploit-ridden


So there is the Intel reference implementation. But vendors are free to make changes, and their versions are not open source.


64x




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: