Hacker News new | comments | show | ask | jobs | submit login
OpenBSD Will Get Unique Kernels on Each Reboot (bleepingcomputer.com)
409 points by SwellJoe on July 6, 2017 | hide | past | web | favorite | 111 comments

In addition to a new kernel at boot, several libraries in base are also randomly re-linked, including libc and libcrypto, which are prime targets.



This means that in addition to the dynamic linker loading libraries at random addresses, in a random order, the offsets inside the library itself are different on each boot, and on each system.

Robert Peichaer (rpe@) added the kernel re-linking at install/upgrade: https://marc.info/?l=openbsd-cvs&m=149884116824098

Any ideas how these are randomized? Just wondering if these are still vulnerable to a timing attack...

What part? At least for the library randomization, a special libc.so.X.Y.a is part of the base install, objects are extracted with ar(1), and then re-linked in a random order (sort -R/arc4random(3)) to create a working shared library.

The kernel has a similar reordering stage, best explained by Theo de Raadt: https://marc.info/?l=openbsd-tech&m=149887978201230&w=2

The full implementation is in the tree.

Isn't secure random number generation a problem immediately after boot? Curious what arc4random is using as an entropy source.

OpenBSD's bootloader seeds the kernel, mixing in other entropy sources. High quality random numbers are available very early compared to other operating systems.

See Theo's talk from Hackfest 2014, an updated talk originally given at EuroBSDcon 2014:


Page 19 and beyond explain this in detail: https://www.openbsd.org/papers/hackfest2014-arc4random/mgp00...

Does it actually do the randomisation at boot time? It's unclear from the article, but OpenBSD could be randomising the next kernel to boot after the current one has been loaded. This could be performed some time after the boot process, allowing more randomness, and minimising the boot-time workload/code complexity.

No. After boot time. As you said.

Many modern systems have hardware RNGs, but they may also be using stored seeds.

Yep, there's a stored seed (/etc/random.seed) that gets added to the mix at boot.

> Many

^[which?] ^[citation needed]

On desktops, Intel Ivy Bridge and newer and everything AMD since June 2015.

On mobile, most mobile SoCs include security stuff, Qualcomm seems to have had them since at least the Snapdragon 805. See here for the addition of the RNG to the linux kernel in 2013: https://lwn.net/Articles/570158/

Even common embedded SoCs like those used in the ESP8266 include hardware RNGs.

Really, there's no excuse for not using it as at least one factor. If you're concerned about possible backdoors, xor it with your own CSPRNG in software like the Linux kernel does.

TPM has one too.

Recent AMD and Intel systems with AES-NI provide a hardware RNG. Although as far as I was aware this is not used on OpenBSD for fear of hardware backdoors.

See https://en.wikipedia.org/wiki/RdRand and (same page) https://en.wikipedia.org/wiki/RdRand#Reception for info on concerns about backdrops.

> Having KARL on other OS platforms would greatly improve the security of both Windows and Linux users.

This is surely true, but at least on Windows the central security holes do not lie in Windows itself (these kinds of holes exist - but exploits are very expensive, which shows that they are typically rare and not easy to exploit), but in third-party applications.

For example the current 2017 version of the Petya ransomware was spreaded via a security hole in the software update mechanism in the Ukrainian tax preparation software M.E.Doc. Other well-known attack vectors that are commonly used to attack Windows PCs are Flash Player and the Java browser plugin.

Well, it's true that the initial vector is often third-party software. But once you're able to run arbitrary code in a user-mode process running in a limited security context, you still need to attack some high-privilege component to get full control of the machine. Usually this component is the kernel, so additional kernel mitigations do help protect you.

Also, at least for those of us on Windows 10 Professional and Enterprise, there is secure kernel.


"This is surely true, but at least on Windows the central security holes do not lie in Windows itself"

They don't ? IIRC, stuxnet was an autorun exploit of some kind and the recent unpleasantness was all based on built-in SMB functionality ... right ?

This is so wrong that I have no idea why anyone would make such an assertion. Win32k.sys alone is a bottomless pit of EoP vulnerabilities. They are a dime a dozen and not particularly difficult to exploit. Also, in the context of kernel security, tax software, flash, and java browser plugins have no relevance.

> They are a dime a dozen and not particularly difficult to exploit. Also, in the context of kernel security, tax software, flash, and java browser plugins have no relevance.

Indeed, but I wanted to illustrate that while kernel security is important, there exist much more dangerous "open barn doors" (I don't know whether this English translation of the German phrase "offene Scheunentore" is proper English).

Oh, that is fair. Kernel security is indeed largely inconsequential in the real world. My initial read of that has it sounding like saying Windows (kernel) doesn't have exploitable vulnerabilities, third party software does.

I still disagree, but less strongly :) Flash has always been a weak point, and Java was (but has not really been hit for a few years). But not only have there been exploits hitting MSIE/Edge/Office, they deserve much of the fault for the poor security architecture that facilitates exploitation of plugins in my opinion. Like untrusted fonts in the kernel, they seem to agree in so far as Edge no longer supports ActiveX at all.

The number of exploits overall has gone way down, but there are still a ton of security patches rated as Critical RCE coming out monthly in all the usual Windows targets. And now that Tavis shone some light on their AV engine, it has been revealed that is a gaping hole both in design as well as in implementation.

Regardless, there are far more practical realities that make Windows a security liability. If you survey 100 random penetration testers, you might find one that uses RCE exploits regularly (before shadowbrokers gave everyone new toys anyway). The playbook for everybody else largely consists of spear phishing to get a "beachhead" and then moving laterally with Pass-the-hash and similar things that are technically possible to defend if you read the documentation and set the right group policies, but that nobody in the real world does.

It makes sense to me, but we say "left the barn door open." I think the most natural translation is "gaping holes."

Don't forget fonts!

> Don't forget fonts!

It is today considered a design mistake that GDI was moved to kernel mode in Windows NT 4.0 for performance reasons (https://en.wikipedia.org/w/index.php?title=Architecture_of_W...). But with Windows 10 Anniversary Update font parsing is done in user mode within an AC ["AppContainer"] (source: https://blogs.technet.microsoft.com/mmpc/2017/01/13/hardenin...).

Was it really a mistake in a practical sense? Does Microsoft move their consumer base to the NT kernel via XP a few years later (a massive win for their platform overall) if GDI performance had to take a big hit?

Mark Russinovich, before Microsoft hired him, once demonstrated that moving the GDI into the kernel wasn't necessary for performance.

Windows NT was a nice, clean system from Dave Cutler, but wouldn't run a lot of code that ran under Windows 95. Especially 16-bit programs, which ran in a compatibility box under NT which was not tolerant of 16-bit programs doing things they were not supposed to be doing. XP put a lot of marginal Windows 95 code in the NT kernel and supported bad 16-bit programs. It took a decade for Microsoft to dig out from that mess.

> Was it really a mistake in a practical sense?

This is what many people on the internet say. This does not mean there there might be good arguments for the opposite standpoint, too.

At least I can tell that Microsoft is working to move parts of GDI step by step from the kernel back to user mode again, which should provide evidence that they consider this decision as a historical mistake, too, because it opens too many potential gateways for security flaws.

> which should provide evidence that they consider this decision as a historical mistake, too, because it opens too many potential gateways for security flaws.

Or perhaps it was the correct decision at the time, but now (decades on, with computing power orders of magnitude cheaper and security vulnerabilities orders of magnitude more expensive) a different decision is appropriate?

I don't know about that... plenty of other operating systems didn't render fonts in the kernel at the time, and they seemed fast enough. Let's just say it was a mistake but that we can be happy Microsoft is now fixing it.

> plenty of other operating systems didn't render fonts in the kernel at the time, and they seemed fast enough.

Which ones? I'd be surprised if either of classic MacOS or BeOS didn't have the display layer in the kernel; Solaris had it in userspace but was pretty slow; BSD was still tangled in lawsuits and Linux barely existed.

> BSD was still tangled in lawsuits and Linux barely existed.

Yea, but they seemed fast enough with rendering fonts in the userspace (xfstt). Which is what I thought we were talking about.

> Yea, but they seemed fast enough with rendering fonts in the userspace (xfstt).

I don't remember there being enough GUI applications around on Linux/BSD to be able to talk about whether font rendering was fast or slow. Anything that used motif was slow, netscape was very slow. xterm was fast but fixed-font.

Classic MacOS can't really be said to have a kernel, as it had no privilege separation and no security whatsoever; user applications ran with full access to the hardware, and used cooperative multitasking instead of being preemptable. The QuickDraw routines were in ROM but the only difference that made is you had to modify a jump table instead of being able to overwrite them directly.

(All of the above supports your point; it's just that the Mac went further than you implied.)

The most amusing part of this was in NT 3.x they made a big deal about everything not being in the kernel in all their marketing vs Netware where a bad NLM could crash a Netware server.

Happy to hear they've got it sandboxed further than just "user mode". Whenever I hear fussing about code running in userland vs kernel I think of this comic: https://m.xkcd.com/1200/

Non-mobile link:


Note that this is only the case for untrusted fonts. Trusted fonts stayed in the kernel, which suggests that Microsoft still does believe the performance reasons are valid.

Oh, I didn't know that! I haven't used Windows in any serious capacity since 7, so that's refreshing to hear.

Wasn't NotPetya spread via SMB?

That was AFAIK WannaCry

Also Petya/NotPetya. And it used NSA's EternalBlue, too.


Both are correct. The NSA SMB exploit is typically ineffective for initial entry into a network because SMB is almost always blocked at the network boundary but almost never blocked internally. So both Petya and WannaCry had different means for the initial infection, then used SMB attacks to wreak havoc once inside. WannaCry was initially delivered using plain old email attachments, and Petya was delivered via a software update through a hacked update server.

If the kernel image is different each time, you can't verify its integrity through a checksum.

I'm curious to hear from people working in infosec: is that real problem? How do you see the tradeoff?

I haven't noticed any boot time checksum verification on Linux or OpenBSD in the past 15 years. If it's there, I've missed it. But it doesn't seem like this change introduces a new problem. Secure boot springs to mind, but that's something you probably have to disable anyway to run OpenBSD...

Think about this angle: if you're concerned about infosec, and there is a malicious actor with the capability to replace your kernel (which you don't do unless you're root), you do have a real problem. Even if the kernel were verified at boot time, that same actor should have countless other attack vectors.

But it's something worth considering if a tursted chain from machine firmware all the way to the application level is established. It's not there yet.

What is the solution when you need to upgrade some kernel or program and the new version has a different cheksum? I don't see why that same solution, whatever it is, wouldn't work in the relinking case.

I haven't noticed any boot time checksum verification on Linux or OpenBSD in the past 15 years.

Some Linux distributions support kernel and kernel module signature verification in combination with secure boot. As far as I understand, RHEL does this automatically when secure boot is enabled:



AFAIK ANY distribution that supports secure boot does this, it's entirely pointless to sign the bootloader and kernel but then allow arbitrary kmod's to load in and harm your trusted system.

I disabled secure boot on ubuntu because I couldn't get the virtualbox kernel module... So pretty sure the modules have to be signed.

Anyways, with Lenovo who is to say they didn't leak their private key out shear incompetence :)

I faced the same issue. Though disabling secure boot just to get the vbox driver running isn't the best idea. There are quite many detailed tutorials that list how you can sign the vbox modules and it won't take that much time, I promise. If the kernel devs are providing us with a security mechanism, might as well use it. :)

You could just feed the individual kernel symbols in a canonical order (i.e. sorted) to the checksum algorithm. It's a bit more involved, but since boot loaders like Grub already have ELF parsers, it shouldn't be too hard.

You could generate a checksum each time you generate a new kernel. If you need some pre-kernel checksum validation, that might be a problem though.

In fact OpenBSD already generates a checksum for each kernel.

According to [1],

> "At boot time, a unique kernel is built and installed for the next boot"

Therefore, if the building code is itself trusted it can make a checksum and sign it. So each boot can verify the next boot, in a blockchain-ish way.

[1]: https://news.ycombinator.com/item?id=14711983

How does this help though?

If you are in a position to replace the kernel, can't you also replace the code that does this verification?

That is exactly how games are cracked, as I understand.

No, because the code doing this verification is also checked by the loader, which is checked by the secure boot module. The secure boot module provides the trust root.

What is the attack vector on KASLR that KARL prevents?

My best guess: A leaked kernel pointer could be used to find an offset for the KASLR kernel, and that offset could produce a working payload for some other unrelated kernel shell code exploit.

If that's correct, KARL seems like a pretty fringe improvement over KASLR. Can anyone educate me?

That's exactly it, more or less. KASLR is almost always useless in practice because there are so many ways to leak kernel pointers that most seasoned researchers are convinced that this will never be plugged. Even if they are, it's often the case that the same bug you are exploiting can also be used to leak enough information.

Here is a long anti-ASLR rant by the folks who invented the ASLR mitigation in the first place, explaining why attempts to repurpose the idea for kernel attacks are misguided:


I agree that it is probably not a meaningful improvement.

I don't think it's meant to be a massive improvement - just another one of the many barriers you want to put in front of an attacker. Assume they will get through, but make it as difficult as possible in the hopes that reduces the people who attempt it.

Or make it as difficult as possible so it will eat up the attacker's time.

KASLR just requires ONE leaked pointer to calculate base offset of the kernel, and from there is the standard ROPchain technique. The kernel is still one big identical blog, just mapped at a different starting address.

If i understand KARL correctly, they reorder the internal code (and data?) in the kernel. Therefore a single pointer-leak does not expose all the ROP gadgets anymore. More information leak is necessary, or a smaller amount of gadgets. Therefore imho this is a much better protection than KASLR.

There's this thing called return oriented programming that this is a hard counter to. Return oriented programming replaced shell code a while back when stacks became non executable across the board.

And why would you ever reboot openbsd computer? :)

To verify that it still boots correctly after an update.

If an update does something that breaks your startup sequence it generally is much better to find out about it right away than it is to find out about years later in the middle of the night when you get a forced reboot due to hardware or power issues, and find that things are broken and you have no idea which of a dozen allegedly minor updates broke it.

Then the package manager is broken on your OS of choice, the package manager (or another related utility) should ensure whatever it updates or installs actually works.

Please tell me about this method of writing bug-free software and building bug-free hardware that you have; I am very interested.

Can you also detect whether a program will loop forever?

I have an oracle machine that I'd like to sell you! ;)

Hey i wonder if such an app would be a great startup idea? The 'predict if program will terminate' one.

For a long time now many people are already working on this topic.

This apps are called static program analyzers and some can prove totality of code.

> Can you also detect whether a program will loop forever?

Assumed the current state of knowledge about our universe no program will loop forever.

But maybe our universe is a kind of infinite loop by itself. Who knows.

I know you're just being an ass but I'll entertain you. You write post-install scripts to ensure basic functionality.

Suppose I use program foo as part of some enterprise application on a server. I am inadvertently relying on undocumented or undefined behavior of foo.

An update to foo changes what happens in that undocumented or undefined case, and with the new behavior my application does not start correctly.

There is nothing the package manager can do to ensure that this does not happen because there is nothing wrong with anything the package manager is managing. The bug is entirely in my code. All the update did was expose it.

The question them is when will that now exposed bug actually get hit, so that I become aware of it and fix it.

The purpose of the reboot is to make sure that exposure happens at a time when it will not cause much harm and I will not have a lot of trouble finding it.

You assess basic functionality, e.g install a httpd and before install the package manager spins it up on localhost and ensures it can http.

To watch KARL do it's thing!

to quit Vim

I laughed and then remembered I did this once as a teen.

Then I laughed some more.

:q is not that hard ;) I upvoted you though because I liked the joke.

To upgrade to a new snapshot :)

Because you use it on a laptop with hard-drive encryption and tend to avoid suspend to RAM.

Liking the advocacy though

I tend to reboot after each patch, and of course, every six months for the new release.

I must admit that syspatch has saved me a ton of time.

Just curious - how would they deal with kernel crash dumps?

The re-link is recorded so if you needed can reproduce the same kernel. But as dimman has said you will have symbols.

It was the first thing that popped into my mind at first too but then I started thinking about it and out of the top of my head the only issue I could think of would be if symbols are stripped (you can't "load them from another kernel"). But as long as they're in there I think there shouldn't be any problems? But please elaborate on what you mean with the "deal with" part.

Wouldn't the symbols being there defeat the purpose of the whole randomization thing?

The purpose is to defeat a generic exploit.

For example, with ASLR, it's easy to retrieve the current layout of the processus and therefore to debug it (if you have the appropriate symbols) despite the randomization. But an exploit has to be built for the specific instance of the application running. If you share the same privileges as the process, you can get the current mapping but an exploit is useless. If you don't, you don't have access to the mapping either and it's difficult to get it while you are executing code inside the process as you can't easily access the functions you would need for that.

Binutils has had debuglink support for, like, a decade now. OpenBSD uses it to put the symbols in a separate file. gdb knows how to find it using information in the executable.

The relative offsets within individual compilation units is the same, so debugging is unaffected by this.

Not entirely true -

e.g. if your stack is hosed, etc, things could be affected

Also, much of kernel debugging involves poking at core images, which wouldn't be coherent with the kernel binary.

That said, I'm fairly certain I recall reading that the logic saves your previous kernel for such a case, and otherwise someone could make scripts to save more copies if needed.

Others have pointed out reconstructing kernels for debug purposes.

What's the performance impact of this? Are OpenBSD reboots suddenly going to take 5-10 minutes because the kernel has to be re-linked?

The relinking is done in a background job fired off at the very end of the rc script.

Per Theo de Raadt:

"At boot time, a unique kernel is built and installed for the next boot"

I am grateful that OpenBSD exists, but I find it sad that the most novel advances employed in systems security are layers of security/obscurity over vulnerable systems, instead of advances in intrinsically secure systems. Type-safety would eliminate a large class of vulnerabilities in systems software. We enjoy such benefits at the application layer, but systems software is far behind.

I have not followed these features - are they proposed in any way for adoption in FreeBSD ?

Nice technique but it cannot be used in a secure boot chain. Maybe that's not an issue in most contexts OpenBSD is used into, but if I were to choose I wouldn't turn off secure boot to activate this feature.

why can't it be used in a secure boot? The image and checksum are created at the end of last boot, so it could be signed then. If the bootloader is signed it should be able check the kernel.

the more i hear about OpenBSD the more i like it, i really should do more with it as my current experience begins and ends at using pfsense (well and a PS4 which i believe runs a BSD based OS if you count that)

Doesn't this prevent using UEFI secure boot mechanisms?


is there any book/document about building embedded systems with openbsd/netbsd/freebsd ?

Embedded, not so sure.

Depending on your definition of 'embedded', you might look at the various wireless router repackagings and scripts that are available.. Also, the NetBSD 'rump' kernel might be an interesting thing to review.

Generally speaking, the projects overall have excellent documentation, including manual pages or HTML docs on kernel, c library interfaces, building the system and packages from source, etc. Also, the whole system is in the source tree, which can be downloaded as a set.


OpenBSD in general: There's "Absolute OpenBSD', kind of more 'user/admin' level.

For lower-level stuff like API's, kernel organization, etc, the McCusick books (Design and implementation of the {4.3BSD,4.4BSD, FreeBSD} Operating System), though either dated, or more specific to FreeBSD, respectively, still cover quite a bit of stuff that still applies to OpenBSD (most changes have been incremental, and can be tracked through the source tree history back to the original USG sources if needed)

I'd suggest running an install 'from source' on a spare machine/vm/etc for a while and reading docs and the source tree this will get you familiar enough with the system to have an idea where to go next.

I guess this is more of a quick start but along the lines:


Distinct, not unique, I believe.

Random linking of at least 160 files will give more combinations than there are atoms on Earth. I counted 1206 *.o files in OpenBSD kernel. Unique enough?

Distinct enough, sure.

Unique means something else.

Oh, I'm sorry, unique in the current Universe from its birth many many trillions of trillions of years until after the death of all currently known galaxies. Still not unique?

No other OpenBSD installation had or will ever have the same kernel, so each kernel is unique.

You seem to think that I'm arguing that collisions might be a problem. I'm not. I'm raising a point of English usage. "Unique" is not the right word to use in this context; "distinct" is correct.

If a set has a single member, that member is unique; e.g., 2 is the unique even prime number.

If a thing is different from all other things, it's distinct.

> You seem to think that I'm arguing that collisions might be a problem.

To be fair, your short original post left readers free to guess at which unnecessarily pedantic point you were trying to make.

Huh? From the set of all possible linked OpenBSD kernels (2^number_of_dot_o_files), the kernel you re-link is unique. And distinct, too.

I think that you're not catching my point, and recommend that you look up the word "unique" in a good dictionary, preferably of mathematics.

(It's tautological and uninformative to say that any x is the unique member of the set of things equal to x. And it's simply incorrect to say that any member of a set with multiple elements is unique wrt that set.)

It's n!, which is much bigger than 2^n.

Thanks for correction.

That's a very shiny pedant.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact