Hacker News new | past | comments | ask | show | jobs | submit login
How is the Linux kernel tested? (embeddedbits.org)
159 points by sprado on March 28, 2020 | hide | past | favorite | 56 comments

We run our own ci for building Linux kernels with clang. (ClangBuiltLinux.github.io). We take Debian's nightly package of ToT, make a docker image with the minimum tools we need, and use buildroot ramdisks that have a custom init script that powers down the machine once it reaches init. Our CI fetches various kernel trees and branches, builds them, then boots them in Qemu. The machine has 2 minutes to power up and down (usually takes less than 10s), otherwise we consider the machine hung and fail the run. We use travisci for the reporting, but are looking to offload the building.

Also, Linaro's Tool chain Working Group runs a ton of CI on the kernel as well. There's an effort from RedHat called KCI to aggregate all of these reports.

There was a bug in a released kernel I ended up having to hunt down back last summer. It required a very specific set of circumstances to actually trigger the bug, but luckily for us we found some obvious side effects from the bug such that it was possible to get the relevant information via a very simple statically compiled C program (there would be a certain value under /sys).

I used that C program as the initramfs for qemu, with it spinning up an instance based on the test kernel, and could grep for what I wanted in the output to get a relevant exit code.

Combine that git bisect's automatic bisecting process, and I was able to automatically git bisect through a few thousand commits in the kernel to find the cause. It only took it about 50 minutes to find it in the end, which was helped a bunch by ccache (and the sheer build parallelism granted by building on a 52 core system).

That was a fun, somewhat out of the blue, task.

The process I took was based on https://ldpreload.com/blog/git-bisect-run, with various adaptations to suit the needs and the relevant build environment.

so you have a very well tested boot sequence -

is this not a slightly better kernel equivalent of 'it builds, ship it' ?

> a slightly better kernel equivalent of 'it builds, ship it' ?

Surely, "it runs, ship it"? That seems quite a bit better.

The grand majority of code in the Linux kernel will never be hit by booting it in a QEMU with init=/bin/shutdown. The pure line coverage of these builds is probably like 5 or 10 %.

The majority of code will never be hit in QEMU or any testing in general, because 56%+ lines are in drivers. https://unix.stackexchange.com/a/223763/29119

With 16%+ lines in all 24 architectures combined, you're not going to reach more than one of them at a time.

10% is going to be an extremely happy case. If we can test 5-10% that's a great achievement. To get past that you need to start booting real hardware with specific configurations.

Note that our CI is doing way way more testing than most kernel developers do. It's not meant to exercise 100% of the kernel, just be a basic smoke test to see if we really messed something up.

From triaging reports daily from kbuild test robot aka "0day" bot, I'd say 1/3 to 1/2 of kernel commits pushed by various developers to their trees have never even been compiled (very obvious mistakes regardless of toolchain).

We get way more coverage in actually shipping these kernels in Android and ChromeOS.

Why is there such little emphasis on “traditional” testing, that is, regular unit tests? At least some portion of the code base is surely suitable for normal unit tests. For example data structures, scheduling algorithms, file systems, ...

I think the main reason is that the Linux kernel (and similarly the *BSD kernels) are written in a programming language (C) that doesn't make it easy to do that. Code is often directly built on top of other kernel subsystems without any dependency injection whatsoever. This means that it's still possible to do unit testing of parts of the kernel, but it takes a crazy amount of effort, such as overriding symbols, overriding include paths and provide stub headers, etc..

I am well aware that it's also possible to have dependency injection in C by using structs with function pointers, but I think we can all agree that it's a lot less pleasant to use than C++ abstract base classes, Go interfaces or Rust traits. This is why the Linux kernel only tends to use this sparingly (e.g., inode operations).

This is probably the reason. I work at a place where most software is written in C, and I see the same thing here: literally all tests are either manual or integration tests. Unfortunately, this also means that it takes about about half a day to 'run the tests'.

I work on IOS-XR, a router OS written in C.

I agree: UT is a pain to write precisely because you need to spend quite a bit of effort to stub out your dependencies. And when you do stub things out, they usually end up being “dumb” stubs where the function just returns EOK. Thankfully, there has been a recent effort in XR to leverage the Cmocka test framework to make stub functions a bit smarter.

Even if you have great UT, there is a bigger issue: the UT only tests and validates your code, but does not validate interactions with other components. With a system as complex as IOS-XR, there are non-trivial situations that you simply cannot trigger with UT.

This is where IT shines, imo: you can bring up a full router and test all known interactions at the system level. The test runtime is much longer, of course, but in my experience, it’s worth the wait to avoid hitting the issue down the line.

You don’t need a dependency injection framework to write unit tests, you just need cleanly separable units with well defined interfaces.

Note that I am not saying you need a dependency injection framework (like Google Guice/Dagger for Java); I’m merely talking about dependency injection as a concept.

Abstract base classes, interfaces and traits allow you to add dependency injection with relatively little code. In C it is simply more of a hassle, which is why folks don’t tend to do it.

And, in the absence of a dependency injection framework, it's likely that the units are not cleanly separable - because, without a DI framework, all classes are (presumably?) instantiating their dependencies directly.

Unless I've missed something? I've only ever worked in Java so maybe things are different in C-world,

> without a DI framework, all classes are (presumably?) instantiating their dependencies directly. ... I've only ever worked in Java so maybe things are different in C-world,

Well, for one thing, there are no classes in C. :) It is possible but unfun to emulate them with function pointers. Iiuc, little of the Linux kernel is written in that style.

Also, FYI, for many years we did DI without frameworks, using the factory pattern and other techniques. It wasn't always fun but it can certainly be done without Spring or whatever the new thing on the block is.

> Well, for one thing, there are no classes in C. :) It is possible but unfun to emulate them with function pointers. Iiuc, little of the Linux kernel is written in that style.

object-structs with function-pointers-for-methods are super-common in the Linux kernel and basically used everywhere for everything where modules can plug something into the kernel (e.g. virtually all drivers have at least one of these).

Thanks for the correction. I was going off the little bit of Linux code I've read, which seems to call most functions directly. And also another comment on this story. I don't know what to think now.

Linux is monolithic, but also modular. While drivers are almost entirely implemented with these kinds of objects, "core" modules have less pluggable functionality, and so you don't see it as much. For example, contrast the page cache code (that's basically mm/) with the VFS code (fs/). You'll notice how almost anything I/O uses these kinds of objects heavily.

This is a major advantage of NetBSD's Rump kernels that are used for automated testing. Some people have tried doing the same for Linux but I'm not sure if any such efforts are still in progress.

  > it takes a crazy amount of effort
I agree with basically everything you've said but I don't buy that it takes a crazy amount of effort to do anything. You have C. If it's hard to do in C, you have a Makefile. If it's hard to do with a Makefile, you can run a script during the build process. Anything can be streamlined.

  > it's also possible to have dependency injection in C by using structs with function 
  > pointers, but I think we can all agree that it's a lot less pleasant to use than C++ 
  > abstract base classes
I hate function pointers, and void* context pointers even more, so I wrote macros to do binary search and sorting so I didn't have to pass a void* to qsort(3) and bsearch(3) (also, bsearch(3) doesn't tell you the insertion point of a missing element)

If you want to sort an array:

  int arr[] = {5, 10, 15, 17, 20};
  size_t size = sizeof(arr) / sizeof(*arr);
  QSORT(arr, size, arr[a] < arr[b]);
If you want to find the value 5 in that array:

  ssize_t index;
  BSEARCH_INDEX(index, size, arr[index] - 5);
  // Now 'index' has the result.

With regards that anything can be streamlined: sure, but it’s also about the amount of investment that would take. You could spend days or weeks to automate all of this for C. Meanwhile for Go there exists a tool called ‘mockgen’ (https://github.com/golang/mock) that can automatically stomp out mocks for any interface type declared in code. Not just for the ones in your codebase, literally arbitrary ones: interfaces part of the Go standard library, ones that are declared in third-party dependencies.

The fact that you hate function pointers and void* context pointers is an exact confirmation of my premise: people think it’s too much of a hassle.

  > With regards that anything can be streamlined: sure, but it’s also about the amount of investment 
  > that would take.
Yes, I can't deny there is more up-front cost in C for some things.

  > The fact that you hate function pointers and void* context pointers is an
  > exact confirmation of my premise: people think it’s too much of a hassle.
My point was that there's usually a better way to get around a language's (in this case, C) limitations, and it's not necessarily macros every time. At least for the problem of abstract base classes, I rather liked your hinting of the linker swapping out the desired implementation for test binaries. That makes sense, since I think I've never seen an abstract base class (which is abstract for testing purposes) have more than one implementation per binary.

As for mocks, the fact that they're hard to do in C may be a feature in disguise...

Linux kernel project predates what you call traditional unit testing practices†. Its success in the first decade and a half coupled with pragmatics of hardware testing make the flow what it is now.

† Regression testing was certainly known then, but it was not a dogmatic movement yet.

I understand there are hurdles due to legacy, language, low level etc. But if 1 or 2% of a huge code base is easily testable, shouldn’t it be? At least if/when regressions are found in functionality that can be easily testable (pure functions etc) it would seem prudent to add regression tests to prevent the thing from happening again. Even in a 30 year old C code base.

You could test data structures (which tend to be implemented using macros), but any nontrivial subsystem of a monolithic kernel lives in a web of dependencies with other subsystems. This is especially true of Linux.

To solve this, you would need to either use a hierarchical decomposition of subsystems or do some crazy mocking to run subsystems outside of the full kernel.

There is a recent project (KUnit) to add a unit testing framework to the Linux kernel, but it remains to be seen how much adoption it will get.

NetBSD has their rump kernel tech, which specifically exists to run chunks of kernel code independently. It can do a lot of neat things (I liked the "run netbsd drivers on other OSs" trick, personally), but one of the big uses they've mentioned is that it helps testing and development.

A lot of the linux kernel is implemented as modules, which does allow said modules to be run under different OSes. You could probably use the module interface to implement unit testing.

I think many kernel developers view that as a problem as well. KUnit hopefully will change this when it becomes more widely used. I recommend more kernel developers check it out, it's quite nice even in its current form!

How do I test a driver for which I do not possess the hardware?

You write the software in such a way that instead of just reading and writing registers or memory you exercise some set of functions. In normal operation you pass the driver a set of real functions that read and write real registers. In testing you pass functions that do other things. This makes it quite easy to exercise the features of the hardware that are rarely seen in the wild. For example most IO adapters and NICs have some kind of signal that they are overheating. Most Linux drivers simply ignore or malfunction when these conditions are raised, because the author of the driver never got a chance to manually exercise that feature.

This is basic design for unit testing but it's impossible in Linux because Linux lacks a zero-cost abstraction that would let you mock out a device. C only has costly abstractions such as tables of function pointers.

You can compile object file in isolation and provide mocked implementations for all imported functions.

You test the logic in unit tests with any hardware interaction mocked. Even if you can't test a lot of the driver, there is surely some logic (data structure manipulation, buffer construction, etc) that you have factored out into testable functions.

What if there's no mock for the hardware? Whoever wrote the driver didn't supply one. Is it better to not accept drivers unless there's mocks? Do you know how few drivers Linux would have in that case?

FWIW, I agree 100% with you. It's just simply not the way the world works.

You write your own simple mocks?

I should add that AWS is the only thing I’ve ever mocked that had third party mock tools available, everything else I’ve ever worked on required us to write our own. I’ve never written device drivers, so I’m not arguing that it’s easy or common to do, just that’s what I would do at least as much as possible.

Write hardware emulation for VM with any behaviour you want to test.

Then I end up with a driver that conforms to emulated hardware and not it’s real counterpart

It seems like that's a concern with any testing strategy that mocks out some part of the system. Obviously there's no getting around actually testing against the hardware, but it seems like it could still be useful for the same reason tests with mock implementations are useful generally.

That's how unit tests works.

You port the driver to Rust, obviously. Then run the driver in a docker container that communicates with a serverless unit testing framework written in node.js va JSON commands.

> Another major challenge is to automate device drivers and hardware platform tests because they require the hardware to be tested.

Anyone have any thoughts on how to automate testing of coupled hardware-software systems? This is really hard; we've attacked it in the past by writing hardware simulators in accordance with the ICD. This falls flat once you find that the hardware doesn't precisely match the ICD, and usually it's much cheaper to change the software than the hardware. And at that point, the simulator hasn't actually helped you at all.

I recently built a system which involved several tightly coupled hardware components and we fought many bugs on a tight schedule. It would have been nice to find a good way to think about this beyond the basic hardware-in-the-loop manual testing.

So, here's a thought experiment:

Start a company who's mission is to provide batch processing of hardware tests for their clients. Test jobs are submitted online with a packaged version of the software and a test script written in some job control language.

This company has a huge warehouse (to start with) of all sorts of hardware. Every port and connector (ethernet, USB, HDMI, PS/2, serial, etc.) is hooked up via a giant network to a central server.

The central server can then run the batch test jobs. It will deploy the software, and can even simulate interactive execution by routing keyboard/mouse signals to the device and scanning the display output signal.

Obviously, sometimes the hardware will have to be reconfigured, thus reducing turn-around time, but clients can pay extra to have hardware set aside that is set-up in their particular configuration.

Eventually they could build up a library of emulators which have been empirically tested to match actual hardware behavior (rather than the spec). Hardware that has been emulated can be tossed to allow room for new hardware. Customers might even be able to run on demand tests in "the cloud" using just the emulators.


Basically, I think it's too expensive to do in an ad-hoc basis. You really need a setup that can benefit from the economies of scale.

I can't tell you precisely how it was done but I worked at a company with many different types of hardware that contained many complex configurations of FPGAs, lasers, optics, and microcontrollers, coupled to a computer and they managed to simulate it quite well for what seems like a decade. One of the scientists there was one of the few geniuses I've ever met and they managed to simulate all those devices and a sufficient amount of their variations. So I can confirm it's possible, maybe it just requires an overworked genius?

So I can confirm it's possible, maybe it just requires an overworked genius?

I believe that, but unless you can find a genius and/or mass-produce their work, does it help the rest of us?

Maybe. Knowing something is possible is sufficient for enough people to give things a try.

I know I've been successful in doing so; I've never built a simulator of this magnitude but I've successfully solved difficult problems with novel solutions simply from hearing it was possible to solve them in a given manner.

There is automated static analysis from smatch for over ten years now:


This has found thousands of bugs in the kernel.

So what happens when Linus Torvalds dies? Is there someone else who will become BDFL?

Greg Kroah-Hartman has filled in before when Linus has had to take time off.

Long way of saying "it's not".

LTP is only a small part of testing Linux.

This has not been mainlined as far as I know.

The commit history looks pretty healthy

"Mainlined", not "maintained".

It is, by everyone who uses it every day.

Oh my! I though they were doing TDD :)

Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact