Older Cisco routers ran IOS directly on proprietary hardware. At some point, Cisco decided to switch to Intel hardware but didn't port their kernel. They use a Linux kernel and ran IOS as a huge 50MB+ binary. The guy doing the talk got shell access and only found one ethernet device when running ifconfig. The actual switching hardware was being handled in userspace by the large binary.
I'm guessing they probably just wrote some shim layers to connect their PCI drivers up to the userspace PCI Linux API.
I think even if the driver were to be implemented in kernelspace, it would still probably not expose any of it's physical interfaces to userspace as plain ethernet devices, maybe apart from virtual/mgmt ones to run SSH on, and perhaps one so that the kernel can handle packets that the router doesn't have flows programmed for (like in OpenFlow).
Not absurd at all. Cumulus (which I cofounded) does exactly that. There are >1000 customers, including several of the largest cloud operators in the world.
It works really well in practice, since you can just fall back to the kernel for non-fast-path stuff like ARP. IOS/NXOS implement ARP (and everything else) themselves. We can just use the kernel's implementation.
The idea is essentially to use the lightning fast forwarding ASIC as a hardware accelerator for the networking functionality the kernel already has.
That's basically how switch development works in a nutshell, look at Broadcom's OpenNSL.
Once GPUs arrived, the ability to do latency-critical management of the device state became important and the register management moved into the kernel. But for traditional framebuffers the device setup was for the most part done once, and there's no particular need for that to be managed outside userspace.
One important one is that accessing the PCI config space via IO ports 0xCF8/0xCFC is racy with the kernel, since a read or write requires writing the BDF address to 0xCF8, and then reading/writing the data from 0xCFC. If the kernel tries to do this dance while the X server is doing it as well one of them is going to read or write the wrong address.
Interestingly, this design required in the X server to run under binary translation in VMware's monitor, even though it was userspace code, because it had to elevate its IOPL to be able to read/write the IO ports. CSRSS.EXE in windows also ran in BT, since it too was driving the graphics card before NT4. After NT4 moved the graphics code into the kernel, no one remembered to take out the IOPL elevation code, so at least until XP (and probably later) CSRSS.EXE runs with elevated privileges that it didn't need.
It is not very useful to limit its privileges.
If it can be done outside the kernel, it shouldn't be in the kernel.
I don’t know of anything that used it, but I’m sure there were custom PCI cards for data acquisition, hardware control, etc etc. that used it.
The main benefit is reliability. Driver code is usually lower-quality than other code that runs in kernels. The hardware itself can act weird in a way that messed the drivers up. The infamous Blue Screen of Death on Windows was usually driver errors. Isolating them in their own address space prevents errors from taking the system down. One might also use safe coding, static analysis, model-checking, etc when developing drivers themselves. Microsoft eliminated most of their blue screens with SLAM toolkit for model-checking drivers. Of the two, isolation with restarts is the easiest given you can use it on unmodified or lightly-modified drivers in many cases.
Far as security, it really depends on the design of the system and hardware. The basic, isolation mechanisms like MMU's might restrict the rogue driver enough if the attack just lets them go for other memory addresses. If it uses DMA, then they might control the DMA to indirectly hit other memory or even go for peripheral firmware. If the DMA is restricted, then maybe not. It all depends as I said on what the hardware offers you plus how the system uses it.
All these possibilities are why high-assurance security pushed in the 1980's-1990's to have formal specifications of every component, hardware and software, that map every interaction of state or flow of information. That didn't happen for most mainstream stuff. Without precise models, there's probably more attacks to come involving drivers interacting with underlying hardware that's complex. It's why I recommend simple, RISC CPU's with verified drivers for high-security applications. Quite a few folks from the old guard even use 8-16-bit microcontrollers with no DMA specifically to reduce these risks.
Far as verifying drivers, here's a sample of approaches I've seen that weren't as heavy as something like seL4:
Note: Including that last one specifically for the I/O verification part.
The main advantage is that you don’t have to deal with all the limitations of kernel mode programming.
Expect a release soon, for the first time in years. And it's a major one.
It "only" delayed Minix 3.4 for 2 years.
Edit: Cisco hardware was mentioned in this thread
- lower control transfer overhead, up to and including becoming completely polled mode. interrupt, getting into the kernel service thread from interrupt, through the kernel stack, into epoll, and into a user thread takes some time
- use of device specific features without having to plumb them through all the various kernel interfaces
- native asynch removing overheads associated with i/o thread pools
- exploitation of workload specific optimizations that would be defeated by the kernel scheduler, memory management, buffer cache, and other machinery
This in turn lead to Microsoft balking at supporting said hardware as Windows is deeply reliant on PCI (even the ARM SOCs powering the Windows RT products support PCI).
In turn Intel developed Moblin, that later merged efforts with Nokia's Maemo to become Meego. Later still foisted onto the Linux Foundation.
Similarly i think the Mach kernel powering Apple's OSs are a "fat micro" where various things that should be in userspace, if one followed the microkernel orthodoxy, resides in kernel space.
Perhaps the only orthodox microkernel OS out there is QNX, these days languishing in the bowels of Blackberry's holdings.
Edit: it is somewhat ironic that Alpha's memory protection model is designed such way that the natural way to implement any OS would be to write your own microkernel as OS-specific PALcode (something between firmware and microcode, written in extended Alpha ISA and the only thing that the CPU hardware sees as privileged code), but none of the Alpha OSes is implemented this way. In OSF/1 you thus get limited microkerne-ish thing that runs two process-ish things, one of which is Mach kernel and the other currently running Mach task, which in turn is either the essentially monolithic Unix kernel or Unix userspace process.
There is no relationship to Tru64 except that HP did also support OpenSTEP at one point.
L4 running on most GSM radio chips.
Many embedded RTOS targeted at critical systems, are microkernels as well. For example the offerings from Green Hills.
There are also modular kernels, which are also neat when implemented right (Linux is basically a modular kernel at this point)
"BSD is Dying"
As I recall, in that presentation, the redheaded stepchild is OSX.