Hacker News new | past | comments | ask | show | jobs | submit login
Automated Verification of a Type-Safe Operating System [pdf] (microsoft.com)
122 points by muraiki on July 10, 2017 | hide | past | favorite | 31 comments

Keyword here is "automated". Quoting, "In the end, the Verve design, implementation, and verification described in this paper took just 9 person-months, spread between two people". This is considerably less than, say, seL4, which took 20 person-years(!). Well, seL4 is larger than Verve, but still.

It's disappointing that little verified OSs like these don't show up little boxes that need them, such as home DSL routers, printers, webcams, and various IoT devices. Those all have one job, and run a fixed program or programs. You want them to do that one job and nothing else. Not act as attack vectors against other systems.

The interesting this out of all this is that Microsoft is the one often doing research like this, from F*, Project Everest, to complete kernels, but I've yet to see Windows have any of the new techniques/tools arising from MSR.

I wonder if it is a problem of the research being fairly new, or a lack of software developers inside Microsoft willing to replace their existing code and learn new tools and languages.

Until recently, MSR was also very disconnected from product development. There was a distinct change last year.

Also, in general most papers are more useful for writing further papers to improve on their limitations (until something's great), than to be adopted right away. This one is pretty cool, but both the conclusion and the benchmarks set out significant challenges for further research. After all, this is the first OS using automated verification, not the last.

> a lack of software developers inside Microsoft willing to replace their existing code and learn new tools and languages.

I'm sure research on the topic hasn't ended, so should they switch now or wait for more research? Or wait for research on verifying existing code? And if a new paper comes out, should they switch again? Also, do universities train enough developers to use such technology?

Because you are not looking into the right spots:

- Windows 7 kernel reorganization

- MDIL .NET AOT compiler for Windows 8 and .NET Native for UWP

- async/await for .NET

- generics for .NET

- UWP based on COM (original idea was called ExtVOS before .NET existed)

- Task Programming Library for .NET

- Pico-processes on Windows 10

- Secure kernel on Windows 10

- Hyper-V based containers

- Driver verification framework using a theorem proving framework

- The P language used for writing Windows 8 USB drivers

I guess they just don't adopt more due to management, and how the windev and devtools relate to each other in regards to technology adoption.

Half of what you mentioned doesn't come close to some of the research you see MSR publishing. The ones you mentioned are nice, but check out their papers. Some of their papers are so outlandish (for the lack of a better word), that I doubt they can even find a proper conference/publication to submit them to.

The most surprised I've been was when they started using P lang for drivers (which is also used for verification, so your last two points fall under a single point if I'm not wrong). This was the first time I've seen them using a new language/tool rather than make additions like the rest of the points you made. (Also FPGAs in their data centers but that's a different topic)

I'm not deriding their decision to do so, just an observation. Google's X does research which can possibly be an actual product, and I guess I expected Microsoft to do the same. Using such cutting-edge research and include it in their products might earn them cool points from HN, but I'd say the management is perfectly content with the numbers in front of them. OTOH, I doubt any manager is willing to take the blame if their initiative fails in a released product. Microsoft has enough bugs that people complain about as it is (refer to the front page submission about performance)

So I feel that I need to provide the background of the items I mentioned

Research on ExtVOS:

- .NET

- generics for .NET

- UWP based on COM (.NET was to be originally based on COM, not CLR which came to be due to Java's influence)

By products of Singularity and Midori projects:

- MDIL .NET AOT compiler for Windows 8 and .NET Native for UWP

- async/await for .NET

- Task Programming Library for .NET

By products of DrawBridge and IronClad projects:

- Pico-processes on Windows 10

- Secure kernel on Windows 10

- Hyper-V based containers

By products of Dafney, Z3 and theorem proving research:

- Driver verification framework using a theorem proving framework

- The P language used for writing Windows 8 USB drivers

Looks like I was wrong on multiple counts. Thanks for taking the time.

Any links on the driver verification framework? I've only seen people talk about using z3 for hardware verification but didn't know about MS doing it and talking about it publicly (could only find this: https://pdfs.semanticscholar.org/b100/b1562ba0e75191d29d4c15...)

Yes, here they are:

"SMT in Verification, Modeling, and Testing at Microsoft"


"Efficient evaluation of pointer predicates with Z3 SMT Solver in SLAM2"


"Tools for Verifying Drivers"


I think seL4 saw some adoption in routers?

I'd like a source on that since I try to collect examples of high-assurance security developed or adopted. Especially adopted as it's so rare.

MirageOS is another project to look into. I don't think it's high-assurance by itself, but it attracts the kind of people who are into that sort of thing. (HalVM, too.)

Like this awesome guy from CompSci:


Anil is quite the character indeed!

seL4 so far makes a lot of sense for what we're working on. The hardest part is finding seL4 expertise.

What you need is embedded microkernel expertise with some training in INFOSEC. Look up Nizza Secure-Systems Architecture paper for how to use such architectures well.


Whatever happened to that? Several papers in 2010, then nothing. Wikipedia says there was a 2013 release.

Microsoft had a later project, Ironclad (2014) which built on that to build some simple applications.[1] That's an active project at Microsoft.[2]

[1] https://www.usenix.org/system/files/conference/osdi14/osdi14... [2] https://www.microsoft.com/en-us/research/project/ironclad/#

"Whatever happened to that?"

Isn't the simple answer "Microsoft had a later project, Ironclad (2014) which built on that to build some simple applications."?

Chris Hawblitzel worked on both projects.

However, I wonder about your claim "That's an active project at Microsoft". Last publication on that page is from 2015, and last fit commit is from a year ago.

Quite possibly the same that has happened to other OSes at MSR like Singularity and Midori: the team moved on to a new research project which involved writing another new OS.

I keep hoping one day such projects will have right level of management support and actually deliver something big into the mainstream.

We got .NET Native, async/await, W10 pico-processes, secure kernel,windows internal refactoring... out of those projects, but they still feel they could have been much more with the right amount of management support.

As "Someone" said, they moved onto another project:


Reminded me of this bit from Guido van Rossum's PyCon 2012 keynote:


"I know my operating system kernel will never be proven correct."

There's two assertions:

1. They're not proving anything relevant to current software development.

That's a joke to anyone following formal verification. Key protocols, microkernels, hypervisors, safe concurrency, CompCert compiler, part of NaCl, Microsoft's static verification for device drivers... the list gets bigger every year. Quite a few are already practical.

2. I know my operating system will never be proven correct.

That's probably true if he uses one not designed for verification. The high-assurance field gave up on UNIX after prototypes like UCLA Secure UNIX and even Mach work showed it was inherently too complex and leak-prone to secure in a high-assurance way. They moved to security kernels, separation kernels, language runtimes, trustworthy hardware, and so on. Each time, tried to make the trusted part as correct and secure as possible. If they couldn't, a recovery option was added.

So, we have numerous secure and high-uptime systems deployed whose methods were published even if source wasn't. He or others can always build on these if they want to replicate the results. His OS will at least be secure against most attacks, quite reliable in practice, and most errors recoverable with the rest not leading to data loss if good backup/restore plan. Good enough for me. :)

> So, we have numerous secure and high-uptime systems deployed whose methods were published even if source wasn't. He or others can always build on these if they want to replicate the results.

I know that's technically true, but not helpful, and that's why this line of thinking is becoming unfashionable.

As a researcher in Programming Languages, who publishes and has reviewed code: By that argument, there's no need to publish CompCert or Coq, which seems ridiculous. Publishing code can help adoption even if the code gets rewritten.

Publications without code often fail replication for many reasons (not necessarily involving foul play or even errors). I can't document all tricks needed to get the performance results in 20 pages. And replication in another scenario might show the results don't generalize easily (and nobody knew). So having sources provides the extra confidence you need before bothering.

And compared to other publications without code, the needed expertise seems much less common.

OTOH, papers do get used by the research community, possibly steering it in some direction (including then wrong one). And a few papers matter enough that people will attempt replication anyway (and learn if they work).

When the results of a paper are "here's what we tried and got, new ideas are needed" (honestly, many papers on hard problems) then maybe sources don't matter.

It looks like you're making a good argument for a different conversation or tangent to this one. The comment you replied to was aimed at Guido van Rossum's claim up-thread that formal verification was useless and couldn't help his OS. I countered that claim. You're now saying techniques in published papers aren't beneficial without source. I'll address that since you took time to write it.

Actually, let's look at not publishing CompCert's source as an example. Under your theory, I can't learn anything at all from just the paper or binary. Here's what I think I can learn from it:

1. They claim they verified most of C language in a compiler down to the assembly in Coq. John Regehr's testing confirms it has the correctness in terms of defects detected that one would expect in a verified compiler. So, now I know (a) they did do a verified compiler for C, (b) the methods they describe might help other formal developers in their verifying compiler work (give ideas at minimum), (c) a successful verification of a C compiler clears the way for more projects/funding on C or other compilers, and (d) anyone needing a verified C compiler has a supplier, AbsInt, whose work was rigorously evaluated.

2. Building on former, the fact that the compiler will produce correct output more than most compilers means it can be used to check correctness of highly-optimizing compilers. Automated testing can be directed at an optimizing compiler and CompCert where the output of programs run through each is checked for equivalence. If not equal, probably a bug in the unverified one but maybe CompCert itself. Likewise, I considered re-running a pile of test cases for a C program that were showing bugs through CompCert version of that same program. Any bugs disappearing might mean those were compiler bugs. Help narrow things down.

3. A key method for both CompCert and CakeML was using a series of intermediate languages working from highest to lowest level w/ equivalence checks on each pass. People doing formal or informal methods for developing compilers can then apply that technique, maybe even those DSL's, to see if it improves their own work. They might start creating their own DSL's. If Racket and Red are any hint, just giving the ability to do DSL's with worked examples can lead to productivity or correctness improvements in many types of software.

Let's look at another example from high-assurance security. One of the demonstrators was closed-source GEMSOS. Its design and assurance methods are described here:


I used such papers to develop solutions that were highly resistant to penetration because the techniques supporting them naturally lead to that. Especially little code in TCB, using FSM's to brute-force the analysis, input validation, avoiding pointers much as possible, safer language, and so on. Difference between popular/mainstream INFOSEC and what I learned from INFOSEC's founders in high-assurance was mind-blowing. Common example is covert-channel analaysis of OS's is almost non-existent but Kemmerer's paper tells you how (Shared Resource Matrix). Likewise, highly-secure repo's are uncommon but Wheeler's survey (SCM Security) tells you how to fix it. Heck, I even learned to make sure my stuff could be formally verified later by just reading what was easy, moderately difficult, or hard in formal methods papers. I called Design for Verification which I later found was similar to how hardware engineers do things (i.e. DFT, DFM). I learned that from reading on Cleanroom methodology where they verified simplified code in box structures by eye before building for test runs.

So, I reject based on worked examples the claim that it's "not helpful" to have papers w/out source code that describe worked examples in enough detail to replicate for critical review or use for new project. It's been very helpful to all kinds of people. I also agree with you that there's many reasons to prefer source and/or data used in research over black boxes. It's good they published the source of CompCert. Even better that CakeML is fully open versus CompCert's restrictions. As I hinted at in 1, other teams started using CompCert and CakeML as backend infrastructure for their own work on other languages. Them being commercial w/ free use for researchers would probably have reduced adoption but I bet there still would be some if their DSL's were published. People could still do useful things, push state-of-the-art, and publish the work with much fan fare.

So, still benefit even if not maximal benefit to have detailed papers with no source.

Of course he believes that, he's the creator of a language that doesn't even have a type system...

It does, he just didn't want to attach it very strongly to his naming system

well it kind of does but idiomatic python is not very functional

What does that have to do with functional programming?

(I know Python3 has some optional gradual typing. And I know that Python ain't very functional---especially if you go by how Guido wants Python to be written. But I still don't see the connection---apart from that functional programming and academic theory on typing go hand-in-hand.)

He is very probably right.

Applications are open for YC Winter 2024

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact