
Hacker-Proof Code Confirmed - sprucely
https://www.quantamagazine.org/20160920-formal-verification-creates-hacker-proof-code/
======
acfoltzer
I'm one of the researchers at Galois who's working on the quadcopter platform
for HACMS. Let me first say that this headline makes me cringe just as much as
anyone. I'd also like to give folks a pointer to some of the work we've
developed on the project–it's all open source.

First we have Ivory, the DSL for doing safe systems programming that takes
buffer overflows and other memory-safety bugs off the table:
[http://ivorylang.org/](http://ivorylang.org/) [1]

Second, the flight control software we wrote for the 3DR Iris+:
[http://smaccmpilot.org/](http://smaccmpilot.org/) The adventurous among you
who fly with a Pixhawk can build it and try it out yourselves, and we'd love
to hear your thoughts.

[1]: When we began work on HACMS, Rust was around but nowhere near mature
enough to base a proposal on. These days, it covers a lot of the same bases,
although currently Ivory is more suited for formal analysis with SMT solvers
and model checkers.

~~~
eggy
Glad to hear you say that.

Aside from that the work seems very interesting. I am going to check out
Ivory. I have been reading up and playing with Idris and F _.

I like F_ so far, and that you can output F# and OCaml. Idris's goals seem in
line with what you were originally looking for. Have you evaluated it?

~~~
acfoltzer
Indeed, we're fans of Idris here, and even hosted a series of Idris tech
talks: [https://galois.com/blog/2015/01/tech-talk-dependently-
typed-...](https://galois.com/blog/2015/01/tech-talk-dependently-typed-
functional-programming-idris-1-3/)

The goals of Idris and similar languages are different from the goals we had
with Ivory, though. We sometimes describe Ivory as "medium-assurance" as
opposed to the high assurance one can get from a language with a more
expressive type system.

It makes sense for some systems or components to be formally verified, for
example the seL4 verified microkernel which is also used on our copter.
However we simply would not have been able to formally verify (or even get to
typecheck with fancier dependent types) something on the scale of an autopilot
given the resources we had on the project. Instead we rely on getting quite a
bit of bang for the buck with the memory safety features, and then augment
with assertions.

We ended up with a working autopilot (and board-support package with drivers)
in only ~4 engineer-years, so we think the tradeoff is working well so far :)

~~~
eggy
Sounds great. I'll check out the talks link and Ivory some more. Thanks!

------
lazaroclapp
Quick reminder that verified code is only hacker-proof provided that:

a) The specification correctly describes all correct behavior and correctly
proscribes all incorrect behavior (basically, you trade a huge codebase for a
smaller one in the form of the spec, which is a massive improvement, but as
long as humans write the spec it is not clear it can ever be fully hacker-
proof)

b) The trusted computing base works as modeled in the spec. Even on the limit,
that means that the physical properties of the hardware are as we expect them
to be and that our understanding of the laws of physics is correct. Slightly
below this threshold there are things like Row Hammer attacks
([https://en.wikipedia.org/wiki/Row_hammer](https://en.wikipedia.org/wiki/Row_hammer)
). Before even this: microcode bugs, proof-verifier bugs, etc.

That said, formal methods do provide a huge improvement over the state of the
art of completely unspecified software, and have been shown to work in
practice to greatly reduce the number of vulnerabilities of important large
systems such as compilers, kernels and now remote controlled helicopters. They
might not be appropriate for every piece of software quite yet, but I would
greatly like to see this stuff for things like self-driving cars, airplane
autopilots, smart grids and smart factories and in general all of the rapidly
approaching "heavy" IoT applications. Also, some verified properties of
commodity OS would be very nice to have (say container isolation in a cloud
environment or non-leaking of keys for a smart wallet process).

~~~
andrewchambers
> basically, you trade a huge codebase for a smaller one in the form of the
> spec

Most formally verified things i have seen require huge amounts of 'spec' code
compared to the implementation.

~~~
microcolonel
You're confusing the specification with the equivalence proof. A spec is
likely going to be about as simple as an implementation, with the benefit that
it can be read more consistently.

~~~
adrianN
All the specs for real world things I've seen are long and use fuzzy language.
Making them formal enough to provide a basis for verification will only make
them longer.

~~~
more_original
I wouldn't be so pessimistic about the possibility to produce useful
specifications of real-world systems.

One recent successful large-scale verification is the CompCert C compiler. The
compiler is verified relative to a formal specification of a large subset of
C. This specification is in the order of <2000 lines of Coq code (see
[http://gallium.inria.fr/~xleroy/publi/Clight.pdf](http://gallium.inria.fr/~xleroy/publi/Clight.pdf)
). So it is an example of a specification that is much smaller than an
implementation and that can be verified manually.

Also, there is good evidence that the specification is correct. From
[https://www.cs.utah.edu/~regehr/papers/pldi11-preprint.pdf](https://www.cs.utah.edu/~regehr/papers/pldi11-preprint.pdf)
:

> The striking thing about our CompCert results is that the middle- end bugs
> we found in all other compilers are absent. As of early 2011, the under-
> development version of CompCert is the only compiler we have tested for
> which Csmith cannot find wrong-code errors. This is not for lack of trying:
> we have devoted about six CPU-years to the task.

------
alexchantavy
> “We’re not claiming we’re going to prove an entire system is correct, 100
> percent reliable in every bit, down to the circuit level,” Wing said.
> “That’s ridiculous to make those claims. We are much more clear about what
> we can and cannot do.”

This is an important statement that should be highlighted more prominently in
the article (whose title I disagree with).

Formal methods are promising, and I'm glad that the researcher acknowledged
that when your project relies on so many other abstractions and systems it
doesn't matter how rigorously developed your solution is if a hacker can find
a way to accomplish their objective through one of your dependencies.

~~~
rocqua
Specifically, hardware level exploits are wholly disregarded by formal
verification. This means bit-banging could still work. In many systems I'd
also be wary of the OS itself.

~~~
nickpsecurity
They're disregarded by _software_ verification. There's more formal
verification deployed in hardware than software. There's also techniques to
verify hardware and software together. The hardware problems you see are often
deliberately left in there either due to corner cutting or backward
compatibility with bad designs of the past. Both inspired by desire to see the
number next to net income continue to rise. ;)

------
eriknstr
I Google'd _HACMS project_ and found an Open Catalog page at DARPA with a
bunch of links to open source software and paper PDFs.

Seem worth checking out to anyone whose interest was piqued by the article.

[http://opencatalog.darpa.mil/HACMS.html](http://opencatalog.darpa.mil/HACMS.html)

------
codemac
Has anyone successfully brought formal methods to an area of code that is
_newly_ being developed?

I have successfully convinced management to allow me to use formal methods on
portions of very old code (~17 years old at the time) that are being
rewritten.. but I've never been able to get management on board with the time
required for new development with formal methods. The planning time up-front
is more than what a lot of companies are willing to tradeoff for time-to-
market.

~~~
Ben-G
Amazon has made use of formal methods during the development of AWS. Relevant
paper can be found here: [http://cacm.acm.org/magazines/2015/4/184701-how-
amazon-web-s...](http://cacm.acm.org/magazines/2015/4/184701-how-amazon-web-
services-uses-formal-methods/abstract)

~~~
codemac
Found a much better link :)

[http://research.microsoft.com/en-
us/um/people/lamport/tla/fo...](http://research.microsoft.com/en-
us/um/people/lamport/tla/formal-methods-amazon.pdf)

------
kelvin0
A while back I was exploring the use of 'soft' formal methods to help in
evaluating correctness of some new modules we had to develop. I had come upon
Alloy, which seems like a gentler introduction to formal methods (Unlike Z or
Coq).

[http://alloy.mit.edu/alloy/](http://alloy.mit.edu/alloy/)

Anyone have any experience using Alloy, or some other formal method?

~~~
maramono
I've used it several times but the big problem with it is that once you have a
nice, correct model that has been checked (and no counterexamples found),
there's no way to translate it to code. This meand that your basically back on
square one in terms of implementation.

I really like (and use quite often) the ASM method instead.

More of my thoughts here: [http://ortask.com/how-i-designed-built-and-tested-
a-temperat...](http://ortask.com/how-i-designed-built-and-tested-a-
temperature-logger-with-arduino-part-2/)

~~~
nickpsecurity
There's actually been imperative extensions and prolog compilers for it. Here
you go:

[https://homes.cs.washington.edu/~emina/pubs/alloy.mscs13.pdf](https://homes.cs.washington.edu/~emina/pubs/alloy.mscs13.pdf)

Also, for ASM's, the most interesting one I've seen is certified compiler
project that is taking a lot less lines of code than CompCert:

[https://www.complang.tuwien.ac.at/andi/papers/hipeac14.pdf](https://www.complang.tuwien.ac.at/andi/papers/hipeac14.pdf)

------
lordnacho
Can we get an actual example of a formally verified piece of code? If I have a
function adding a and b, how does it look when it's "formally verified"?

~~~
nickpsecurity
Good news is that most of them abstract the stuff away for common operations.
They'll let you annotate code that's transformed into a set of verification
conditions that the prover automatically discharges. SPARK is an example:

[http://www.skein-hash.info/SPARKSkein-release](http://www.skein-
hash.info/SPARKSkein-release)

Let's say that work isn't done. Then you have to hand model the properties of
what you're doing. Arithmetic, expressed in bits, is harder than it appears to
describe in a way that covers all the properties. Here's one on SmartMIPS:

[https://staff.aist.go.jp/reynald.affeldt/documents/affeldt-j...](https://staff.aist.go.jp/reynald.affeldt/documents/affeldt-
jssst2006-en.pdf)

------
TickleSteve
Although good, hasn't this just shifted the burden up a level?

With formal methods, we can get proven-ish correct implementation, now we need
bug-free specifications.... this in itself will be the next challenge.

In the real world, specifications are very incomplete and fuzzy, partly
because the business-level specifications are also fuzzy.

This will not be a panacea, its simply shifted the burden up to a (possibly
more manageable) level.

~~~
nickpsecurity
"With formal methods, we can get proven-ish correct implementation, now we
need bug-free specifications.... this in itself will be the next challenge."

That's _much_ better. See, before you needed to (a) understand a vague
specification, (b) get your concrete version of it right, and (c) implement
that specification. Now, you just trust a precise specification. That it's
precise, esp if in Z or something, let's you check it against the people
giving you vague stuff too by asking them questions. Solves lots of problems.

Altran/Praxis Correct-by-Construction provides an illustration of how that
works:

[http://www.anthonyhall.org/c_by_c_secure_system.pdf](http://www.anthonyhall.org/c_by_c_secure_system.pdf)

At another extreme, some companies capture requirements in formal logic then
just compile them to code with logic programming languages like Mercury:

[http://www.missioncriticalit.com/development.html](http://www.missioncriticalit.com/development.html)

Note: Not sure how well the last one works but worth mentioning that it's at
least possible to go from specs to code in some places.

------
mirceal
For anybody interested in this field:

Specifying Systems by Leslie Lamport, the same guy that gave us Logical Clocks
and Paxos (free on Ms Reseach website):

[http://research.microsoft.com/en-
us/um/people/lamport/tla/bo...](http://research.microsoft.com/en-
us/um/people/lamport/tla/book-02-08-08.pdf)

------
peterbonney
Aside from the obvious point that software can obly be as secure as the
hardware it runs on, isn't Gödel's incompleteness theorem a problem with
scaling from smallish problems to biggish ones?

Granted it has been a long time since I understood it even moderately well,
but my main recollection is that a complete system of logic can't be
consistent, and a consistent system can't be complete. That would seem to be
an issue with extending formal methods to any computer system of broad general
use.

Or am I misapplying the theorem here?

~~~
MrManatee
Gödel's incompleteness theorems are more of a theoretical limitation than a
practical one.

Roughly speaking, Gödel's incompleteness theorems say that all proof systems
for number theory are limited in some way. For example, first-order Peano
arithmetic is such a system, and one of its limitations is that it doesn't
support transfinite induction. (It doesn't matter if you don't know what it
is.) In other words, if you want to translate a mathematical proof to Peano
arithmetic, you have to come up with a way to do it without transfinite
induction. Sometimes, such in the case of Goodstein's theorem, this is
impossible. To prove Goodstein's theorem, you have to choose a stronger proof
system to begin with.

So, Gödel's theorems guarantee that no matter how strong you proof system is,
there are always number-theoretic statements that are beyond its reach. But
for reasons that are not currently completely understood, this doesn't really
happen in practice. "Naturally occurring" examples of number-theoretic
statements almost always turn out to be provable in surprisingly weak systems.

Instead, you run into practical problems: the theorem is provable in the
system, but actually writing out the proof is utterly inconvenient. As an
analogy, there are Turing-complete programming languages that don't have the
concept of functions. In theory, they are capable of all kinds of
computations, but in practice you don't want to use them.

And if, instead of mathematics, we concentrate on proofs of correctness, then
this is even less of a practical problem. To quote Leslie Lamport, proofs of
correctness "are seldom deep, but usually have considerable detail." The
proofs may be long and complicated, but as long as they don't use any kind of
ridiculously abstract techniques, they are just the kind of proofs where
computers can have an advantage over humans.

~~~
peterbonney
Interesting. Thank you for the explanation!

------
Mao_Zedang
I have always been interested in this area of programming. One thing I have
always wanted is the ability to enforce restrictions on data types beyond
memory. By that I mean say i am using an int, what if I want to restrict this
to actually 1 -55 you can solve this using run time error handling but wouldnt
it be better if we had static checks where upstream we could see an unbounded
int being passed into this method and throw a compile error.

~~~
manmal
I think you could do that, but as long as you allow the usual operations on
bounded variables (say, multiplication), you would have to ensure that the
other operand is also bounded, or else you get an unbounded result
nonetheless. Meaning, all variables that are used in any expression would have
to be bounded eventually. There's also the problem that the compiler would
have to infer that [1-10] - [1-2] can be negative, but does not have to. So
with every expression on bounded variables, you increase their bounds' range,
likely deleting the system's benefits.

~~~
Mao_Zedang
I agree its super complex, these bounds wouldn't be limited to integer based
variables. I would also argue there are very narrow instances where you would
want to bound a variable and also do arithmetic on it. The language should
make the bounding optional and unbounded possible.

------
jeyoor
I think rigorous application of formal methods plus deployment techniques like
unikernels have the potential to improve the quality of a lot of software.

~~~
__jal
I'm more of a pessimist. If you can't get C programmers to handle memory
buffers correctly, how are you going to get them to use actual formal methods
or annotations that rely on understanding substantially more math than the
average programmer has?

~~~
kordless
Make the infrastructure you run things on provide what appears to be formal
methods for interactions, but also make it appear inconsistent over a given
regular timeframe. Abusers of the system will find lots of issues with it
because it will break unpredictably at random times if they are hammering
away. Happy users of the system will having it break every now and again, but
will likely avoid making things overly complex as a result of it increasing
the chances it will break over a given time period.

Expecting infrastructure to be completely reliable is a fallacy and should be
embraced instead of chased.

Suffering on the part of the former "programmer" (hacker) will be paid while
obtaining results of various runs to exploit the system from the outside
(external). Suffering on the part of the later "programmer" (coder) will be
paid internally while working on more efficient solutions.

We'll need to code up the two types of suffering into stores of value, and tie
them to cryptocurrencies, but that should be fairly straightforward to
implement.

------
usgroup
You might regard unit tests and formal proof to be opposite ends of the same
scale with model driven testing being somewhere in between.

One may be well advised to pick according to what you need. Formally proving
the behaviour of your whole web app would likely take you an order of
magnitude longer to write.

------
virgulino
"Beware of bugs in the above code; I have only proved it correct, not tried
it." (Donald Knuth)

~~~
AstralStorm
Statement hopefully obviated by seL4 Isabelle to/from C compiler.

------
tracker1
I think the biggest issue is from a business people perspective, formal
requirements needed for this type of development (software development as an
engineering discipline) is too slow, and not as open to change as the agile
(craft) methods tend to be in software.

~~~
saretired
It depends what business you're in and what your needs are. Agile doesn't get
much traction in flight control software or telephone switches.

~~~
tracker1
Exactly... anything dealing with critical hardware will/should be done as an
engineering discipline wrt development, where tfa's practices could be better.
That isn't the majority of software development, which is one-off line of
business apps.

------
eggy
This is a big win for Haskell, since government contracts and approved vendors
tend to have long arcs of doing future business. The idea an EDSL in Haskell
was used greases the skids for any future Haskell acceptance on jobs. Good
luck!

------
gravypod
I don't get the idea of a formal method. You're attempting to prevent bugs in
code by writing code that describes the final product. In any reasonably
complex system the descriptions will be large, complex, and bug prone. How is
a "formal method" any better then one of the following:

    
    
       - Unit testing: Provide every test case possible
       - Pure Functions: Use small methods that cannot error and always return sane values
       - Fuzzing Testing: To find if the implementation is actually correct or if it is correct by chance
    

What is the benefit of a formal method? It's just inconceivable for me to see
"Hacker-Proof code" and not laugh at whoever wrote this.

~~~
pka

        - Unit testing: Provide every test case possible
    

Even if your program only takes a 32 bit integer as an input you'd have to
provide 4294967296 test cases. Most programs' inputs are a bit larger than
that.

    
    
        - Pure Functions: Use small methods that cannot error and always return sane values
    

"cannot error" is exactly what formal verification is about. Pureness doesn't
prove correctness.

    
    
       - Fuzzing Testing: To find if the implementation is actually correct or if it is correct by chance
    

Fuzz testing is good for finding buffer overflows and such, and those bugs can
be ruled out even without formal verification methods (i.e. Rust). Otherwise
you'd need to specify every test output possible anyway, and you're back to
point 1.

~~~
gravypod
> Even if your program only takes a 32 bit integer as an input you'd have to
> provide 4294967296 test cases. Most programs' inputs are a bit larger than
> that.

How do I go about defining a test using a formal method?

Also, you know as well as I do that you don't need to test every input, only a
representative subset if you use unit tests and use a fuzzing testing library.

Using fuzzing, unit testing, pure functions and liberal preconditions through
your project in development you'll likely come up with a product that has
exactly what you are promising. Add a linter on top of that and for free
you've got the safety of healthy paranoia.

Do you have an example of a project that uses "formal methods" so I can see
what they are? I'd also like to see something written that has a comprehensive
security audit.

Again, from what I'm seeing you're writing code to verify code and calling it
"bug proof" just because you've written double the code. That's insanely
misguided in my book.

~~~
pka
> Also, you know as well as I do that you don't need to test every input, only
> a representative subset if you use unit tests and use a fuzzing testing
> library.

The thing is that you can never be sure that you chose a "representative
subset". Unless you test every conceivable input you've verified that your
program works for the subset you chose to test, nothing more. For example, how
sure are you that your unit tests cover _every_ conceivable edge case?

> How do I go about defining a test using a formal method?

You don't, it's a proof. You don't need tests.

Some links from another comment [0].

I'd also point you to Idris [1] for a friendly dependently typed language if
you are interested in learning more.

[0]
[https://news.ycombinator.com/item?id=12544477](https://news.ycombinator.com/item?id=12544477)

[1] [http://www.idris-lang.org](http://www.idris-lang.org)

~~~
gravypod
> You don't, it's a proof. You don't need tests.

Ok well what does it look like? Can you show me an example of this proof and
how it is tested? You can't just say it's proven and remove the need for you
to prove you've implemented it correctly. How is the proof validated and how
is the source code checked against this?

Also by the act of checking the source code you've essentially just created a
test case. So we are back to my initial phrasing.

What I'd like is for someone to show me how I go about proving something.
Could you, if possible, walk through "proving" an implementation of a
Fahrenheit and Celsius converter or something? What do you do? Where do you
start? What does this look like?

That's whats been driving me crazy about all of this "formal methods" thing.
People say "it's proven" and when you say "how do you check your code or how
do you verify what you did was correct" they say "we made a proof, it's
proven". No information is exchanged other then the assertion of "what I did
was right".

~~~
pka
Well, an F to C converter is not a terribly interesting case, because it would
be trivial to implement. When you think about it, how would you test this in
your favourite language of choice? By computing the expected value by using
the _same_ formula you used to compute the value you are testing in the first
place.

But here's another example that came up recently. Let's say you want to write
a function that takes a string and returns its first character. Now, it would
be an error to call this function with an empty string, because then what
would it return?

In a normal language, you'd probably check if the string was empty and then
return null or something. When testing the function, you'd have to remember
the empty string case and write an unit test for it (now I don't know about
you, but I can't say I'd always remember to write a test case for this
particular edge case; normally I'd forget).

Can we do better? It turns out we can.

We can encode invariants about our program on the _type level_. Types can be
more than just strings, ints and chars. They can also be _propositions_ , i.e.
statements about our program. In this particular case we can construct a type

    
    
        NonEmptyString
    

that can only be constructed by a function like this (using Haskell syntax):

    
    
        neString :: String -> Maybe NonEmptyString
    

(This means that neString takes a String and returns a Maybe NonEmptyString).
But what is this Maybe (also called Option, etc)?

It's like a union type - it can either contain some value, or it must be
empty. Now the cool thing about Maybe is that we can't just take out the
potential value contained inside it - we have to _pattern match_ on it! What
this means is, that whenever you want to see what's inside a Maybe the
compiler _forces_ you to tell it what to do in both cases - when there's
something inside and when there's nothing inside.

Maybe you can see where this is going. This is how our takeFirstChar function
would look like:

    
    
        takeFirstChar :: NonEmptyString -> Char
    

Notice - we say that it only works on NonEmptyStrings! If we try to pass a
regular string to it that would be a compiler error. But now we want to read
something from the command line and returns it's first char. Let's say we have
a read function:

    
    
        read :: String
    

Every time you call this function it gives you a string read from the
terminal. Ok, let's put our program together now:

    
    
        a = read
        print (takeFirstChar a)
    

But wait! We can't do that - as I said above, the compiler complains, because
we try to pass a String to a function that expects a NonEmptyString. Instead
we have to:

    
    
        a = read
        case (neString a)
          Just st: print (takeFirstChar st)
          Nothing: print "String was empty"
    

Notice how we _pattern match_ on the result of (neString a). If the string was
empty, we'd match the Nothing clause. In all other cases we'd print the first
char.

What we did is _prove_ that the program will behave correctly in all cases -
no tests needed! We can't put the program together in any other way, because
it would be a type error. In a sense, the fact that the program exists is a
proof of its type (programs have types too!)

Now this is a pretty trivial invariant to encode on the type level, but in a
dependently typed language you could encode whatever you'd like, i.e. for
example that a list is sorted or that a number is the GCD of two other
numbers, etc.

This is often not trivial to do, and at some point you start seeing
diminishing returns, because the time it takes to rigorously encode a proof is
longer than any possible return of investment caused by proving that there are
no bugs. But doesn't have to be an either-or situation. You can encode as much
invariants as you'd want in your code. I think Haskell strikes a good balance
in that regard, although Idris is a very interesting language too.

(Sorry for typos, errors etc, didn't proofread).

~~~
gravypod
> But here's another example that came up recently. Let's say you want to
> write a function that takes a string and returns its first character. Now,
> it would be an error to call this function with an empty string, because
> then what would it return?

This is wrong because it depends on the language. The behavior depends on the
implementation but if you are meant to return the first character in the
string and the string has no characters then you should return an empty string
if provided a string with no characters.

From there, if given null you should return null back or throw an exception
depending on the language.

My function would, abstractly, look:

    
    
      function get_first_letter(s):
        if s == null:
          return null
        if s == "":
          return ""
        return s[0]
      
      value = get_first_letter(read_string())
      
      switch (value):
        case null:
          print("No string entered")
          return
        default:
          print(value)
          return
    

Is this code "proven" just because it handles a every case? What makes this a
proof or proven? This is basically just good practice: handle every case that
a function can create. This is something that we have all been doing. Why
makes a formal method "proven" or different.

> What we did is prove that the program will behave correctly in all cases -
> no tests needed! We can't put the program together in any other way, because
> it would be a type error. In a sense, the fact that the program exists is a
> proof of its type (programs have types too!)

This seems no different from any other language. You've just spread the code
you had to write over "neString" and "takeFirstCharacter" and created a new
type for it. Why couldn't you have just used the underlying functionality of
String to make those checks when needed? For instance, "neString" must be
using some sort of string method to check if a String is a NotEmptyString. Why
not just put that code where it needs to be?

This just seems like a way to force your hand into using More types that have
the same underlying implementation and I just see that as a way to confuse
someone trying to come work on your project. When I look at source and say
"wait a minute they have 10 types defined for strings and all they do is
something that String could do beforehand" I'm a little irked. What makes the
implementation that you have created "proven"? Just that the compiler said
that all of the types line up? That doesn't make you any safer then before.

If you even make a single cast, a single unsafe block, or anything then you
need to test the entire system. Your house of type-verifying cards falls down
once you hit even the smallest section of unverifiable code. Just by making
types for every one of your billion special cases doesn't mean that you don't
need to test the code.

For instance in rust the standard library uses unsafe{} blocks often. This
means that any code that goes into or comes out of that block is unverifiable
for type. Just by using defined types to create it doesn't make it so the
programmer has correctly implemented anything.

Also, you are assuming the implementation of neString is correct. What if that
has a bug? Again, just because your passing your data around the correct way
doesn't mean the correct things are happening with it.

You can't just make more types and call it all good.

~~~
pka
> This is wrong because it depends on the language. The behavior depends on
> the implementation but if you are meant to return the first character in the
> string and the string has no characters then you should return an empty
> string if provided a string with no characters.

How? The type is:

    
    
        takeFirstChar :: String -> Char
    

There's no such thing as an empty char. Your get_first_letter essentially has
the type:

    
    
        get_first_letter :: Either Null String -> Either Null String
    

Which doesn't guarantee anything and is arguably a lot worse than just:

    
    
        get_first_letter :: NonEmptyString -> Char
    

This is why you can't have proven code in a lot of languages, because they
permit you to lie about the actual types. From there, any proof is invalid
because it is based on invalid propositions.

Dependently typed and other languages do not allow you to lie about the types
(for example, a value can't be null unless this is reflected in the type; you
can't throw exceptions or do IO unless this is reflected in the type, etc),
hence a well typed program in such a language is a valid proof. You can not
implicitly cast between types, i.e. a function like:

    
    
        cast :: a -> b
    

doesn't exist.

A function like print actually has the type:

    
    
        print :: String -> IO ()
    

Which tells you that it is doing IO. You can't call such a function from a
pure function for example, the types wouldn't permit it.

> Also, you are assuming the implementation of neString is correct. What if
> that has a bug? Again, just because your passing your data around the
> correct way doesn't mean the correct things are happening with it.

I was simplifying, but neString would actually have a type similar to:

    
    
        neString :: String n -> Maybe (String (n != 0))
    

where n is the length of the string. Notice, the length is encoded in the
type! A wrong implementation (one that returns i.e. Maybe (String 0)) will not
type check.

But yes, you can have errors in your proof, no doubt. But these are
specification errors, which there's no way to get around. What a proof
guarantees is that there are no errors in the _implementation_ of your
specification.

~~~
gravypod
Ok so if you are given a specification and you use formal methods throughout
and it doesn't conform to the specification you were provided with at the end
you have made an error. The specification is not at fault here. What you are
doing is claiming types into and out of functions decide if they are
functioning correctly. This is not the case at all.

I think that most bugs aren't because the type you were expecting was wrong
but the data produced by a function was wrong.

Implementations cannot be judged just because they accept the correct types.

For instance, if I am told to write a function to predict lottery numbers and
do the following

    
    
       LoteryNumberType
       get_lottery_number :: LotteryNumberType
    

If I call `get_lottery_number` by your logic it will always provide me with
the real winning lottery number since the types check out. That is absolutely
incorrect. Types can't prove functionality. They do provide protection from a
set of bugs but those bugs are also avoided easily by handling every case that
a function can return which was already being done before formal methods where
an idea.

You can't say a program will have no bugs because the types check out. Is that
what you are saying? Your code's types match in all uses -> this means there
are no bugs -> this means I don't need to test my code?

Is that really what you think?

~~~
pka
> What you are doing is claiming types into and out of functions decide if
> they are functioning correctly. This is not the case at all.

When I say proof, I mean it in the mathematical sense. The Curry–Howard
isomorphism [0] formalizes the relationship between (mathematical) logic and
type theory. Unless you are saying mathematical logic is inconsistent (and
thus every mathematical proof is invalid) this is absolutely the case.

The same way you can construct a mathematical proof that e.g. n is prime, you
can can construct a type Prime n that proves that n is prime [1].

> I think that most bugs aren't because the type you were expecting was wrong
> but the data produced by a function was wrong.

This is where your misunderstanding stems from. Continuing with the example
from above, if you have a function:

    
    
        makePrime :: n -> Prime n
    

the function _can not_ construct (Prime n) if n is not a prime.

> If I call `get_lottery_number` by your logic it will always provide me with
> the real winning lottery number since the types check out. That is
> absolutely incorrect. Types can't prove functionality.

If you give me a specification that constructs predicted lottery numbers I
could absolutely turn it into a type :)

> They do provide protection from a set of bugs but those bugs are also
> avoided easily by handling every case

But you don't have proof that you are handling every case :)

> You can't say a program will have no bugs because the types check out. Is
> that what you are saying?

Yes, if the types correspond to the specification then there are no bugs. The
specification can still have bugs though, but that's unavoidable. The thing is
that you've proven the program does what you told it to do.

Once you stop thinking about types in the Java sense and start thinking about
them as logical propositions it will all make sense. I provided you with a
couple of links in the other comments and here if you are still interested.

[0]
[https://en.wikipedia.org/wiki/Curry%E2%80%93Howard_correspon...](https://en.wikipedia.org/wiki/Curry%E2%80%93Howard_correspondence)

[1] [https://github.com/agda/agda-
stdlib/blob/master/src/Data/Nat...](https://github.com/agda/agda-
stdlib/blob/master/src/Data/Nat/Primality.agda)

------
phantom_oracle
Here are my questions (to anybody with knowledge of this field):

\- Is this type of programming related to the functional-programming field?

\- How is this different to OOP/imperative code?

\- What language(s) will be used to write code for this?

~~~
AstralStorm
I can speak about Isabelle/HOL specifically only.

1) functional programming is a subset of mathematical proofs. Isabelle syntax
is somewhat similar to what is offered in ML family of functional languages.
Unlike functional languages, you can use more advanced constructs than
bijections (can model superposition etc.) and it is easier to state things
over sets.

2) imperative code is a subset to this too with added order of operations.
Language is somewhat different though Isabelle has a module that has necessary
proofs to verify imperative programs.

------
bikamonki
Just thinking out loud: if a program can check the correctness of another
program against a formal spec, can't it just _write_ the checked program from
the spec?

~~~
MrManatee
As others have already replied, sometimes it can. But sometimes, and
particularly if we care about efficiency, this is so difficult that we're not
even close to being able to automate it.

For example, here is my "formal definition" of a primality checker:

IsPrime(Int n) = n > 1 and not(exists a, b in Int: a > 1 and b > 1 and a * b
== n)

It is not directly executable, because it uses the "exists" quantifier over
all integers. A clever code extractor should be able to somehow convert this
to a finite computation. But would it be able to come up with the polynomial-
time AKS primality test? [1] I highly doubt it.

Unless, of course, there is a special case for recognizing this particular
definition. But I don't think that really counts, because I'm only using
primality checking as an example. You can't have a special case for
everything.

[1]
[https://en.wikipedia.org/wiki/AKS_primality_test](https://en.wikipedia.org/wiki/AKS_primality_test)

~~~
AstralStorm
Actually, Isabelle has methods that check isomorphism of proofs in a brute
force way. It is pretty slow, so it is almost always better to just say what
specific proof you want to use.

------
TwoBit
All we need now is a proof that the proof program is correct. And a proof that
the proof proving program is correct.

~~~
AstralStorm
Internal consistency is relatively easy to verify. Completeness to some extent
as well.

The manual task is to verify all the assumptions.

------
jbb555
I don't understand this.

In order to use formal verification you have to provide a complete and correct
specification of what you want the software to do. We already have formal
languages for doing this called programming languages.

How is this any different from a programming language?

~~~
fmap
Programming languages have to be (efficiently) executable, specifications can
be logical/declarative. E.g. in a specification I can state things like
"forall functions, with uncomputable property foo, we have bar". Plenty of
things are uncomputable (e.g. quantifiers over natural numbers), but useful in
specifications.

There are good and bad specifications, same as for programs. For the semantics
of a programming language you could essentially transcribe an interpreter in
the form of a structural operational semantics and these are relatively error
prone. Instead, you could give an axiomatic semantics, which is a lot more
high-level. An equivalence proof between the two gives you a high level of
assurance that the operational interpretation you had in mind while writing
your interpreter actually means what you think it means.

A recent example, the formal specification of the weak memory model of C11
turned out to be wrong (in the sense that it forbids common compiler
optimizations, because programs have access to a time machine), but this was
discovered when trying to develop a program logic for C11.

In practice, most broken specifications I have seen were written by people who
never really worked in formal verification. I am not aware of a single
instance where a piece of formally verified code was broken because of a
broken specification. There are cases where the specification had to be
extended. E.g. CompCert was initially developed as a whole program compiler
and the spec had to be extended for separate compilation. This broke the alias
analysis.

------
heimatau
To save others time. I've copied some of the biggest points of the article. My
comments are in [brackets], except these^^ first few sentences.

"Between the lines it takes to write both the specification and the extra
annotations needed to help the programming software reason about the code, a
program that includes its formal verification information can be five times as
long as a traditional program that was written to achieve the same end."

"Jeannette Wing, corporate vice president at Microsoft Research. “Any natural
language is inherently ambiguous. In a formal specification you’re writing
down a precise specification based on mathematics to explain what it is you
want the program to do.”"

"“We’re not claiming we’re going to prove an entire system is correct, 100
percent reliable in every bit, down to the circuit level,” Wing said. “That’s
ridiculous to make those claims. We are much more clear about what we can and
cannot do.”"

[In the experiment the article is mostly based on] "The hackers were
mathematically guaranteed to get stuck. “They proved in a machine-checked way
that the Red Team would not be able to break out of the partition, so it’s not
surprising” that they couldn’t, Fisher said. “It’s consistent with the
theorem, but it’s good to check.”"

"In a return to the spirit that animated early verification efforts in the
1970s, the DeepSpec collaboration led by Appel (who also worked on HACMS) is
attempting to build a fully verified end-to-end system like a web server. If
successful, the effort, which is funded by a $10 million grant from the
National Science Foundation, would stitch together many of the smaller-scale
verification successes of the last decade. Researchers have built a number of
provably secure components, such as the core, or kernel, of an operating
system. “What hadn’t been done, and is the challenge DeepSpec is focusing on,
is how to connect those components together at specification interfaces,”
Appel said."

~~~
mkstowegnv
"Jeannette Wing, corporate vice president at Microsoft Research. “Any natural
language is inherently ambiguous."

Lojban (nee Loglan) is a conlang designed to be syntactically unambiguous but
as usable in speech and thought as a natural language
[https://en.m.wikipedia.org/wiki/Lojban_grammar](https://en.m.wikipedia.org/wiki/Lojban_grammar)

------
walter_bishop
Such 'hacker-proof code' can only be as secure as the underlying hardware. If
such a cure already existed, we wouldn't be subjected to the current
hacking/phishing epidemic.

"these malfunctions could be as simple as a buffer overflow"

No amount of 'formal verification' will detect or prevent buffer overflows,
which are a defect in how the hardware allocates memory. The cure must also be
found in the hardware.

"Back in the 20th century, if a program had a bug, that was bad, the program
might crash, so be it,”

It must of been on a different planet, cause I can remember using networked
computers in college way back in 1984. and the first Internet worm occurred in
1988.

~~~
marcoperaza
Hardware bugs are NOT the major source of widely-exploited vulnerabilities in
practice. They're a tiny fraction of exploits being used in the wild, and
certainly of the ones being used to target the general population.

Further, a buffer overflow is absolutely not a defect in how hardware
allocates memory. It's a logical flaw in the software that reads/writes beyond
the limits of the memory the programmer intended to access.

~~~
mistaken
It's also worth noting that some critical hardware components (e.g. CPUs since
the infamous division bug) are formally verified as well, so manufacturing
defects should be the largest concern for them.

~~~
AstralStorm
Only a tiny fraction of the CPU is formally verified. There are frequent
errata. And plenty of undefined behaviour as well as races you can hit if you
really try.

------
neilellis
This article was just wildly misleading, the title is well, it speaks for
itself.

Interesting language and rather than 'formally proven' as the article wildly
claims it is a strongly typed system ala Haskell. So yep looks a great idea
for secure systems - but formally proven - bah! :-)

Has anyone here actually formally proved a non trivial program, say a couple
of hundred lines on NCSS?

How about tens of thousands, hundreds of thousands or millions of lines of
code. The complexity of the proof would be insanely complicated such that the
level of expertise to define the requirements would be (if at all within the
limits of human endeavour) only for maybe a handful of people. And do those
people make good business analysts, or hackers, or entrepreneurs or even just
communicators.

Jeez the misinformation that gets spread!

~~~
tzs
> Has anyone here actually formally proved a non trivial program, say a couple
> of hundred lines on NCSS?

I don't know what NCSS is, so am going to answer as if your sentence stopped
two words earlier.

How about the CompCert C compiler [1]? Or CakeML [2]?

And let's not forget seL4 [3], which has a formal proof that the
implementation in C conforms to the specification, and a further formal proof
that the binary code is a correct translation of the C code.

[1]
[http://compcert.inria.fr/compcert-C.html](http://compcert.inria.fr/compcert-C.html)

[2] [https://cakeml.org](https://cakeml.org)

[3] [https://sel4.systems](https://sel4.systems)

