Hacker News new | past | comments | ask | show | jobs | submit login
The Piano Test for Program Verification (2011) (regehr.org)
37 points by tosh on Jan 26, 2022 | hide | past | favorite | 23 comments



> I recently reviewed a paper that kept saying things like “we rigorously prove” and “we formally verify” — despite the fact that the paper contained nothing but proof sketches

Yep. Some people think "formal" means "mathematical," whereas it actually means "mechanical". A formal proof is one that can be correctly checked by a computer.

> The second problem is deeper: it is extremely difficult to argue convincingly that a verification result is correct. By “correct” I mean not only “mathematically sound” but also “the result is faithful to the intent of the verification effort.”

Worse. Even if we're convinced that the properties we verify are the right ones for correctness, we can only prove (even formally so) that an abstract algorithm satisfies them. We cannot prove that a system — which ultimately comprises a lot of hardware — does.

On the other hand, because reducing the probability of (catastrophic) failure is all we can hope for anyway, formal verification in various forms can help do that better — i.e., reduce the probability further, and with lower cost — than traditional testing methods alone. For example, because a formal specification can be quite short, it might be easier to convince ourselves that the properties we're checking are indeed the ones we're interested in than thousands of tests. I absolutely agree, though, that formal methods researchers often make the mistake on focusing on the wrong thing when communicating the value of their work. Formal methods can never prove a system correct, but they can help make it more correct and with less effort than using traditional assurance methods alone.

In other words, formal methods help catch some dangerous bugs more easily than traditional testing, and that is where their value is. Using the word "proof" when describing the goals of formal methods is doing them a disservice.


You'd be foolish to sit under the piano if only some piece of software is verified, and not the entire operating system, and the silicon. And not just the correctness of blueprint of the silicon, but the freedom of manufacturing defects of that exact build of the silicon, and all the hardware around it: every DRAM memory cell, every capacitor. That still leaves the possibility that a bit is flipped due to silicon being struck by radiation. Consumer-grade hardware is quite susceptible to random errors caused by cosmic rays.

All verification makes assumption about some lower layer that is not itself verified.

For instance, you can't deduce the issues uncovered by Meltdown and Spectre from the semantics of the x86 instruction set architecture; so if you use that alone, you can falsely prove that x86 programs don't have security problems.

Rationally speaking, you shouldn't take any risk, however small, where there is no benefit. You wouldn't risk $1000 if the expected value from the entire risky event $900. If you're putting your life on the line, there has to be something in it other than convincing people that you really believe in something being correct.

Consider this: people have died for the sake of showing that their belief (e.g. religious) is correct! For instance, set themselves on fire or blown themselves up, or suffered as martyrs all the way to death. Just because some programmer put himself under the piano while his code was being fuzzed doesn't mean that it's correct, just that he or she is reckless and has a fanatical belief in the work. (If the programmer is reckless, I'm disinclined to believe in his or her code.)


If we were able to write a complete, correct, and machine-readable specification for a piece of code, then we wouldn't need the code: we could use the specification as the code. (This does not mean that programs cannot be verified; I believe they can be, at least to the same degree bulletproof vests are, and people routinely bet their life to demonstrate the latter. It's just that it shouldn't require two specifications.)


A specification isn't the same as executable code. For example, the specification for sorting any array A having indexes [0 .. n) is the resulting array B such that there exists a permutation P of [0 .. n) such that B[P(0)] <= B[P(1)] <= ... <= B[P(n-1)]. But this specification says nothing about how to find such a permutation. You could do it by brute force because the space is finite, but that wouldn't be a practical sorting algorithm. So a practical algorithm will have a different flavor and different proof than the simplest specification.


It looks like too big a specification, more like a wish. We could equally well write that the result must be a solution to the traveling salesman problem. A specification can only be useful if there exist a piece of code that can take it, apply it to the algorithmic part and tell us the results.

I'm thinking more along the lines of "when this field is X, that field can only be Y or Z" or "when this condition holds, these fields are not accessed". And these declarations should be a part of the main code so that it can refer to its internals.


We would still need code because the "specification" and "code" have become synonymous by virtue of the specification becoming executable. It isn't possible to need the specification, but not need the code, when they are the same thing.

The problem is that producing a complete and correct specification is almost of the same difficulty as producing complete and correct code. And the almost part disappears when we require the specification to be executable.

Writers of non-executable specifications rely on being rescued by the code: that someone will "read between the lines" here and there when they produce the code.

There is also a very good reason for needing two specifications: the specification proper and some code written to the specification (even if the specification is executable). Because producing specifications is difficult, the redundancy helps: whenever your specification and code have a discrepancy, that could be a defect.

If specifications are executable, the implementations are severely constrained; they have to do exactly what the specification does. If the executable specification for a programming language is an interpreter for that language, then a compiled version of the language doesn't conform to the spec. It's a different spec.


I'd argue that a specification must be a homomorphic version of the real code. A specification has fewer parts, but the relationships between the parts are same as in the code. I.e. it would be interesting to write a general specification for, say, a math typesetting algorithm and then add concrete specs that describe how that algorithm materializes in, say, different instruction sets. (I wish there were such specifications, because right now when you want to understand the general principle you have to rely on texts or dig that knowledge from the idiosyncrasies of different programming languages and individual styles.) I.e. a specification that can be shared between different programming environments. Something that captures the fundamental programming principles, not the "modern" features.


I did something less extreme than the piano test but still risky. I implemented the Bitcoin cryptography stack from scratch - prime field, elliptic curve addition, elliptic curve multiplication, Base58Check, ECDSA, RIPEMD-160, SHA-256. I added hundreds upon hundreds of test cases, and some parts were mathematically proven. Then I computed private keys, public addresses, and outgoing transactions using my own code, and processed a handful of transactions. If I made a mistake in the math or algorithms, I could have sent money to unspendable addresses, which would have resulted in thousands of dollars in losses - and remember that transactions are irreversible on the blockchain. That was the degree to which I trusted my code to do the right thing.


Yup. This is excellent on cybersecurity in general. When I taught a cybersecurity class, Day One, I said "Cybersecurity is presently very bad," and presented a simple way to prove it, literally zero tech knowledge required.

Find a cybersecurity company that will indemnify you in case of breach or problem. (Hint, You won't.)

Skin in the game.


Some transportation systems are formally verified. I don't think anyone riding them except for extreme nerds would even know. A big part of engineering is, after all, preventing harm to the public through its works.

All we can prove is that some property holds. I think the major benefit is that it helps humans think about those properties and what is important. Especially in the world of computer systems it is impractical to expect a human to glance at a few thousands lines of code and spot where the race condition is. But if we simplify the model and check it with a computer it can usually find it rather fast.

Formal methods don't make systems inherently safer or fool-proof but they help us think about and understand complex systems. The name of the game after all is to not fool ourselves. The challenge is that we are the easiest ones to fool (paraphrasing Feynman here).


your program might as well be an infinite loop sampling a boolean, sigsegv can be thrown after hardware errors. I wouldn't sit under that piano even if your code is formally verified and that formal verification was verified to be verified. :)


This setup tests the entire validity of the computer: drivers, operating system, libc -- I think that's what makes me uncomfortable with it way more than a bug a fuzzer can find in openssh.


"systems are tested in realistic environments and failures are rigorously studied and the results are fed back into the design of the next iteration of the system. This is not incompatible with formal verification, but it’s also not a point of view that comes out very often in the verification literature."

Yes, indeed. Life is infinite like the Sierpiński triangle, and at some point we deem the risks so small to fall within engineering tolerances. If only the code review engineers checking each change were as forgiving as the auditors.

Startups know this (business SWOT analysis). Hacker News knows this (karma). Some people like Elon Musk know this very well, so much that SpaceX is proud of their failures.

Humility is the source of integrity.


We used to play a game at work where we would write deliberately bad code. It started off as bits we would find in our own code base. But it morphed into a game where we would try to write bad code. We would add conditions like 'make a for loop with out the for condition'. The whole group got into it. It was fun and gave concrete examples to everyone what is bad code and how to look for it.

Failure is an excellent teaching tool. When the stakes are low orgs can seek it out. When the stakes are 10k people lose their jobs orgs tend to build in caution and tripwires around the known bad things. Elon at one point was 1 rocket going kablooie from spacex not existing. That is no longer true so he can seek out more risk.


Elon Musk is the biggest egomaniac that ever roamed the Earth, he doesn’t fit in with your comment the slightest.


Since I am a coyote, I will simply flop around like an accordion while birds circle my head.


Offer ten bucks too and someone might actually take you up on it.


Doesn't have anything to do with a piano.


Don't look up.


It just shows what the author would do with a piano (he is likely not a musician); the true story is about a rope, an I-beam and a knive controlled over USB.


Weird comment. It's a piano because falling pianos have been in movies and cartoons as a gag since forever. (Pianos have been raised and lowered via cranes above the sidewalk)


It's simply not relevant for the analogy that there is a piano; could be anything heavy; maybe children find it funny when a piano falls on someone's head, just as they find jokes about people's disabilities funny.


Taking a wild guess you aren't a writer. It's simply a colorful (and memorable) analogy that simply uses a situation that is well known as a gag within our culture (similar ones involve anvils and safes).

Nobody here is finding it funny when a piano actually falls on someone. But we've all seen Roadrunner vs. Wile E Coyote, and most of us aren't so uptight that we get upset at such concepts.

Would you have found it better if he said a television set? Because hundreds of people are actually killed by toppling TVs every year. No one has ever been killed in the way depicted in the gag (and alluded to in the article), which is the piano is hoisted on a cable and falls on someone below (typically it is when moving a piano, and it is above a city sidewalk).

To compare it to joking about people with disabilities is just bizarre. Are you just concern trolling here?

https://www.straightdope.com/21344194/has-anyone-ever-been-k...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: