Depending on the system you're using formal methods will rely on some set of preconditions (state before a function), postconditions (state after a function), and general state. There's nothing magical about formal methods that prevents errors, either logical/technical or by omission, here. And suddenly very small changes in program flow/logic can result in enormous amounts of work.
They're really fun and interesting, and also help you look at your code in a different way even when not using them. I think everybody has this sort of 'epiphany' moment the first time they actually play with them. But they don't work out in practice anywhere near as well as they do in theory or very limited scale. Something you quickly realize as the effort involved increases exponentially against linear increases in program size/complexity.
It's true that to date we don't have tools that can efficiently verify a program all the way from code to the high-level global correctness conditions (and, TBH, I don't think we'll have them in the foreseeable future, if ever), but there are tools that help narrow the gap: code-level tools that work at the code unit level (function preconditions), and high-level tools that work at the large-scale system description. Using both (which is exactly what companies that use formal methods for software, like Altran UK, do), let you work from both directions -- top and bottom -- to minimize the "uncertainty" gap, and, most importantly, they do it affordably.
It's also true that formal methods are a tool that we've not yet learned to effectively use in practice, and don't yet know the best ways to deploy it. But we're learning, and we're already at the point that at least some are seen considerable payoffs.
It is really just a matter of choosing the right tool for the job. Sometimes you need to be really really sure everything goes as you expect, and testing alone won't cut it.
Also, the level of confidence offered by formal methods is often configurable to lower costs. Some formal methods can affordably scale to very large systems, with the tradeoff that the confidence is lowered, as it only covers the general algorithm design rather than actual code.
I think Frama-C & Spark should sell a bit more the AoRTE stuff. If at least your software won't crash or use uninitialized variables, you'll sleep better. The world would probably be a better place if things like x509, asn1 decoding code and all kinds of file format parsers were proved for 'just' the absence of runtime error...
The thing that is nice with most modern formal verification tools is that they're not all-or-nothing propositions. Suppose you already have some Spark code that you've checked (proof) for AoRTE, but didn't go for full proof of all the functional properties of your software (you went for pure-testing with a pinch of static analysis and some code review on critique stuff). You're already high on the correctness scale... You get a report from a customer or security researcher or code auditor that there is a strange behaviour in your code. They can't nail the reason but sometimes you send inconsistent data. No crash or memory corruption, just an output that seems impossible to you (even after full code review and exhaustive test) and out of spec anyways. No reproducer, no idea what was sent to your system, just 'please stop sending us inconsistent data'. One way to find the bug is to try to prove it's not there. Add a post-condition 'fields a and b and c should never be 0 at the same time' and use the tools. Maybe start with structured testing following the code yourself... Then go for unsound static analysis or fuzz testing. Then once you've ran 2 weeks of afl and hongfuzz and didn't trigger this post- and once you've exhausted all the codepeer (or PVS studio or Coverity) false positives and still they didn't find the code path to your bug, and you still need to find this bug, you can use the proof tools and be sure you'll get an answer (after some work) on this specific question.
2 big caveats (for Spark at least) : you'll have to restrict yourself to the Spark subset of Ada (mostly, no pointers, no aliasing - but this subset is deemed to grow once the proof tools improve...) and second : automated proof with floating point numbers is not there yet and it is a hard research topic.
If you look around there's also EVE (for Eiffel) that B. Meyer mentionned some time ago on his blog. They didn't have to add a specification language or to supplement their original language, it was built in Eiffel from the start :-D.
One funny thing about specification languages is that they 'need' to be expressive to be useful. The Ada 2012 specs added lots of sugar to help writing contracts : quantifiers on collections/arrays (for some, for one), if- and case- expressions (pattern matching isn't far) and expression functions... and in Spark 2014 contract-cases. I expect some more in Ada2020. Since contracts must be executable you get all the sugar for your regular Ada too.
That depends on the tool. OpenJML just uses SMT solvers, but if you need bigger guns, Krakatoa lets you use either SMT solvers or a proof assistant (either Isabelle or Coq).
Think such methods can be expanded to handle more of it like SPARK does runtime errors? Or it will stay really hard?
My guess (bystander and user of the tech) is that it'll stay hard for 5-10 years (slow progress of research projects, lots of lead bullets, and far more cores on our dev machines...). If you're motivated and are ready for semi-automated proof you can probably use lemmas (pre-proven properties) in Spark but if you can't find your lemmas in the Spark toolset, then good luck with Coq or Isabelle.
Some encouraging stuff is going on in the FP static analysis world though, with projects like herbie (based on sampling only IIRC), the Tocatta lab, and I've heard some good things about numalis (start-up).
Edit: seems Gappa can be integrated through why3, and seems they did some work on collabaration with alt-ergo so it seems it should be usable through Spark. Something to check in the next release :-D.
That's true, but may be changing.
> a) the state space is much smaller than for software (although it can still be prohibitively large for complex designs and brute force formal methods)
That's true as well, but that is why some formal methods -- like my favorite, TLA+, and all the sound static analysis tools (they work by abstract interpretation) -- try to tackle the problem of software verification by abstraction (in a precise mathematical sense, namely a description of the system using a sound approximation of one with fewer states). In TLA+ you can even describe your system in different levels of detail and show that they form a sound abstraction/refinement relation. This is not to say that it's a silver bullet -- it takes work and thought, and isn't appropriate for everything, but a system like TLA+ lets you at least choose the level of detail you wish to work in and so the cost you'll have to pay. The lower the level, the more confidence you get, but more work you need to do. The early signs of TLA+'s success in industry show the practice to be worthwhile in some/many very important -- and quite common -- circumstances.
True, "trial runs" for asic production cost impossibly huge amounts of money as mask sets cost so much to make even for a 3 generations old process.
Only top players can afford them, yet some of them do switch do limit themselves to extensive verification.
Another big thing is time, doing a trial run in actual silicon takes many times more time than the most extensive verification run you can afford for an equivalent amount of money on current mainstream litho process.
There are only few software products that are formally verified. Some data diodes and real time operating systems (at least partially).
That depends what you mean by "formally verified". If you mean verification all the way from global, high-level properties and down to program code or even machine code -- what's known as end-to-end verification -- and that the verification method is deductive proof, then you're right. But there is plenty of embedded software that's verified end-to-end with model checkers, and a growing list of popular software that's verified by various means, but not end-to-end, for example most of Amazon's web services.
For everything else, people use the good old functional verification because more the state complexity of more complex designs simply explodes.