
A small British firm shows that software bugs aren't inevitable (2005) - cjg
http://spectrum.ieee.org/computing/software/the-exterminators
======
stevoski
Even algorithms "proved" correct sometimes turn out to have bugs:

"I was shocked to learn that the binary search program that Bentley proved
correct and subsequently tested in Chapter 5 of Programming Pearls contains a
bug."

That's from Joshua Bloch's discovery in 2006 that Java's binary search
implementation was broken:
[http://googleresearch.blogspot.com.es/2006/06/extra-extra-
re...](http://googleresearch.blogspot.com.es/2006/06/extra-extra-read-all-
about-it-nearly.html)

~~~
georgerobinson
Can you elaborate on this further?

Would this mean that Bentley's proof was in fact incorrect (i.e. not a valid
proof at all) or that it simply didn't consider integer overflow for fixed
size ints?

I wonder to what extent the implementation (i.e. the size of the integer type
used) should affect the proof that an algorithm is correct?

~~~
chriswarbo
I suppose different people may answer your questions in different ways,
depending on how they define words like "proof" and "correct". I would answer
like this: the proof was correct, but it was solving the wrong problem
(unbounded ints rather than bounded).

 _Inside_ a formal system it's relatively easy to understand what's going on;
the real difficulties are at the boundaries: how do we build a formal model of
a real system, and how do we translate formal results back to the original
system? It's a lot like programming: manipulating data is easy, the problems
are getting the right data and interpreting what the output means.

In this case, the model didn't capture overflow; so the proof is valid _for
that model_ , but it's not a useful model for Java's ints.

------
knappador
A referentially transparent, small function, has always been easy to write
bug-free assuming the OS doesn't lose a page-table entry, a network interrupt
isn't exploitable, and cosmic rays don't flip bits. The issue has never been
that we can't write something that's correct.

Getting that correctness to propagate because of the strictness of all the
tools involved and accuracy of their construction is the issue, which leaves
us with needing to either automate proofs with Coq or prove things in more
general ways that lead to undecidable problems etc. Still, the fact that one
function can be written bug-free and known to be bug-free does indicate that
it's not an inevitability of probability playing out as code grows.

We have null-pointer exceptions and no maybe type. The API's that emit nulls
sometimes make me expect sewage to leak from light sockets. It's possible to
do better. We just don't, and none of us love it. Rust and Kotlin are at least
very exciting. I'd like to understand some Coq more in the context of x86.
What routines are strictly necessary to even do Coq? A response based on some
statement about how the Y-combinator enables X where X leads to Y would not
surprise me.

~~~
efaref
With nicer syntax, nulls are equivalent to Maybe types. Recent C# for example
has the ?. and ?? operators, which are flatMap and orElse.

~~~
chriswarbo
> With nicer syntax, nulls are equivalent to Maybe types.

No they're not, since you can't nest NULLs.

Say I have a database of "Users", with "birthdates" mapping a "User" to a
"DateOfBirth" and "spouses" mapping a "User" to a "User", I can look up a
User's DateOfBirth either using a Maybe or by allowing NULLs:

    
    
        -- Maybe
        getDOB :: User -> Maybe DateOfBirth
    
        -- Nullable
        getDOB :: User -> DateOfBirth
    

I can also look up a User's spouse with either a Maybe or a NULL:

    
    
        -- Maybe
        getSpouse :: User -> Maybe User
    
        -- Nullable
        getSpouse :: User -> User
    

So far they're pretty much equivalent. However, what if I want to look up a
spouse's DateOfBirth?

    
    
        -- Maybe
        spouseDOB :: User -> Maybe (Maybe DateOfBirth)
        spouseDOB u = fmap getDOB (getSpouse u)
    
        -- Nullable
        spouseDOB :: User -> DateOfBirth
        spouseDOB u = let s = getSpouse u
                       in if s == NULL then NULL
                                       else getDOB s
    

These are no longer equivalent. In the Maybe case we can distinguish between
three kinds of values:

\- `Nothing` indicates that we can't find the spouse

\- `Just Nothing` indicates that we can find the spouse but not the DOB

\- `Just (Just x)` indicates that we found both the spouse and the DOB

In the nullable case we can only distinguish between two kinds of values:

\- `NULL` indicates that _either_ we couldn't find the spouse _or_ we couldn't
find the DOB, but we don't know which

\- Anything else indicates that we found the spouse and the DOB

The Maybe approach gives us strictly more information; although we can choose
to ignore it if we like, either by using `join` or by replacing `fmap` with
`>>=` (AKA "bind", which is a combination of `fmap` and `join`).

~~~
david-given
Yeah, but you're comparing apples and oranges. The nullable case isn't
preserving the information you want, because you haven't written the code to
preserve it, while in your Maybe case, you have. The two examples you give
aren't doing the same thing.

Your nullable example is better compared to this, which also discards the
required information:

    
    
        spouseDOB :: User -> Maybe DateOfBirth
        spouseDOB u = case getSpouse u in
                        s -> getDOB s
                        Nothing -> Nothing
    

Likewise, the nullable case, modified to preserve it (in a language which
actually has nulls this time):

    
    
        DateOfBirth* spouseDOB(User& user) {
          Spouse* spouse = getSpouse(user);
          if (spouse)
            return &spouse->dateOfBirth;
          return NULL;
        }
    

Maybes are just pointers. There's nothing magic about them.

~~~
chriswarbo
> Your nullable example is better compared to this, which also discards the
> required information:

Exactly. Your version is just doing what `join` or `>>=` would do, which I
mentioned at the end; ie.

    
    
        mySpouseDB u = fmap getDOB (getSpouse u)
    
        -- Using join
        yourSpouseDOB u = join (mySpouseDOB u)
    
        -- Using >>=
        yourSpouseDOB u = getSpouse u >>= getDOB
    

> Maybes are just pointers. There's nothing magic about them.

Of course there's nothing magic; `Maybe` just increments types. I was phrasing
myself in the context of a language like Java, which has NULL but no pointer
manipulation.

------
taspeotis
Related reading: how NASA does it [1].

[1] [http://www.fastcompany.com/28121/they-write-right-
stuff/](http://www.fastcompany.com/28121/they-write-right-stuff/)

~~~
ckozlowski
That was a fantastic read, thanks. =)

------
jaziek
Interesting.

I think the point about the mathematics being too difficult for a "rank and
file" programmer is probably a little off base. It isn't so much that the
concepts are difficult as the fact that it seems prohibitively time consuming
for relatively little gain that would put most people (especially those
managing a project) off using formal specification / verification to build
their software.

In most cases besides safety critical systems, it is acceptable to ship a
product in 1/10th of the time, knowing that there is chance that there are
imperfections.

Additionally, having used the Z language mentioned in the article, and taken
some classes in university on formal methods, the thing which put me off the
most was the (necessary) verbosity of the specification languages that are
used for these type of things. I never wrote something that wasn't incredibly
trivial, because it would have taken too long.

~~~
DennisP
As a rank and file programmer who didn't take university classes on formal
methods, I've found it pretty difficult to dig up accessible tutorials.

~~~
mannykannot
This wasn't my introduction to these methods, but I wish it was:

[http://www.amazon.com/Error-Free-Software-Know-How-
Correctne...](http://www.amazon.com/Error-Free-Software-Know-How-Correctness-
Engineering/dp/0471930164)

------
throwaway7767
Can we change the title here? It's directly quoting the byline, but it
contradicts the article.

Title/byline: "A small British firm shows that software bugs aren't inevitable
(2005)"

Article: "Praxis doesn't claim it can make bug-free software, says Amey, now
the company's chief technical officer. But he says the methodology pays off.
Bugs are notoriously hard to count, and estimates of how common they are vary
hugely. With an average of less than one error in every 10 000 lines of
delivered code, however, Praxis claims a bug rate that is at least 50--and
possibly as much as 1000--times better than the industry standard."

It's impressive, and I'd like to see more formal verification in software
especially for security-critical components. But the title is factually
incorrect.

~~~
dang
I can't think of a better (more accurate and neutral) title right now, but if
you or anyone suggests one, we can change it.

------
mamcx
Note this:

""" Only after Praxis's engineers are sure that they have logically correct
specifications written in Z do they start turning the statements into actual
computer code. The programming language they used in this case, called Spark,
was also selected for its precision. Spark, based on Ada, a programming
language created in the 1970s and backed by the U.S. Department of Defense,
was designed by Praxis to eliminate all expressions, functions, and notations
that can make a program behave unpredictably. """

Is not only a problem that our industry lack discipline, is that almost
everyone is so resistant to use better tools.

When some insist to use C++/JavaScript/PHP/MySql/Mongo/etc (tools with bad
design/complexity/bug-prone/etc flaws) with the _excuse_ that is possible to
use them "well" _if only_ we are more "disciplined" and "pay attention"?

When bad tools are bad, discipline is not the answer. Is _fix_ the tool, or
get rid of them.

Why developers understand that if a end-user have a high-error rate in one
program is a problem _with the program_ but when that happend with a
language/tool for developers... not think the same???

------
adrianN
I think an important thing that we're currently missing is some kind of
quantitative assessment what influence various measures have on software
quality. All we have are "best practices" that all sound like common sense,
but without much data backing them.

How much do formal methods really buy us it terms of quality? What's the
defect rate if you use language X vs language Y? Is it better to spend lots of
time gathering requirements and then produce something that exactly matches
the requirements or is it better to iterate as quickly as possible and gather
requirements as you go? What's the impact of the team as compared to the
methodology?

I suspect the answers to these questions are really murky and its not clear
whether there is a single methodology that produces the best quality for the
money invested.

~~~
HeyLaughingBoy
Oh, we have those. The metrics gathered by a team doing TSP
([https://en.wikipedia.org/wiki/Personal_software_process](https://en.wikipedia.org/wiki/Personal_software_process))
are useful since the SEI
([https://en.wikipedia.org/wiki/Software_Engineering_Institute](https://en.wikipedia.org/wiki/Software_Engineering_Institute))
has data on thousands of projects and the metrics do provide good guidance
_when used appropriately_. It is possible to consistently achieve < 0.1
defects/kloc using TSP.

The problem is, having spent seven years on a PSP/TSP team, that it greatly
slows the development process so it's really only useful for those few types
of projects where the cost of quality is worth it. For the vast majority of
software, I'd say that it's better to accept a small number of minor defects
in order to ship faster.

------
scorchio
For engineers, the last paragraph is the most pertinent:

"The key weapon is abstraction," he says. "If you can build abstractions well
enough, you should be able to break things down into bits you can handle."
That maxim guides every other discipline in engineering, not least the design
of computer hardware. Why not apply it to software, too?

~~~
andyjohnson0
_" Why not apply it to software, too?"_

One the whole, we do. But sometimes abstractions leak [1]. Sometimes you get
impedance mismatches when you try to layer them. Sometimes the abstraction
isn't what you think it is or want it to be.

An example of the last point is alluded to in a comment by efaref about a
binary search algorithm that turned-out to be broken because it assumed that
an integer type in the implementation language behaved the same way as
integers in mathematics, but it didn't because it didn't allow arbitrary-size
values. The abstraction provided by the integer type was faulty.

[1]
[https://en.wikipedia.org/wiki/Leaky_abstraction](https://en.wikipedia.org/wiki/Leaky_abstraction)

~~~
mannykannot
If one were using formal methods in these circumstances, they should, in
principle, reveal those leaks and impedance mismatches to you (or, more
precisely, reveal to you any problems they cause in your specific design.) The
existence of leaks and impedance matches in informal abstractions is an
argument for formality, not against it.

~~~
andyjohnson0
I agree in principle, although in practice I am sceptical about the
practicality of applying formal methods to large systems. The ratio of effort
to reward just seems too large for most domains.

------
boothead
Interesting read. I actually used the first Mondex system at Exeter Uni and it
worked very well. Never knew why before :-)

------
osullivj
LTU has a good discussion on Fetzer's classic critique of program
verification: [http://lambda-the-ultimate.org/node/2783](http://lambda-the-
ultimate.org/node/2783)

