

Bug finding is slow in spite of many eyeballs - jonaslejon
http://daniel.haxx.se/blog/2015/02/23/bug-finding-is-slow-in-spite-of-many-eyeballs/

======
netcan
Despite being big and different and having developed some grumpy old forum
symptoms, HN is still smart. As such, I think we have a great example of
memetic churn and progression here. We start with manifesto-ey essays trying
to cut through some habitual thinking and paradigms. The new fundraising
model. Bootstrapping. Web based SAAS is an ocean with a lot of paradigm
shifts, the tech options, the business models, the development cycle. Those
get people excited and defensive and there's fun to be had arguing it out.
There's all sorts of pedantic arguing in the fringes pointing out that this
things not _strictly_ true. Then there are manifesto busting pieces. Case
studies and anecdotes about manifesto-ed approaches going terribly wrong.

I'm not down on this process, it's progress.

Anyway, I think this is a part of it. "Enough eyeballs" is a big hairy hefty
concept, part of the big hairy open sourced concept. No one could have
predicted open source. The cultural and economic (BTW, economy is a big part
part of culture) machinations of open source add up to something big and
impressive and unpredictable. "Enough eyeballs" is a (somewhat squishy) big
concept trying to make sense of it. This is like trying to boil down 'platform
wars' or 'platform openness' into slogans and roes of thumb. You may come up
with an interesting, informative and useful way of looking at things. But,
underneath it is a big harry tangle of machinations, circumstances and ungodly
complexity. Hence the need for rules of thumb and manifestos in the first
place.

This post is part of the more mature stage. The audience is expected to be on
Linus' side already and the novel part is discovering that the rule of thumb
is isn't working in all cases. Diging into the machinations. Who are all these
eyeballs? What are these bugs? Are some types of eyeballs more or less
effective against some bugs?

I'm not sure where I'm going with this… be excellent to each other?

~~~
shin_lao
Anyone who has a couple of year into software engineering knows that there is
diminishing return over the number of "eye balls" and that some problems are
just god damn hard or need to be looked at with the problem in mind.

Software quality is orthogonal to the openness or closeness of the source
code.

I think it is more about the will to do something great, the ability to listen
to the user base and of course the technical skills of the authors.

~~~
lmm
The article shows it's not about that

> Perhaps you think these 30 bugs are really tricky, deeply hidden and
> complicated logic monsters that would explain the time they took to get
> found? Nope, I would say that every single one of them are pretty obvious
> once you spot them and none of them take a very long time for a reviewer to
> understand.

For me the takeaway is that even something as basic and mature as curl can't
get it right in C. Two buffer overflows in as many years, along with TLS and
HTTP failures. It's past time to move onto better tools. I hope the "bitcoin
piñata" calls attention to the fact that there's now a full SSL stack
available in a substantially safer language.

~~~
jacquesm
It's the devil that we know. New environments: new exploit classes.

~~~
schoen
There are some great examples of that (return-oriented programming as a way of
getting around non-executable data segments), but surely there are _some_
tools where adopting them was a pure win for safety and correctness over the
status quo.

------
fdej
Eventually, we need to move towards formally verified software. Some bugs slip
past any number of human eyeballs. It won't be feasible to do formal
verification of all software anytime soon, but it should be done for operating
system kernels, networking and crypto libraries, virtual machines, compilers,
and similar security-critical software.

Switching to programming languages with type systems powerful enough to
protect against common classes of bugs is a good first step (but not enough).

~~~
mcdoug
I feel like if we did this we'd never get to an actual working system. How do
you formally verify that a remote file system interacting with a faulty hard
disk is doing the right thing? This reminds me of Hurd, which is arguably a
superior design that will never be finished.

~~~
fdej
That's certainly an issue, and there's no easy way around the fact that even
formally specifying the behavior of a program that has to interact with the
outside world is problematic.

Nonetheless, some of the academic research that has been done on formal
verification is quite impressive (including the development of actual
nontrivial working software). And it can be done piecemeal -- you could
formally certify the functional parts of a crypto library (and demand this
standard of future replacements) independently of what goes on in the world of
file system drivers.

Realistically speaking, as worse is better, I think a large push for formally
verified software probably won't happen for decades to come, but it will
happen eventually. It already happened in parts of the hardware industry
(Intel certainly don't want the division bug to happen ever again). The main
issue is that we need tools that make development easier, to allow keeping up
with the rapidly changing requirements in the software world.

~~~
jdc
What are some examples of non-trivial working formally verified software?

~~~
maemre
There is Quark [1], a web browser with formally verified kernel. Here, kernel
is a process which manages other slave (helper) processes which render the
page actually.

[1]: [http://goto.ucsd.edu/quark/](http://goto.ucsd.edu/quark/)

------
krig
"Because in reality, many many bugs are never really found by all those given
“eyeballs” in the first place. They are found when someone trips over a
problem and is annoyed enough to go searching for the culprit, the reason for
the malfunction."

That /someone/ is able to go searching for the culprit instead of having to
rely on someone ELSE to look at the source and figure out what is going on is
the whole point of the quote, no?

~~~
dwc
Yes, open is source indeed awesome that way. Anyone can, in theory, go chase
down that bug they tripped over. And many people do! But not nearly as many
people as are capable of it (say, developers with some familiarity with the
language, et al).

I've made just a few bug fixes to open source software that I didn't have some
ownership of. From talking around with other devs over the years that makes me
quite unusual, in that almost none of them have made _any_ fixes to _other
peoples '_ code. And here I was feeling bad that I hadn't done more.

~~~
krig
I don't think it should be surprising that few users actually look at the code
or are willing to dig into a foreign code base. It's still true that open
source makes it possible, which is a huge step up from any other model.
There's a lot of backlash right now due to some very high profile bugs that
have been around for a very long time. But would those bugs have been found if
the programs hadn't been open? Also, look at what happened after heartbleed:
Another group of people decided to dig into the openssh code base and try to
clean it up, finding and fixing lots of other issues, without any authority
from the original authors. That's the benefit of openness, in my opinion.

------
toddkaufmann
So how many bugs remain?

Mostly rhetorical question, but can any extrapolation be done? If you go back
five years, can any of those numbers correlate to the findings since? Do any
metrics such as cyclomatic complexity, #defects/kLoC[1][2], unit tests or code
coverage help?

In most cases the definition of "defect" is not well-defined, nor in many
cases easily comparable (e.g., a typo in a debug message compared to handling
SSL flags wrong). Is is a requirements or documentation bug: the specification
to the the implementer was not sufficiently clear or was ambiguous. Also, when
do we start counting defects? If I misspelled a keyword and the compiler
flagged it, does that count? Only after the code is commited? Caught by QA? Or
after it is deployed or released in a product?

Is it related to the programming language? Programmer skill level and fluency
with language/libraries/tools? Did they not get enough sleep the night before
when they coded that section? Or were they deep in thought thinking about 4
edges cases for this method when someone popped their head in to ask about
lunch plans and knocked one of them out? Does faster coding == more
"productive" programmer == more defects long term?

I'm not sure if we're still programming cavemen or have created paleolithic
programming tools yet[3][4].

p.s.: satisified user of cURL since at least 1998!

    
    
        [1] http://www.infoq.com/news/2012/03/Defects-Open-Source-Commercial
        [2] http://programmers.stackexchange.com/questions/185660/is-the-average-number-of-bugs-per-loc-the-same-for-different-programming-languag
        [3] https://vimeo.com/9270320 - Greg Wilson - What We Actually Know About Software Development, and Why We Believe It's True
        (probably shorter, more recent talks exists (links appreciated))
        [4] https://www.youtube.com/watch?v=ubaX1Smg6pY - Alan Kay - Is it really "Complex"? Or did we just make it "Complicated"?
        (tangentially about software engineering, but eye-opening for how much more they were doing, and with fewer lines of code) (also, any of his talks)

------
berkes
While technically, it is possible to use statistics as "proof" with N=30, but
it is stretching it a bit, IMHO.

For example, stating that the amount of reports per year corresponds to the
amount code added, by stating that both are "somewhat linear" is not very
solid. I could just as well state that the amount of reports per year is
"somewhat exponential" and conclude that it does not correspond to the amount
of lines of code added.

This does not make the point overall any less true, it is just that the
foundation: the numbers, are too few to make any grand conclusions from.

~~~
jmorrison
Might I humbly suggest that anybody serious about this issue read (sadly, the
late) Manny Lehman's "FEAST" publications? He attempts to quantitatively model
software evolution, which includes complexity, errors of omission (limitations
of domain model), errors of commission ("bugs"), etc. It is fascinating
reading. I remember many "Aha!" moments when seeing the graphs. It also
contains many quantitatively-derived principles one can operate by, some of
which underlie pg's "beating the averages" argument. His wikipedia page is
here
[https://en.wikipedia.org/wiki/Manny_Lehman_%28computer_scien...](https://en.wikipedia.org/wiki/Manny_Lehman_%28computer_scientist%29),
and the FEAST pubs are here
[http://www.eis.mdx.ac.uk/staffpages/mml/feast2/papers.html](http://www.eis.mdx.ac.uk/staffpages/mml/feast2/papers.html)

------
harkyns_castle
I've been using open source for ages, and have rarely taken the time to look
at the source.

But, at least its there.

I sometimes wish there wasn't such motivation to hack software. If we were all
working towards a common good (in my case I'd like to see us doing a bit of
space exploration, renewable energy tech etc), and we wouldn't _need_ to
exploit stuff.

But, we're humans. Greedy little scumbags hehe, always looking for the short
term gain. All varying forms of politicians.

------
iwwr
Whereas closed source proprietary software has no eyeballs on it. Open source
is at least an opportunity to identify problems by third parties without
reverse-engineering. Open source also allows code analysis tools to do
automated tests across wide numbers of codebases.

~~~
ryanlol
Closed source often has eyeballs specifically paid to... eyeball it.

~~~
awalton
All four combinations exist in nature: Open source code with plenty of
eyeballs, open source software with no eyeballs, closed source software with
tons of reviewers, and closed source software with no no reviewers. It'd be
interesting to see what the percentage break down is amongst these, but it
probably wouldn't be surprising.

It's also another tick in the box as to why pre-checkin code review is so
important - bad code is often immortal, and it can be really hard to patch out
bugs if people have grown to rely on broken behavior, so it's best not to get
yourself in that state to begin with.

~~~
amelius
Also interesting would be a breakdown by language. For example C++ versus
Haskell.

------
codehero
Linus' law may scale, but even assuming these eyeballs produce bug fixes,
applying those fixes to the source tree does not scale as well.

I have taken time to put my eyeballs on bugs in spidev's ioctl() and TI's spi
driver but my bug fixes are not in the tree.

Signing off, adhering to the source standard and attaining enough respect from
the established devs to get your fix accepted are the limiting factors.

I already invested significant amounts of time finding these bugs and fixing
these issues; I don't have any more to spend to make Mark Brown or other
kernel devs happy.

I don't even care about getting the credit for my fixes, but it seems the
kernel devs don't want to take my code to next step and get it integrated.

------
jpollock
I've got some different take-a-ways. First, time-to-discovery is different
from the shallowness of the bug. Increasing the number of eyeballs looking at
the code increases the likelihood that:

    
    
      a) someone will encounter the problem.
      b) that someone will be interested enough to dig into it.
      c) they will then find the problem before giving up.
    

That's the power of many eyeballs. It's an expression of interest. With fewer
eyeballs, you might get bug reports (a), but not have enough people looking to
get (b+c) out of the community. With closed source, (b+c) can _only_ be
provided by the team. Open Source means that this can be provided by a
sufficiently large community.

That open source projects are having security bugs reported can probably be
explained by economics. It's becoming harder to find security problems in
closed source projects (windows), so researchers shift to open source ones.
With an open source project, there's a lot of low hanging fruit to be had with
static analysis, copied code and fuzzing. Closed source is a lot harder, so
people go for where the ROI is good - either towards the rewards or towards
open source for practice.

Finally, shifting to Java just shifts the attack to the JVM, which is just as
hard to secure. I still remember the year of Java exploits, complete with a
remote DoS attack based around sending a floating point number to the
server[1]. There will always be bugs. If your goal is to write secure
software, open source is good. If your goal is to avoid bad press, making it
expensive to test is probably the way to go instead.

[1] [http://www.oracle.com/technetwork/topics/security/alert-
cve-...](http://www.oracle.com/technetwork/topics/security/alert-
cve-2010-4476-305811.html)

~~~
JoachimS
And: c) the bug report and fix will not be ignored.

Considering how OpenSSL handled bug reports and fixes (by letting them sit in
the tracker for years) or Ulrich Dreppers lets say less than welcoming
attitude the works done by the eyes can be a waste. And the many eyes soon
withers to nil.

------
erikb
Well the conclusion is not surprising. Bugs aren't found because many people
"read" the source code, but many people of many different skills use it and
therefore every problem that is hard to you will some day find a person that
has the specific domain knowledge to fix it much more easily.

Also I'm a little surprised at the size of the dataset and the choices, given
that open source probably fixes an easier bug faster than a higher priority
one.

------
camperman
Linus' Law is not some kind of catch all that applies to auditing code for
security weaknesses. It specifically refers to the rapid quality control that
happens when you release early and release often - the bazaar method of
software development, as outlined here:

[http://www.catb.org/esr/writings/homesteading/cathedral-
baza...](http://www.catb.org/esr/writings/homesteading/cathedral-
bazaar/ar01s04.html)

~~~
Animats
Yes, that's the party line. If it was true, the number of open bugs would
decrease over time. For most open source programs, it increases.

Mozilla has passed the 1 million bug mark.

~~~
rndgermandude
To be fair, mozilla uses bugzilla to track everything, not just "bug" bugs.
Each new feature in development has a bug, or multiple bugs. When code is
refactored there is a bug for that, when somebody wants commit access there is
a bug for that, when an employee needs a new laptop there is a bug for that,
when a community organizer wants some money or gear for an event there is a
bug for that, and so on... Other organizations use their bug/issue trackers in
a similar manner.

------
JoachimS
I've always read the expression backwards: With enough eyeballs, the shallow
bugs will be found.

Things like spelling errors in print statements, comments etc gets corrected.
By off by one errors, use after free and all kinds of subtle logical problems
that only manifests once in a while and after long execution will require a
focused effort to find (and more and more, good tools.) Daniels description
matches this sentiment.

~~~
chris_wot
At least you've never read it as "With enough bugs, the shallow eyeballs will
be found".

~~~
JoachimS
True. Nor that enough eyeballs with bugs will be found at the shallow end.

------
DonHopkins
To quote Theo De Raadt:

My favorite part of the "many eyes" argument is how few bugs were found by the
two eyes of Eric (the originator of the statement). All the many eyes are
apparently attached to a lot of hands that type lots of words about many eyes,
and never actually audit code.

