I'm not down on this process, it's progress.
Anyway, I think this is a part of it. "Enough eyeballs" is a big hairy hefty concept, part of the big hairy open sourced concept. No one could have predicted open source. The cultural and economic (BTW, economy is a big part part of culture) machinations of open source add up to something big and impressive and unpredictable. "Enough eyeballs" is a (somewhat squishy) big concept trying to make sense of it. This is like trying to boil down 'platform wars' or 'platform openness' into slogans and roes of thumb. You may come up with an interesting, informative and useful way of looking at things. But, underneath it is a big harry tangle of machinations, circumstances and ungodly complexity. Hence the need for rules of thumb and manifestos in the first place.
This post is part of the more mature stage. The audience is expected to be on Linus' side already and the novel part is discovering that the rule of thumb is isn't working in all cases. Diging into the machinations. Who are all these eyeballs? What are these bugs? Are some types of eyeballs more or less effective against some bugs?
I'm not sure where I'm going with this… be excellent to each other?
Software quality is orthogonal to the openness or closeness of the source code.
I think it is more about the will to do something great, the ability to listen to the user base and of course the technical skills of the authors.
> Perhaps you think these 30 bugs are really tricky, deeply hidden and complicated logic monsters that would explain the time they took to get found? Nope, I would say that every single one of them are pretty obvious once you spot them and none of them take a very long time for a reviewer to understand.
For me the takeaway is that even something as basic and mature as curl can't get it right in C. Two buffer overflows in as many years, along with TLS and HTTP failures. It's past time to move onto better tools. I hope the "bitcoin piñata" calls attention to the fact that there's now a full SSL stack available in a substantially safer language.
The article addresses how hard some of the bugs were (they were not hard), but additionally, sometimes (often?) code needs to be looked at without the problem in mind -- who hasn't written code (or a paper, or comment on HN) and proofed it immediately, committed it, and realized it has an error that your mind elided over because you were so sure "what you meant" was there that you glossed over a dumb error. Having a different mind review these things avoids that problem.
Edit: I glossed over this -- I was also referring to proofing your work much later, so it's not still fresh in your mind "as you think it is". So, bring a "different mind" to the table.
s/completely different mind/different mind/
I think it's just a matter of being complicated when you get dig into it. Open Source has certain strategies for quality that work, certain advantages (and disadvantages), different economies which means different abundances and scarcities of different resources. This "Linus’ Law" expresses something novel about the way open source projects can benefit from the advantages of being what they are. It's not an empty statement just not an airtight, always-right rule.
A simple model for code quality is something like Q =(good programmers) / (good + bad programmers). I would actually argue that it's not even related to the average, and that bad programmers can degrade quality disproportionately to their actual number. I think it might be something closer to Q = (good programmers)^2 / (good + bad programmers)^2. This is what it seems like people are getting at with the whole "negative productivity" in the good/bad/10x programmer framework.
Openness/closeness of the source code isn't the whole picture but the open-source model can more easily run into the too-many-cooks problem if the source is not properly gate-kept and reviewed in the aggregate. Thus I don't think it's entirely orthogonal. Of course closed-source software is typically commercial, which runs into its own set of pressures that degrade code quality.
I think much more sophisticated testing systems are what will really boost code quality. More advanced detection suites that throw more warnings, mandated full unit test coverage, plus CI frameworks that make sure that your code isn't merged if it doesn't clear the warnings and pass the full-coverage unit test. Randomized address systems and mandated array-bounds checking that catch undefined behavior and off-by-one errors. Basically, making things fail noisily instead of silently and forcing contributors to pay attention.
The C compiler in particular is really bad on the "throwing warnings" thing, even before you get yourself into other kinds of trouble. As in, according to the C standard, "rm -rf /" (or literally any other behavior the compiler wants) is a valid output behavior if you miss a closing quote(), rather than a compile-time error. That's an absurd definition in a security-minded world, and that's just the most egregious example of undefined behavior allowed by the C standard.
JVM-style managed code and declared exceptions are annoying but do seem to be a step in the right direction from a security perspective.
I'll give you a high grumpy old forum five though.
Switching to programming languages with type systems powerful enough to protect against common classes of bugs is a good first step (but not enough).
Nonetheless, some of the academic research that has been done on formal verification is quite impressive (including the development of actual nontrivial working software). And it can be done piecemeal -- you could formally certify the functional parts of a crypto library (and demand this standard of future replacements) independently of what goes on in the world of file system drivers.
Realistically speaking, as worse is better, I think a large push for formally verified software probably won't happen for decades to come, but it will happen eventually. It already happened in parts of the hardware industry (Intel certainly don't want the division bug to happen ever again). The main issue is that we need tools that make development easier, to allow keeping up with the rapidly changing requirements in the software world.
The practice of Test driven development is a great solution to this. The problem is very few people use it, many people actually are against it, and many of those that do do it, do it incorrectly (not on purpose). I would think that small chunks of singly-purposed functionality would be much easier to verify.
I don't mention this in order to criticize test-driven development or the improvements it can bring to software reliability or safety, just to point out that there's still a big gap from there to a formal proof of correctness.
That /someone/ is able to go searching for the culprit instead of having to rely on someone ELSE to look at the source and figure out what is going on is the whole point of the quote, no?
Some apparently "bug free" programs are actually riddled with bugs but they are not found because almost nobody uses them. They are probably not fixed for the same reason ;-)
I've made just a few bug fixes to open source software that I didn't have some ownership of. From talking around with other devs over the years that makes me quite unusual, in that almost none of them have made any fixes to other peoples' code. And here I was feeling bad that I hadn't done more.
Mostly rhetorical question, but can any extrapolation be done? If you
go back five years, can any of those numbers correlate to the findings
since? Do any metrics such as cyclomatic complexity, #defects/kLoC,
unit tests or code coverage help?
In most cases the definition of "defect" is not well-defined, nor in
many cases easily comparable (e.g., a typo in a debug message compared
to handling SSL flags wrong). Is is a requirements or documentation
bug: the specification to the the implementer was not sufficiently
clear or was ambiguous. Also, when do we start counting
defects? If I misspelled a keyword and the compiler flagged it, does
that count? Only after the code is commited? Caught by QA? Or after
it is deployed or released in a product?
Is it related to the programming language? Programmer skill level and
fluency with language/libraries/tools? Did they not get enough sleep the night
before when they coded that section? Or were they deep in thought
thinking about 4 edges cases for this method when someone popped their
head in to ask about lunch plans and knocked one of them out?
Does faster coding == more "productive" programmer == more defects long term?
I'm not sure if we're still programming cavemen or have created
paleolithic programming tools yet.
p.s.: satisified user of cURL since at least 1998!
 https://vimeo.com/9270320 - Greg Wilson - What We Actually Know About Software Development, and Why We Believe It's True
(probably shorter, more recent talks exists (links appreciated))
 https://www.youtube.com/watch?v=ubaX1Smg6pY - Alan Kay - Is it really "Complex"? Or did we just make it "Complicated"?
(tangentially about software engineering, but eye-opening for how much more they were doing, and with fewer lines of code) (also, any of his talks)
For example, stating that the amount of reports per year corresponds to the amount code added, by stating that both are "somewhat linear" is not very solid. I could just as well state that the amount of reports per year is "somewhat exponential" and conclude that it does not correspond to the amount of lines of code added.
This does not make the point overall any less true, it is just that the foundation: the numbers, are too few to make any grand conclusions from.
But, at least its there.
I sometimes wish there wasn't such motivation to hack software. If we were all working towards a common good (in my case I'd like to see us doing a bit of space exploration, renewable energy tech etc), and we wouldn't need to exploit stuff.
But, we're humans. Greedy little scumbags hehe, always looking for the short term gain. All varying forms of politicians.
Most bugs I have seen figured out have been a collaborative effort. One person finds one part, which leads to the next person figuring something else out, etc. Much harder to do when it is just your job, and you may not even be paid for this work.
Or a motivated attacker to reverse it for exploits.
Shared source (http://en.wikipedia.org/wiki/Shared_source) also has these properties (but not the freedom properties of open source).
Automated testing tools are also available at the binary level.
It's also another tick in the box as to why pre-checkin code review is so important - bad code is often immortal, and it can be really hard to patch out bugs if people have grown to rely on broken behavior, so it's best not to get yourself in that state to begin with.
Do they in fact register the bug, ignore it, file a report, solve it? Without source code we are left to trust the vendor that they actually fins and solve bugs. The vendor and/or its staff might be under pressure to leave bugs in place. We just don't know.
Closed source (usually) gives you Ts&Cs, support model and some level of guarantee of fitness for purpose which is why we pay to use it. This is not to say that there aren't bugs that seem to take forever to fix, but usually that is not the case at least not when there is a risk of the product becoming unsalable.
Of course there are closed source software or appliances that just are not worth their cost because they fail to mitigate the risks of having us trust what we cannot see, but these usually fail to gain any significant market share.
Sometimes there are conflicts of interest. We have documented cases of 3-letter agencies paying companies to leave bugs or unsafe options in code. Sometimes the backdoors may be more valuable than the product itself.
I suspect many corporate clients found out these days that their SSL MITM software they used made their infrastructure vulnerable.
Most people using open source software trust it as implicitly as they would have to trust closed source software, they have to, because reviewing and comprehending all of the code running on any typical machine is impossible. Most people simply use open source and don't care about (or aren't capable of) dealing with the code, and of the ones that do, there is no guarantee that they're going to be competent.
Given enough headcount all problems are solved in line with the time plan is the enterprise version of the law. With enough engineers, you can just treat them as replaceable units, just numbers in an Excel sheet. Move them around as needed to meet staffing requirements, or resource allocation requests as they are often called.
(No, I'm not cynic at all.)
I have taken time to put my eyeballs on bugs in spidev's ioctl() and TI's spi driver but my bug fixes are not in the tree.
Signing off, adhering to the source standard and attaining enough respect from the established devs to get your fix accepted are the limiting factors.
I already invested significant amounts of time finding these bugs and fixing these issues; I don't have any more to spend to make Mark Brown or other kernel devs happy.
I don't even care about getting the credit for my fixes, but it seems the kernel devs don't want to take my code to next step and get it integrated.
a) someone will encounter the problem.
b) that someone will be interested enough to dig into it.
c) they will then find the problem before giving up.
That open source projects are having security bugs reported can probably be explained by economics. It's becoming harder to find security problems in closed source projects (windows), so researchers shift to open source ones. With an open source project, there's a lot of low hanging fruit to be had with static analysis, copied code and fuzzing. Closed source is a lot harder, so people go for where the ROI is good - either towards the rewards or towards open source for practice.
Finally, shifting to Java just shifts the attack to the JVM, which is just as hard to secure. I still remember the year of Java exploits, complete with a remote DoS attack based around sending a floating point number to the server. There will always be bugs. If your goal is to write secure software, open source is good. If your goal is to avoid bad press, making it expensive to test is probably the way to go instead.
Considering how OpenSSL handled bug reports and fixes (by letting them sit in the tracker for years) or Ulrich Dreppers lets say less than welcoming attitude the works done by the eyes can be a waste. And the many eyes soon withers to nil.
Also I'm a little surprised at the size of the dataset and the choices, given that open source probably fixes an easier bug faster than a higher priority one.
If Linus's Law and so on really did rely on people encountering the bugs by chance in everyday use of the software, it's no wonder (in hindsight) this doesn't help with a lot of the security bugs we face, many of which would never be triggered randomly in normal use, but rather require constructing elaborate attack scenarios. Even those that might occur by chance are not likely to be repeatable, and so not likely to get reported or analyzed.
Maybe this points to a change in our prototypical concept of a "bug". When ESR first wrote "The Cathedral and the Bazaar", I would have associated "bug" with something like "the TIFFs produced by program A can't be read by program B", or "program C seems to crash if you have a non-ASCII filename", or "program D drops its network connections if more than 65536 packets are received". Today, I would associate it with something like "an attacker who sends a certificate containing an extension with invalid ASN.1 encoding that follows one or more syntactically valid extensions that are marked Critical and that are unknown to the user-agent can get remote code execution" or "an attacker who sends an XML payload that is parsed correctly by library A and incorrectly by library B due to discrepant handling of Unicode escaping can request operations that should be forbidden".
Well, a literal hot-off-the-presses example would be a bug in handling multibyte characters in the regular expression library in Flash Player, which was exploitable:
In other words, these bugs are often complicated artifacts that require research to find and malice to use -- not annoyances and breakages that are frustrating end users every day.
Given the universal quantifier, one would assume that ESR meant security bugs here as well.
Also, it is stated as a law, but ESR did not back it up with much statistical evidence to support it. I would be surprised if it holds in reality for all bugs.
- Does releasing early and often lead to more (non-shallow) eyeballs? Not really: see Heartbleed or the OpenSSL RNG bug in Debian.
- Do you need more eyeballs, or just the right eyeballs? (Given the right eyeballs, all bugs are shallow)
- Isn't the effect of moving to a safer language larger?
The existence of those bugs doesn't disprove the claim. You'd need to show that software that is not released early and often doesn't have (even) fewer eyeballs.
Generally speaking, 'eyeballs' aren't actively looking for problems. They're passively using software and noticing when something seems out of place. Many security bugs will never seem out of place, and are only discoverable when someone is intentionally probing for security bugs.
Mozilla has passed the 1 million bug mark.
Some possibly more useful statistics:
For the "Core" product (read: Gecko), there are as of right now 52k open bugs and 230k closed bugs.
For the "Firefox" product (desktop, not Firefox android), there are 21k open bugs and 128k closed bugs.
Note that these "bugs" include feature requests, so unless you think the number of requested features will decline over time....
I know of at least one study that contradicts this claim:
"We found that with shorter release cycles, users do not experience significantly more post-release bugs and bugs are fixed faster, yet users experience these bugs earlier during software execution (the program crashes earlier)."
Things like spelling errors in print statements, comments etc gets corrected. By off by one errors, use after free and all kinds of subtle logical problems that only manifests once in a while and after long execution will require a focused effort to find (and more and more, good tools.) Daniels description matches this sentiment.
My favorite part of the "many eyes" argument is how few bugs were found by the two eyes of Eric (the originator of the statement). All the many eyes are apparently attached to a lot of hands that type lots of words about many eyes, and never actually audit code.