From personal experience (anecdote alert!), errors are also common in the ostens...

rrrrtttt · on Nov 3, 2013

There is a point of view that says that computer science conferences exist for the purpose of gaming the tenure system. The name of the game is plausible deniability: you're not supposed to submit papers that contain known false claims, but everything else is fair game. And this has become such an integral part of the culture that technical correctness is no longer a necessary condition for accepting a paper [1]. I think in this light it's quite clear why many scientists are happy to leave their papers hidden behind the ACM paywall.

[1] http://agtb.wordpress.com/2013/04/14/should-technical-errors...

ahelwer · on Nov 3, 2013

Thank you, that was a fascinating read. It is understandable that technical errors are given a pass, as they aren't the meat of the paper. In the case of the Booth paper, I really should state I do not mean to attack him. The idea of using the KMP preprocess to solve the problem is a wonderful approach and works very well despite the actual implementation being technically incorrect. If I recall, the bug had to do with the termination condition; the algorithm had to run twice as long to terminate correctly. I will say my understanding of the algorithm improved as a result of debugging it!

mturmon · on Nov 3, 2013

I think it's pretty common. Here's a note I wrote regarding Luc Devroye's highly-regarded (and excellent!) book on univariate random number generation:

Although not dissenting from the other reviews which tout the comprehensiveness of the treatment and its level of detail, I have to add an unpleasant fact about the algorithms: the codes may not work as written, and if they don't, there's not an easy way to track down the problem. (This is because of the nature of the constructions used in the complex constant-time algorithms -- this opaqueness is not a problem for the elementary algorithms which, alas, may not run in constant time.)

A look at the author's web site (currently at errors.pdf off his main page) shows that, e.g., the algorithm on page 511 [of the book] for Poisson r.v.'s has four serious bugs as originally published. This means that the main algorithm for one of the most important discrete distributions was not coded and tested by the author before the book appeared!

In fact, I believe this algorithm has at least one more bug, because I'm still seeing a small off-by-one anomaly in my implementation. The algorithm for binomial r.v.'s may have trouble as well -- I see problems for N=400, p=0.05. After 10 million draws (i.e., enough to get good statistics) I see deviations of counts in some bins near the peak (i.e. number of integer outcomes of the R.V.) of 8 standard deviations from the expected number of counts. So, be careful, and consider alternate implementations of the more complex algorithms.

There's a lot of detail in the book, and the techniques are valid. But as we all know, implementation sometimes reveals algebra errors!

nl · on Nov 3, 2013

Implementations of common algorithms are even worse.

The Java binary search implementation had a bug that eluded detection for 9 years, and it was based on a implementation from the 1986 book "Programming Pearls" that also contained the same bug (TL;RD: it's an overflow error that computers in 1986 would probably never have run into - who could imagine having an array with more than 2^30 elements?!).

Even worse: While the first binary search was published in 1946, the first binary search that works correctly for all values of n did not appear until 1962. - and this bug shows it is likely this 1962 version would have failed the same way.

See http://googleresearch.blogspot.com.au/2006/06/extra-extra-re...

im3w1l · on Nov 4, 2013

Curiously, most c programs with 2^30 element arrays will probably have 64 bit ints, and will therefore work. In the time the code original code was written, I suppose the ints would have been 16 bit?

joeyo · on Nov 3, 2013

This is a problem in computational electrophysiology as well. There are a lot of typos and other simple mistakes in classic papers. Say you want to implement a finite differences model for a certain type of voltage-gated potassium channel. If you go back to the original paper it's not uncommon to find minus-signs omitted, parentheses placed improperly or other unfortunate bugs. It can take a lot of head scratching and wasted time to get to the point that you can reproduce the figures from the paper!

Granted when something has been in the literature for a long time, the derivative papers and popular implementations (in eg Neuron) are usually right, but there is rarely anything in the scholarly record that documents these errors. It's all tribal-knowldege and side-channels.

ahelwer · on Nov 3, 2013

Ugh, that sounds terrible. During a previous internship at a HPC company[0] I implemented a computational electrodynamics FDTD algorithm as given in the Taflove book[1], and I made more than enough errors even without the book containing mistakes! Two fields, each with three components and subtly different equations for all. What a nightmare. Especially since it's impossible to tell what is wrong when watching the EM wave propagate in an impossible oblong fashion during your simulation.

[0] http://www.acceleware.com/

[1] http://www.amazon.ca/Computational-Electrodynamics-Finite-Di...

Mgccl · on Nov 3, 2013

This is why I think it be important to have a language designed to express algorithms and can be verified of correctness to a certain point.

GalacticDomin8r · on Nov 4, 2013

> This is why I think it be important to have a language designed to express algorithms and can be verified of correctness to a certain point.

Do you mean math?

Mgccl · on Nov 4, 2013

mathematician make errors. I want something that can be checked formally.

GalacticDomin8r · on Nov 7, 2013

> mathematician make errors. I want something that can be checked formally.

So you mean math?