Have it require approval for each bug report. While it isn't as fast as a fully automatic bug finding system, it is faster than a manual find and file. Plus it takes care of those situations where it was intended(poor coding, but it works).
I've had some similar ideas myself, consider me interested. I've written some (very little) code to search projects among the big open source hosts (SF, Google, Github etc). Also have a nice domain for it.
This will be very useful as long as it finds confirmed bugs. Otherwise it will be more like an unasked-for code style check (For example, one can argue using functions like strcpy are unsafe, but unless it's really possible to get too many characters in the buffer, it's not a bug)
"queue debate about tools that hold your hands vs understanding what you are actually writing"
For a start, explain why, in Java, (int) Math.random() is a bug. A programmer that happens upon this thread, now knows "this is bad", but has no clue why. A tool will only make the programmer even more reliant on tools and IDEs, so this kind of bug will be eliminated, but the frame of mind that spawned it will live on.
Findbugs, the popular Java tool I linked to in another comment, explains the bug like this:
"A random value from 0 to 1 is being coerced to the integer value 0. You probably want to multiple the random value by something else before coercing it to an integer, or use the Random.nextInt(n) method."
That's about as good an explanation as you'd get from anyone.
The way I see it is this. Simple bugs like this happen because _human beings are flawed._ All it takes is a momentary lapse of concentration and you've put the cast in the wrong place in a method you don't write a test for because you're in a hurry, and all it's doing is generating a random number with a standard API so why bother? (Or it's something you wouldn't normally even test for, like assuming something is re-entrant when it isn't)
We're inevitably going to make a certain number of mistakes a day, and it's our duty to put systems into place that catch those mistakes before they cause any more damage than they should.
Slight difference here. There are actually two types of errors in the Java code:
1. int foo = (int) Math.random() * some_max_value;
The error here is assuming that the multiplication takes place before the truncation. This isn't happening in the Python code because int(expression to truncate) is unambiguous. (+1 to Python here for making it hard to shoot yourself in the foot).
2. int foo = (int) Math.random();
The error here is assuming that Math.random returns something outside [0.0, 1.0). This is the error that all five of the Python examples are showing. (Boo to Python AND Java programmers.)
Third error: This is not the correct way to randomly pick a number in a set range. The proper way is actually quite complicated. Imagine you do (int) (Math.random() * 10), this could give you numbers from 0 to 10. However, you only get 0 if Math.random() * 10 is less than 0.5, but you get 1 if the value is between 0.5 and 1.5. You are half as likely to see a zero!
I can't speak for Python, but in Java it's quite simple to do it right; Random#nextInt(int) "Returns a pseudorandom, uniformly distributed int value between 0 (inclusive) and the specified value (exclusive)" (also consider using SecureRandom).
At least in Java and other languages, when casting as int truncates, and doesn't round the number, so: ((int) (0.9999999)) == 0
But then there's Math.floor and Math.ceil for what you're describing which can be used as well.
You (and everyone else) are of course right. I remembered the wrong thing, what I said doesn't apply to that method in Java. What does apply is that there are lots of subtleties in generating random numbers, and there's rarely a reason to re-invent the wheel. There's a good example of how subtle pseudorandomness is in the Java documentation, so I will link that instead of embarrassing myself further: http://download.oracle.com/javase/1.4.2/docs/api/java/util/R...
Actually the python equivalent of "Returns a pseudorandom, uniformly distributed int value between 0 (inclusive) and the specified value (exclusive)" would be random.randrange(0, n, 1)
random.randint(a, b) returns a uniformly distributed int value between a (inclusive) and b (inclusive).
I bet those 5 were written by folks with more (or stronger) skill with C/C++/Java other strongly typed language where casting is required/common.
Casting is uncommon and "weird" in Python. Usually means you're being unpythonic. As in this case you should be using randint or randrange rather than cast to int.
the main cause here is that randomness is hard to test so that code was just assumed to be correct. With randomness you can never be sure, http://www.random.org/analysis/
I know it's safer, but personally I find it harder to read. Especially if the constant is a #define or enum and is similar in format to the variable in question.
I personally like Yoda Conditions, but a few coworkers have pointed out that any linter worth its salt will catch an = inside a conditional, so this doesn't actually gain much in safety. I've stopped using them because there a large number of coworkers find them confusing.
Maybe I am feeling dense, but I do not understand what you are trying to point out. None except one are errors as they relate to embedded sql statements (where Equal To is the comparison operator).
The first few examples are fine. It's the unscaled ones later on, like "longarr[i] = (int) Math.random();" and especially "[Math.abs((int) Math.random()) % 3];" where the authors didn't notice that random() "returns a double value with a positive sign, greater than or equal to 0.0 and less than 1.0."
The cast has higher precedence than multiplication, so the scaling doesn't help at all, unless you use parenthesis to force the right order of evaluation.
They are all in error. "(int)Math.random()" always returns 0 because Math.random() returns [0.0, 1.0) (i.e., does not include 1). Casting that return value to a int will always return 0.
Indeed. Take a look at http://www.google.com/codesearch/p?hl=en#TXS-ndgbCls/trunk/j... for instance: if you weren't looking for an incorrect cast, it would be very hard to notice. Two out of three casts are correct: only one is missing the parentheses.
A lot of the results I'm seeing in the search have more problems with order of operations -- so long as you multiply the random() result by a good-sized constant before you cast to int, it actually does what it's "supposed" to do.
This is the point of the post. There are no examples that aren't casting Math.random() to an int before doing anything with it thus using a pretty expensive way to represent 0.
The comment I was replying to was simply saying that casting the return value of Math.random() will be 0. Which is true, but is not the issue -- order of operations (for people who think the multiplication happens before the cast) is the issue.
> so long as you multiply the random() result by a good-sized constant before you cast to int, it actually does what it's "supposed" to do.
Yeah but since cast binds tighter than multiplication, any code following the pattern `(int) Math.random() * a` is broken and generates 0 every single time.