
A Large Scale Study of Programming Languages and Code Quality in GitHub [pdf] - todd8
http://macbeth.cs.ucdavis.edu/lang_study.pdf
======
lighthawk
"RQ1. Are some languages more defect prone than others" is misleading.

It shows TypeScript having an coef of −0.43, Clojure of -0.29, Scala at -0.28,
and Haskell at -0.23 and shows a small amount of standard error for each. That
ignores a few things: 1. in Table 2, the amount of data analyzed for each is
much different than some of the others, and 2. if some projects are newer or
don't have as much activity, they will have fewer issues raised. It's a bad
idea to equate that in any way to code quality.

------
edu
Skimming the document in find that in page 3 they have categorized the
languages by paradigm in a very weird way.

They list 3 paradigms: procedural, scripting and functional.

First, scripting is not a programming paradigm. The main difference is that
compilation is not an explicit step. But even those, some of the languages
(Python!) are compiled.

Then, in the Procedural category they have C and Go, but then add C++, C#,
Objective-C and Java which are not procedural but OO!!!

The categorization seems completely arbitrary and wrong!

~~~
tinco
C++,C#, Objective-C and Java are all procedural languages. This has little to
do with whether they are object oriented or not. That said, it would have been
interesting to see whether object oriented features of languages have an
effect on software quality.

------
numlocked
Python and Ruby come out with quite opposite results. There is absolutely no
reason why those languages, on their own, should lead to wildly different
error rates. Both dynamically typed, general purpose, very equivalent
languages. So if this study seems intuitively like a fools-errand, that would
be your smoking gun.

The study admits the effects are small. The one clear conclusion is an obvious
one:

"Result 4: Defect types are strongly associated with languages; Some defect
type like memory error, concurrency errors also depend on language primitives.
Language matters more for specific categories than it does for defects
overall."

Does Clojure have an inherently lower defect rate than C++? It's arguably an
interesting question to ask, but looking at the Python and Ruby results makes
it clear, to my anyway, that we'll have to stick to arguing about it over
beers, rather than finding the answer in data.

~~~
tinco
The Ruby ecosystem has very strong emphasis on tests and test driven
development. Perhaps bugs in Ruby simply often are fixed before they're
committed because the test suites have better coverage.

Also I think Ruby developers are more likely to be professional (as in paid
and educated to be programmers), whereas python developers might (slightly
more) often be mathematicians, scientists or sysadmins.

As is noted in the paper, all other factors that were controlled for were
strongly dominant. That means who you are and what you are doing is much more
important than what language you do it in.

------
JackMorgan
Came up with some similar results a couple months ago. I looked at it from
just bugs per commit, but I also have some cool charts! [http://deliberate-
software.com/safety-rank-part-2/](http://deliberate-software.com/safety-rank-
part-2/)

------
lithos
Why does Go do so bad with concurrency errors, when it's supposed to be a
darling at concurrency.

~~~
actsasbuffoon
It probably has something to do with concurrency being common in Go code. The
primary implementations of Ruby and Python (for example) have almost no
meaningful form of concurrency. That means most Ruby and Python apps have no
concurrency, thus no concurrency bugs.

------
JoeAltmaier
Seems to fall right in line with expectations.

