
The impact of language choice on github projects - llambda
http://corte.si/posts/code/devsurvey/index.html
======
mbostock
2009\. Note the rise in popularity of JavaScript since then:
<https://github.com/languages>

Also, box plots would provide more useful comparison than medians; box plots
can show several quantiles simultaneously. This is almost essential if you
don't know anything else about the distribution. Still enjoyed the analysis,
though, and I think it's an interesting space for further exploration!

------
aveeno
"it becomes harder and harder to contribute to a Perl codebase, the bigger it
gets."

I've never written any perl, but is this really true? I fail to see how its
really any worse than ruby/python.

~~~
zdw
Perl and ruby are pretty similar (much of ruby was inspired by perl according
to Matz), but perl is much older than ruby, and thus many people are used to
language idioms that aren't particularly modern.

There's also a tradition of writing one liners ala sed/awk in perl, and
someone going from that to more complete programs isn't generally the best for
program structure.

Python's whitespace requirements tend to make it clearer to read, as there are
fewer ways to format identical code, and it's a much less of a TMTOWTDI
language than perl.

 _tl/dr:_ Perl code is often written by old timers who expect everyone to be a
perl genius like they are.

~~~
chromatic
_Python's whitespace requirements tend to make it clearer to read..._

What projects have you worked on such that consistency of indentation was at
all a meaningful factor to maintenance? I worry more about duplication and
near duplication, testing efficacy, proper factoring, symbol naming, effective
error handling, the possibility of fencepost errors, and clarity of intent.

~~~
kamaal
Very well put.

In my experience unless you are hiring exceptionally bad programmers. You
don't really have to worry about code indentation.

Seriously code indentation is all people worry about? I have bigger problems
to worry about.

------
ccashell
I found the statistics to be quite interesting. The conclusions drawn by the
author, however, left a lot to be desired. Most of them were not directly (or
even indirectly) supported by the data, and consisted primarily of random
opinions that left me feeling negatively about an otherwise interesting post.

I would have also like to see data on the mean for some of the statistics, and
not just the median. In some cases, one or the other can provide very
misleading statistics, and providing both to compare would have helped smooth
over concerns there.

------
flatline
I found the median commits and committers to be interesting - there must be a
lot of people doing a few number of commits, which is what I would expect more
in e.g. forum posts than source code. I would like to have seen the mean too.
As to the title, I don't think that the numbers really say anything about the
impact of language choice.

------
shawnps
> First, the sample size. Clearly, github is very popular with the Ruby crowd,
> with more than four times as many projects as Python, the runner-up.

It looks like Python is at ~400 in the graph, and Ruby is at ~1200. Am I
missing something?

------
chrismealy
How much of the javascript on github is just jquery bundled in rails projects?

~~~
dpkendal
I'm fairly sure that GitHub counts languages by projects, not by files. So a
few JavaScript files inside a Ruby project would not affect the ranking.

------
davorb
No Haskell? I'm surprised.

~~~
dons
While Haskell is ranked ahead of eg scala or erlang in total repos on github,
the majority of larger/active projects in Haskell are still in darcs repos, on
places like code.haskell.org.

Historical note: the Haskell dev world jumped on distributed vcs a few years
before git appeared, via darcs, and that early move has resulted in the
situation today, where it's mostly only the younger projects in git.

~~~
dons
And just as an example, I have around 200 darcs repos, including big projects
like xmonad, yi or bytestring, that might never be on github, since they're in
darcs on c.h.o.

The Haskell world is really an anomaly in this regard.

~~~
gnaritas
Not really, same applies to Smalltalk, and I'd guess Lisp as well.

------
nknight
"Median" and "average" are being used inconsistently in this article, making
it more difficult to meaningfully interpret the dataset.

~~~
cortesi
Hi there. I think you're interpreting "average" as being equivalent to "mean",
but this isn't my understanding of the term. Both the "median" and "mean" are
measurements of "averageness". I use "median" when I'm talking directly about
figures or a diagram. I use "average" when I'm making some general point, and
only after I've already specified what my exact measure of central tendency
was.

<http://en.wikipedia.org/wiki/Average>

~~~
spullara
Median is an exact term that means the middle value in a sorted list of
values.

~~~
cortesi
Yes, I don't think anyone has disputed that.

