I wouldn't put too much stock in those numbers. To start with, they are not measuring actual programing language usage. If you want to know which languages people are using, the only way to really do it is to do a simple random survey and ask them.
What these statistics do is go to specific communities and attempt to measure which languages those communities use. Using those numbers to measure language popularity in the population at large makes the assumption that those communities are representative of the population at large. That's a faulty assumption to make.
Do you, for example, think that Slashdot has a large segment of the "Excel VBA Macro Developer" market posting to it? Probably not.
There are other sites, like Stack Overflow, that show a 2:1 preference for C# over Java. Does that mean there are 2x as many C# programmers as Java programmers? No. It just means that more posts on Stack Overflow are about C# than about Java.
Information about the number of unique books written about a language at a particular store does not imply anything about popularity. If you considered "unit sales", that might give you a measure of "popular book subjects", but not about language usage. For example, it could be possible that Java programmers buy more books on Java than C# programmers do on C#. Many of the C++ programmers I know have several books on C++. I don't think that means C++ has more users than, say, Windows batch scripts.
The web site does say "these results are not scientific", but I don't think that's a strong enough statement. The "scientificness" has to do with how the data is collected, and how the results are reported. Even if they measured the same things using scientific means, the numbers still wouldn't imply what they claim them to imply. A better disclaimer might be "these results are not useful for making any decisions what so ever."
The actual data being shown is very clearly explained. They aren't drawing any conclusions so I don't see what your problem is. There is a 'Normalized Comparison' graph but the weights are right there for anyone to adjust. The authors are very honest about what this data is and what it means. It's just interesting, nothing more. They don't claim or even imply any of the things you say they do.
You're inferring things based on your own bias. Nowhere on that page did I get the impression that they even believe, much less claim that those data are representative of true language popularity. They claim its impossible to measure. What is popularity anyway? Number of users? Programs? Bytes? A critical reader would want a definition of that term before drawing any conclusions, and if you read beyond the first paragraph you'll find that they define the what they are measuring precisely. Basically you're complaining that they didn't do enough to discourage lazy brains from jumping to conclusions.
The sections that describe the measurements say things like:
"This is a fairly crude approximation of popularity, however, it's worth including, because all other things being equal, the more popular a language is, the more pages will exist mentioning it."
"Popular languages are used more in industry, and consequently, people post job listings that seek individuals with experience in those languages. "
"Books are also a lagging indicator, but a good way to eliminate languages that aren't "established"."
Those statements indicate that the measurements are proxies for language popularity. My point is that those statements are not true, that the numbers do not serve as proxies for language popularity. They are not crude proxies, they are not lagging proxies, or any other form of proxy.
Most conclusions you could draw from the numbers would be wrong.
> Information about the number of unique books written about a language at a particular store does not imply anything about popularity.
Sure it does. Java is more popular than Forth, and as a consequence, there are a lot more Java books than Forth books. Sure, there are various other factors involved, but there is very definitely some correlation. Anyone else remember how many Ruby and Rails books started being pumped out when it started getting popular?
Yeah, sales numbers would be nice too, but over time, editors won't publish books on things that won't sell.
Also, that "particular store" is not just some mom and pop bookstore in Podunkville, it's Amazon.com. It is, last I checked, one of the most popular book stores in the world.
I can't really say I agree with you. There are a lot of good books on the subject, there are a lot of great tutorial sites (free) on the web (look at scott stevenson's work, or cocoa is my girlfriend, or CocoaDev). The mailing lists are pretty heavily trafficked, as is the #macdev irc channel (if not always entirely helpful).
The API documentation is better than most development environments I've worked with. Admittedly, Apple's secrecy problem gets in the way of publicly acknowledging bugs, missing features, or planned upcoming features. These things can be frustrating. Apple publishes a large selection of sample code though. In fact, they ship a lot of it with the developer tools themselves (plus all the docs), so you don't always have to go looking through the website.
iPhone development is undeniably worse right now. I imagine this will change over time, but I agree their is a lot of work to be done on that front.
There's a Google group where people can make suggestions for additions to the site. Objective C is probably a sensible one to include, at this point, though.
Odd data point: PHP is well-represented everywhere except Lamda the Ultimate (no surprise there) and ... Amazon. Why are there so few books on PHP when it's so well represented in Craigslist job postings?
My best guess is that books are a really lagging indicator, because books stay in Amazon's database for so long: there's no way for PHP, Python, and Ruby, or even Perl, to come close to catching up with C and Java, given the large number of C and Java books published before those languages were even invented.
Actually, I feel that at some point in the past the market for PHP books was saturated. Until perhaps PHP 6 comes along, there is little need for new books. Additionally, the way many PHP programmers usually start is with online tuts and docs.
Does anyone know how many programmers are programming per language? I'm most interested in Ruby, Python, C#, Java, but any other language would do too. Very rough figures? Are there 10^4 C# programmers, 10^6?
(Interestingly, the stats are pretty easy to change. Perl was 2%, then Yuval Kogman and I moved our repositories there, and got it up to 3%. We advertised heavily on IRC, and now it is at 6%. Tripling the amount of code on github in a few weeks is certainly a sign that Perl is not dead. It's also interesting to consider that Yuval and I are 1% of the open source community ;)
github is too new and too unstable in terms of numbers, so far. Last I checked (a few months ago), Ruby was the most popular language according to github, which doesn't jive with any of the other data sources (or my own gut instincts). I've been keeping an eye on it, though. Stackoverflow is another one that's interesting, but currently has too much bias.
Twitter... hrm. Yeah, might add that in the talk section.
HN... I'd feel kind of bad about having an influence on the numbers myself;-)
I was curious to see the numbers for Python vs. Ruby. I had expected Python to be more popular than shown, it seems to range from 2x Ruby to 20% more. I would think a regression for the changing popularity of each language would show Ruby passing Python at some point, does anyone think that will actually happen?
My favorite argument for Python over Ruby was always that Python is more popular but it seems like that will become less true over time.
What these statistics do is go to specific communities and attempt to measure which languages those communities use. Using those numbers to measure language popularity in the population at large makes the assumption that those communities are representative of the population at large. That's a faulty assumption to make.
Do you, for example, think that Slashdot has a large segment of the "Excel VBA Macro Developer" market posting to it? Probably not.
There are other sites, like Stack Overflow, that show a 2:1 preference for C# over Java. Does that mean there are 2x as many C# programmers as Java programmers? No. It just means that more posts on Stack Overflow are about C# than about Java.
Information about the number of unique books written about a language at a particular store does not imply anything about popularity. If you considered "unit sales", that might give you a measure of "popular book subjects", but not about language usage. For example, it could be possible that Java programmers buy more books on Java than C# programmers do on C#. Many of the C++ programmers I know have several books on C++. I don't think that means C++ has more users than, say, Windows batch scripts.
The web site does say "these results are not scientific", but I don't think that's a strong enough statement. The "scientificness" has to do with how the data is collected, and how the results are reported. Even if they measured the same things using scientific means, the numbers still wouldn't imply what they claim them to imply. A better disclaimer might be "these results are not useful for making any decisions what so ever."