1) The data are what they are - these are the rates people are charging by listed skill. It would be very interesting to look at prices by actual completed projects, but that's just another approach one can take.
2) I make it very clear that there's an association, not a causal claim. I flag the fact, make a lame joke about it and put "might* in the title.
Yes, but not rates being charged for using the listed skill.
As to the dreaded axis values: I think your chart, in this context, does exaggerate the variation in hourly wage by skill. At first glance, almost all your data points are between $25 and $35. Clojure hourly wage is around $31 per hour from the interactive chart. Your lowest hourly wage for a skill, haml, is $25.90. This simply is not much of a difference and your chart should make that clear.
I like your article. It is interesting, and I appreciate your efforts to take this data and provide some value out of it to the community.
I think we need to be more conservative in drawing any conclusions out of this data though.
The vital piece of information in the graph is the relative size of the wage. If the author had started at 0, it would be far harder to distinguish differences between the skills, by the same principles that make pie charts mostly unusable.
Relative (i.e. proportional/percentage) size is only apparent when you start a chart at 0. The absolute difference between 25 and 31 is 6. If you start the chart at 25, that's graphed as an ∞% difference, when it's really only a 24% difference.
Thanks for making my point for me! The chart really misleads re: the difference between languages. Do we even know if Clojure wages are statistically different from, say, Scala rates?
1. As hinted above, what are the standard deviations? Do we even know if Clojure is statistically different from HAML? Do Clojure rates exhibit higher variation than HAML?
2. $4 difference between pattern recognition v. machine learning, where pattern recognition is, effectively, a subset of ML? We see this with legal services v. contract drafting, and (arguably) info architecture v. interaction design (less overlap with these, to be sure).
Also, while John gets credit for joking about causality, I'd be very concerned about sampling bias if he uses oDesk data to examine human capital decisions.
We'll have to disagree about the axes---I think the choice of where to start axes depends on what you're trying to illustrate. For me, absolute differences in wages was what I wanted highlight.
Re: Standard error bars - you're absolutely right. I should have included them, but it was my first time using the googleVis package & didn't see an easy way to add them. I don't have the numbers right in front of me, but standard deviation for each skill was on the order of ~$7/hour and each skill had to have at least 30 obs., though many where much, much higher. If we say n=50, that's an SE of about $1/hour, so clearly one shouldn't take much stock in very small differences in wages. My goal was just to unearth some interesting data---I probably should tone down the implied recommendation lest I'm responsible for a glut of unemployable lisp hackers in a few years:)
The graph is visual, it's the relation visually; if you're trying to calculate the actual %change, then a table is needed. But having all the bars at full length makes it near impossible to see the differences between bars of length that are within a few % points.
I see the graph as answering the question: What is the difference in achievable wages between different "hacking" skills? The graph, as shown, displays that difference well. Like I said, there are no hard and fast rules, but I would have done it the same way!