This one in particular is very clear:
C is next, with significantly less swearing.
C# and Java are roughly tied a bit below C.
Python and PHP have, comparatively, almost no swearing.
Was that really so hard? When the data is already subjective (what is and isn't a swear word) and intended almost solely for humor, do we really need more precision than a pie chart offers?
It is at best hyperbolic and at worst dishonest to say you "have no idea" how to interpret this. You have an idea. You just don't have precision.
Of course. Python users are happy people.
I wonder what happens with PHP... ;-)
Ignorance is bliss?
> Of course. Python users are happy people.
Heh, I've been using Python lately and have felt lots of urges to swear, but I can't get myself to commit it. Shoot.
this is the problem. In the pie chart it's almost impossible to determine which of those three has the most. In the bar chart, it's fairly obvious to my eye that C++ wins, though JS/Ruby are very close.
All the languages are equally represented by commit count.
I like how he had to tweak the data collection process to make the visualization method fit.
This is normal and has nothing to do with how you choose to represent it.
It would have been meaningless to show any graph or table saying 'Python has the most messages with profanity" if the amount of Python projects is 80% of all the projects out there.
He should just collect as many commit messages as possible, then divide the profanity count for each language by the commit message count. Because that has lower standard error [and no more bias] than what he did.