I'm no statistics wizard, but I find it interesting that you conclude that we're "doing it wrong" while at the same time suggesting that a scatter plot where a huge percent of the points overlap is the way to go.
I appreciate that you try to up the resolution to counter this, but it still strikes me as the wrong presentation.
Very true, but the whole idea behind the blog post seems to be that the representation of data is done wrong.
I would rather the author go into a bit of a discussion about what the data represents and what different ways this could be presented. I can't help to get the feeling that the choice of a scatter plot is more or less arbitrary.
I agree with your criticisms. I wrote this last night and submitted this morning, but after reflection, I even made some edits. For example, I started by saying that the first graph was "wrong". Upon reflection, it's not. I think more accurately, it just doesn't represent what people think it does.
If you Google search for "apache bench gnuplot", you'll find a very similar gnuplot template that has been circulated for a very long time, but everyone seems to think that the resulting plot is response time over time.
I'm definitely going to follow up on the issue. I'm a mediocre programmer, so gnuplot is pretty hard for me, but I keep having these "ah ha" moments. My next post will look at each graph in more detail and try to explain just what is being shown in each.