Hacker News new | comments | show | ask | jobs | submit login

It's been a long (long!) time since I touched Apache Bench or Gnuplot but isn't that first graph "number of responses at this response time"? i.e. about 3000 of your responses were ~100ms or quicker; 4000 were under ~150ms; etc.

Your scatterplot has just unaccumulated that data - but it's the same data.

I definitely agree that it is the same data, and I plan to do some follow-up to this piece because of shortcomings in my evolving view on what I just learned. By "doing it wrong", I mean that many people believe that this data is response time on the y-axis, and chronological time on the x-axis.

I hate to single out anyone, but Philip is obviously a smart guy, and I think he's competent enough to not be hurt by a simple oversight like this. If you look at his write up here, he uses the oft circulated gnuplot template (check the comments):


> On first sight, we immediately see from the graph that the response time using Puma at the end of the 10000 requests is pretty bad with 100 concurrent requests, with the longest request taking around 60 seconds. I’m not entirely sure why this happens or what happens near the end, but here’s one plausible explanation: > When the benchmark starts, 100 concurrent requests are sent to the web server. A maximum number of 16 threads, and thus 16 requests, are allocated by Puma at once. The 17th request will block until one of the 16 threads currently in use is finished. However, since we’re executing 100 concurrent requests, there will be 84 requests waiting (100-16). Looking at the requests in the generated puma.dat file (generated with ab -r -n 10000 -c 100 -T 'application/x-www-form-urlencoded' -g puma.dat -p ../live_streaming/post, we see that exactly 84 requests have been waiting for execution. These are the requests that were issued first, but have never been allocated to a thread by Puma. As a result, they have been waiting for the entire benchmark. I’m not sure why Puma would behave like this.

That protracted explanation is predicated on the inference that the data is ordered chronologically. Many, many people make this mistake (google "apache bench gnuplot"). I always thought it was as well. I don't know why I never looked at the starttime or seconds columns of the data.

Yeah, I see what you mean from his explanation. I wonder how/why the misreading happens? Maybe it depends how much statistics you got beaten into you at school or something.

I think two factors have caused this confusion:

1) Most of the people using the gnuplot template really don't understand what it's doing

2) We all assume that the output of `ab -g plotfile` is a serial log

If you hand it a single column of data, gnuplot just uses the order for the x axis. The x label in the first plot should be 'Row number' or something like that.

Since each row is a single request, and ab writes the file sorted by response time, the first plot is effectively a sideways cumulative histogram. In other words, you can see that 4000 out of 5000 requests were served in under 150ms, etc. Arguably this is more informative than the scatterplot, although I suspect the OP is right about how commonly the graph is misunderstood.

Coincidentally, I'm in a Skype chat trying to explain this same thing. Now that I understand it, I'm growing to like the sideways cumulative histogram because it gives a good representation of the time factor. If we were to flip the axis with response time on the x axis using something like 5 ms binning, and the y axis representing a count of the requests for each bin, we'd lose the significance of a request that takes 500ms. On our "proper" histogram, it would be represented by a tiny bar to the right of the histogram. I'm not sure that's preferable.

The choice of cumulative distribution vs. time-dependent response times depends a lot on what you're trying to measure. The cumulative distribution is useful for showing the likely response time and its variation, but assumes a constant state. Your new plot style is useful for seeing how changes in loads affect the response time distribution (if you see double the hits, do you get longer response times and/or more variation in times?).

That's a really good point. In fact what you're describing is what the OP says people expect the first plot to show. Just because we (I?) can't perceive trends in the OP's second plot doesn't mean we couldn't if he increased the ab parameters.

That makes sense.

I guess there should be a better label for the y axis though.

Pretty much what I was thinking. They're both useful charts, if you understand what is being charted.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact