
How Not to Measure Computer System Performance - sidereal
https://homes.cs.washington.edu/~bornholt/post/performance-evaluation.html
======
ltratt
Benchmarking practises are currently poor, almost without exception. For peak
VM performance, we have started to use Kalibera/Jones's method
[http://kar.kent.ac.uk/33611/7/paper.pdf](http://kar.kent.ac.uk/33611/7/paper.pdf)
(we reimplemented the statistical computations at [http://soft-
dev.org/src/libkalibera/](http://soft-dev.org/src/libkalibera/) to make it
more accessible). I don't think this method is the end of the story, but it's
a definite improvement: we were surprised at some of the odd effects it
highlighted (non-determinism was not what I was expecting). It's definitely
changed how I think about benchmarking.

------
oneofthose
Great article the gist I get from it is: running an experiment in computer
science is easy (just ./bench), running an experiment in computer science in a
correct way is hard. I agree with this assessment.

This plotty tool [0] seems interesting and valuable - but I'm not sure how it
relates to the problem the author talks about.

[0]
[https://github.com/jamesbornholt/plotty](https://github.com/jamesbornholt/plotty)

------
mstromb
Why would linking order affect runtime performance? Something to do with the
interaction between offsets and cache, maybe?

Would it be possible to determine ahead of time what order would maximize
performance, or would that require profiling?

~~~
rectang
I'd speculate that if you're unlucky about link order, two hot cache lines may
get mapped to the same slot in an N-way associative cache -- whereas if you're
lucky, they end up going to different slots and don't continuously evict each
other.

With regards to alignment... do linkers typically pack objects so tightly that
the start of each object isn't aligned on a cache line boundary? AFAIK they're
typically 32, 64, or 128 bytes.

~~~
fleitz
Cacheline boundary?

Probably, because caches line sizes are an implementation detail, not part of
the architectural specification.

------
peterwwillis
There are lies, damned lies, and software benchmarks.

~~~
jacques_chester
Lies, damn lies, and $100 million investments.

------
a3089268
I think the author has computer science and computer engineering confused.

~~~
scott_s
I don't think he does. What he describes is part of what myself and my
colleagues consider "computer science." We typically consider "computer
engineering" the _design_ and _making_ of hardware. But to be a systems
researcher in computer science, you must know how these things work, and be
able to reason about how they affect the software systems you care about.

