
Fibs, Lies, and Benchmarks (2019) - kgwxd
http://wingolog.org/archives/2019/06/26/fibs-lies-and-benchmarks
======
nemo1618
>Friends, it's not entirely clear to me why this is, but I instrumented a copy
of fib, and I found that the number of calls in fib(n) was a more or less
constant factor of the result of calling fib. That ratio converges to twice
the golden ratio, which means that since fib(n+1) ~= φ * fib(n), then the
number of calls in fib(n) is approximately 2 * fib(n+1). I scratched my head
for a bit as to why this is and I gave up; the Lord works in mysterious ways.

We can model the number of calls with a doubly-recursive function, much like
fib itself:

    
    
      calls 0 = 1
      calls 1 = 1
      calls n = 1 + calls(n-1) + calls(n-2)
    

However, it's easier to work with this if we split into three parts: the base
case of fib that adds 0, the base case that adds 1, and the recursive case. If
you examine the number of each type of call, you get the following table:

    
    
      add 0:   1, 0, 1, 1, 2, 3,  5 ... = fib(n-1)
      add 1:   0, 1, 1, 2, 3, 5,  8 ... = fib(n)
      recurse: 0, 0, 1, 2, 4, 7, 12 ... = fib(n+1)-1
    

So the total number of calls is fib(n-1) + fib(n) + fib(n+1) - 1. Since
fib(n+1) = fib(n) + fib(n-1), we can express the sum as 2 x fib(n+1)-1.

And of course, there's an OEIS entry for this sequence:
[http://oeis.org/A001595](http://oeis.org/A001595). I notice that one of the
definitions given there is "odd numbers whose index is a Fibonacci number,"
which matches our definition nicely, since you can define the set of odd
numbers as 2*(n+1)-1.

------
0xff00ffee
I've encountered this lazy trope for over 30 years working with performance
analysis.

It always comes from one direction: engineering.

Benchmarks have multiple audiences and multiple uses. What servers one
customer (perhaps a microarchitect tuning a pipeline) does not serve another
(a company building a product that has to choose a particular component) nor
another (a professor looking at historical trends).

Of course if you try to turn a screw with a hammer it's not going to work, so
choose the right benchmark for your analysis.

~~~
gammadens
I agree, but I also think a comprehensive set of standard benchmarks can be
useful to get an overall sense of how language instantiations perform and how
things change. It's fairly clear that in general some language implementations
are much slower across the board than others, even if for many other
comparisons the distinctions are fine or depend on domain.

My overall sense is that there's been a pull back from general benchmarking
compared to say, 15 years ago, and it's unfortunate, because it leaves the
benchmarking to developers of languages, compilers, and whatnot. This provides
an opportunity to show of the best-case scenarios for the languages, but also
for them to hide the areas of weakness -- and those hidden areas are often the
mine traps for those deciding whether to invest resources in a new language.

Having a standard, comprehensive set of problems helps address this "hiding."
I also think there's value in naive benchmark programs as well as "expert"
tuned ones: not everyone is going to optimize every single scenario in every
language.

The one thing I've never seen implemented well is some measure of "ergonomics"
or "high-level" versus "low-level" aspects of a language, which also seems
important to me. Some of that is going to be subjective but some of it not.

------
AtlasBarfed
"The microbenchmark is no longer measuring call performance, because GCC
managed to reduce the number of calls. If I had to guess, I would say this
optimization doesn't have a wide applicability and is just to game benchmarks.
In that case, well played, GCC, well played."

Loop unrolling and/or equivalent in recursion is a very basic optimization.

It's weird that an optimizing compiler optimizes away something that he is
somewhat arbitrarily benchmarking, and he declares it something specifically
targeted at only improving artificial benchmarks.

"In Guile you can recurse however much you want."

Sooo... if your recursion depends on accumulated stack data, which is what
most non-tail-recursive recursions want to do... can you recurse however much
you want? This seems like a disingenuous statement. Every recursion algorithm
cannot be magically refactored to perfectly tail recursion.

When I was a wee CS student, I heard that any recursion algorithm can be
converted to a loop. With puzzlement I asked the prof who said that about
something that pretty clearly needed stack-state, and he further qualified it
with "well, if you have a stack datastructure in the loop."

There seems to be pervasive overclaims here, and narcissistic self-focus.
Doesn't inspire confidence in the language.

------
jxy
Somebody should update this R7RS benchmark, [https://ecraven.github.io/r7rs-
benchmarks/](https://ecraven.github.io/r7rs-benchmarks/)

Though even if it is currently 3 times better than version 2.2.6 on that
benchmark, guile is still far away from the top tier.

