This really depends on where you set the baseline standard. You may see the wors...

This really depends on where you set the baseline standard.

You may see the worst-performing variants of a given model as defining the baseline, with some being above the baseline.

But it's perfectly legitimate to define the baseline at the best-performing variants of a given model, with some unfortunately being below the baseline.

Personally, I prefer to set expectations high, and define the baseline based on the best that has been achieved so far. The worst-performing variant from 2014 should still be expected to exceed, or at the very worst be equal to, the best-performing variant from 2013.