I know that at least for the Jetstream benchmark, higher scores are better (ie they're not just time measurements). If you make an assumption that makes the parent comment contradict itself, it seems like it would make sense to investigate your assumption instead of thinking the parent comment is contradictory.