What I really would like to see in such tests is comparison of output quality. They mentioned a few times which drivers they are using, it would be also great if they would compare features sets of both. If a driver doesn’t support a certain api it might just implement a simple stub and skip some computation cycles.