> It's almost certainly best just to compare TFLOPS - also a bit of a slippery concept, as that depends on the precision
Agreed. Some quibbles about the slipperiness of the concept.
flops are floating point operations. IMO it should not be confusing at all, just count single precision floating point operations, which all devices can do, and which are explicitly defined in the IEEE standard.
Half precision flops are interesting but should be called out for the non-standard metric they are. Anyone using half precision flops as a flop is either being intentionally misleading or is confused about user expectations.
On the other side, lots of scientific computing folks would rather have doubles, but IMO we should get with the times and learn to deal with less precision. It is fun, you get to make some trade-offs and you can see if your algorithms are really as robust as you expected. A free 2x speed up even on CPUs is pretty nice.
> and also whether the application can make use of the sparsity feature
Eh, I don’t like it. Flops are flops. Avoiding a computation exploiting sparsity is not a flop. If we want to take credit for flops not executed via sparsity, there’s a whole ecosystem of mostly-CPU “sparse matrix” codes to consider. Of course, GPUs have this nice 50% sparse feature, but nobody wants to compete against PARDISO or iterative solvers for really sparse problems, right? Haha.
They don’t have much application outside ML, at least as far as I know. Just call them ML ops, and then they can include things like those funky shared exponent floating point formats, and or stuff with ints.
Or they could be measured in bits per second.
Actually I’m pretty interested in figuring out if we can use them for numerical linear algebra stuff, but I think it’d take some doing.
Agreed. Some quibbles about the slipperiness of the concept.
flops are floating point operations. IMO it should not be confusing at all, just count single precision floating point operations, which all devices can do, and which are explicitly defined in the IEEE standard.
Half precision flops are interesting but should be called out for the non-standard metric they are. Anyone using half precision flops as a flop is either being intentionally misleading or is confused about user expectations.
On the other side, lots of scientific computing folks would rather have doubles, but IMO we should get with the times and learn to deal with less precision. It is fun, you get to make some trade-offs and you can see if your algorithms are really as robust as you expected. A free 2x speed up even on CPUs is pretty nice.
> and also whether the application can make use of the sparsity feature
Eh, I don’t like it. Flops are flops. Avoiding a computation exploiting sparsity is not a flop. If we want to take credit for flops not executed via sparsity, there’s a whole ecosystem of mostly-CPU “sparse matrix” codes to consider. Of course, GPUs have this nice 50% sparse feature, but nobody wants to compete against PARDISO or iterative solvers for really sparse problems, right? Haha.