Does anyone understand why such minor upgrades resulted in a major version bump? Is this some sort of stability check point? Or some other versioning convention?
It supports a whole new architecture (Ampere) and all the good stuff that comes with it: Multi-Instance GPU partitioning, new number formats (Tfloat32, sparse INT8), 3rd gen of Tensor Cores, and asynchronous copy/asynchronous barriers. These are huge features.
Well, I think a new microarchitecture means a major bump. So between that and version bumps to to actual major software features, you get to 11 within 13 years or so.
Also, GCC 9.x compatibility may seem minor to some, but is significant for others. I also think there's some C++17 support in kernels - that's something too.
Ooh, I missed those. Support for C++17 is pretty major. Thanks. Perhaps my memory is fuzzy, I just remember the CUDA 9->10 switch having some significant (but not major) performance and feature changes.
I've got confused for a sec on "removing Pascal support", as some of the 10XX GPUs are only 2 years old. Looks like Pascal stays, and Maxwell is removed indeed (it was deprecated in 10.2).