* Able to port (“hipify”) most CUDA code to HIP with the hipify-pearltool
* Manual porting required for optimal performance in several critical sections
* Competitive performance for both scientific and machine learning codes
* Debugging and profiling tools will be key for future work
* Initial results are promising, the overall environment needs to mature
Traditionally, power measurements are done using something like Kill-a-Watt. There's no guarantee that AMD's and NVidia's power-draw functions are calibrated against each other.
With that being said, its cool to see that MI50 is competitive against A100 in scientific compute. Both A100 and MI50 are on TSMC's 7nm node, so that is a pure architecture vs architecture competition.
ROCm is definitely less stable and has fewer features than CUDA. However, since so much code is in CUDA, its good for AMD to provide a porting framework to their GPUs.
In my experience, the hardest thing about ROCm is the lowered expectation compared to NVidia GPUs. In particular, AMD only supports a select number of chips on ROCm, mostly coinciding with AMD's "MI" line of cards.
Fiji, Polaris, Vega, and Vega 7nm (aka: Rx Fury, Rx 580, and Vega64 / Radeon VII). As soon as you deviate from these cards and use say a 5700 XT, ROCm compatibility plummets and you start seeing issues.
In contrast, NVidia supports even a low-end 1660 pretty well. Fortunately, the RX 580 has dropped down to nearly the price of a 1660, but if you make the mistake of buying a 550 instead (or a 5500 XT), things suddenly suck.
With enough research, a computer builder can get an RX 580 and build out a cheap dev-box for ROCm. But this "trap" of not supporting all hardware has probably made AMD lose a few customers.
With that being said, it seems like Flops-for-flop, the AMD cards offer superior price/performance. MI50 is very similar to Radeon VII (MI50 simply has the card-to-card communications as well as full-speed double-precision). As such, you can get an MI50-like performance with a $700 Radeon VII (so long as you don't care about double-precision).
AMD is upfront about which GPUs they support ("The MI-line"). They however need to be more clear on the consumer-cards that match the MI-line, and show sample computer builds for the hobbyist community.
These tests used V100 (TSMC 12nm), not A100.