In my experience, bitmap comparison testing is pretty hard to keep going unless ...

In my experience, bitmap comparison testing is pretty hard to keep going unless you have a dedicated service for maintaining the bitmaps, updating them when needed, a nice reporting system that will shade the difference area, and people with enough time to review the results to figure out what the difference means and determine root cause. I'm sure some of that has become easier over time with off-the-shelf tooling, but since I'm still seeing bespoke systems out there doing it I doubt it's become turnkey easy yet.

It's also not something you want to do until the UI has solidified for that spot--which sometimes never happens, for some apps. It also has the issue that you can often only pixel-compare between two shots from the same rendering system--that is, same browser version if you're testing browser, same graphics drivers and rendering subsystem version if you're native, etc. Frequently that means testing against frozen reference environments that become increasingly less relevant to in-field behavior over time and are also a maintenance load to qualify and update.

When thinking of the "test pyramid" and why, for example, UI tests are at the point due to high fragility vs. low specificity vs. high effort to triage and maintain, I'd put bitmap testing at the very tippy-top on an antenna above the pyramid. They're useful, but have a large hump to set up, a long tail of rather heavy maintenance, a failure could mean almost anything under the hood, and they churn like a mofo during any flow or visual refactor whatsoever with no possible way to abstract them to mitigate that.

At that point, it's not really about the usefulness anymore, and more about the opportunity cost of not doing something different with the time. I think they're usually pretty questionable unless you're actually testing a rendering system where the bitmap is the primary output. A custom-rendered component ancillary to the application probably wouldn't meet that bar in most cases unless it were complex and central enough to merit the operational risk and stable enough to mitigate the same.