Tracks that I’ve primarily seen A/B tests used as a mechanism for gradual rollout rather than pure data-driven experimentation. Basically expose functionality to internal users by default then slowly expand it outwards to early adopters and then increment it to 100% for GA.
It’s helpful in continuous delivery setups since you can test and deploy the functionality and move the bottleneck for releasing beyond that.
If you roll it back upon seeing problems, then you're doing something meaningful, at least. IMO 90+% of the value of A/B testing comes from two things, a) forcing engineers to build everything behind flags, and b) making sure features don't crater your metrics before freezing them in and making them much more difficult to remove (both politically and technically).
Re: b), if you've ever gotten into a screaming match with a game designer angry over the removal of their pet feature, you will really appreciate the political cover that having numbers provides...
Not the parent but some actual practitioners. A change is based on the gut feeling, and it's usually correct, but the internal politics require to demonstrate impartiality, so an "A/B test" is run, to show that the change is "objectively better", whether statistics show that or not.
I think gradual rollout can use the same mechanism, but for a different readon: avoiding pushing out a potentially buggy product to all users in one sweep.
It becomes an A/B test when you measure user activity to decide whether to roll out to more users.
It’s helpful in continuous delivery setups since you can test and deploy the functionality and move the bottleneck for releasing beyond that.