The point of multi-armed bandit situations is that there is a trade-off to be made between gaining new knowledge and exploiting existing knowledge. This comes up in your charts - the "MAB"s always have better conversion rates, because they balance between the two modes. The "A/B testing" always gain more information quickly because they ignore exploitation and only focus on exploration.
I should say also that multi-armed bandit algorithms also aren't supposed to be run as a temporary "campaign" - they are "set it and forget it". In epsilon-greedy, you never stop exploring, even after the campaign is over. In this way, you don't need to achieve "statistical significance" because you're never taking the risk of choosing one path for all time. In traditional A/B testing, there's always the risk of picking the wrong choice.
You aren't comparing A/B testing to a multi-armed bandit algorithm because both are multi-armed bandit algorithms. You're in a bandit situation either way. The strategy you were already using for your A/B tests is a different common bandit strategy called "epsilon-first" by wikipedia, and there is a bit of literature on how it compares to epsilon-greedy.
Also, despite this being a slightly pro a/b testing post, I have to say it's actually made me more interested in trying out Myna's approach MAB algorithm.
/former AB test software dev who fought my users to try to stop them from misinterpretation results, and failed.
Acceptance of this fact has avoided a lot of potential ulcers for me.
See my longer top-level comment for some of the trade-offs.