Maybe this is just me, but those performance claims seem a bit dubious. Or rather the visualisation is suspect. While it's clear that DoH performed better in the cases where DNS had a hiccup I can't say with certainty DoH didn't have similar hiccups that were hidden by the way they presented the data.

In fact, assuming there's no deep meaningful dependence between the performance of DoH and DNS requests then ordering by the performance of DNS requests clusters all the bad DNS requests and shuffles the DoH requests randomly. It's therefore not surprising that after averaging you see a spike of bad DNS requests which is absent for the DoH requests (you're essentially looking at the quantile function of the DNS requests minus the average of the DoH requests).

Edit: unless what we're looking at is the difference of both quantile functions, but then the language describing the graph is a bit confusing and just plotting both quantile functions would have saved a lot of confusion.

