
Azure Storage: How fast are disks? - ed_elliott_asc
https://www.grsplus.com/blog/2018/07/azure-storage-how-fast-are-disks/
======
partiallypro
Perhaps this is separate, but Azure has a bit of a latency problem on their
App Service Plans and the storage. I know that it has something to do with how
they are able to offer instant scaling up, down and out. And the ability to
move an app slot to another service plan instantly. However it does have some
major performance issues, it's also very slow to unzip items. Unzipping from
the terminal will end up just timing out (though the process will complete,
eventually.) This causes auto-update failures way too often. Wordpress,
Magento, etc objects load really slow for example. I hope someone at Microsoft
reads this and figures out a way to bring this latency down. It's a really
great service otherwise, in that you don't have to worry about patching your
server.

I've never had an issue with their blobs or CDN speeds, but the app service
disk latency is an issue. Specifically, in my case, with PHP apps.

~~~
educanon
We’ve had similar issues and timeouts in Node and PHP apps.

~~~
partiallypro
There is a cache setting you can put in your slot that will fix it, but the
app has to be restarted to clear the cache. It's not selective on what gets
cached...and it doesn't work well with most apps. If you could selectively add
directories that cache it would work so much better, or if it worked in some
asynchronous way if a file changed in the cache (for apps that auto-update.) I
don't know the solution, but it's an annoying problem.

If you Google/Bing for "Azure Website slow" this issue is the cause is 99% of
all the complaints. The current solution "Local Cache" is lacking. It does
work, but not in my use cases.

[https://docs.microsoft.com/en-us/azure/app-service/app-
servi...](https://docs.microsoft.com/en-us/azure/app-service/app-service-
local-cache-overview)

~~~
h1d
If something works poorly without manually applying cache, it reminds me of
WordPress. It seems it's nearly impossible to get Azure on par with the rest
in terms of performance.

------
thejosh
Azure disk speed for VMs is absolutely horrendous for the amount you have to
pay. Our workloads are so bad because of this.

~~~
nvivo
I completely agree. For some reason they're not 100% ssd and that changes
everything. That is the main reason I avoid Azure. Last time I tested (about 8
months ago) I got a new windows server instance with 4gb ram and do a windows
update. In my laptop takes 20 minutes, aws takes 40 minutes or so, azure takes
almost an entire day! Partly this is because MS doesnt keep their images as
updated as AWS, but mostly is because HDD instead of SDD as default. Not
advocating anything, just my experience.

~~~
h1d
What is the point of using Azure if the performance is way worse than
competitors?

~~~
teddyuk
is the performance worse than aws?

[https://www.datadoghq.com/blog/aws-ebs-latency-and-iops-
the-...](https://www.datadoghq.com/blog/aws-ebs-latency-and-iops-the-
surprising-truth/)

~~~
nvivo
I know what I see in my apps. Yes, Azure is worse than AWS. AWS has its flaws,
but Azure is simply very slow in comparison.

------
jen20
It would be far more useful to see the upper percentiles of latency than an
average, and to see the tests run over a sustained period of time.

As usual, the interesting information comes not during regular operation, but
during incidents, where it is not uncommon for the latency to spike
_massively_.

~~~
ed_elliott_asc
Hi,

The percentiles are in the images, I was being lazy and didn’t add them to the
text. If enough (well a few) people want them I’ll add them.

Azure storage either works for days and weeks 100% consistently or they have
an outage and are down for days so even when I have run longer tests they
don’t show any didference.

If you monitor and they have an outage the latency basically goes to a day an
iop :)

~~~
jen20
Heh - I only did a scan read, but the 99.9th percentile would be much more
useful than the average in the table.

I agree on the last part - but I think it's important to measure the effects
during outage too. One we often see is fsync operations taking minutes with
write caching disabled - this is more common than just during outages however!

~~~
ed_elliott_asc
Leave it with me and I’ll get some longer runs

------
he0001
I ran some quick tests on Azure with an app which relies heavily on fast write
ops. On a dedicated server with SDD this is very fast in the range of <1ms (in
relation negible latency). But on Azure there were random spikes of >1s
writes. I haven’t tested any more but will do some more thorough tests.

------
cm2187
> HDD uncached read latency: 11ms, SSD: 4.5, local SSD: 1ms

I am surprised by the poor performance of the remote SSD. I was under the
impression that the ip latency within the same datacentre would be <1ms. Does
anyone know what is causing this?

~~~
nik736
We even have two completely different datacenters within Frankfurt, connected
through redundant waves that are most of the time < 0.6-0.8ms

~~~
ed_elliott_asc
This is a good point, I’ll replicate some of the tests across regions to
compare, I expect there will be a difference, maybe within datacenters too

------
CyanLite2
A lot of FUD in this thread here.

If you're using the free tier of App Service plans, yes you're going to have
perf limitations. And that's not Azure specific.

If you choose hard drives that are HDD and not SSD, you're going to have perf
limitations. That's not Azure specific.

The costs for SSD is about the same between AWS/Azure/GCP.

------
what-the-grump
I’ve pulled 40k iops across 3 disks in azure running ssd. The new managed
disks are fast enough for our workloads. Yes they cost money, but looking at
overall cost disk is ~15-18% of total spend for us.

There are some bugs with the new archive tier and some weird behaviors for non
ssd disks when you start forcing high queues.

~~~
thejosh
What sort of monthly spend?

~~~
ed_elliott_asc
We stripe two sets of 4 P30's to get up to 2,000 MB/sec on GS 5's - storage
costs about £800 per vm per month for that.

