
MySQL performance optimization: 50% more work with 60% less latency variance - kilimchoi
http://engineering.pinterest.com/post/122520169079/mysql-performance-optimization-50-more-work-with
======
thaumaturgy
My jaw landed on the floor when it said that pinterest runs on AWS. AWS is
fantastic but I would've expected that the performance-per-dollar ratio is way
below having your own colo'd hardware at pinterest scale -- and would leave
you spending talent on trying to optimize your software, instead of just
throwing more hardware at your operation.

Is this common now, that pinterest-sized businesses are running on AWS?

~~~
Someone1234
It actually can make more sense for larger websites than for smaller ones, as
Pinterest can scale with demand and thus be running 50% or less VMs during
quiet periods (e.g. middle of the night).

I'm not exactly sure how you jumped from them using AWS to the assumption that
they're "throwing more hardware at [their] problems" rather than optimizing
their software. AWS or other cloud services don't really indicate a certain
priority as far as internal development, just like first party hardware
doesn't (and there are companies that throw additional hardware at problems,
just like there are companies that throw extra virtual capacity at it).

~~~
fweespeech
> It actually can make more sense for larger websites than for smaller ones,
> as Pinterest can scale with demand and thus be running 50% or less VMs
> during quiet periods (e.g. middle of the night).

You do understand we can buy things at 33% the price AWS charges, right? And
have 24/7 access?

Oh, did you know you can do the same thing at other providers for cheaper? :|

AWS is overpriced.

~~~
eli
Is it not possible that the peak load is more than 3x the average load? With
your own hardware you have to be provisioned for peak load 100% of the time.

~~~
fweespeech
> Oh, did you know you can do the same thing at other providers for cheaper?
> :|

[https://www.stormondemand.com/servers/baremetal.html](https://www.stormondemand.com/servers/baremetal.html)
32/64GB - $0.75/hr

[https://www.linode.com/pricing](https://www.linode.com/pricing) 20/64GB -
$.96/hr _and_ bandwidth

[http://aws.amazon.com/ec2/pricing/](http://aws.amazon.com/ec2/pricing/)
16/64GB - $1/hr + bandwidth costs @ [varies]

Even if you take your premise at face value, AWS is literally the most
expensive option.

Personally, I've never seen a real world use case where you dropped below 33%
capacity. In practice, you can get hourly billing for cheaper than AWS anyway
with providers large enough to meet most customers needs.

~~~
tracker1
Linode, and most alternatives, don't have S3/Blob storage as a service.. also
missing are distributed key-value storage as a service and sql as a service.

Generally, with any given service as you can scale it makes sense to run your
own instances.. but realistically someone who knows _insert-tech_ better than
AWS/Azure admins is a pretty rare breed, and having to in-source those
specific skills costs in terms of time and money, redundancy even more so.

If you have a team of 3-5 people, you can do far more with AWS or Azure than
you can most alternatives... Employees aren't free.

------
Kassandry
It sounds like they really ran afoul of the stable page writes bugs that exist
in Linux 3.2.31 to 3.9.

Thread discussing the bug and a test program to show the issue.

[https://lkml.org/lkml/2012/10/9/210](https://lkml.org/lkml/2012/10/9/210)

Patch introducing the bug:

[https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux....](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/patch/?id=3d08bcc887a1c8d12be8d81f747ffa2e8a44b67b)

Patch fixing the bug:

[https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux....](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/patch/?id=1d1d1a767206fbe5d4c69493b7e6d2a8d08cc0a0)

It causes a huge dropoff in latency and throughput by causing all filesystems
to block even when they don't need to.

In short, 3.2 is lethally bad in terms of database workloads.

------
dsheth
Would be interesting to see how this compares to RDS--i.e. perhaps all these
optimizations are already in place, or perhaps it makes sense to not use RDS,
and optimize mysql for your own workload.

------
acd
A question, with such high load, why not run it on dedicated hardware with PCI
express SSDs? You are paying around $1000-2500 per month for the instance type
for that you can get dedicated machine and lessen the total number of servers.

~~~
MichaelGG
Or use Google Cloud which will stick a 375GB PCIe SSD in your VM for $80 a
month.

~~~
campers
Yeah their SSD pricing example seemed to blow AWS away
[http://googlecloudplatform.blogspot.com.au/2015/04/understan...](http://googlecloudplatform.blogspot.com.au/2015/04/understanding-
cloud-pricing-part-2.html)

~~~
MichaelGG
Awesome.. As much as I dislike Google as a business, I love their engineering
and despite being totally biased coming on to GCP, they won me entirely in a
day. We're so, so very much happier dealing with GCP. No RAID to get perf -
just ask for more. No complexity. Simple, damn fast, and cheap.

I cannot imagine how Azure or AWS compete (esp Azure - even without making a
cheap comment about their terrible new portal) when it comes to IaaS. GCP is
just flat out better.

------
Thaxll
When I see the prices / perf that AWS offers compare to high end hosting like
Rackspace, its' crazy.

I don't know why people keep running MySQL on AWS.

~~~
pbz
Rackspace is on the high end as far as price. You can find DCs for much much
less (multiple times less).

------
rastem
This article doesn't talk about the optimizations, it just mentions they were
done. Clickbait, I'm afraid.

~~~
dragonne
The link at the end points to this slide deck, which has more specific
details: [http://www.slideshare.net/denshikarasu/all-your-iops-are-
bel...](http://www.slideshare.net/denshikarasu/all-your-iops-are-belong-to-us-
a-pinteresting-case-study-in-mysql-performance-optimization)

I recommend taking a look at it.

------
thrownaway2424
It's amazing to me that anyone still tries to run mysql with the glibc
allocator. That allocator is not suitable for any kind of multithreaded
workload. Use tcmalloc or jemalloc for everything.

