

Ask HN: Dealing with a huge MySQL database - help - webvet

We have been dealing with a huge MySQL database on one of our projects.<p>Total byte size: &gt; 150GB, Record count: &gt; 100M (and growing daily)<p>Primary issue is of performance (this is a live site), currently hosted on a (fairly powerful) VPS.<p>Not having had in-house prior experience with such data volumes on MySQL, we hacked our own table partitioning architecture of sorts and also optimized the queries, table structures etc. to the max of our abilities. All this has been working reasonably well for us so far.<p>But now, with rising traffic levels, things are beginning to slow down again.<p>At this point, we&#x27;re considering the following 3 options:<p>1) Throw more hardware resources at it 
Pro: Instant gratification :)  Cons: Cost + vicious cycle(?)<p>2) Try to optimize the database&#x2F;config, table structures and queries further 
Pro: Always useful   Cons: Time + Effort<p>and<p>3) Migrate to PostgreSQL (or something else?)
Pros&#x2F;Cons: Still studying<p>What do HNers with experience in such matters suggest? We&#x27;re open to all suggestions. Thanks in advance.
======
debaserab2
There is no silver bullet.

1) Analyze your indexes on all tables. Start with the high volume insert/read
tables.

2) Look at your slow query log. Pick the slowest three queries. Optimize them.
Setup a recurring meeting with your team to do this every week.

3) Start looking at what data can be flattened and is less important and could
be stuck in a key value datastore.

4) Consider loading a replica slave server. Off load some of high volume read
queries to the replica. Also serves as redundancy.

5) Consider where it may make sense to move tables to InnoDB. Table-level
locking can be extremely time consuming when traffic is high. Row-level
locking can help here.

6) Figure out what your hardware upgrade path looks like. You will have to do
this eventually. Identify when you might hit a hardware ceiling with your
hosting provider and get a plan in place. Coordinate with your marketing/sales
team and upgrade before you hit a traffic peak, not during.

~~~
webvet
Thanks. 1) and 2) - we've done to the max. 4) already in place, with some
differences.

~~~
debaserab2
Unless your development is completely frozen, #2 is never done

~~~
webvet
Development froze almost 2 years ago :) Since then, just the data volume keeps
growing.

~~~
cellover
Wouldn't it be interesting to unfreeze the development and see if you can't
refactor to avoid unnecessary queries? Code topology can heavily affect the
amount of queries.

~~~
webvet
What I meant was feature development... that's what got frozen long ago...
code revision exclusively for performance optimization has been ongoing
continually.

------
charlesju
I have a little bit of experience dealing with large datasets and MySQL. This
is going to be a little counter-intuitive to standard startup culture advice,
but generally, the engineering effort to optimize databases past the initial
live flow is not worth throwing money at the problem. You have to weigh the
fixed cost of the optimization and the variable cost of the upkeep of the
system against simply patching the system for now.

If you are willing to throw money at the problem, which you seem like you're
wavering on, here are some options on how to throw money at the problem:

1\. If you are willing to be hands on and maintain your MySQL cluster
yourselves, you should move away from EC2 (I am presuming) and to a local
collo where you can install super fatty SSD computers with massive RAID setups
that will greatly improve your performance per server (on the magnitude of up
to 10x)

2\. You can do a hybrid solution with your existing infrastructure. It sounds
like a lot of the data is not necessarily going to be used that much (tracking
all events probably, from what I am assuming, or something big data intensive
like that), you can throw that into a separate database infrastructure.

If my assumption is correct, I am presuming you are using a large portion for
the database for stat tracking, and not user information, then the "proper"
way to throw it somewhere is to put it into S3 as fat logs that you can later
use something like map reduce to process. Other options for more efficient
storage are Redis and MongoDB.

3\. We have taken this to the extreme and actually moved all of our live data
to DynamoDB. It is a NoSql database storage system operated by Amazon. This
has let us concentrate on features and the user experience. The cost is our
database is probably 5-10 times more expensive than a self rolled Redis SSD
backed equivalent hosted at our local collo.

\------

Just some other general database scaling advice for MySQL. The steps to
scaling your MySQL database for live use are as follows:

1\. Do not use any relational calls on your data. Relational calls are useful
for databases that aren't used to manage live web transactions as it can
seriously back up your system.

2\. Put indexes on all of your heavily used columns for finding data.

3\. Look at the slow queries and optimize those queries out.

4\. Call up Percona and ask for a full consultation to make sure your configs
and systems are set up properly for MySQL.

[http://www.percona.com/products/mysql-
consulting/overview](http://www.percona.com/products/mysql-
consulting/overview)

5\. Email me if you have more questions charlesju gmail

~~~
webvet
Hey, I'll have to check with the others here, but I think we have not been
aware of 1. !!! Will certainly investigate this ASAP - thanks.

2\. and 3. - already implemented to the max (of our abilities :)).

------
a3n
"Not having had in-house prior experience with such data volumes on MySQL ...
At this point, we're considering the following 3 options:"

4\. Hire or contract a consultant/expert. Part of their assignment could be to
teach you.

~~~
webvet
Hey, thanks :) We are actually considering this, but it skipped my mind while
making the post.

~~~
king_magic
This is probably the right answer. Look at it from a cost-benefit POV - how
much money will getting the right answers _now_ save you in the long run?

------
geophile
What exactly is getting slow? Inserts? Updates? Queries? Which ones? Have you
looked at the slow query log? Have you checked the plans for the slow queries?

Are these MyISAM tables or InnoDB?

150GB and 100M records is not huge, not even close. You are looking for a
quick fix and there isn't one. You need to start with (2).

~~~
webvet
> 150GB and 100M records is not huge, not even close.

Close to what, if I may ask?

~~~
nfailor
not even close to 'huge,' although my english syntax is a little rusty.

~~~
webvet
Regardless of your English syntax (rusty or not), what would you define as
'huge' for a MySQL app?

------
ainsej
Caching is definitely a must have, both an external cache (Memcache, Redis)
and tuning MySQL's own caches in the configuration, using InnoDB and MySQL
means you can configure a large buffer in the MySQL config to speed things up
considerably

Disk speed could also be a huge issue, especially on a VPS where resources are
shared between many virtual machines, consider migrating your database off to
a dedicated server (Preferably one with an SSD), depending on where you are in
the world OVH might be a good choice.

------
netaustin
Here are some guiding questions:

1) Do you actively need all 100M records, or is there a period after which you
can archive them?

2) Have you partitioned data across multiple servers, or just multiple tables
on one server?

3) What are the nature of your expensive queries? Are you generating reports?
User dashboards?

I think a general idea of what your application is doing would help us provide
some guidance.

~~~
webvet
1) Yes, and no 2) Both (3 servers) 3) We've analyzed/optimized this to the max
(of our abilities)

------
mjhea0
In my experiences, the size of the database nor the number of records really
matter. It's the number of queries that you are handling at one time that can
really affect performance.

Sure, if you have a single table with a ton of records this can slow things
down - but not nearly as much as dealing with a high number of queries.

I highly recommend this book -
[http://www.amazon.com/dp/0596101716/?tag=stackoverfl08-20](http://www.amazon.com/dp/0596101716/?tag=stackoverfl08-20)

Yes, caching can really help. Logically placed indexes can too. Switching to
postgres will not help. The bottleneck will still be there. The key is to find
that before you start tuning/optimizing:
[http://www.singlehop.com/blog/analyzing-mysql-performance-
an...](http://www.singlehop.com/blog/analyzing-mysql-performance-and-
bottlenecks/)

Good luck!

~~~
webvet
Will get the book and check out the blog soon. Thanks. :)

------
WestCoastJustin
I would not migrate to postgres without knowing the issue first. Here are some
suggestions: Enable replication master/slaves+, assuming you have a read heavy
architecture. Enable the " _The Slow Query Log_ " [1], which will give you
something like /var/log/mysql-slowquery.log, then you can look in here to find
queries that take a long time. Find out what the bottleneck is, I'm willing to
best you have limited I/O and that is the source of your issues, can you use
memcached [2] to limit the database hit rate? You need to remove/limit direct
access to the database when people get duplicate data out.

Before you start to optimize I would profile things to get a baseline for how
many select/inserts you are doing and how long they are taking, and system
load, etc. Based off your 100M statement, you are doing roughly 1157.4
inserts/s. Maybe that's 250 inserts/s during a slow time and 3,000 inserts/s
at peak, but it would be nice to know.

    
    
      100,000,000 (inserts) / 86,400 (seconds/day)
      = 1157.4 (inserts/s)
    

I guess it all depends on the insert size and index but you should be able to
scale this. If you cannot then you need to partition/shard your data. After
reading all this, you are probably thinking, I should just throw more hardware
at the problem ;)

ps. Have you stopped to ask if you really need all this data?

[1] [http://dev.mysql.com/doc/refman/5.6/en/slow-query-
log.html](http://dev.mysql.com/doc/refman/5.6/en/slow-query-log.html)

[2] [http://memcached.org/](http://memcached.org/)

~~~
webvet
[1] Been doing already and [2] Will re-visit soon

Thanks :)

------
apalmblad
Hmm.

Bringing in a consultant is probably your best bet. Database optimization
needs to generally take into account YOUR data and usage patterns.

If you want to learn a bit yourself, pick up a copy of "High Performance
MySQL." I've found that to be the most useful MySQL scaling book I own.

Figure out your slow queries. Run "SHOW PROCESSLIST" on the server and see
what's typically running.

Look at the actual queries - if you're frequently running a query that looks
through most rows of your biggest table, that's going to be a tough query to
optimize. Instead, look to either caching or regular pre-computation of the
results.

I've found that an unfortunate number of people aren't aware of "EXPLAIN" and
its use to help figure out query issues. Learn and use it.

You're on a VPS. That may be fine, but it may have horrible I/O throughput. If
you're writing a lot of data, or having queries that hitting some sort of
mysql or OS cache, IO will be your bottleneck. Make sure it's fast enough.
Look at average IO wait times. Test max. throughput.

MyISAM tables may be an issue. MyISAM tables use table level locking - only
one session at a time will be able to update the table. This is quite possibly
a problem if you're doing any updates or inserts to a table that is also
frequently read from. Look to move to InnoDB soon. See:
[http://dev.mysql.com/doc/refman/5.1/en/internal-
locking.html](http://dev.mysql.com/doc/refman/5.1/en/internal-locking.html)

~~~
webvet
>Database optimization needs to generally take into account YOUR data and
usage patterns.

I'd say this is quite true.

>I've found that an unfortunate number of people aren't aware of "EXPLAIN" and
its use to help figure out query issues. Learn and use it.

Yes, we discovered this some years ago.

Thanks for you input.

------
janlin1999
It sounds like your database is roughly the same size as the database that we
work with, and MySQL performance has been a recurring issue for us. Assuming
that you've done a reasonable job of option 2, I gravitate towards option 1
and maybe option 4 (depending on how confident you are of your team's
technical capabilities and how expensive consultants are relative to
additional hardware).

These are intermediate steps that have helped, giving us some time to figure
we whether want to migrate to a different database:

1) We started using InnoDB a few years ago.

2) One of our common queries joined parts of multiple large tables; we
essentially cached the useful parts of the join and put the results into a
memory table.

3) Our non-production server is under a lot of load from our back-end
processes, and we decided to move the database to a RAM disk, which greatly
improved performance. It's important to note that, while inconvenient, a power
outage is not catastrophic in our case (e.g. we don't deal with financial
transactions); we do daily backups to non-volatile memory. For around $5,000,
you can build your own server that has over 250 GB of RAM and then put it into
a co-location facility.

We seriously considered migrating to Postgres, but #3 has bought us some time
on that front. It might be that we eventually still migrate, but we prefer to
do it after the company is on better financial footing. It might also be that
while we wait, other less painful options open up over time (e.g. maybe
MariaDB will suffice).

I hope the situation resolves well.

------
mattjaynes
Consider that 37Signals probably has a much larger dataset and is still
happily scaling vertically due to Moore's law.

[http://highscalability.com/blog/2012/1/30/37signals-still-
ha...](http://highscalability.com/blog/2012/1/30/37signals-still-happily-
scaling-on-moore-ram-and-ssds.html)

It's probably the cheapest, quickest, and least risky option if your
engineers' time is expensive (which it most likely is).

Of course, you'll ultimately also want to find bottlenecks and tune the DB and
cache the hell out of everything, but scaling vertically will at least buy you
some good time so you can do that in a more relaxed time-table.

Related links:

[http://37signals.com/svn/posts/3202-behind-the-scenes-the-
ha...](http://37signals.com/svn/posts/3202-behind-the-scenes-the-hardware-
that-powers-basecamp-campfire-and-highrise)

[http://37signals.com/svn/posts/3090-basecamp-nexts-
caching-h...](http://37signals.com/svn/posts/3090-basecamp-nexts-caching-
hardware)

[http://37signals.com/svn/posts/3112-how-basecamp-next-got-
to...](http://37signals.com/svn/posts/3112-how-basecamp-next-got-to-be-so-
damn-fast-without-using-much-client-side-ui)

------
morgo
Hi!

There are some good suggestions in this thread already. Step #1 is to
determine why it is slow.

If it is slow under load, then I am going to suspect that you are hitting
table locks - which means you need to migrate to InnoDB (row locking + MVCC).
Important to point out: a bigger server may not actually help enough here,
because while you are waiting on locks nothing can be done while there is free
capacity waiting to be used.

If it is just generally slow related to growth, it can probably be improved by
indexing - which will help you reduce what data needs to be in RAM.

In either case, two third party tools to suggest: \- pt-query-digest:
aggregate your slow query log. You want to use 0 seconds as the threshold,
record 20 minutes of slow queries during regular activity. Methodology
described here: [http://gtowey.blogspot.ca/2012/07/slow-query-log-is-
not.html](http://gtowey.blogspot.ca/2012/07/slow-query-log-is-not.html)

\- pt-online-schema-change: you can migrate from MyISAM to InnoDB online via
triggers (not online by default).

Both tools part of [http://www.percona.com/software/percona-
toolkit](http://www.percona.com/software/percona-toolkit)

~~~
webvet
We did use a tool (perl script) called mysqltuner.pl on this sometime last
year and it did give us some useful/actionable insight. Will check out what
you've suggested too.

Thanks. :)

------
tdonia
Inline with questions 2 & 3 - what's the bottleneck? reading or writing or
both? if you haven't already, it's worth looking into a master (writes) /
slave (reads) setup - provides a bit of redundancy and lets you scale reads
very easily by adding more slaves. It also helps with writes because it can
take a lot of pressure off the master.

...assuming you're webvet.com it looks like you're scaling drupal, so these
would be my first questions: are you using the views module? if so, kill that,
as well as anything else by merlin (panels, etc). and look at your indexes.
setup pressflow + percona. master & slave db. reverse proxy caching (varnish
or nginx) are all good places to start. hostwise if you're not already on
amazon, you can get a lot of mileage out of a service like voxel that lets you
mix VPSs (cheap webheads) with physical hardware. it's not great and it may
well be worth biting the more expensive bullet and going to AWS for long term
needs, but it's a lot better than linode for an underwater db.

my email is in my profile if you'd like more detail - drupal is a bear at
first but it can be scaled for a while.

~~~
webvet
Thanks for your input - and no, we're not webvet.com

~~~
tdonia
ah ok, cool - good luck!

------
IgorPartola
One of the less documented tuning parameters I found is the threa dcache size.
Normally it is a very small value for the number of simultaneous connections
you likely support. Increasing it can result in elimination of random
pauses/slowdowns.

In general, it is impossible to "tune MySQL" though. You can allow it to use
more system resources and in some cases you can get it to be less durable to
gain write performance, but to get orders of magnitude increases you need to
restructure your data. Try sharing or partitioning data. Increase or reduce
the number of indecies. Denormalize data. Archive logs to flat files. Put the
OS, /var/lib, and /var/log on separate physical drives. Run MySQL on bare
metal since disk IO is often a bottleneck and virtualization can add overhead.
Restructure the data. At one point a write heavy app I was optimizing was
logging things to the database (that needed to be queries). Each log entry
contained extra data we did not query. One speed up that worked was throwing
the majority of the extra data into a single gzipped JSON blob. This made it
easier for MySQL to fetch fewer pages. Less IO, faster operation.

~~~
webvet
I believe we did tune and tweak the heck out of my.cfg, but will investigate
thread_dcache_size all the same.

I also believe we're archiving at least some logs (and such) to flat files
already, but will re-visit this soon too.

Thanks for all your input.

------
Demiurge
Have you considered rewriting MySQL in Go?

jk. What is the real nature of the queries, what kind of caching is being
used? MySQL should not really get hit that often if you have 150GB of data and
a site, unless it's not getting content but running some sort of deep
processing. In such a case MySQL seems like the wrong tool for the job.

------
nla
Without doing an analysis, which is really warranted, here are some coarse
grain recommendations:

1) Get off VPS if you can. If you can't check out linode.com; they offer
better price/performance than most cloud services.

2) Take a look at the Percona build of MySQL - it is faster, especially with
more cores.

3) Use in memory tables if possible

4) Switch to InnoDB or XtraDB (in the case of Percona) tables... MyISAM
implements table level locking, whereas InnoDB and XtraDB implement only row
level locking.

------
patrickg_zill
My first recommendation would be to get a dedicated server with a decent
amount of RAM (32GB or more) and put MySQL on that.

1\. Greater memory bandwidth 2\. dedicated disk resources 3\. you can ensure
that extra RAM is used as read and write cache.

I am not the cheapest provider by far, and I rent these kind of systems for
$200 a month. If you go with someone else the price could be under $100 a
month.

------
gauravgupta
Cleanup - Probably one the most effective but often ignored tips for DB
management. Do you really need all that data in real time? Are you querying
all of it in your app? If not, it makes sense to archive parts of it to
another disk. The size reduction would boost performance tremendously.

~~~
webvet
>Do you really need all that data in real time? Are you querying all of it in
your app?

Yes and yes. ALL the data records could potentially be queried via site
navigation.

------
CWIZO
Do you have any sort of caching in your app? Memcache, for instance, can go a
long way in helping you scale.

~~~
webvet
We don't have Memcache (or anything similar) presently, though I do believe we
looked at it sometime last year. Will get a re-check done on this - thanks. :)

~~~
ks
You should look into caching. If you are talking about webvet.com it looks
like a perfect site for using something like Varnish. Most of the content
probably doesn't change often so you should be able to have a long TTL.

------
benjaminwootton
Option 4) would be to hire an expert in these matters.

100M rows is not a lot of data nowadays.

~~~
webvet
Thanks - 4) has been on the cards, I just forgot to mention it.

------
lotsofcows
Not particularly large for MySQL. Consider using master / slave to separate
reads and writes, cache heavily and consider sharding if appropriate.

Alternatively, as suggested elsewhere, give Percona a ring.

------
Joyfield
You give us two values (150GB and 100M+ rows). That is not enough to give you
ANY suggestions other than give us more info.

~~~
webvet
Please let me know what more info would be helpful and I'll be happy to
provide the same.

~~~
amirouche
are queries uniform over the data. For instance most of queries conerns only a
little fragment of one table for instance most recent rows. Can you shard them
?

------
petervandijck
Call Percona first. $10K well spent.

Then (probably, and assuming you have the basics right) just throw hardware at
it.

------
sgaither
How slow is "slow" for you right now? And writes vs. read? Which MySql storage
engine are you using?

~~~
webvet
Let me try to answer: Slow - as in average Top loads (on CentOS) staying
between 1 and 2 pretty much throughout the day; closer to 2 rather than to 1,
peaking ALL the way up to 10 and even beyond several times a day, for periods
as long as 15 to 30 minutes.

Storage engine: MyISAM

~~~
viraptor
Do you have some other measurements? slow_log, queries without indexes, low
key cache usage? Load is just one metric and can be very deceiving. Do you
have any backup system which hits the disks at the same time, batch jobs, or
something similar? Depending on your workload, "load" can vary - it just means
the writes are being queued up.

It's definitely not a good sign, but try to get more specific. I've seen
servers doing heavy network I/O with "normal" load over 5 times the number of
cores.

Also, unless you're doing loads of selects and very few modifications, you
could gain a lot by switching to InnoDB, or XtraDB, rather than MyISAM.

~~~
thesis
Have to second this. You should monitor load for sure... but depending on how
many cores you are running a load of 1 or 2 could be absolutely nothing in the
grand scheme of things.

Personally, the most important metric for us is average query response time.

Additionally as others have stated where you can use memcache or some other
key/value store you should be working on implementing that.

------
masterkain
stackoverflow -> this way

