
Tuning Your DBMS Automatically with Machine Learning - blopeur
https://aws.amazon.com/blogs/ai/tuning-your-dbms-automatically-with-machine-learning/
======
falcolas
If you are on MySQL RDS, the first thing you need to do is change the
INNODB_LOG_FILE_SIZE to something reasonable, like 2 gigs. I have preemptively
stopped at least 5 vertical scaling requests with this one change.

The default of 128mb is plain stupid. I get why Amazon chose it, because it
directly eats 2x the value of your backing store - something that can be hard
to explain to customers with 4gb disks attached and not really running any
appreciable load through it.

But when you have 100+ gig disks allocated on a 2xlarge instance, the small
value makes no sense whatsoever.

~~~
tudorconstantin
At my previous employer, a small (back then, 10x larger today) outourcing
company, I saved a customer's project by increasing innodb_buffer_pool_size
from 8MB to 2GB. As a context, it was back in 2010 when MySQL had MyISAM as
the default storage engine and I think 99% of the tutorials on tuning MySQL's
performance were focused on MyISAM, while the customer had most of its tables
as innodb. As a bonus, the value for innodb_buffer_pool_size was expressed in
bytes, showing a large, 7 digits, number. Before my arrival there all the
senior devs tried to improve the performance of that db. The customer bought a
server with 8GB of RAM and dedicated it solely to the db, while the db was
about 4GB in size. It was useless. When traffic to the web app increased, the
db daemon started paginating and the server load increased to a point of 0
responsiveness. There were simple queries on indexed fields in tables with
less than 1k records which took 8 seconds to execute.

It took me about 2 weeks of internet scavenging to find the culprit, which was
just the innodb_buffer_pool_size size. That was my first commercial project I
worked on for the employer and it got me the reputation of THE db performance
tunning expert in the company. I was with them for 5 years and that was a
quick&easy way to position myself as a valuable team member.

------
gerdesj
On GitHub: "No description, website, or topics provided."

In the article: "and collects its Amazon EC2 instance type and current
configuration"

... and I switched off.

I recently diagnosed a MySQL latency snag on a well known cloudy platform for
a customer. I run rather a lot of comparative bonnie++, MySQL bench and Lord
knows what else. I was able to convince the customer that my office PC ran
MariaDB better simply because my single SSD on a rather shag Lenovo PC (a cast
off from another customer!) had better i/o and latency than whatever they were
being given by said cloudy provider.

I suggest you start with the basics: CPU, RAM, disc I/O and latency, network
I/O and latency. Optimise those first and then work up the stack (and down,
then back up etc.)

If you start with "assume a spherical EC instance" you may not be considering
the whole problem -> solution -> realisation thing.

~~~
hashhar
But then there's the issue of cost. Tuning performance on the software side
only costs time and expertise but not money. And management doesn't really
want to spend money unless you tell them they are going under within a week.

~~~
throwanem
Sounds like GP is a contractor. In that world, time and expertise _is_ money -
contracting rates aren't perceived as a sunk cost the way salaries are - and
an SSD won't cost more than a few hours' billable time would. (Less than one
hour, for high-end firms.)

------
tkyjonathan
Technically there is a perl tool to make recommendations for mysql settings -
although not with AI. As a DBA, I would probably use this, but 80% of the
performance improvements come from indexing, fixing bad data models and
archiving - especially with RDS where the options for performance optimization
are limited.

~~~
danieltillett
I assume you are referring to this project [1].

1\. [https://github.com/major/MySQLTuner-
perl](https://github.com/major/MySQLTuner-perl)

~~~
tyingq
I love that project, but it seems like it could use a refresh. For example, it
currently still complains if the query cache is turned off...but Mysql does
that on purpose, and will remove it entirely soon. Last time I used it, it
also seemed like it was still straddling suggestions between MyISAM and
InnoDB. It seems safe enough at this point to have the tool strongly suggest
converting MyISAM tables and focusing solely on newer storage backends.

------
nosefouratyou
Reminds me of this: [https://blog.acolyer.org/2017/01/17/self-driving-
database-ma...](https://blog.acolyer.org/2017/01/17/self-driving-database-
management-systems/)

~~~
elvinyung
Yep! Peloton is by the same people.

------
morgo
Product Manager for the MySQL Server here.

The default configuration for MySQL is for a server with 512M RAM:
[https://dev.mysql.com/doc/refman/5.7/en/memory-
use.html](https://dev.mysql.com/doc/refman/5.7/en/memory-use.html)

.. so some improvement should be expected. The paper shows what has been tuned
in each configuration (pasting below).

It would be diligent to point out that the DBA configuration actually changes
the semantics so you may lose data (disabling doublewrite, flush-log-at-trx-
commit=0).

(a) OtterTune Configuration (MySQL)

=============================

innodb_buffer_pool_size 8.8 G

innodb_thread_sleep_delay 0

innodb_flush_method O_DIRECT

innodb_log_file_size 1.3 G

innodb_thread_concurrency 0 # this is the default

innodb_max_dirty_pages_pct_lwm 0 # this is the default

innodb_read_ahead_threshold 56 # this is the default

innodb_adaptive_max_sleep_delay 150000

innodb_buffer_pool_instances 8 # this is the default

thread_cache_size 9 # this is the default

(b) DBA Configuration (MySQL)

=========================

innodb_buffer_pool_dump_at_shutdown 1

innodb_buffer_pool_load_at_startup 1

innodb_buffer_pool_size 12 G

innodb_doublewrite 0

innodb_flush_log_at_trx_commit 0

innodb_flush_method O_DIRECT

innodb_log_file_size 1 G

skip_performance_schema

(c) Tuning Script Configuration (MySQL)

==============================

innodb_buffer_pool_instances 4

innodb_buffer_pool_size 4 G

query_cache_limit 2 G

query_cache_size 2 G

query_cache_type 1

(d) Amazon RDS Configuration (MySQL)

===============================

innodb_buffer_pool_size 10.9 G

innodb_flush_method O_DIRECT

innodb_log_file_size 128 M

key_buffer_size 16 M

max_binlog_size 128 M

read_buffer_size 256 k

read_rnd_buffer_size 512 M

table_open_cache_instances 16

thread_cache_size 20

------
stevehiehn
So cool! Its not hard to image this same technique used on clouds to tweek
infastructure for workloads.

------
mandeepj
Similar concepts can be applied to optimize applications performance e.g. .net
web apps or any similar stack. I got something to research during coming week

------
ocowchun
I just see that Postgres have better performance than MySQL with default
configuration.

