
A Look at MyRocks Performance - PeterZaitsev
https://www.percona.com/blog/2018/04/30/a-look-at-myrocks-performance/
======
cat199
Just going to point out that percona IMHO is potentially hugely
underappreciated in the MySQL world...

At $work we are managing a fairly heavy MySQL instance and having scaling
issues (~5TB, lots of blobs, and frequent user-managed schema changes for
interactive/batch data analysis, many concurrent complex joins involving
multiple 1M+ row tables) and after reviewing the various MySQL derivatives, it
seems that the percona ppl are really focusing on 'real deal' operational
reliability/scalability issues... (e.g. minimizing transaction locking issues
for coherent backups, real-time db cloning, replication data integrity
testing, etc). I strongly recommend anyone looking at mysql flavors to not
overlook their offerings. Looking forward to doing some production tests of
their XtraDB mysql flavor in the coming months (for now usage has been limited
to unit tests and using the percona toolkit/xtrabackup)

Also: no I am not a paid sponsor.

~~~
chucky_z
I've found MySQL to have issues with datasets that don't fit into the innodb
buffer pool. Are you doing a lot of partitioning to make everything work well?

If everything is behind an ORM, have you considered trying out PostgreSQL w/
the MySQL FDW? pgsql 10's improvements to FDW's make it a truly viable option.

Also to clarify: I'm just saying this as pgsql seems to handle queries that
involve disk access and large datasets much better. I have found that as long
as your hot dataset fits into the innodb buffer pool and/or you're doing only
key lookups (e.g.: select * from tbl where pk=1;) MySQL is most certainly
faster for real-world usage. If you're pulling and sorting/filtering millions
(or billions) of rows per query, I find that pgsql stands up extremely well to
that. Even in a single huge server setup.

Also a semi-unrelated ditty, but as someone who has to run migrations on
pretty big (100m rows) tables, ALTER TABLE LOCK=NONE ALGORITM=INPLACE is
pretty great (here's a table of what you can user per-situation
[https://dev.mysql.com/doc/refman/5.6/en/innodb-create-
index-...](https://dev.mysql.com/doc/refman/5.6/en/innodb-create-index-
overview.html#innodb-online-ddl-summary-grid))

If you made it this far in my comment you'll probably realize I didn't grasp
the article well. I just re-read it and tl;dr, MyRocks works well (as
expected) with datasets that don't fit into the innodb buffer pool. :)

~~~
cat199
Our workload is application specific and currently requires MySQL - 100% of
queries are on primary keys so most 'hot' paths I think tend to stay resident,
at least as far as query operations go. Also, we tend to fetch 'chunks' and
operate on those in memory rather than running server side so some of the
processing is offset by local operations.

Would be good to dig a bit deeper into this - thanks for the pointers.

~~~
PeterZaitsev
Yeah. Innodb is heavily optimized for PK lookups.

------
alfiedotwtf
Good to see Percona get some love on HN. They put out really solid work, and
have great blog posts!

~~~
niroze
Indeed! Their expertise and quality are something I truly value. Percona
Server is killer.

~~~
PeterZaitsev
Wait to see when Percona Server 8 comes out :)

~~~
niroze
Exciting!

------
threeseed
MySQL's pluggable engine really is a killer feature.

From these results you would think it makes sense to have RocksDB the default
for MySQL and then have InnoDB be there for the larger users.

~~~
felixhandte
I actually think the correct conclusion is the reverse: if you can fit your
tables in RAM, choose Inno. Otherwise, Rocks outperforms Inno when the working
set no longer fits in memory. OP's testing is a little confusing because the
working set remains constant and the memory is scaled, where normally we think
of this in terms of seeing how much we can scale the workload on a node with
fixed resources.

~~~
adventured
That's the right conclusion. Most use cases will fit into memory and InnoDB
will win vs Rocks when all of your data fits into memory. Very large users are
the ones who would more typically benefit from Rocks, as their data sets may
dramatically outstrip available memory.

~~~
ggg9990
But larger users can also afford more RAM.

~~~
PeterZaitsev
This is common misconception. The larger there is a cost of sub-optimal
performance. If you spend $1000/month on your infrastructure, halving
infrastructure cost will save you $500/month which can't justify a lot of
investments. Now if it is $100M/month saving $50M a month is worth a lot....
this is why Facebook for example has created many custom built highly
optimized systems like RocksDB

------
aphextron
Really confusing title with "Facebooks" missing an apostrophe

~~~
threeseed
Can we instead just have the original title: A Look at MyRocks Performance

~~~
tlb
Changed, thanks.

------
amanzi
Could this be a solution for those of us trying to run a big databases on a
small VPS?

~~~
dajonker
Perhaps, but it depends on your workload, as is described in the article. In
any case: do your own measurements on your own database/application to see if
it works for you.

