
Building a PostgreSQL load tester - lawrjone
https://blog.lawrencejones.dev/building-a-postgresql-load-tester/
======
michelpp
> When faced with a technical problem it’s often better to use an existing
> tool than jump into writing one yourself. Having benchmarked Postgres
> clusters before, I was already familiar with a tool called pgreplay that I
> thought could do the job.

There is also pg_badger which can run parallel jobs:

[https://github.com/darold/pgbadger](https://github.com/darold/pgbadger)

pg_badger is also used by nancy, an benchmark experiment driver, so you can
replay your logs, then change something, like add an index, and replay them
again and nancy will show you to differential performance:

[https://gitlab.com/postgres-ai-team/nancy](https://gitlab.com/postgres-ai-
team/nancy)

~~~
lawrjone
Hey, author here! Perhaps I'm missing something, but I think pgBadger is just
for log analysis, no?

pgBadger is the second component to the tool this article is about. The idea
is to replay the original production logs and capture the target clusters logs
while you do it. After the benchmark is complete you can pass the logs into
pgBadger to understand what happened with the performance, and compare the
report to one produced for production.

I hadn't come across nancy before though, this looks cool! For replay of real
logs nancy appears to use the original pgreplay
([https://github.com/laurenz/pgreplay](https://github.com/laurenz/pgreplay))
so it would have been subject to the same issue that motivated building
pgreplay-go, but I bet providing an option to switch between the tools would
be quite easy.

~~~
sbuttgereit
I came here to basically state the same point or ask if there was a way to
apply pgBadger, the log analysis tool, to this problem. I use pgBadger quite a
lot, but not for your use case. Perhaps you could take the generalized
versions of queries that emerge from pgBadger analysis along with query counts
and fake something together, but there are significant flaws with that
approach, such as sequencing, transaction relationships, etc. anyway....

There is pgbench, of course that does exist, but that probably doesn't fit
your use case as it's a load tester with a contrived schema, data access
pattern, queries, etc. Load testing the way you use a database is a harder
problem than just doing so generically.

But here's a question: were the logs really the best option for getting that
"seed" data? I wonder if some of the work done facilitating logical
replication could be harnessed for such a task without some of the noise that
comes with logs. I don't know that this is possible and it may simply be too
intrusive, especially if you're just trying to work with existing history; but
if possible and not too much admin overhead seems like this would be an easier
way to accomplish something like getting a load testing data set.

~~~
lawrjone
> were the logs really the best option for getting that "seed" data?

It seemed to us like they were. The database under test belongs to a large
monolithic app that has long reads, heavy write batch jobs and is serving API
traffic. My team and I struggled to find an alternative that we didn't feel
was too fake, that we could really have confidence in.

Even though the preparation of the benchmark loses some realism, it's still
close enough to real life that it gives you great signal on how things might
degrade. Running this replay against a new cluster and simultaneously
verifying the performance of a critical endpoint was the gold standard for
verifying we were safe to move to the new hardware.

Whether this is all true for others, I'm really not sure. We were willing to
invest the time because this was so critical for us, but it might not make
sense for other people to take the same approach.

> I wonder if some of the work done facilitating logical replication

Perhaps? You could capture writes, but the majority of our traffic were reads
which would have been unreported by logical replication.

> getting a load testing data set

Anecdotally, capturing the logs seemed quite easy. Just turn Postgres logging
up to max and shove them on a different disk. That said, if we weren't in GCP
with quick disk snapshotting then things would be a lot more awkward!

~~~
sbuttgereit
> Perhaps? You could capture writes, but the majority of our traffic were
> reads which would have been unreported by logical replication.

Ah yes... of course.... forget I mentioned it... I do a lot of ERP type work
so am in a very similar boat most of the time, but I'm currently doing the
information architecture for a green field development project and am very
focused on some other priorities, so a little work induced myopia on my part.

------
ahachete
Cool development. Great to have it handy in the toolbox.

I still have two concerns when simulating a workload in order to test
configurations / DDL changes:

\- Processing the logs implies generating all for every query that arrives.
While this may be acceptable for some cases, it is unacceptable for high
volume sites. Logging every query may bring the db down. PostgreSQL needs a
different mechanism to report this information, like a streamed UDP of query
performance metrics.

\- Independently, this only reproduces the read workload, but what about the
write one? Ideally, you should combine both. Write one may be very well
captured and replayed using logical decoding / replication. But still would
need a lot of coordination / tooling with the read replaying tool to achieve
the desired effect.

All in all, complex problems but very important and interesting :)

------
earless1
This is a space that I am deeply interested in, but my team have not found a
good pattern around this for MySQL. We currently use a combination of MySQL
Slap and Jmeter, but we feel like there are better ways to do this.

Anyone have nice patterns for this around MySQL?

------
Manozco
I've extensively used pgreplay for work and I'd like to add a few things: \- I
recently (May 2018) added in pgreplay a dry-run mode where it just parses the
log and print stats (no replay here). This could be used to benchmark the log
parsing part. \- Instead of parsing the log as you did, I would have
configured PG to ouput logs in csv (it can even does both at the same time I
think), would have been easier to parse \- pgreplay has a binary replay file
format well documented, I would have used that in your case probably

Otherwise seems like a nice tool, I might give it a try in the future :)

------
sharadov
I've used this in the past-
[https://www.hammerdb.com/about.html](https://www.hammerdb.com/about.html) Not
for Postgres, but for running load tests on SQL server.

------
mistrial9
there is a rumor that the US West Coast grocery store Safeway built their
business, at least in part and perhaps a lot more, using Postgres in the
backend. Now that Safeway has expanded to .. how many locations? How many
real-estate deals? perhaps it is time to ask broadly . did this company put
any resources into building a load-tester for future versions of Postgres ?
using their own expertise .. mocking the data . Let's also recall that the
grocery store business is not "cost conscious" ... but eternally, vigorously,
to incredible degrees, penny oriented, for not months or years, but decades..
relentlessly, you might say..

ok yes - it is possible that Safeway groceries has not used Postgres to build
their business, or that they have been very active and productive contributors
to the project, based on the vast business resources they have accumulated
over sixty years.. but.. how about it.. a load tester would kind of be useful
.. now, wouldn't it. Maybe Safeway is mocking the Postgres project, not the
data .. no clue.. rumor only

