

Stopping expensive queries before they start - adamnemecek
http://adpgtech.blogspot.com/2015/02/stopping-expensive-queries-before-they.html

======
WALoeIII
This is very useful for batch or cleanup jobs.

For example, imagine a system where items are soft-deleted immediately upon
user action, but not actually deleted for a few days to facilitate restoration
(recycle bin).

There is going to be some nightly/hourly/scheduled job that actually really
deletes these records. Initially it will have little work to do, but over time
as the system grows it may become slow. Typically this would be hard to
separate from other slow queries, you would have to catch it running while
causing other queries to pile up. Query time isn't necessarily useful here as
you may have enough I/O to cover the slow query replacing pages in the cache,
but that I/O would be better serving user facing requests instead of this
cleanup job.

This feature allows for the "work" it really takes to serve the query to cause
it to error, instead of time which may grind down for other reasons. At this
point you know its time to re-think the soft-deletion strategy, you disable
the job. Maybe you sweep more frequently? Maybe you keep a look-aside of
things to sweep to avoid scans? Maybe you sweep during a low-traffic time?
Whatever. It buys you time to think.

I wish there were something comparable for MySQL.

------
apalmer
Not sure the correct choice is to just disable queries at runtime if there
performance is a net negative.

Would expect these decisions to be made at design time, or perhaps that alerts
are sent to ops when these situations occur.

~~~
hyperpape
Our system allows users to do reporting based on a wide variety of criteria.
In general it works, but nothing prevents a user from running queries that
will timeout. Timing out is a waste of everyone's time, and it would be better
to just kill the request early.

We do try to keep the reporting limited to what can be done efficiently, but I
don't know if it's really viable to avoid everything ahead of time.

~~~
hyperpape
In fact, I'll go further: it's basically impossible for our system (and I
think it's not an uncommon system for certain types of enterprise software).
Part of the structure of the application is that there's a hierarchy
representing our customers and their customers or business units. So company A
has "sub-companies" (literal sub-companies, business units or customers) B
through Z. A query that runs at the sub-company is perfectly acceptable, but
excessively expensive at the higher level.

Except for the simplest cases, the application has no clue which queries will
be problematic. Who knows that information? Only the database.

------
earless1
Link to github repo:
[https://github.com/pgexperts/pg_plan_filter](https://github.com/pgexperts/pg_plan_filter)

------
nowarninglabel
Can this be extrapolated out to other systems such as MySQL or Vertica?

------
1PuPqpDc1V7biSy
Is there something similar for Oracle?

------
anilshanbhag
This is like 50 lines of code, wondering how much they charged for it ?

~~~
joshstrange
Are you insinuating that LOC should equal pay for said code in some way?
That's a terrible idea and has proven to lead to bloated and bad codebases. We
should be glad that they are OS-ing this at all, they don't have to and not
all clients would even let them.

