Hacker News new | past | comments | ask | show | jobs | submit login

No, query planners are generally hopeless; for my use cases, it's trivial to get orders of magnitude better performance with different query structures.

If you have just a predicate, or a single join, most query planners are just about ok; with multiple tables and predicates touching those tables, there's ever increasing risk it will start the join on the wrong table and produce too many rows somewhere in the plan. Engines like PostgreSQL are unpredictably stupid (but sometimes very smart), while MySQL is predictably stupid (but consistently fast when you keep it simple and avoid specific constructs).

When designing for interactive use, you want to design for a maximum run time on the query. If that means using a query which is often not the fastest, but is never slower than X, it's the query to use. Smart planners are more of a hindrance than a help, because they reduce predictability.

The worst case is when a planner starts with a small row set (planners usually like starting with small row sets) and joins against a big table expecting a similarly small joined set, but ends up with an enormous result set because of the cardinality of the join. Estimating join cardinality is a tough problem for database vendors to solve, and it's critical to figuring out the correct join order - particularly since the number of join strategies grows with the factorial of the number of tables to join, so the planner can't spend much time on it. Meanwhile, the application developer may very well know the designed cardinalities and a join sequence that minimizes the chance of a blowup, the most important thing for interactive use.




Well yes, this is exactly what a query optimiser is meant to do.

What you're meant to do is add a few indexes, maybe sprinkle some hints into your query (platform specific), based on your knowledge based on your rough knowledge of the statistics of your data structures and let the optimiser/planner work out the best way to execute it.

Now yes, the optimiser may sometimes get it hilariously wrong, but I guess that's the price you pay for a relatively portable (across DB platforms, with some tweaking of the hints) solution.

The risk of going too far down the rabbit hole of dynamically re-writing the SQL is that you're essentially trying to second-guess any query optimiser/planner and for most of people making software they're going to do a poor job building a 'shadow optimiser' based on 'experience' of a specific platform and how it behaves in observed scenarios. If the people building databases are finding it hard, there's probably a reasonably good reason.

I'd also note that this isn't really a problem with ORMs per-se, but with relational DB platforms and their performance when running arbitrarily complex SQL.


You can concretely measure how much time it takes to scan X million rows of a table; how much slower it is for every join you add; how much slower an index is to scan for every column that's added to it; how much slower it gets for every column you add to the select clause; etc. And determining optimal join order in my case isn't particularly hard (it's a lot like a star schema; the key thing is not to bring in the whole of the big table in the middle).

I was a compiler engineer in my previous job; transforming and optimizing expression trees are my bread and butter. I'd give myself good odds of improving MySQL for my particular use case - I'd know where to look, what stats to collect and how to approach things differently. But likely my optimizations would be pretty specific to my situation, and not everybody would be happy with more predictable, sometimes slower performance.


The query optimizer faces the same limitations as an optimizing compiler - improving the query as much as it can given limited information, in as short a time as possible.

And it's frequently unsuccessful, for the exact same reasons as optimizing compilers - proper query optimizations require more information and time than what is available to those optimizers.

Hand tuned code created by an expert will always be faster than that produced by an optimizing compiler; hand tuned SQL created by an expert will always be faster than that produced by the query optimizer.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: