
How Does the SQL Server Query Optimizer Work? - xameeramir
http://xameeramir.github.io/How-SQL-Server-Query-Optimizer-Works/#.WIEaVWGMEKY.hackernews
======
lobster_johnson
Tangentially, I've always wondered why databases today still aren't capable of
being self-tuning. In 2016 we're still manually creating indexes, and
databases still use the exact same physical data structures for everything
they store, no matter if they're storing 1,000 rows or 1 billion.

The problem isn't too dissimilar from the challenge that tracing profilers try
to solve. One key difference, of course, is that dynamically adapting the
physical storage strategy to usage patterns is relative expensive, resource-
wise. But not every problem needs to be solved for those users with 1 billion
rows. I would argue that most applications — web apps, anyway — have datasets
that are large enough to optimize, yet still small enough that a dynamic,
self-tuning system could be entirely feasible.

The resource problem can be mitigated in several ways. One is to off-load
processing horizontally, so while a master is chugging away, a slave is
building a table using a different strategy. Another is to simply perform
expensive optimizatons during off-peak hours.

Also, I find it interesting that databases also still operate in a mode where
reads and writes happen in the same physical store. I have an idea for a
database that separates the two: You have the transactional, highly
consistent, non-horizontally-scalable OLTP-style "transaction store" where a
client carefully constructs reads and writes against the absolute truth, but
where reads aren't super efficient; and then you have the "query store" which
is a read-only, highly replicated and sharded set of mirrors that uses highly
optimized data structures that are created on demand based on the topology of
the data and usage patterns (in particular, the same data can potentially be
indexed in multiple ways (B-tree, compressed columns, etc.) until the profiler
finds the best strategy). The mirrors' replication should be tunable so that
some "hot" subsets can be immediately consistent, and others can use slower
batch transfers to reduce lock contention on updates.

Right now, when people build scalable systems, they often apply this design by
putting the truth in PostgreSQL or whatever, and then indexing it into
Elasticsearch or Solr. But there are huge benefits to combining the two into a
single, unified system.

~~~
dom0
Because it doesn't really matter. A tree works just fine as an index, be it
100k or 1 billion records. The other part you mention - self tuning - is also
quite clear imho: either the DB is so small that indexes genuinely do not
matter (and this can be surprisingly large today) or it's large enough that
you don't want the DB to reorganize itself on a whim. We all know "smart"
systems, and we all experienced how flaky their outputs are. You don't want a
DB that quite literally scratches the scratch storage because of that.

And if we are talking about serious need for indexes, then we almost certainly
are talking about subtle tradeoffs that are unlikely to be made like we hope
by algorithms that optimize aggregate metrics (eg. average/median query time,
cache efficiency etc.) -- consider complicated reporting queries that don't
run often, but take a while (but since they are seldom run they are not
influential on the databases' performance metrics). A "smart" DB could decide
to drop an index, because it accelerates a common write operation, and no
important queries needed that index anyway. But now your complex reporting
queries are all turned into nested full-table scans. Oops...

~~~
quizotic
No. Not true at all. At around a billion records, you start getting into
distributed/scale-out systems, and distributed indexes are performance death
whether for OLTP or analytics. Even before you cross into distributed query
processing, the random access nature of indexes levies a heavy tax. Indexes
are a tiny piece of this issue. They address the problem of finding a single
needle in a haystack, but if you need to aggregate lots of data, indexes are
largely worthless.

What about different compression schemes for different sets of "column" values
- with the idea that you execute against the compressed values, rarely ever
decompressing. Even column store systems that allow you to specify a
compression scheme aren't adaptive. Why the hell not? This is almost 2020.

~~~
needusername
> At around a billion records, you start getting into distributed/scale-out
> systems, and distributed indexes are performance death whether for OLTP or
> analytics. Even before you cross into distributed query processing, the
> random access nature of indexes levies a heavy tax.

I assume you mean once process/aggregate a billion records. Having a billion
records in a single table on a single, non-distributed system is not an issue
at all.

------
adamnemecek
If this tickles you fancy, you should check out this site [http://use-the-
index-luke.com](http://use-the-index-luke.com) or a book by the same author
[https://www.amazon.com/Performance-Explained-Everything-
Deve...](https://www.amazon.com/Performance-Explained-Everything-Developers-
about/dp/3950307826/ref=as_li_ss_tl?ie=UTF8&qid=1484664477&sr=8-1&keywords=Sql+performance&linkCode=ll1&tag=akhn-20&linkId=c4b9565b49e6d7df8926ea4b2c38638d)

He goes into great detail to explain how exactly SQL indices work and how to
leverage this to write queries.

------
rusanu
I have a blog article that covers a bit more details:
[http://rusanu.com/2013/08/01/understanding-how-sql-server-
ex...](http://rusanu.com/2013/08/01/understanding-how-sql-server-executes-a-
query/)

~~~
swasheck
I remember the first time that I read this. I still reference it many years
later.

------
protomyth
_The reality is that, even after more than 30 years of research, query
optimizers are highly complex pieces of software which still face some
technical challenges._

Yep. It still amazes me how queries can go from absolutely great to scanning
tables with a row added in the wrong place. _WITH RECOMPILE_ still scares the
heck out of me. It can really help when distributions change, but it can also
kill your performance with a bad query plan choice.

It addition to the hint show, you can also force certain indexes. This didn't
seem to be as big a problem on SQL Server as its parent code Sybase ASE.
Sybase made some really, really bad choices on big tables.

No amount of optimization will fix a bad schema design.

This is also a field where talking to people about how you built your machines
and where the indexes are located can make a world of difference. If you
cannot have the whole database on the fastest storage, then at lest put the
indexes there.

~~~
swasheck
> It still amazes me how queries can go from absolutely great to scanning
> tables with a row added in the wrong place.

I think that this is a bit of an oversimplification, though I understand the
sentiment. Statistics skew is one of the primary reasons for such a dramatic
shift in query planning and adding a row can create such a problem if you're
already at some sort of skew tipping point and that one row pushes you over
the edge.

It would be nice to know more about where the edge is and how to plan to
ensure you don't drive your application's/server's performance off the cliff,
though.

~~~
protomyth
> I think that this is a bit of an oversimplification, though I understand the
> sentiment.

No, it really isn't. Cache / Memory isn't a sliding scale, and once you exceed
it you fall off the cliff. Databases queries can fall off a cliff if you
exceed your memory. I've watched a query that was executing in a couple of
seconds drop into many minutes territory because it had to start paging.

There is a real machine with limited memory running your queries and it has
been my experience that warnings are rare.

~~~
jve
Even if you have tons of RAM and the data IS there, performance is still
disasterous if you have to perform seven, eight or nine digit logical reads.
(millions or billions). Happens when SQL Server assumes it will use loop join
because it thought there is only 1 row to join. Oops.

~~~
protomyth
Oh yeah, memory fails aren't the only cause, but they do tend to hit you very
quickly. Nothing like doing production patches because you went from
comfortably able to get done in the time period needed to "won't finish until
next week". I can just see the developer face when seeing the loop join where
it shouldn't be. Almost as bad as seeing FSM (full sort merge) on Ingress or
full table scan on ASE or SQL Server.

That's really why I recommend SQL writers really study the plan generated. Its
the closest thing to studying assembly to teach what the optimizer is
thinking.

I still think the old Ingress plan viewer was pretty neat.

------
psi-squared
There's a really nice article about the Postgres query optimizer, which goes
into much more detail about the algorithms used (it's likely that at least the
basic ideas are shared with SQL server, though I can't say for sure). I really
like the exposition in this:

[http://jinchengli.me/post/postgres-query-
opt/](http://jinchengli.me/post/postgres-query-opt/)

~~~
quizotic
As a "newer" engine, SQL Server has a couple of tricks that aren't found in
most other systems, including intra-query parallelism across multiple threads.
This capability requires occasional reshuffling of data between threads, which
makes SQL Servers internals a little more like a distributed DBMS (think
Netezza/Teradata) than the wonderful, but ancient, Postgres. All of which
means that SQL Server has a larger palette of execution options and some
different optimization decisions than PG.

------
arnon
There's very little actual data here...

Oracle has a bit more data on their website..
[https://docs.oracle.com/database/121/TGSQL/tgsql_optcncpt.ht...](https://docs.oracle.com/database/121/TGSQL/tgsql_optcncpt.htm#TGSQL94983)

~~~
smcl
I think "SQL Server Database Engine" refers to MS SQL here, I'm not sure how
useful the Oracle stuff is for MS SQL Server (I have never used Oracle)

~~~
arnon
They all operate the same way give or take

------
morgo
I have a guide that shows the equivalent for MySQL. It's kind of split between
these two pages:

[http://www.unofficialmysqlguide.com/introduction.html](http://www.unofficialmysqlguide.com/introduction.html)

[http://www.unofficialmysqlguide.com/server-
architecture.html](http://www.unofficialmysqlguide.com/server-
architecture.html)

------
mozumder
Is there a Postgres equivalent to reoptimize prepared statements?

~~~
quizotic
I think you can just prepare them again, no? Of course the onus is then on
poor you to know when that might make sense because of changes to the
referenced tables. Though postgres is so incredibly open, you could probably
do that with a trigger or a hook into analyze or vacuum or copy-from.

~~~
anarazel
Prepared statements are re-planned on use if the underlying statistics change.

~~~
quizotic
Ah. Thank you for the correction!

