
Database Learning: Toward a Database That Becomes Smarter Every Time - kawera
https://arxiv.org/abs/1703.05468
======
Roritharr
I recently wondered why we still don't have self optimizing databases,
creating their own indexes and tweaking their settings when needed. Are there
such products out there or is there no demand?

~~~
jandrewrogers
Good databases do self-optimize quite a bit but they select where and how they
do that optimization carefully. There are two big constraints on practical
dynamic optimization: the cost must be less than the benefit and it must be
applied to the system incrementally.

These constraints have effects that are not always obvious. For example,
architecting a database kernel so that it _can_ be dynamically optimized slows
down computational throughput in the general case! This means that the
benefits of dynamic optimization have to be sufficiently large that it offsets
the performance tax of doing dynamic optimization at all. Also, because the
performance tax is general, there is an implication that the optimization
needs to be general across the set of runtime operations as well i.e. you
can't optimize one operation at the cost of slowing down others. Nonetheless,
there many areas such as cache replacement algorithms where dynamic
optimization has big performance payoffs for the added complexity.

Second, dynamic optimization needs to be applicable incrementally, otherwise
it tends to have a "stop the world" effect on performance when it is being
applied, which is bad. A good example of this is dynamically optimizing
storage for the actual distribution of queries it sees. Dynamically adding a
new secondary index imposes a huge background cost that is visible in
performance, so databases don't do it. However, dynamically optimizing
individual _pages_ for queries is common because it can be applied
incrementally for pages you are already accessing anyway for a small one-time
cost per page distributed over time as pages are accessed. And it only applies
to pages you actually touch; building a new index pulls a lot of cold storage
through the cache. Page level query optimization may be less effective overall
than adding a second index but the fact that it can be applied incrementally
makes it a preferred form of dynamic optimization.

In short, dynamic optimization has been studied and tried for many decades.
Sophisticated databases do a lot of it but the mechanisms are chosen carefully
to minimize adverse side effects in real world conditions.

~~~
Roritharr
Great Answer, thank you! Going further, would this mean one could build a DB
where i could send off "optimize now" queries and telling thereby the DB that
I accept stop-the-world type of effects for a limited time? Or even specifying
the effects that i'm willing to accept?

My main issue is that most of the DB Stuff i do as a non-dba is so basic that
it could be automated, but isn't for reasons that until your post came up
weren't clear to me.

------
einrealist
How abou a database that uses behavioral pattern detection or behavior
profiles to make data exfiltration through connected applications harder?

