
Compiled or Bust? - fogus
http://www.codinghorror.com/blog/2010/03/compiled-or-bust.html
======
pragmatic
It's interesting to me that he's got a top 1000 site with standard tools. No
"NoSql" or the other such echo chamber flavor of the day fads.

Can get away with this because c#, .net, asp.net and sql server are pretty
darn fast?

<fud> It's been my experience that the standard 2 tier app with a compiled
language is usually pretty fast out of the box. Databases are an _almost_
solved problem. The db folks have been at it a long time and you can wring
some pretty amazing performance out of sql server|oracle|etc.

Problems: php is usually slow as sh!t and requires something like eAccelerator
to provide decent performance.

Java could be good, but the App Server/Broker/N-tier architecture usually
kills performance rather than enhancing it.

Just me experience. No hard science here. </fud>

~~~
raganwald
Some domains suit the relational database model better than others. If he was
doing a huge amount of machine learning-type stuff he might have a different
architecture.

Then again, given that he (like most people, myself included) has a lot of RDB
experience, he designed his application around what RDBs do well. Perhaps if
he had 10+ years of experience with distributed hash tables he might have
designed a site with a different feature set.

~~~
bad_user
Yeah, but services these days that are announcing the usage of a NoSql storage
aren't doing any machine learning, they are just storing comments, tags and
relationships.

------
kamens
There's real confusion in this post.

Rico's statement "Each execution builds the expression tree, and then builds
the required SQL." is talking about turning a LINQ expression tree into a SQL
string _before_ SQL Server ever gets a chance to reach into its query cache.

Jeff's conclusion "The computational cost of pre-processing a given query is
only paid the first time the database sees the new query....Not so in Linq to
SQL." is just plain talking about the wrong thing.

SQL Server doesn't care what abstraction generated the SQL query that it's
receiving, it'll still use the same old cache.

Rico is legitimately discussing the time taken for the web server to compile a
query to SQL, not the time for a SQL Server to look up a cached or uncached
query.

~~~
j_baker
"In short the problem is that the basic Linq construction (we don’t really
have to reach for a complex query to illustrate) results in repeated
evaluations of the query if you ran the query more than once."

It sounds to me like Rico is advocating the prepare/execute model. With ODBC,
it's possible to generate a query plan on the _client_ side. What Rico is
saying is that it generates a query and then sends the raw query to the
database. This would have been a valid point _before_ SQL Server 2005, but
according to Microsoft "the prepare/execute model has no significant
performance advantage over direct execution, because of the way SQL Server
reuses execution plans."

<http://msdn.microsoft.com/en-us/library/ms175528.aspx>

------
andrewcooke
how does a sql engine cache [results and plans given] queries?

i would naively assume that there are two caches (ignoring compiled queries -
see comment below). one, up front, is a literal cache: repeat a query text
exactly and bang, you get the previous results.

but then it would make sense for a second cache based on the structure of the
ast after parsing a query. if the structure is the same, but some parameters
have changed, then it should be possible to re-use much of a previous [plan].

if i understand the article correctly, it's saying that this doesn't happen.
the only way to get this is to use a "compiled query" that you then call with
a variety of parameters.

is that right? if so, why? seems like my second level cache wouldn't be that
hard to implement...

[edit: i realise this wouldn't avoid the needless passing from linq to text
and back to ast, and i agree with the general tone of the article that you
need to justify worrying about optimisations - i'm just curious about how sql
engines work]

[edit2: clarified by explicitly mentioning plans]

~~~
raganwald
What happens is this: There is something called a "query plan." For trivial
queries, it is the same as the query. But for more complex queries it works
out how to use indices, joins, and so forth. It's kind of a lower-level query
language. Figuring out the query plan can be costly for complex queries.

The database maintains a cache of query plans, so when you give it the same
query, you get the query plan back right away and the database runs it.

There are data caches for other reasons, but the problem Jeff is describing is
that the C# VM must interpret all the steps of building a SQL query in text
every time it's run. The database may have the query plan cached, but even
though the code never changes, building the SQL query string happens every
time.

~~~
andrewcooke
i thought he was saying more than that.

i thought that he was saying that a new plan is _always_ constructed. what i
was saying is that a new plan only needs to be constructed for "sufficiently
different" queries, where "sufficiently different" can be something much more
stringent than the hash of the literal text.

if we're only worrying about the _parsing_ time (as opposed to the time to
build and optimize a plan) then it's even more pointless to worry about this
(i'm assuming parsing is significantly faster than actually building a plan).

[edit: i'm confused who is who here; have swapped person twice, hope this is
correct]

~~~
raganwald
I think the database is smarter than a simple literal text for query plans. I
think it reverse-engineers literals so that if you query WHERE t.initials =
'ac' and then later query WHERE t.initials = 'rb', it re-uses the cached query
plan.

But the point of the article is that the problem is not in the database, it's
over in the CLR which is spending a lot of time composing the SQL text that
gets sent to the database.

~~~
raganwald
UPDATE:

In Oracle, when you use literals any change to the text results in another
hard parse on the database side. However, you can use bind variables instead
of literals and it can re-use the query plan even when you are searching for
different values.

<http://www.akadia.com/services/ora_bind_variables.html>

The OP is talking about LINQ. I assume that its Oracle implementation is smart
enough to use bind variables, but I am not going to look that up right now...

~~~
andrewcooke
the article gives an example of how to do compiled queries. which strongly
implies that "ordinary" linq queries are not compiled as you seem to expect.

