
All Abstractions Are Failed Abstractions (Linq vs Raw SQL) - pchristensen
http://www.codinghorror.com/blog/archives/001281.html
======
Maro
There's no way fetching IDs then fetching each row one-by-one is faster than
SELECT * with the the same WHERE clause, especially taking into account
roundtrip times and whatnot. His measurements are wrong (eg. due to caching)
and/or he's confused.

~~~
bradford
He's not selecting each row one-by-one. He's getting 48 records back using two
queries. The first query gets the 48 id's of the most recently updated posts.
The second query selects the entire row that is associated with each id:

1) select top 48 Id from Posts

2) select * from Posts where Id in (<id's selected from previous query)

Unless the second query is 'abstracted' into 48 individual queries (which
would make for an interesting abstraction-article in itself) your accusation
is completely baseless.

I think Jeff's notation in the SQL queries were a bit confusing, but if you
actually read the article his intention is clear and his point is valid. I'm
baffled by the retaliation here on HN. Please rebut me and restore my faith in
the HN community.

~~~
ntoshev
_He's not selecting each row one-by-one._

Yes he does: "even if we did it in this naive way, the total execution time is
still a very reasonable".

 _He's getting 48 records back using two queries._

This is equally silly. The two queries would be equivalent to the single one
if there weren't two round trips instead of one, more traffic (the ids are
retrieved twice instead of once and they are also sent to the server) and the
overhead from concatenating and then parsing all the ids in the second query.

~~~
bradford
I concede. Got confused with the original posts wording, thanks for clearing
it up!

------
pj
_But even if we did it in this naive way, the total execution time is still a
very reasonable (48_ 3 ms) + 260 ms = 404 ms*

I can't read things like this and avoid wondering why people keep reading him.
He is so wrong so often I ... it's just beyond me.

Query execution time is just _one_ part of performance. You still have to
create the connection and transfer all the data across the wire. Sending N
records in 1 big package is much more efficient than sending N+1 records in
individual packages.

Run a performance measure on the actual code execution and I am almost
_positive_ you'll get worse performance than would be indicated by your query
execution times alone.

From now on, I'm flagging every codinghorror post.

~~~
smokinn
It really depends though. If you're using mysql with myisam tables you can
easily come up with a sequence of events that lead to N+1 records being much
faster due to table locking issues, especially when joins start getting
involved. In that case if you can split your query into N+1 requests by
primary key and avoid the join that locks both (or worse, many) tables for a
little while it'll be much smoother and faster in production.

Now, of course, if you can do that you really should be thinking about
memcached and/or a key/value store with possibly a distributed index rather
than a relational db anyway.

------
jpeterson
Jeff is really doing us all a disservice here. Now I'm going to have junior
programmers fighting me, citing published evidence that running a thousand
queries is actually faster than running a single query with a thousand rows
returned.

------
michael_dorfman
A strange article, even by Atwood standards. The anecdote doesn't match the
moral-- nothing would be altered by removing Linq-to-SQL from the equation.

~~~
shin_lao
I agree with you, I don't understand what point the author is trying to prove.

I truly think that mixing code and requests is acceptable only for small
projects. Not a single word about stored procedures, which are in a way an
abstraction of your database logic.

------
smhinsey
Every time I read one of these posts, which admittedly is only via HN lately,
the suspicion that I am being trolled grows. This almost seems to be borne out
by the numerous corrections that are almost always to be found in the
comments.

More seriously; in some ways, I feel bad for the position Jeff Atwood is in.
On the one hand, I don't think he really intends to speak with the authority
implied by his prominence, but on the other hand, he has his prominence.

~~~
mquander
He appears to be purposefully trolling in this post:
<http://www.codinghorror.com/blog/archives/001257.html>

However, he never admits it. I suspect that although many of his posts might
not start out as trolls, he is quite happy to engender a loud reaction without
any real concern as to the accuracy or utility of what he writes.

------
profquail
There is a certain level of functionality available in all relational
databases...LINQ to SQL has been designed to be isomorphic to that
functionality. The result of that (plus the very powerful reflection classes
in .NET) means that the only problem here is that Microsoft hasn't optimized
the expression tree to SQL output as much as possible. After another update or
two to the .NET framework, the performance should be virtually identical; it
may possibly be even better with LINQ to SQL since it may 'think' to optimize
queries in ways that you wouldn't think to.

On a side note, the author also neglects to mention that LINQ to SQL is a
great way to make your code more portable to other platforms (Mono) or other
databases (MySQL, PostgreSQL, etc.) so that you don't have to learn the
intricacies of each databases' SQL syntax.

~~~
logicalmind
Linq to SQL produces expression trees. These expression trees are interpreted
by the query provider. In the case of Linq to SQL the query provider produces
SQL for the SQL Server dialect. The SQL can be optimized. This SQL is sent to
SQL Server which then parses and optimizes the execution plan for the SQL (or
not if it is already in the procedure cache).

Are you saying that the Linq expression to SQL optimization will replace the
SQL Server SQL plan optimization?

The gory details of producing a Linq query provider are covered in this series
of blog posts: <http://blogs.msdn.com/mattwar/pages/linq-links.aspx>

~~~
profquail
No, I'm not saying that Linq to SQL will replace the SQL Server plan
optimization; I'm merely pointing out that a well-optimized (and perhaps
dynamic, using reflection) query provider could probably generate better, more
efficient SQL queries than 99% of developers in 99% of use cases. You might
miss out on some potential optimizations simply because you're trying to get
the code working right instead of spending hours to shave a few milliseconds
off of your query's execution time (plus, nobody's perfect!)

------
jerryji
Now I start to understand better why HN readers oppose CH.

~~~
Confusion
Please explain why, so we don't have to actually read the article ourselves.
You could be doing other community members a great service by saving them the
time to read articles that, according to your comment, aren't worth reading.
However, your comment is useless without explaining why we needn't bother to
read the article.

~~~
ntoshev
The article is full of messy and wrong arguments that are hard to summarize.

------
ankhmoop
I'd be curious to hear the reasoning of the users who 'upvote' Jeff Atwood
articles such as this one.

~~~
thamer
I upvoted this article because the conversation on this page is interesting
and provides a lot more information than the original article. I always look
at the comments before reading an article on HN, and often feel that they are
more valuable and insightful than the linked page. This is the case here.

------
pchristensen
To everyone who's bashing, he's just saying that it's not a straightforward
relationship under the covers of an abstraction. He's still using Linq2Sql and
still returning datasets in one query. He just wants people to be aware that
even though you have a good abstraction, there are still complex
considerations.

~~~
batasrki
Yeah, but at the same time, he's being misleading about the effects of N+1
query problem. It's a problem for a reason.

Anyone experienced enough in web development will maybe get his point, but the
newbies who read his blog will not and will fight more senior developers on
this topic, all because "Jeff Atwood said it".

That's where the bashing is centered on.

~~~
jrockway
Someone should tell these newbies that "Jeff Atwood said it" is actually an
argument _against_ whatever he said.

~~~
batasrki
Someone should, indeed.

------
euroclydon
I'm sure Google and other large dev shops will be chomping at the bit to buy
the premise version of Stack Overflow once they see how sharp Jeff is.

------
TweedHeads
The author doesn't know/understand SQL and he should be kept away from
databases.

