
Why LINQ beats SQL - douche
https://www.linqpad.net/WhyLINQBeatsSQL.aspx
======
koolba
> And if you need to support older databases (prior to SQL Server 2005), it
> gets worse: [ _some bad SQL_ ]

If you're stuck using a database version that first came out a decade ago,
you've got bigger problems than legacy syntax for pagination.

> Not only is this complicated and messy, but it violates the DRY principle
> (Don't Repeat Yourself). Here's same query in LINQ. The gain in simplicity
> is clear: var thirdPage = query.Skip(20).Take(10);

If you're using OFFSET / LIMIT to do paging then you're doing it wrong. See
here for the real way to handle paging for _large_ result sets: [http://use-
the-index-luke.com/blog/2013-07/pagination-done-t...](http://use-the-index-
luke.com/blog/2013-07/pagination-done-the-postgresql-way)

~~~
pmontra
Great link, thanks. Is PostgreSQL still the only db supporting that?

~~~
MarkusWinand
The right charts on slide 45 says "X" for SQLite. That changed recently:
[https://www.sqlite.org/releaselog/3_15_0.html](https://www.sqlite.org/releaselog/3_15_0.html)

DB2 LUW has also decent row-values support: [http://use-the-index-
luke.com/blog/2014-11/seven-surprising-...](http://use-the-index-
luke.com/blog/2014-11/seven-surprising-findings-about-DB2#blog-db2intro-
rowvalues)

Need to update the slides...

------
agentgt
I know many are going to disagree but I have come to despise almost all SQL
generators unless they are as close to 1-1 as possible (aka Lukas Eder's jOOQ
is acceptable in my book).

Even more controversial... I don't want to make it any easier for developers
to interact with the database unless it is absolutely correct and transparent.
I know.. I know... how could I say such I thing! But you see I want every
interaction with the our data repository... our precious data repository...
that is usually the bottleneck... I want you to know WTF you are doing when
you touch it.

We (my company and probably others) are past the days of monolithic rapid
prototype apps with ORMs. Now you pick a database and you learn the hell out
of it because you can stick to it. Back in the day one of the reasons ORMs
were so popular because companies would have to ship on premise ware that
would have to work with multiple databases. With the cloud that is no longer
the case. However I still think a lot of .NET shops come from the on premise
mentality. Thus performance is less of concern (speed of release is more
desired).

~~~
openasocket
For me there are two primary reasons for having an ORM or ORM-like framework

1\. Compile time checking. I hate when I'm writing a straight SQL query in a
string literal and misspell one of the column names, because everything will
compile and unit tests will pass, and the mistake isn't caught until I run
integration tests. I want a short edit-compile-debug cycle. jOOQ is a good
example of something that helps here.

2\. Arbitrary filters. Often I have to expose a REST API where the user can
add an arbitrary number of filters to their query. You can't write SQL
literals for these because there's a combinatorial explosion of possibilities.
So you wind up dynamically generating SQL yourself by concatenating strings in
the WHERE clause, hoping and praying you don't accidentally allow SQL
injection. I want an ORM to deal with that shit for me.

~~~
ReidZB
jOOQ helps with both 1 and 2. With jOOQ you can write something like (totally
off-the-cuff code incoming, written outside of an IDE with only basic syntax
highlighting, be warned, etc)

    
    
        List<Condition> conditions = new ArrayList<>();
    
        if (notNull(lessThan)) {
            conditions.add(MY_TABLE.MY_COLUMN.lt(lessThan));
        }
    
        if (notNull(greaterThan)) {
            conditions.add(MY_TABLE.MY_COLUMN.gt(greaterThan));
        }
    
        if (notEmpty(name)) {
            conditions.add(MY_TABLE.NAME.eq(name));
        }
    
        dslContext.selectFrom(MY_TABLE).where(conditions).fetch();
    

(This is obviously really contrived but I hope it illustrates the ideas)

We use constructions like this with jOOQ relatively frequently, especially for
cases where we offer all sorts of arbitrary filter knobs (useful for internal
tools especially).

In general, you can do some pretty fancy manipulation with `Condition` in
jOOQ, like you can build a list of conditions and then call
`DSL.or(conditionList)` to OR them all together. Stuff like that.

Source: use jOOQ in production code -- been _very_ pleased so far, other than
sometimes poor documentation

~~~
louthy
I much prefer:

    
    
        var query = from c in db.MyTable
                    where lessThan    == null ? true : c.MyColumn < lessThan    
                    where greaterThan == null ? true : c.MyColumn > greaterThan 
                    where name        == null ? true : c.Name == name        
                    select c;
    

Although IQueryable supports your approach also:

    
    
        var query = db.MyTable;
    
        if (lessThan != null) 
            query = query.Where(c => c.MyColumn < lessThan);
    
        if (greaterThan != null) 
            query = query.Where(c => c.MyColumn > greaterThan);
    
        if (!String.IsNullOrEmpty(name)) 
            query = query.Where(c => c.Name == name);
    
        var results = query.ToList();

------
PeterisP
It seems like an apples to oranges comparison - it is looking at the
readability of code returning similar results, but it doesn't consider how
these results are going to be generated, which is a major part of writing
queries.

Do the LINQ examples, especially the more complex ones, result in the same or
better execution plans? Maybe they do, but the article doesn't tell, and
without it it's kind of inappropriate to make judgments or even
recommendations.

~~~
godshatter
This was my concern as well. When I saw the "query.Skip(20).Take(10)" syntax,
I was wondering if the previous line that built the query variable downloaded
all 15 bajillion rows in my Customers table of people whose names started with
'A' so it could get rows 21 through 30.

If it actually only ends up creating some complex sql behind the scenes to get
those ten rows, how horribly convoluted did it become and wouldn't it be
better to write a straight SQL query to do the same thing, even if it's a
little more complicated than the pseudo code? I guess I prefer more control.
If my SQL query turns out to be slow, then I can learn from that and fix it.
If the generated query from LINQ is slow, I have no recourse over it.

~~~
jdmichal
> I was wondering if the previous line that built the query variable
> downloaded all 15 bajillion rows in my Customers table of people whose names
> started with 'A' so it could get rows 21 through 30.

That would be a silly thing to do. LINQ is made to be lazy. At worst it would
pull the first thirty rows, twenty in Skip and ten in Take.

------
4bpp
I was pretty bewildered by the convoluted way they posited for selecting a
subrange of rows, but then I googled around and learned that LIMIT is a
nonstandard MySQL extension. Huh.

(Apparently, while LIMIT n,k is MySQL-only, MySQL, PostgreSQL and sqlite all
support LIMIT k OFFSET n. So I assume this blog post is only really relevant
for people who live in a Microsoft ecosystem?)

~~~
Someone
It took a while, but SQL Server 2012 introduced OFFSET/FETCH:

    
    
        SELECT * FROM items
        ORDER BY key
        OFFSET 100 ROWS
        FETCH NEXT 10 ROWS ONLY
    

([https://technet.microsoft.com/en-
us/library/gg699618(v=sql.1...](https://technet.microsoft.com/en-
us/library/gg699618\(v=sql.110\).aspx))

------
skywhopper
"SQL is a very old language—invented in 1974."

Not a good start. SQL is still in widespread use 43 years later because of how
good it is. It's not a legacy weighing us down. Rather it has consistently
proven its usefulness over and over again. Some of the constructs can feel
awkward to be sure. But part of our impression of awkwardness is really just
due to SQL's declarative nature, which makes the structure of queries look a
lot different than the procedural languages we spend most of our time in.

But it's not clear to me from this article that LINQ is even anything
different that what already exists in most every modern application platform.
It provides a cleaner syntax for some subset of common query patterns. Okay.
There's a ton of SQL generators out there. Is LINQ doing something _else_ for
us? This article doesn't say. I don't know, and I didn't learn that from this
article.

If you use C# or VB to write apps against relational databases, then by all
means, use this to write your SQL. But don't pretend the SQL isn't there or
isn't important. LINQ just places you one more potentially leaky abstraction
away from your dataset.

~~~
eeperson
> SQL is still in widespread use 43 years later because of how good it is.
> It's not a legacy weighing us down.

I'm rather dubious of this claim. This is kind of like claiming the reason we
use Javascript is because of how good it is. The awkwardness of SQL is not
because it is declarative, otherwise LINQ would have the same awkwardness
because it is also declarative. I think the real reason SQL is because
database vendors have not made an effort to add support for other languages.

------
moomin
Only it doesn't. The examples given _compile_ to SQL. So you're not comparing
LINQ with SQL but LINQ-to-SQL (or the entity stuff) with SqlCommand.

SQL's a pretty grim language, but it's the only language the database
understands as a first-class citizen. Until someone designs a new foundation
(like WebASM is doing), the rest is just syntactic sugar.

~~~
skissane
> SQL's a pretty grim language, but it's the only language the database
> understands as a first-class citizen. Until someone designs a new foundation
> (like WebASM is doing), the rest is just syntactic sugar.

SQL is an ugly COBOLesque veneer over an elegant relational calculus core.

But the "new foundation" you speak of already exists, and has for decades.
Competing query languages based on relational calculus are less ugly – QUEL,
D, etc.

Sadly, those better looking query languages have failed in the
marketplace–ugly SQL won.

~~~
cwyers
I have looked at D, and read some of CJ Date's books. As far as I can tell, D
doesn't solve any of the actual problems with SQL, like composibility and code
reuse, and just solves Date's imagined problems about not following relational
theory close enough.

------
olmo
It's funny how async/await has influenced so many other languages but LINQ,
that in my opinion is much more interesting (specially the DB stuff) for
solving down to earth problems is sistematically ignored by other languages
(Python, Java, JavaScript, etc...)

~~~
amyjess
In the case of Python and Java, I think it's because generator expressions and
Streams, respectively, accomplish much the same thing.

Honestly, if you've seen what LINQ compiles to, it looks an awful lot like a
Java 8 Stream.

Using Wikipedia's example of LINQ translation
([https://en.wikipedia.org/wiki/Language_Integrated_Query#Lang...](https://en.wikipedia.org/wiki/Language_Integrated_Query#Language_extensions)):

Written LINQ is:

    
    
        var results = from c in SomeCollection
                      where c.SomeProperty < 10
                      select new {c.SomeProperty, c.OtherProperty};
    

It compiles to:

    
    
        var results =
             SomeCollection
                .Where(c => c.SomeProperty < 10)
                .Select(c => new {c.SomeProperty, c.OtherProperty});
    

And in Java 8:

    
    
        Stream<> results = someCollection.stream()
                                         .filter(c -> c.getSomeProperty() < 10)
                                         .map(c -> new AbstractMap.SimpleEntry<>(c.getSomeProperty(), c.getOtherProperty()));
    

(of course, iterating over them is different; you just use a for loop in C#,
but in Java you have to either use .forEach() or .collect() to a collection
and then for over that)

I think Streams have more LINQ in them than most people think.

Disclaimer: it's been about a year since I've written Java 8, and this is off
the top of my head (and the last time I used it, my employer's codebase had a
class for pairs that was better than just SimpleEntry), so the code could be
wrong.

Edit: and just for completeness... the same in Python generator expressions:

    
    
        results = ((c.some_property, c.other_property)
                   for c in some_collection
                   if c.some_property < 10)
    

Not very LINQ-like, but it serves the same purpose.

~~~
rjbwork
In the real world, MOST people are using the "it compiles to" version directly
in the code. I certainly don't write the "from x in y select" version, except
in the case of multiple hairy joins. Even those have become second nature at
this point.

It's just more readable, and turns it into an object pipeline/functional
programming style instead. The Java 8 streams were pretty much directly taken
from LINQ and given their traditional functional names.

~~~
louthy
> In the real world, MOST people are using the "it compiles to" version
> directly in the code

What's your source for this claim? Because anecdotally I don't see that at
all.

~~~
gregmac
Our codebase at work has hundreds if not thousands of LINQ queries, mostly
using method syntax. I generally prefer it over query syntax except in a few
cases, such when there's multiple joins or subqueries, or the when the
statement is more than a few dozen lines. Occasionally a very complex LINQ
statement requires using a `let` declaration which also is easier with query
syntax. Luckily Resharper has an action to convert back and forth on the fly,
so it's easy to use that to switch over when necessary.

I also don't use LINQ to SQL at all. It's all on in-memory stuff. I pretty
much never write code like this:

    
    
        var result = new List<string>();
        foreach (var item in input) 
        {
            result.Add(item.Value);
        }
        return result; 
    

instead, using LINQ:

    
    
        return items.Select(x => x.Value);
    

The real power comes when you start mixing conditions:

    
    
        return items
            .Where(x => IsValidKey(x.Key))
            .Select(x => x.Value);
    

Or doing quick checks:

    
    
       if (items.Any(x => x == null || x.SomeValue == null)) 
           throw new InvalidArgumentException(nameof(items));

~~~
rjbwork
And now, with C#6 that last bit becomes:

if(items.Any(x => x?.SomeValue == null))

------
ericls
Elixir's Ecto is very similar to this.

[https://hexdocs.pm/ecto/Ecto.Query.html#content](https://hexdocs.pm/ecto/Ecto.Query.html#content)

------
fnovd
I couldn't help by roll my eyes at that title. ORMs and the like are great,
but they still end up using SQL under the hood. LINQ is by no means the most
aesthetic ORM, either (and half of the article is about much _better_ LINQ
looks than SQL).

The headlines may as well read: "Why [interpreted language] beats assembly".

~~~
WaxProlix
LINQ isn't an ORM though, is it? At least not in the traditional sense.

~~~
contextfree
LINQ is a set of language features and libraries that help enable better ORMs,
among other things.

LINQ to SQL and Entity Framework are examples of ORMs that use the LINQ
features

------
OldSchoolJohnny
If the title says some piece of tech "beats" some other piece of tech can we
just all act like proper engineers and not post it here at all and save us all
some time?

~~~
keithnz
I think sometimes it's worth looking at the usernames of who posts and
consider whether you want to contribute to the thread.

~~~
bdcravens
Judging the username, or the person behind the username?

------
yellowstuff
I agree with the article and try to use write my queries with Linq instead of
SQL almost always.

The article says to avoid using Linq for bulk inserts, but for years I've been
using an extension method to translate Linq to a bulk insert and it works
fine. I forget where I found it.

    
    
        public static class DataContextExtension
        {
            public static void BulkInsertAll<T>(this DataContext dc, IEnumerable<T> entities)
            {
                using (var conn = new SqlConnection(dc.Connection.ConnectionString))
                {
                    conn.Open();
    
                    Type t = typeof(T);
    
                    var tableAttribute = (TableAttribute)t.GetCustomAttributes(
                        typeof(TableAttribute), false).Single();
                    var bulkCopy = new SqlBulkCopy(conn) 
                    {
                        BulkCopyTimeout = 1200,
                        DestinationTableName = tableAttribute.Name
                    };
    
                    var properties = t.GetProperties().Where(EventTypeFilter).ToArray();
                    var table = new DataTable();
    
                    foreach (var property in properties)
                    {
                        Type propertyType = property.PropertyType;
                        if (propertyType.IsGenericType &&
                            propertyType.GetGenericTypeDefinition() == typeof(Nullable<>))
                        {
                            propertyType = Nullable.GetUnderlyingType(propertyType);
                        }
    
                        table.Columns.Add(new DataColumn(property.Name, propertyType));
                    }
    
                    foreach (var entity in entities)
                    {
                        table.Rows.Add(
                            properties.Select(
                            property => property.GetValue(entity, null) ?? DBNull.Value
                            ).ToArray());
                    }
    
                    bulkCopy.WriteToServer(table);
                }
            }
    
            private static bool EventTypeFilter(System.Reflection.PropertyInfo p)
            {
                var attribute = Attribute.GetCustomAttribute(p,
                    typeof(AssociationAttribute)) as AssociationAttribute;
    
                if (attribute == null) return true;
                if (attribute.IsForeignKey == false) return true;
    
                return false;
            }
        }

~~~
NicoJuicy
I'm actually doing something else when updating a webshop from a remote
source..

I convert all objects that i import to a big SQL Query.

Something like

First query: UPDATE table SET active=0;

Second ( big ) query: UPDATE table SET _properties_ = _values_ IF ROWCOUNT =0
INSER INTO table( _properties_ )VALUES( _values_ )

------
NicoJuicy
What i mostly like about LINQ is IQueryable. It's awesome!

I just design the IQueryable with if's, switches, ... and when i need to
execute it. I use .ToList()

The IQueryable is then translated to SQL and executed it.

~~~
OldSchoolJohnny
And then time goes by some guy comes along rips out the LINQ and fixes all
that so it's performant under load with hand crafted sql queries.

~~~
NicoJuicy
And then you have no more errors in the VS Error Window ( LINQ is static
typed), when someone changes the domain layer and everything goes to ____*
when going into production and now you have to manually check all strings if
they are adjusted..

As said before, change to pure SQL WHEN something goes too slow. You don't
have to change everything. It will bite you in the __* later ;)

Gotta love static typed :)

------
siddboots
I currently use a bit of LINQ, SQLAlchemy, Blaze/Odo, and a few other tools,
depending on the project. However, primarily I'm still writing SQL in 2017
despite want of a prettier alternative. Here's the main reasons:

* In pretty much every project I've ever worked on, there's a small, un-standardised set of database functions I rely on. I need to parse strings and dates, round to the start or end of the month or week, group and categorise values in order to build a dimension table. I've never found a SQL generator that can provide this, although I believe SQLAlchemy Core folks are working on it.

* Planners are still not so good that you can ignore them altogether. I often rely on EXPLAIN to see why things are taking seconds instead of milliseconds, and I often end up looking for ways to trick the planner. If I'm using a SQL generator, I'm now trying to trick both the planner and the generator, neither of which I have direct control over.

* If you're building a lot of complicated queries, setting up indexes and views, and so on, it's hard to get past the advantages of an IDE. Currently, the best tools are pretty much all database specific. Even just being able to quickly see your data in a tabular format is huge.

~~~
jsd1982
To your first point, LINQ-to-SQL can bind database functions for use directly
in LINQ queries. I've used this first hand for parsing comma delimited string
parameters into table values and joining from there in LINQ.

I haven't used entity framework or any newer tech beyond LINQ to SQL but I'd
have to believe such a trivial feature is present in them.

~~~
siddboots
You are right, and just about every ORM product gives you _some_ way to get at
raw database functions. It would be great, however, if they also offered a
small RDBMS-agnostic library of functionality that would generate the correct
database functions.

------
spion
The outer join example is not what I'd want LINQ to do. Most of the time it
would be preferable to send two queries, one to retreive customers, the other
to retreive high value purchases, then do the "joining" locally.

Otherwise we're potentially fetching the customer data N times per customer
(depending on the average number of high value purchases)

------
quizotic
Sigh. It's hard to know where to begin critiquing this.

I guess the TL;DR would be something like: "For constrained but not atypical
data models (no N:M relationships, foreign keys explicit in the schema, etc),
it's possible to have a more concise syntax than SQL, with less impedance
mismatch."

But even that underwhelming statement is generous.

In the 1980s, the object database claim to fame was less impedance mismatch,
and ODI's "coding by deletion", but at least ODBs coupled that with claims to
superior performance for network data models.

Maybe it's just the grandiose title "LINQ beats SQL" that's so annoying.
Simpler syntax would certainly be welcome for the relatively rare cases of
humans generating SQL (I think most SQL is machine generated these days)...
but only with a host of etceteris parabis conditions, prominently including
performance.

~~~
quizotic
In the 70s, when I was all bushy-tailed, I cornered Gerry Sussman at MIT to
tell him about "this new relational data model and SQL language". About two
minutes into the conversation I saw him looking at me like "God, why have you
chosen to torture me with this idiot."

Or maybe it was the look you have when you are SURE that someone is telling
you an elaborate joke and suddenly realize that no punch line is coming.

LINQ: I'm looking at YOU :-/

------
tracker1
In most of the examples, it will still execute the more complex SQL against
the database... however, it's somewhat easier to reason about.

That said, I tend to prefer to use the collection extension methods, over the
LINQ syntax, which make more sense in my mind.

------
lukaseder
There will always be 1-2 queries that can be proven to work better in an
alternative query language, other than SQL. For instance, I can surely find a
couple of awesome graph queries in the Cypher query language (Neo4j). Or some
JSON queries in N1QL (CouchBase). Back in the days, they tried to advocate
that QUEL was a BetterThing™. Where's QUEL now?

Same will be said of LINQ in a decade. Whereas SQL will still be there.

------
gigatexal
SO LINQ is more terse but there were no performance metrics given. Given the
same RDBMS with proper indexes, statistics, etc., all functioning properly
which is quicker? I think it'd be murky if LINQ just pulls raw data to the
client and then does the transformation there vs. on the server with SQL.

~~~
douche
There is a bit of an art to writing LINQ queries, but if you know what you're
doing (i.e. spend a little time experimenting, with the traces turned on in EF
or Linq to SQL that output the generated SQL, or a good profiler), you can
write queries that are exactly the same as hand-tuned SQL. There's a little
reflection overhead mapping results to types, but, you either make use of
that, or have to write all that boilerplate yourself, so that's the tradeoff.

Of course, if you don't know what you're doing, you can write some monstrous
Linq queries that will be dog slow and exhibit all the worst n+1 behavior,
pretty easily.

------
proyb2
I guess not many came across this guide.

For LINQ to other languages: Swift Java Kotlin Clojure Dart Elixir

[https://github.com/mythz/swift-linq-examples](https://github.com/mythz/swift-
linq-examples)

------
jedberg
So I'll admit I'm not at all familiar with LINQ and MSSQL, but from reading
this, it looks like LINQ is just an ORM?

And if that's the case, then of course it is easier and looks nicer than SQL
-- that's kind of the point of an ORM.

But I could show you a whole bunch of articles about Python and Ruby and Go
ORMs too.

Seems like an Apples to Oranges comparison.

~~~
jordanlev
It's not an ORM, more like an alternate way to write queries that's kind of
embedded in the language itself (as opposed to SQL where you're putting
together a string). If you're familiar with the Rails world, I'd say it's more
like "Arel" than "ActiveRecord".

~~~
jedberg
But Linq literally generates SQL before sending it to the server:

[https://msdn.microsoft.com/en-
us/library/system.data.linq.da...](https://msdn.microsoft.com/en-
us/library/system.data.linq.datacontext.getcommand.aspx)

Doesn't that make it an ORM?

~~~
contextfree
That link is about LINQ to SQL, not LINQ proper

* LINQ is a set of language features and libraries that help enable better ORMs, among other things

* LINQ to SQL and Entity Framework are examples of ORMs that use the LINQ features

------
ww2
But LINQ is too slow comparing to vanilla SQL.

------
throwawasiudy
There's one BIG problem with LINQ. I'll agree that the syntax is marvelous, a
game changer. The problem is performance.

It's gotten slightly better over the years but compared to an ORM with less
abstraction like Hibernate it's still dog slow. Every MS project I've worked
on was mostly LINQ...then a folder called "dirty SQL" for the heavy stuff.

I'm not sure if it's due to the highly abstracted nature or just not making
performance a priority but in my experience sane Hibernate queries are about
1/2 the speed of native SQL and LINQ is closer to 1/50.

I hope and pray they can make the performance at least within an order of
magnitude of raw SQL or even Hibernate so I can say goodbye to SQL forever.

~~~
NicoJuicy
Use LINQ for development and production, when you need performance use Dapper.
It shouldn' be too hard if you follow the DDD pattern.

You can always use DataContext.Database.SQL(query) ofc

~~~
throwawasiudy
> You can always use DataContext.Database.SQL(query) ofc

that's pretty much our MO :) . Never used Dapper, how reliable is it? I ask
because we use StackExchanges' Redis client for C# and it mysteriously crashes
even after untold hours of debugging

~~~
NicoJuicy
Actually, haven't had any issues with it. So, it's safe to say that i find it
very stable.

