

Writing your own LINQ provider - aashishkoirala
http://aashishkoirala.wordpress.com/2014/03/10/linq-provider-1/

======
aashishkoirala
OP here. I wanted to comment on the discussion around ".NET vs. other things"
coversation that has started brewing here. I'm not one to hate on any
environment or ecosystem. I do make fun of PHP or Java at times, but it's all
good-natured. Not to be too PC, but yes, every thing has its pros and cons.

Personally, however, I think C# as a language strikes the right balance in
terms of power, ease-of-use, learnability and code readability. Even if you
have a problem with Microsoft, I think it is wrong to let your "MS-hate"
translate into "C#-hate". It is a beautiful language and the open-source
community around it is growing rapidly. The progress made by platform-
independent endeavors (case in point Xamarin) is admirable.

In the unlikely scenario that all of programming were to converge into one
language, I would want it to be an open-source, platform-independent version
of C# (with maybe just a tad more constructs borrowed from the functional
programming world).

P.S. I also think it is a mistake to lump C# and Java in the same bucket. Ten
or more years ago, maybe. Today, they are VERY different.

~~~
doorhammer
I do C# dev in my day to day and I really enjoy it, though my personal
preference for a "one language" scenario would probably be F#, if it were to
come from the MS camp. (I don't mean this as a comparison of languages; just
casually throwing out a personal preference)

Was wondering what, in your opinion, are the largest differences between Java
and C#? It's been a long time since I've done any Java, and while I can
definitely name a bunch, I thought it'd be interesting to get your take on the
most impactful or meaningful differences.

~~~
aashishkoirala
Yes, I've heard a lot of praise around F# and its power. I've been meaning to
get into it. Coming into it cold, though, I found the syntax a bit cryptic,
but that's probably just me. I do plan to power through, though.

In terms of differences with Java (and limiting this to just the language and
the syntax), LINQ is a good example of it :) I guess you could classify a lot
of the other differences as synctatic sugar, but those little things add up
and make a real difference in terms of code brevity, readability and developer
productivity. To name a few (and I know that Java, especially with v8 has
started adding support for some of these): Properties (and auto-properties),
Delegates, Events, Lambdas/Anonymous Functions/First Class Functions,
Extension methods, "using" for IDisposable (although Java 7 has something
similar in AutoClosable), the lack of forced checked exceptions, the newly
added async/await keywords, object and collection initializers and anonymous
types, etc.

~~~
doorhammer
Awesome, thanks.

F# is definitely a nice language. I haven't done any large scale projects in
it, but what I have done has been very nice. Check out type providers if you
haven't already. They're really interesting and handy. I switched over to F#
as my side/fun language from Clojure when my day-job became .NET. It's not
that I'd say they're equivalent languages, but the way of thinking about most
things was similar enough that it felt right. On top of that, it was nice
having a fairly strong type system to back things up. It meshed more with how
I think about things.

Hah, I probably should have assumed LINQ would top the list of a linq post. I
remember before I knew what LINQ was about. Once I started using it, it
quickly became one of my favorite parts of the language/ecosystem.

I really do feel that the things you've mentioned do make it a much cleaner
and more consistent language than Java. I've got a friend who switched from
full time Java to C# recently. I'll have to see what he thinks as well.

------
Guillaume86
Nice post, if you plan a writing a full featured provider I recommend using
the ReLINQ framework:
[http://relinq.codeplex.com/](http://relinq.codeplex.com/)

I regret I didn't knew about it when I wrote my first LINQ provider.

~~~
jdaley
Although it's quite a bit older than re-linq, I'll throw a recommendation in
for IQToolkit: [http://blogs.msdn.com/b/mattwar/archive/2008/11/18/linq-
link...](http://blogs.msdn.com/b/mattwar/archive/2008/11/18/linq-links.aspx)

I used it to build a LINQ provider for a legacy mainframe database system,
abstracting away the incomprehensible table and column naming, weird date
systems, EBCDIC, and other nastiness involved in querying that system using
straight-up SQL.

One thing I found is that implementing a custom LINQ provider gives you a
better understanding of how Entity Framework runs your queries. You get to see
how queries get split up into the parts executed in the CLR versus the parts
that get translated, and how expressions get rearranged and rewritten.

~~~
aashishkoirala
I second the comment about getting a better understanding of how EF, etc. work
after writing your own provider. Definitely.

------
gum_ina_package
Awesome stuff. As a side note, anyone else see a rise in .NET/Microsoft
related posts recently?

~~~
skrowl
I'm actually GLAD to see this rise in .NET stuff. I feel like us enterprise
developers are left out on [Y] sometimes. Not everyone rolls python / ruby /
node / etc.

~~~
mattgreenrocks
It's a shame that .NET is considered 'just' enterprise.

C# is a great language, and there are several libraries with fantastic design
aesthetic; one that is _much_ better than most of Ruby/JS.

~~~
UK-AL
Most people will admit they like c# and F#. However they won't like the
microsoft tax, and the fact its tied to windows.

~~~
profquail
How are C# and F# tied to Windows? They run quite nicely on Mono, although
I'll admit that even with the significant, recent improvements Mono has made
it's still not as fast (generally) as the .NET CLR on Windows.

Also -- F# is completely open-source, so there's no "Microsoft tax" to be
paid. In fact, a large portion of the F# community runs exclusively on non-
Windows platforms (e.g., OS X, Linux, FreeBSD, Android).

~~~
platz
They may "run" on mono, but their performance there is terrible.

~~~
profquail
I disagree. The performance on Mono is admittedly not as good as on Windows,
but it's improving steadily and certainly fast enough for many use cases.

~~~
dev360
Why should anyone bother with .net if Microsoft can't make a wholehearted
commitment to have it run on all platforms?

------
sidmkp96
LINQ one of the most beautiful things in C# after extension methods!

~~~
corresation
I would say that LINQ is one of the worst things to ever happen to .NET. On
one team I had to make a dictum that barred LINQ from even being used, except
in exceptional circumstances.

I do not mean to raise the ire of defensive .NET programmers by saying this:
as with most things, used right it is a beautiful thing, and there is nothing
_fundamentally_ wrong with LINQ. But what it was used for in practice was
almost always grotesquely inefficient.

If someone had to make a loop to iterate objects to find a member on every
function call, pretty soon they'll realize they should using a
System.Collections.Generic.Dictionary for instance, or some other appropriate
data structure (e.g. Need sorted data? Then use a sorted tree, for instance).
When it's a simple LINQ query, though, it's one seemingly harmless little
line, so it tends to get a pass.

Pretty soon profiles were just hundreds (if not thousands) of different cases
where LINQ was used as a cure-all, destroying performance under any load at
all.

LINQ is not magical. It is doing the same dirty work that you would be doing
yourself if you needed to filter / sort / recast, but provides syntactical
sugar to do so. The compiler and runtime will happily generate horrifically
inefficient code if you request it, with no warning or complaints.

The tldr; is that it is a fantastic set abstraction that is more often than
not grossly abused.

~~~
jmcqk6
I would change this a little bit. LINQ creates much more readable code, which
is always a good thing.

I think the real problem is Cargo Cult programming. People are using linq
(especially in combination with EF) and not understanding what's actually
going on. So they end up getting things like n+1 selects on collections with
thousands of items. They don't understand delayed evaluation. Or how sorting
effects a Where() call.

The problem is programmers who think they can take some code from stack
overflow and use it without outstanding it. That problem is not unique to LINQ
or .NET.

~~~
corresation
_That problem is not unique to LINQ or .NET._

It absolutely isn't unique to it -- powerful features allow programmers to
shoot themselves in the feet, and this is true since we began this profession.

When dealing with larger applications and datasets it has the potential of
easily imposing _enormous_ costs, of the sort that absolutely eclipse most
other anti-patterns. Appending on an immutable string through recursive
function calls suddenly looks like child's play.

The truth is that in most apps, these inefficiencies simply don't matter, and
to all of the "just learn how to use it right!" replies, I would say that odds
favor that those people aren't using it right at all. But in the end it just
doesn't matter that much because small amounts of data, low demands, etc, just
makes it an irrelevant cost. Most apps are running with an excess of CPU
resources relative to the demands on the app, so it just doesn't matter.

Many .NET developers have never run Visual Studio's profiler. They don't even
know what the costs are, but we're in a world where 1500ms to load a basic
form is okay.

When you deal with very large scale financial data, as I do, however, your
world is a very different world. You don't just have a scrum where you try to
teach people that LINQ isn't magic (which of course we did). The risk factors
become much higher.

~~~
louthy
If the milliseconds matter so much then why aren't you using code reviews or
unit tests to catch the problem queries?

LINQ isn't just about hitting a database either, it enables a more declarative
/ functional approach to development in general, which brings huge benefits in
terms of parallelisation / robustness / testability.

It seems absurd that you would block your team from using it.

Edit: Oh and btw, you're not the only person who has to deal with "large scale
data". Many of us do and manage just fine with LINQ.

~~~
corresation
Who said LINQ was just about the database? Of _particular_ concern, actually,
are LINQ to Objects.

And yes, those milliseconds matter. And it's very strange that I noted that
after repeatedly encountering performance issues with LINQ, you say that we
should have oversight. Do you understand that such _is_ oversight? That the
rules and exceptions came about because the recurring issue with LINQ (
_particularly_ LINQ to Objects) being an enormous performance issue?

It doesn't matter to you, probably because you don't even know what the cost
is [and that isn't meant to be at all a slam. For most software the cost
simply doesn't matter]. And that's fine. But save the continual exclamatives
(which you always add right after defensively down-arrowing).

Though I have to chuckle at the notion that LINQ helps with testability, or
robustness for that matter.

~~~
louthy
> That the rules and exceptions came about because the recurring issue with
> LINQ (particularly LINQ to Objects) being an enormous performance issue?

No, you're exaggerating your position to try and justify it. There is zero
substance in what you're saying. All I see is someone who appears to have made
an irrational decision about a technology because he personally doesn't like
it. There's no fundamental reason why LINQ to Objects should be any slower
than say a foreach over a set. You can get is a mess with either approach. If
you're not prepared to mentor your team, or do code reviews to check for
problematic usage then it's a fault of yours, not the technology.

> It doesn't matter to you, probably because you don't even know what the cost
> is [and that isn't meant to be at all a slam. For most software the cost
> simply doesn't matter].

Of course it's meant to be a slam, you have no knowledge of my experience of
optimisation or my deeper understanding of how frameworks like LINQ works, so
instead you assert that I "don't even know what the cost is". I have spent a
large part of my career doing 'to the metal' optimisation, so I will ignore
that comment.

> Though I have to chuckle at the notion that LINQ helps with testability, or
> robustness for that matter.

If you don't understand why imperative and functional code are different then
you won't understand why LINQ is inherently more robust and testable.

~~~
corresation
_There 's no fundamental reason why LINQ to Objects should be any slower than
say a foreach over a set._

There's no fundamental reason why LINQ to Objects should be any _faster_ than
a foreach over a set. Which is _precisely the point_. LINQ has an amazing way
of hiding details in a manner that allows for massively inefficient code to
look completely innocuous. For throw-away, one-off "command line utility to
convert A to B" type code, it is absolutely golden. For long-term use code
that may be called millions of times, it can be disastrous.

You have no interest in a reasoned discussion, and your argument style is best
described as defensive. That is all. Have a nice day.

~~~
louthy
This is quite incredible. Maybe have a read through the various arguments
you're having on this entire topic and see who's being defensive. If you
really cared about performance you wouldn't use C#, you'd use C or C++.
Anything where you could absolutely control every aspect of memory allocation,
memory access, bounds checking, IO, etc.

I'm out of this thread, it's not adding to the discussion.

~~~
corresation
_If you really cared about performance you wouldn 't use C#, you'd use C or
C++_

This, in a nutshell, is your argument. And it's an incredibly poor, out of
place argument that is founded on ignorance, and does absolutely no service to
.NET.

Performance in most modern applications is achieved through proper algorithms
and data structures. LINQ is often a hasty short-cut around those, but it
offers the illusion of elegance such that many people (such as, apparently,
you. It's all functional-like and such, so it has to be good, right?), are
blissfully unaware.

Again, a lot of the time that's perfectly fine and causes no real harm --
similar to the way regular expressions might be a costly but powerful catch
all -- but in other cases it is not good. That is the case with large scale
financial software. If you have a different experience, good for, but simply
saying "incredible!" over and over again does absolutely nothing to make your
case.

~~~
platz
Taking a line out of DHH's playbook, perhaps it would be better to discuss a
concrete code sample. E.g
[https://news.ycombinator.com/item?id=7335211](https://news.ycombinator.com/item?id=7335211)

------
thesz
You can get away with providing your own classes and extension methods. I did
that, my colleagues did that (floowing my attempt). We needed a LINQ access to
a non-SQL (hypergraph) database, so basic LINQ wouldn't do it.

You need to know how to parse function bodies and I strongly suggest algebraic
types approach, e.g., using SpecificType v = genericTypedVar as SpecificType
and checking if v != null. You have to write your own parametrized type
supporting Where, Select, etc.

And you're good to go.

It took me three days to write first LINQ-like thing.

~~~
aashishkoirala
That is true as well, especially if you need to support operations not part of
out-of-the-box LINQ. The real meat there is in the Expressions API. The actual
LINQ methods, if you look at them, don't do much except provide a way to
delegate to the underlying IQueryProvider.

That said, I think for stuff that is supported, better to stay within the
"standard approach".

But I get your point.

------
aashishkoirala
And just as we were talking about .NET in other environments and Mono and
Xamarin, we hear the news about Microsoft in talks to acquire Xamarin.
Interesting.

------
rip747
LINQ makes moving from CFML to .NET a lot easier. in CFML you're mostly
working with queries and using LINQ makes you feel right at home.

