What ORMs have taught me: just learn SQL (2014)

zzzeek · on Dec 18, 2017

> If you're using an RDBMS, bite the bullet and learn SQL.

If this person spent all that time using Hibernate and then SQLAlchemy, and all that time did not know SQL, then their suffering and bad experiences make complete sense. You absolutely need to know SQL if you're going to use an ORM effectively. Good ORMs are there to automate the repetitive tasks of composing largely boilerplate DML statements, facilitating query composition, providing abstraction for database-specific and driver-specific quirks, providing patterns to map object graphs to relational graphs, and marshaling rows between your object model and database rows - that last one is something your application needs to do whether or not you write raw SQL, so you'll end up inventing that part yourself without an ORM (I recommend doing so, on a less critical project, to learn the kinds of issues that present themselves). None of those things should be about "hiding SQL", and you need to learn SQL first before you work with an ORM.

erikpukinskis · on Dec 18, 2017

Ok, let’s break this down:

> Good ORMs are there to automate the repetitive tasks of composing largely boilerplate DML statements, facilitating query composition, providing abstraction for database-specific and driver-specific quirks

None of that requires an ORM. A simple query builder will suffice and it will be much easier to debug and much less error prone than an ORM.

> providing patterns to map object graphs to relational graphs, and marshaling rows between your object model and database rows - that last one is something your application needs to do whether or not you write raw SQL

So this is the real ORM juice. And ya, without an ORM you have to do this by hand.

So here’s the question: How well can an ORM map to your data model out of the box? In a lot of cases this is where the mess comes from. You need to set up basically a low level AI that can figure out the “right” thing to do as call sites all over your codebase are making arbitrary queries on a huge API surface.

And so the argument against ORMs is that it will take less time and be less error-prone to write bespoke loaders that take rows and build your data model than it will be to configure that AI such that it can do the “right” thing with arbitrary queries.

(If you don’t like the word “AI” here, use “expert system” instead.)

avmich · on Dec 18, 2017

> A simple query builder will suffice and it will be much easier to debug and much less error prone than an ORM.

What's the difference? To me, an ORM is largely a query builder.

emilsedgh · on Dec 18, 2017

ORM: You ask for something and you don't care _how_ its fetched.

Query Builder: You build a query, just not in SQL. So you can get around SQL's limitations (like composition)

andybak · on Dec 18, 2017

> ORM: You ask for something and you don't care _how_ its fetched.

Try this:

> ORM: You ask for something and because you've researched and found a well-written, quality ORM you trust that it will create a sane query.

or possibly:

> ORM: You ask for something and you accept the tradeoffs vs hand-written queries but you're content that it's the correct tradeoff for your use-case.

There's plenty of places for bottlenecks to hide and premature optimization is the root of at least some evils.

jcadam · on Dec 18, 2017

> ORM: You ask for something and because you've researched and found a well-written, quality ORM you trust that it will create a sane query.

Ha. More like:

ORM: You've just joined a project already using an ORM selected by an 'architect' that no longer works here. Everything is fine until you start testing your system with a database sufficiently populated with real-world data. You and the DBA spend the next next 6 months trying to get the damn ORM to generate performant SQL (you even call Oracle support, which promptly dispatches an "engineer" who tries to sell you on another $500k of crap you don't need that won't really address the problem). You eventually just start writing queries by hand wherever you find a bottleneck caused by the naive ORM, which you could have done at the outset, but no one who actually knew SQL well enough was on the team back then.

paulryanrogers · on Dec 18, 2017

IME, it should never take more than a few hours to run down why the ORM made such a query. At a minimum most RDBMS's have query logging and can explain queries.

Ironically the last time I had a big ORM performance problem was Hibernate eager loading all joins. Diagnosing and fixing it didn't take more than an hour. (Though we did have a very experienced DBA at the time.) YMMV

emeraldd · on Dec 18, 2017

You have to known when to break free of the ORM. They are great for graphs of CRUD operations and pretty nasty for much of anything else. You may come to a different conclusion depending on the project and dataset of course.

dorgo · on Dec 18, 2017

And when you break free of the ORM prey to god that your ORM doesn't have a hidden cache somewhere. If it does then you spent a week tear out your hair until you figger out that the ORM' cache caused all the trouble.

collyw · on Dec 19, 2017

This sounds like a distinction that you personally have made, that the wider community probably won't agree on.

petre · on Dec 18, 2017

A query builder aids you at constructing queries, while an ORM builds queries for you, runs them and maps the output to objects. It's more sophisticated than a query builder.

zimpenfish · on Dec 18, 2017

> A query builder aids you at constructing queries

... that you understand and can be sure are sensible.

> while an ORM builds queries for you

... that you have to hope are sensible.

That's one of the biggest flaws of the ORM for me - you have limited visibility of what it's doing to your DB.

lmm · on Dec 18, 2017

Isn't that the same reasoning as "I don't like using high-level languages because they limit my visibility of what's running on my processor"? Do you write all your code in assembly?

_grep_ · on Dec 18, 2017

I think the issue is that if you don't understand what a high-level language is doing under the hood, then you will constantly be surprised by side-effects. You shouldn't be using an ORM to substitute for your lack of understanding SQL; you should be using it to automate tasks.

scarface74 · on Dec 18, 2017

How so? I can log the the sql being generated and look at the logs. Something I should be doing either way.

zimpenfish · on Dec 18, 2017

If you have to log the generated SQL to understand what's happening, you're already behind the curve.

And then what do you do if the ORM is generating junk? If the answer is "use a querybuilder/handcrafted SQL for that one", what's the point of the ORM in the first place?

mgkimsal · on Dec 18, 2017

> And then what do you do if the ORM is generating junk? If the answer is "use a querybuilder/handcrafted SQL for that one", what's the point of the ORM in the first place?

The point may be that 98% of the queries are just fine, and you've saved time vs writing by hand, and it may be easier to read/understand for the next people to have to touch the code.

zaphar · on Dec 18, 2017

"saved time" at the least optional time to save time. ORM's are a maintenance burden. They obfuscate DB performance behind usually an enormous API surface.

Most software work is maintenance work. Optimizing for initial deployment is shortsighted.

scarface74 · on Dec 18, 2017

What "enormous" API surface? It's not about "optimizing initial deployment". It's about "optimizing continuous deployment" and having "always releasable Software". I've worked for departments with 15 devs all working on the same codebase and we could release and rollback releases (A/B deployments) every week because we treated the database as a dumb data store. We had multiple branches at the same time, etc.

I've also worked at a companies where all of the logic was in ungodly stored procedures and getting anything released took months and we had a whole two weeks "hardening sprint" because we had to coordinate with the "database developers" and of course the entirety of the business logic was In stored procedures.

Most software being maintenance work is even more of a reason to optimize deployment. What good is unreleased software? The most important part of the business is releasing software. Most of my emphasis asan Architect is making sure we can release fast.

mgkimsal · on Dec 18, 2017

I like the idea of stored procedures. I get some of the value they bring. However, the few times I've been on projects where sprocs where the primary focus of logic/truth/app... it was always a pain.

* The 'developers' weren't allowed to write the sprocs. We were at the biz meetings, but the DB guys were hardly ever there - their meetings were separate for some reason, but because devs were at the meetings, the devs were the ones who also were the face of the project. When the project was behind, we caught it in the neck, even if we were bottlenecked waiting for the DB team to write their logic and expose it to us.

* The DB team was always fewer people, juggling more projects, and other things like system uptime, maintenance, backups, etc.

* The sprocs were never part of version control or part of any source code that we could ever see as part of normal development. They typically weren't subject to any unit testing process.

The answer to all of this is mostly human management, structuring resources differently, coming up with different processes, etc. But any of those things would have changed the power dynamic, which seemed to be the purposes in those environments. Hey, I can write a stored procedure too - let me write them, and if a DB wants to 'review' them - or really, anyone on the team - please review and let's hammer it out and make it better. But roadblocking projects until the DB guys can 'get around' to writing our mission critical procs is just silly.

One other 'weird' division I saw a few places was this "developers can never have access to production systems - that violates XYZ" (a regulation, or some 'law' that was never produced, etc). I asked what the core issue was, and it was "you can't just have developers going on to production and just making changes on live systems - that's ... (against our policy, etc)". This was particularly challenging in a situation where a critical bug only happened on one production system, and we weren't allowed to replicate the database to another system, nor was anyone with any knowledge of the deployed code allowed to get on to the production system to even see if what was deployed was what we'd developed. But... this was still "our problem". Yet... the DBA in this case was "allowed" to get on the system and hand-write new triggers and sprocs to 'fix' our problem, all without documenting/testing his code, nor committing the code to any repo for us to even have visibility in to the data manipulation he was doing to 'fix' the problem we supposedly cause but couldn't investigate.

Again, I know this isn't a problem specifically with stored procedures. When sprocs have been promoted as the primary interface, however, it's usually been a political/power grab more than a technical benefit. And yes, again, I know there are technical advantages in some cases, but usually not enough to outweigh the drawbacks I've encountered.

collyw · on Dec 19, 2017

Your complaints about stored procedures seem to be about the nature of the company where they were used rather than any inherent flaw with them.

mgkimsal · on Dec 19, 2017

I believe I pointed out that I realize it's a human issue more than a technology one. It was easier for some people to get suckered in to the power dynamics being played because "DBA" was already seen as more of black-magic art sort of thing, and those guys were the "real wizards" and so forth, so whatever they say goes. It's not been everywhere I've ever worked where sprocs were used, but it seems to have been at the places where "stored procedures are law".

scarface74 · on Dec 20, 2017

And it's because most developers can do sql but most "database developers" can't code. So of course they are protective over their code.

weberc2 · on Dec 18, 2017

You haven't saved time compared to using a query builder though; in fact you've lost time due to the hit to readability and debugability.

collyw · on Dec 19, 2017

How do you know this? You have no idea what bugs were encountered or how easy they were to fix.

weberc2 · on Dec 19, 2017

We're conjecturing about hypothetical bugs. No one in this thread knows what bugs were encountered nor how easy they were to fix; they aren't real bugs. They're symbolic of the bugs that we encounter every day, which give us the experience on which we base our conjecture.

scarface74 · on Dec 18, 2017

How often is the ORM generating junk? To say that the ORM is useless because occasionally it doesn't generate performing code (with EF and Linq more often than not it does generate performant code) could be applied to any high level construct. But I don't see people giving up modern languages to go back to writing everything in Assembly or even C.

Yes I optimize when my automated performance testing/stress testing, tells me I need to and may write handcrafted sql, but I've also handcrafted some classes in C back in the day when my old Windows Mobile app using the C# compact framework wasn't performing.

zimpenfish · on Dec 18, 2017

> How often is the ORM generating junk?

From what I've seen (couple of home-grown ones, Class::DBI, ActiveRecord), "more often than you want".

(I'm willing to admit they may not be class-beating examples. :)

scarface74 · on Dec 18, 2017

I meant a good ORM. But then again, my definition of a good ORM is an ORM with a language that treats queries as a first class citizen. EF with Linq doesn't really act feel like a separate framework since Linq and Expressions are built into the language.

zaphar · on Dec 18, 2017

Those are more of a query framework than an ORM. If that's your bar then most ORMs are going to fail to meet it.

scarface74 · on Dec 18, 2017

How is it a "query framework"? Linq works with objects and the Linq expression provider treats the Linq as data that translates the objects to Sql at runtime. That is by definition an Object Relational Mapping.

Of course most ORMs do fall short because most languages don't have the powerful concept of "code as data".

FLUX-YOU · on Dec 18, 2017

>you're already behind the curve.

SQL syntax is extremely verbose, and compounds the more tables are involved in a query. You're not counting the time the ORM has saved you before having to resort to logging SQL.

akvadrako · on Dec 18, 2017

SqlAlchemy Core vs SqlAlchemy ORM

cosarara97 · on Dec 18, 2017

And doctrine DBAL vs doctrine ORM

Animats · on Dec 18, 2017

ORMs are more useful as insert builders. Putting data from an object into the database is something of a boilerplate process. Queries vary with what you want to ask. Most of the time, you don't need all the fields, so filling up some object just because it has slots for everything is a waste of effort. Especially if it means references to multiple tables.

egeozcan · on Dec 18, 2017

Good ORMs have Partial<T> and lazy loading of complex properties through proxies, which can be overridden with something like .With(x => x.ComplexProperty).

But of course, as the queries get more and more complex, the flexibility of the ORM syntax approaches the flexibility SQL. In the end, there are many situations one would rather just use SQL.

majewsky · on Dec 18, 2017

I think the best ORMs are those that just leave out the "Relational" part entirely. So... "OM"?

For example, in Go, I use Gorp, which has a Select() function where you pass in the SELECT query string (plus bound values) and the target type, and it loads every result row into an object of that type. So you can have an arbitrarily complex SQL query as long as it starts with `SELECT one_table.* FROM`. That's a marvelous design.

And when you have to do a query that returns results from multiple tables? Guess what, you just use the normal SQL module from the standard library.

scarface74 · on Dec 18, 2017

Why is it an either or? EF can do that but then you loose the benefits of a type safe language.

oceanswave · on Dec 18, 2017

If type safety is so great, why isn’t sql statically type checked

majewsky · on Dec 18, 2017

SQL is statically type-checked, at least in Postgres.

  postgres=# SELECT * FROM domains WHERE id = 'foo';
  ERROR:  invalid input syntax for integer: "foo"
  LINE 1: SELECT * FROM domains WHERE id = 'foo';
                                           ^

cpfohl · on Dec 18, 2017

Not just PostGres. That's standard SQL. Columns have data types. Tables have schemata. Schemata are very strongly typed. Can't insert a 'full_name' column into a table without such a column.

da_chicken · on Dec 18, 2017

Unless you're talking about SQLite, SQL RDBMSs are both static and strongly typed. The systems typically do allow some implicit type conversions, but type is critical to how a table works.

lmm · on Dec 18, 2017

Because SQL predates a lot of modern techniques. A ground-up replacement written today probably would be statically type checked.

setr · on Dec 18, 2017

What do you guys think the schema is?

lmm · on Dec 18, 2017

Dynamic

int_19h · on Dec 18, 2017

Is it a good one, though?

JamesBarney · on Dec 18, 2017

Probably better than the one I could build by hand, with far better documentation, and greater ease in googling problems :)

failedxyz · on Dec 18, 2017

while ORMs can do quite a bit of optimization, they're still general query builders and can't construct the optimal queries for your use case. if you don't know SQL, or you don't know what's going on behind the scenes, your ORM could be performing much larger queries than it really needs to, costing performance and time

gjjrfcbugxbhf · on Dec 18, 2017

Most ORMs have way too write raw SQL. Furthermore someone using an ORM well will understand how their queries are mapping to SQL.

Your scenario happens with people that either don't know or don't care. They will write crappy queries with any tool.

collyw · on Dec 19, 2017

Sounds about right.

I built a moderately complex application in Django at a previous workplace, using the ORM for most things, until the queries were too complex for the ORM.

Another guy connected to the same database and built some graphs using PHP and SQL. Guess who had to help him write the SQL when the queries got too complex for him? The ORM user.

stouset · on Dec 18, 2017

But that's a completely orthogonal problem — the point made (way) above is that you need to understand SQL whether or not you use an ORM.

jbergens · on Dec 18, 2017

Bascially all in-house built query builders I've seen has been much worse.

voodootrucker · on Dec 18, 2017

It depends on how willing you are to allow your object graph to match your relational schema. The key is to let SQL be SQL. I wrote one that allows you to load data using standard SQL resource files, but it handles the persistence automatically:

https://github.com/bgard6977/sqorm

Flyway & jOOq take a similar approach:

https://www.jooq.org/

Taking the SQL-first approach also allows you to serialize without circular reference problems, since you don't just load the data, you also define a path to decompose the graph into a DAG.

keredson · on Dec 18, 2017

also https://github.com/keredson/DKO

pweissbrod · on Dec 18, 2017

My experiences have led me to the standpoint that most ORMs handle three major things:

1. provide idiomatic domain-object oriented query interface which it in turn translates to SQL

2. provide CRUD sql generation

3. provide some sort of session-based object lifecycle change tracking and management

#1 - The generated SQL is important on many levels. As things like HQL/linq/<your QL here> deviates further from the generated sql transparency is lost. SQL is normally brittle compared to your domain language which has better testability and type safety but still you have a handful of queries where it feels more sensible to write the SQL yourself.

#2 - Code which reflects on a type and generates basic insert/select/update/delete is usually pretty naive and easily done. With the exception of complicated legacy databases and iBatis-style tools ORMs which only support #2 arent really worth bothering.

#3 - I've found this to be the real benefit of an ORM. Being able to scope object lifecycles into clear units of work, buffer pending changes until a discrete point and get scoped caches for "free" have been hugely beneficial.

Each ORM unfortunately/inevitably come with great learning curve. It seems to unfortunately/inevitably bring lots of new concepts and conventions to the table required for the user simply must understand. Sometimes it requires you to re-arrange the way you may write your code to be more session-oriented.

I dont like the idea of AI managing how/when to apply changes to storage media. AI can be smart and efficient yes but you lose that important transparency/predictability. On the contrary I prefer very dumb/mechanical/predictable ORM, something not elegant but the behavior of what happens when is well-understood and easily scaled out to a large team. In my experience hibernate, EF, sqlalchemy have that sort of dumb/predictable behavior (however the SQL they sometimes generate can be performant but unreadable).

spdionis · on Dec 18, 2017

5 years down the road, good luck refactoring your application to not query the same object from the database 3-4 times, and load it only once instead.

oceanswave · on Dec 18, 2017

You’ve probably refactored the ORM Data layer 3 times in that time period as ORM producers have a hard time figuring out the interface which they wish to provide you

spdionis · on Dec 18, 2017

Agreed. I'd say my point still stands because duplicated database calls affect performance and scalability and this can grind the project to a halt at a certain point. Also this refactor is much more dangerous/probably buggy.

On the other hand refactoring your data logic is just business as usual, and you will probably do it in both cases anyway.

guru4consulting · on Dec 18, 2017

caching is a very tricky area with tons of pitfalls. In my opinion, the ORM should not be caching. Let the clients (or any other layer) cache/clear based on their needs.

scarface74 · on Dec 18, 2017

So the easiest way to avoid using an ORM is to create your own ORM?

What AI do you need? You map your tables to objects and relationships between objects via FK relationships.

pjmlp · on Dec 18, 2017

I can get very creative with SELECTs, making use of Prolog style queries, which are fully done server side on the database.

Most ORMs will download all the data and evaluate them on the client side, with code that is even more convoluted that the SQL one and thus with less performance.

scarface74 · on Dec 18, 2017

If your ORM is downloading all of your data and querying client side, then your ORM "is doing it wrong".

Entity Framework and any other Linq to data provider translates the Linq expression to the native language of the source data and does it server side.

pjmlp · on Dec 18, 2017

I know EF, which is why my comment also mentions "code that is even more convoluted that the SQL one and thus with less performance".

LINQ only allows for a fraction of what is possible with SQL, and good luck having the best queries generated out of it, if the RDMS doesn't happen to be SQL Server.

scarface74 · on Dec 18, 2017

Actually my DMS is sometimes Mongo and the Mongo driver does pretty well at translating Linq to Mongo Query using the aggregation framework.

It all depends on the talent of the authors of the drivers.

zzzeek · on Dec 18, 2017

> Most ORMs will download all the data and evaluate them on the client side

I've seen that pattern in lots of homegrown applications, but I've never seen such a thing in a mainstream ORM. Care to provide examples ?

pjmlp · on Dec 19, 2017

I've seen it a lot in entity beans and the early days of hibernate.

Nowadays I only bother with Dapper, MyBatis, using a mix of SQL and stored procedures.

EF only if the RDMS happens to be SQL Server.

sergiosgc · on Dec 18, 2017

> What AI do you need? You map your tables to objects and relationships between objects via FK relationships.

That is only true in a one-to-many entity relationship (and even so, it is debatable). A one-to-one relationship can be modeled in the two objects, in one of them, or delegated to a third entity. A many-to-many entity relationship can also be handled, in OO, in various different ways. Idem for a ternary relationship or, basically any higher order relation between objects.

This is known, borrowing a term from electrical engineering, as an impedance mismatch between the two models, and it's not an easy problem by any measure.

bpanon · on Dec 18, 2017

No, no, no, I know more than you.

Here is why.

Bitcoin cryptocurrency AI biomedical supply chain networking quantum systems-thinker.

dang · on Dec 19, 2017

Please don't post unsubstantive comments here.

https://news.ycombinator.com/newsguidelines.html

https://news.ycombinator.com/newswelcome.html

oceanswave · on Dec 18, 2017

Here, take my money!

kbenson · on Dec 18, 2017

> so you'll end up inventing that part yourself without an ORM (I recommend doing so, on a less critical project, to learn the kinds of issues that present themselves).

I recommend anyone that has to deal with ORMs create their own at some point, in a non critical project. Not because they will necessarily create the next big thing (but who knows?), but because nothing quite gives you the perspective and appreciation for what these systems can do and why they have their pain points like making one yourself. You'll most likely not use your own module long after you've created it and then again surveyed what's already available, but it's invaluable in making a good assessment of those options as well.

Similarly, making a web framework yields similar benefits.

In both cases, a good understanding of the underlying technologies they build on (SQL and HTTP), is required, and if you don't have it doing in you'll have it coming out the other side (which is really the reason for this in the end).

Similar things exist all along the spectrum. From embedded OS's and compilers to javascript utility libraries.

What it comes down to is that a tool is best utilized when the person knows when and how to apply it appropriately, and that's as often as not an understanding of the tool as it is of the context.

A person intimately familiar with hammers and their uses can build some interesting wood furniture, but they'll likely never achieve the same level of product as a master woodworker that's just well acquainted with a hammer. Investing time and effort into tools provides only so much benefit. At some point, more knowledge of the craft itself is far more beneficial.

andrewstuart · on Dec 18, 2017

zzzeek - can I take this chance to praise your work on SqlAlchemy. People say there's not enough thanks given to open source developers... here's thanks to you. It's the work of a craftsman.

jgraham · on Dec 18, 2017

I'm pretty sure that SqlAlchemy has caused more grief and frustration than any other single library. After all if I didn't know about the excellence of SqlAlchemy, maybe I wouldn't get so cross when I have to do anything non-trivial with the Django ORM ;)

Joking aside, SwlAlchemy is very impressive software, and zzzeek deserves all this praise and more. Every time that there's one of these anti-ORM articles I feel like SqlAlchemy pre-emtively addressed all the substantive criticisms in its flexible, well layered, design.

(And in the interests of fairness, Django's ORM is also very good at making simple things simple).

href · on Dec 18, 2017

+1 SQLAlchemy has had a solution for every problem I threw at it in 5 years of heavy use!

cybergoat · on Dec 18, 2017

I want to give thanks to all the SqlAlchemy devs!

I am a huge fan of the custom column types -- they have allowed me to have code that works against SQLite as well as a production databases with ease!

nerdwaller · on Dec 18, 2017

+1 - personally I really like how SQLAlchemy doesn’t abstract away the database so much where all modeling and power is lost. He and the contributors have done an excellent job.

Zancarius · on Dec 18, 2017

I've always been impressed by zzzeek's near-omnipresent participation. Years ago, I had a question about SQLAlchemy that he'd answered on the mailing list which amazed me given the relative rarity of developer interaction. Yet here we are almost a decade later and he's still answering questions directly, but on many more platforms. I'm actually not convinced he ever sleeps.

I wish more people aspired to be like him, because you know there's no more authoritative answer when you run into a Stack Exchange post and he's offering up his assistance.

flavio81 · on Dec 18, 2017

+1 SQLAlechemy is an excellent library. Zzzeek, many kind thanks for such a good ORM.

zzzeek · on Dec 18, 2017

thanks very much and also to all the other great commenters on this thread!

zug_zug · on Dec 18, 2017

I think there are 3 main reasons ORMs came into common use:

1. As a reaction to common SQL injection from poor libraries not implementing parameterized queries. (2004 or so)

2. Novice engineers not wanting to learn SQL (look I learned how to make a blog in RoR, and I like mongo!)

3. As a theoretical abstraction above the data-store (as though you might someday be able to switch the data-store beneath the ORM)

1 has been solved, 2 was never okay (point of the article), and 3 isn't really okay either because it's too leaky (performance specifics).

JamesBarney · on Dec 18, 2017

4. The pain in the butt of writing and maintaining your own mapping code.

5. Type checking all your queries.

6. In code query composability.

But I totally agree that ORMs are too leaky of an abstraction for 2 to be really useful, and that 3 is much harder than it appears to be.

pmontra · on Dec 18, 2017

+1 for these three points.

But I'm really surprised every time people tell me they look at the schema as defined into the ORM instead of at the table in the database.

I'm really jaw dropped the few times I know somebody doesn't even know SQL, only the ORM. Maybe they look at it as if it were the reaction of somebody that thinks you must know assembly if you want to program Ruby, Python or Node (I don't.) Still, if you work with a database you must know it's internal language, SQL or NoSQL. Your going to need it or build a mess.

And involving a DBA early in the project can make your database at least twice as fast, with the right schema and the right queries. Then you translate that into the ORM you want to use.

dizzystar · on Dec 18, 2017

Woah, who are you interacting with that not knowing SQL is rare?

In my experience, nearly no one knows SQL, and the attitude seems to be that learning it at all is a waste of mental bandwidth.

On the other hand, I wonder if knowing none at all is better than knowing a little.

GFischer · on Dec 18, 2017

Wow, I've never met a developer that didn't know SQL.

Even 1 year "learn programming fast" courses here in Uruguay have at least the basics. OTOH, Javascript is not taught in many courses so YMMV...

pmontra · on Dec 18, 2017

I expect that everybody with a degree knows SQL and I'm realizing that I could be wrong. Maybe sometimes I'm the only one in the room that knows it. I'll check it next time I'm at a technical event leaning on the backend side.

tonyedgecombe · on Dec 18, 2017

I haven't written very much SQL in my career, there is quite a lot of development that doesn't use it.

spdionis · on Dec 18, 2017

On the other hand learning javascript frameworks... /s

gldalmaso · on Dec 18, 2017

I find that jOOQ (java) solves all that for me.

Very happy to not have and ORM on our stack for many years.

I believe that the interaction with the database is just about the most important part of code that you need to rely on. We have had messy data written to the database due to misused or misconfigured ORM (perfomance issues, bad orphan handling, session state problems, overflown sequences, to list a few) and decided that we should not rely on all devs touching that code to be experts in the ORM api to avoid the pitfalls, we rather our devs be expert in SQL.

Never had any of them complain about mapping using something like jOOQ. Type checking and composability are also built in.

pjmlp · on Dec 18, 2017

Yep, MyBatis is also another nice alternative.

pjmlp · on Dec 18, 2017

DALs can be easily written with libraries like MyBatis, jOOQ, Dapper, ....

Just because we are using SQL doesn't mean we have to do the mapping fully manual.

mattridley · on Dec 18, 2017

I use jOOQ - I like the benefits of having the static type checking but the ease of expressing with a SQL like DSL.

Plus it plays very nicely with kotlin :)

exclusiv · on Dec 18, 2017

I'd add #4 - it's just generally faster and more enjoyable to develop with a good ORM.

randomdata · on Dec 18, 2017

Although, I think what you plan to do with the data also makes a big difference. Which probably explains why opinions vary so widely.

For short-lived processes, like typical web applications and command line utilities – that load data from the database, do something with it, and then purge it from memory again – I'm becoming less and less convinced that ORMs are actually a benefit.

On the other hand, if your application plans to map the database data to memory for long periods of time, with the need to keep them in sync, then you're probably going to end up writing something that resembles an ORM anyway, and poorly at that. In this case a good ORM is beneficial.

andybak · on Dec 18, 2017

Yep. Learned SQL. Happy never to have to touch it again.

eru · on Dec 18, 2017

For 3: the proper abstraction for relational data is, surprise, a relation.

I used to work in an environment where we had relations as full first class data structures. They are very pleasant to work with.

We actually had 'relation-object-mappers', ie when we had to interact with other systems that didn't use relations, we often mapped them to relations internally to make them play nicer.

yorwba · on Dec 18, 2017

How were those relations represented? If I understand correctly, you were not just wrapping tables in a database, but using some other backing implementation. What were the most common operations, and what was the performance like?

eru · on Dec 18, 2017

Oh, the implementation was fairly straight-forward. I think just sorted arrays are something.

It wasn't about speed of execution, but expressiveness when coding. Later on they even added proper type system support.

Common operations were things like map/project, extend, filter, join, collect-by-key / expand, etc.

Just as Codd pointed out in his original papers, relations allow you to not have to make a choice about a hierarchy for your data.

Using key-value store like a hash-table in your program, or the much vaunted has-a relationship between objects would force you to make these choices. Thus making interacting with the data awkward for all but one access pattern.

Relations work best when your program is written in a style that deals largely in immutable data. (What we call 'purely functional', but people in dysfunctional languages have also picked up on the advantages recently.)

int_19h · on Dec 18, 2017

> Just as Codd pointed out in his original papers, relations allow you to not have to make a choice about a hierarchy for your data.

Or, alternatively, make it really painful when you do actually need to query hierarchies, along the lines of "give me all the tuples above this one in the hierarchy".

eru · on Dec 19, 2017

Datalog can do those kinds of queries--transitive closures--easily, if memory serves right.

Datalog is a specifically chosen subset of Prolog.

orf · on Dec 18, 2017

For 3: if you want to distribute a library or service and have no control over the data source.

rickycook · on Dec 18, 2017

so, i think #3 is less useful for switching out the actual DB server itself in an ongoing project, but i’ve found it immensely useful in a couple of ways:

- replaced the DB driver mid way through a project to a slower, but more complete implementation. this went flawlessly, because it’s all so generic and in the end the same DB under the hood, so a simple change (unless you don’t use an ORM where i could see it being a nightmare)

- started a new project where i had to use MSSQL, which I’d never used before, but i’m a big fan of postgres/sqlalchemy. other than ODBC oddities, it was really simple to write the new app with all the same patterns i was used to with things like update on write, lazy joins, constraint deferral, and i think most importantly would be MIGRATIONS! huge help to have the same migration framework that i was used to

zimpenfish · on Dec 18, 2017

> As a theoretical abstraction above the data-store

I worked on a project with a home-grown ORM (in C; it was horrible) that abstracted over both MySQL and Postgres ... except the overarching application required Postgres-specific column types and functions.

pvg · on Dec 18, 2017

ORMs of various stripes were in common use well before 2004 so 1 & 2 don’t sound terribly persuasive. 3 is definitely an evergreen selling point - you might change to a different RDBMS vendor (back when there was such a thing), etc.

cloverich · on Dec 18, 2017

cause I think its relevant -- note he's the creator of SQLAlchemy.

simula67 · on Dec 18, 2017

Wikipedia says SQLAlchemy was created by somebody named Michael Bayer : https://en.wikipedia.org/wiki/SQLAlchemy

EDIT: Sorry, my mistake. The partent is talking about the parent comment, not the FTA

SatvikBeri · on Dec 18, 2017

(Mike Bayer is zzzeek: http://techspot.zzzeek.org/)

xzel · on Dec 18, 2017

Yeah they're the same person. His HN profile has a link to his blog.

LeonidasXIV · on Dec 18, 2017

If I link to zzzeek's blog in my HN profile, will I also become Michael Bayer?

manicdee · on Dec 18, 2017

This is especially true where the ORM can mask things such as moving a column to an associated table. The ORM knows that object.attribute is now represented in the object_attribute table with the relationship using object_attribute.pk in the object.attribute column (which may or may not be renamed to attribute_id).

No need to rewrite all your SQL, just the ORM description of the model!

Joeri · on Dec 18, 2017

Realistically though, if you’re querying the same table from that many places your architecture is already in trouble. So editing the queries shouldn’t be that big of a deal.

needusername · on Dec 18, 2017

> providing abstraction for database-specific and driver-specific quirks

That is quite theoretical. My PRs for fixing non-spec compliant behavior in pgjdbc get rejected because they might break some ORMs (mostly Play). My PRs for adding MariaDB sequence support to Hibernate get rejected because there are additional MariaDB features that Hibernate doesn't support as well.

rickycook · on Dec 18, 2017

it might be theoretical in hibernate, but sqlalchemy does a TON of this stuff

as an example: https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalch...

heaps of this kind of thing are nicely dotted around the code so you don’t have to deal with weird driver quirks, maps “Text” column type to whatever it needs to be in your given database to have an unbounded text blob

id say that’s far more than theoretical

EDIT: typo

EDIT 2: also, on DB specific features, SQLA supports a bunch (not all). eg: https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalch...

needusername · on Dec 18, 2017

In Java land the JDBC driver is supposed to do that. If it doesn't then that's a bug in the driver. If it's a bug in the JDBC driver then the fix needs to go into the JDBC driver, especially if the driver is on GitHub.

The presence of an ORM relying on these bugs should not prevent a bug in the driver from being fixed.

mayankkaizen · on Dec 18, 2017

>, and you need to learn SQL first before you work with an ORM.

I did the opposite and faced lots of difficulties. Then I concentrated on plain SQL and suddenly I felt like I am understanding ORM much better.

jsight · on Dec 18, 2017

Exactly. It is amazing how many bad ORMs that I've seen in systems from people who "just used SQL" instead of an ORM.

I've found that the people who know SQL pretty well can actually do good work with or without an ORM. But the maintenance is a lot easier when the abstractions are consistent with the abstractions that are used by a popular ORM.

mathattack · on Dec 18, 2017

Everything that isn't SQL tuned to your environment forces you to sacrifice performance at some point. At some point it does an inefficient join. How can you not go back and fix performance issues is you don't know SQL? (It isn't like you need to fix it in C)

X-Istence · on Dec 18, 2017

SQLAlchemy makes dropping down to SQL where necessary incredibly easy. There where necessary you can tune the queries, but where not necessary you can let the ORM create them for you.

This gives you a lot of flexibility and power, but as zzzeek mentioned, you need to know SQL to understand how what the ORM is inefficient and to be able to replace it as necessary while still letting the ORM do what it is really good at.

stcredzero · on Dec 18, 2017

you need to learn SQL first before you work with an ORM.

Analogously, I'd say that being able to write a compiler, down to having it generate machine code or assembly, gives you a leg up when using a compiled language. I've met and interviewed a number of coders who had cringeworthy gaps in their knowledge, because to them, a C compiler was just some kind of "magic."

3riverdev · on Dec 18, 2017

(Bias: I'm one of the Hibernate ORM committers.)

Hibernate (and presumably any ORM) was never intended to be a complete abstraction of anything-SQL. Like others have mentioned here, an understanding of SQL must be had before using an ORM. The ORM is one piece to the puzzle, not a shield to prevent you from having to touch SQL.

One pattern I typically use is a take on CQRS: Hibernate for writing/updating/fetching/deleting a single instance of deeply-relational object model, SQL (I like jOOQ) for larger-scale fetches and any bulk actions.

Shameless plug for a write-up I put together last year: https://www.3riverdev.com/hibernate-orm-jooq-hikaricp-transa...

gwbas1c · on Dec 18, 2017

I must admit that I've inherited two messes where someone used NHibernate as a replacement for SQL. In both cases, the schema was excruciatingly simple but the code performed excruciatingly poorly.

In once case the project was canceled, in part because it was so late due to the developers not knowing how to use a database. In the other case I put my foot down and removed NHibernate. The schema was so simple that it was just easier to put a few extra minutes into boilerplate code than to put time into learning a new thing.

I'd really like to see a good writeup about the use cases that tools like Hibernate excel at. The problem is that, in both cases, I had to work with a high-level manager who had very bad assumptions about what Hibernate can and can't do.

jgeraert · on Dec 18, 2017

I'm so glad someone finally confirms my idea about using hibernate. I've been in constant battle with developers that map full object graphs coming Frome some endpoint to hibernate/jpa annotated classes and then throw it at hibernate. Here you save this... It just doesn't work that way very well. It's always a mess with relationships. Whereas you run your logic on entities attached to the session things work out nicely.

lmm · on Dec 18, 2017

> Hibernate (and presumably any ORM) was never intended to be a complete abstraction of anything-SQL. Like others have mentioned here, an understanding of SQL must be had before using an ORM.

Disagree. You need to understand the relational model, but you don't need to understand SQL-the-language. I've written plenty of successful systems using hibernate without needing to touch SQL, and am much happier for it.

_grep_ · on Dec 18, 2017

I agree with you, but I think people (for better or worse) are using the two terms interchangeably in this thread.

antjanus · on Dec 19, 2017

I have my own views on this but I basically agree with everything you said.

I think ORM is a great way to translate tables into real objects that have their own methods and properties that may or may not interact with the DB. I think that's where the real power is.

But for performance-necessary actions, yeah, SQL all the way (or rather a query builder).

manigandham · on Dec 18, 2017

One of the most endless and pointless debates.

99% of the time, an ORM is fantastic and will make you more productive while providing performance, security, and maintainability. They come in many sizes from thin wrappers around a db connection to full-featured frameworks.

For the other 1%, use raw SQL, or perhaps a query building tool to help with parameterization, composability, etc. In fact, modern ORMs will even let you input raw SQL and handle the conversion back to objects if you need it.

Saying ORMs are always wrong is just as dumb of a statement as saying all database access needs to be in raw SQL. They are just a tool and abstraction, like everything else you use in software development. You know the right time to use it.

That being said, not knowing SQL at all means a lack of general understanding in how relational databases work and will almost always cause problems.

bunderbunder · on Dec 18, 2017

All of these endless debates seem to boil down to two different groups who work in two different problem domains talking past each other.

I've personally never seen an ORM lead to success in the long run. But I also work in a space where queries frequently end up involving something that ORMs typically don't handle well: merge statements and pivot statements, window functions, management of the lock escalation policy to fine-tune performance, temp tables... The list is endless.

What I have not ever worked on is a relatively basic CRUD datastore. Which I realize is what most people are using databases for. So at this point, I'm putting my money on ORMs being a hole in one for that application. Because, otherwise, I just can't reconcile a statement like, "99% of the time, an ORM is fantastic" with the reality I'm living in. In my career, 100% of the time, when an ORM was present, it was invariably the single biggest piece of technical debt.

manigandham · on Dec 18, 2017

Yes, you are talking the 1% of use-cases, which is probably more like 10% these days with more complex software. Most business apps are just CRUD, but if you're doing analytics queries and such with tabular/pivot/nested result sets, then an ORM isn't going to do much for you.

SQL is the database interface so of course using it directly without abstraction helps you get all the power and control. I have seem some cases though where a query-builder with a solid DSL can be a good middle-ground.

ulkesh · on Dec 18, 2017

I guess my experience is limited, but I’ve not seen much of this despite working in an ORM environment with around 75 entities for the last few years (my recent experience anyway, the rest goes back 16 years). Maybe that is small potatoes, I don’t know, but I’ve found that anyone who understands JPA well enough can work to avoid any pitfalls of using ORM. It seems to me that having a good mix of understanding SQL and ORM is a good thing; and especially understanding exactly what the ORM system is doing for you and how it is doing it. Dropping ORM altogether sounds like a bad idea since it provides a number of built-in security features as well as an abstract modeling paradigm that is fairly easy to conceive and maintain; provided, of course, that you learn to say “No” to protect the integrity of the model (such as rejecting the attribute creep the article warns about).

I have found, in my experience, that people who tend to want to write SQL over ORM usually want to do so because they simply know SQL better. That’s okay, there is nothing wrong with that. But that doesn’t immediately mean ORM systems are bad. No need to be tribal about it.

The problem I see is that many new software developers these days sometimes can’t see the forest for the trees because they dwell too much on what they think is better instead of simply seeing the software and abstractions as nothing more than tools in the tool belt. It happens everywhere — PC vs Mac, iOS vs Android, Scala vs Java, SQL vs ORM. It’s fine to have opinions, I have many, but as I’ve aged I’ve become acutely aware that my biases are almost solely rooted in the limitations of my understanding.

jeswin · on Dec 18, 2017

The post is from 2014, so I won't be too harsh here. The author's problem is with some specific flavors of ORM he's used, and shouldn't be generalized. Hibernate's expressiveness is/was crippled by Java itself. C# ORMs on the other hand are way better because they benefit from LINQ which adds queries natively into the language. Other more expressive languages have excellent ORMs as well.

The objective of ORMs is not to replace 100% of your queries. That 10% might still require SQL or Stored Procs and that's fine.

ORMs give you:

1. Type safe queries

2. Ability to refactor easily, click to rename prop

3. Not having to handcode joins if objects are related

  customers.select(c => { cust: c, orders: c.orders })

4. Lazy evaluation and composition (C# examples)

  //If getOrders() returned a query expression
  getOrders().where(o => o.city === "London")

  //You extend it further
  getLondonOrders().where(o => o.total > 200)

  //^ These queries aren't executed yet.

rjbwork · on Dec 18, 2017

I've been writing C# professionally for ~5 years at this point. While I was initially quite infatuated with LINQ2SQL and EF, I have gone through the same situation as this fellow. I just write SQL in-line using Dapper for parameterization/data mapping, and use stored procs when I need some of the more arcane features of SQL (merges, CTE, etc).

mattmanser · on Dec 18, 2017

I've been writing C# professionally for ~12 years at this point. I'm extremely comfortable with SQL, the first startup I worked for for 3 or 4 years in the mid-2000s did amazing things with it and was extremely anti-ORM. We did things like write SQL that would automatically get translated into XML, which we'd combine with xslt to create dynamic pages.

Yes, I've hit major problems with the EF (including one on Friday which was causing huge CPU and Mem spikes at the busiest time of our year). Yes, it's a PITA to debug the queries. Yes, it can do stupid things. Yes, some programmers can massively over-complicate it.

But you can prise the EF from my cold, dead hands before I give it up. ORMs rock. It's such a huge time saver as long as you KISS and accept you have to drop down to SQL sometimes.

And screw switching to Core until they've sorted out lazy loading. Lazy Loading can also screw you, but again, it's just wonderful when you use it right.

I've also turned into a huge fan of Code-First and Migrations, having been against them at first.

ZenoArrow · on Dec 18, 2017

> "And screw switching to Core until they've sorted out lazy loading. Lazy Loading can also screw you, but again, it's just wonderful when you use it right."

Are you referring to EF Core? I was considering learning it. What's this lazy loading issue? It's not one I've heard of before.

mattmanser · on Dec 18, 2017

As far I understand it, it's half finished and progress has been super-slow:

2016 - https://weblogs.asp.net/ricardoperes/missing-features-in-ent...

2018 - https://github.com/aspnet/EntityFrameworkCore/wiki/Roadmap

Basic functionality like group by, lazy loading, etc. is missing. By the look of it you can't even load custom types from hand-crafted queries, which is pretty ridiculous.

Haven't really been keeping that up-to-date with it. I personally feel the whole Core thing has been a massive cluster-fuck for their existing customers.

Mouse47 · on Dec 18, 2017

>Lazy Loading can also screw you, but again, it's just wonderful when you use it right.

That's funny - the first thing I do when starting a new project is turn off lazy loading globally. I find it hides poorly-performing code until it's causing problems; the equivalent code without lazy loading usually just throws an exception.

Granted I've only been in the industry for 3 years, so /shrug

Also, question: you mention code-first and migrations... do you think you'd rather use SQL for your schema definition/migrations if the tooling was better? I find SSDT doesn't quite cut it :(

mattmanser · on Dec 18, 2017

We have far more problems with too many .Includes causing terribly performing queries with bad JOINs than lazy loading problems. Granted this code base is in a bit of a state and we have a complex order structure that can go like 10 layers deep, and ideally we're looking to go even deeper with complex pricing. You do an include with all of that and you're going to get a terrible query.

It's generally very cheap to do a single item query with no joins (as in a nanoseconds db query, yes, nano), and it only has to do it once, if you re-use that object again anywhere else, it's already in the context so it doesn't have to go to the db. Even doing them hundreds of times can be super cheap[1]. Add to that it only has to load each item once you can make intelligent decisions about what to .Include and what to lazy load.

[1] Caveat, Azure db connection latency often sucks so this isn't completely true, we had 5-6ms instead of the <1ms you'd expect. Presently at about 2-3ms. This is the fault of Azure and not lazy loading though. Causes a problem when you make hundreds. 100 lazy loading calls each taking 6ms would add 600ms, or 1/2 a second, to a request.

Mouse47 · on Dec 18, 2017

>We have far more problems with too many .Includes causing terribly performing queries with bad JOINs than lazy loading problems. Granted this code base is in a bit of a state and we have a complex order structure that can go like 10 layers deep, and ideally we're looking to go even deeper with complex pricing. You do an include with all of that and you're going to get a terrible query.

Wait - are you "Include"ing things you don't need? If not...assuming a sane query plan, shouldn't the single query (e.g. "Include" version) outperform the deconstructed series-of-queries that brings back the same data?

E.g., Included:

    context.Orders.Where(x=>x.OrderId = 5).Include(x=>x.OrderItems)

which translates roughly to

    select * from Orders o join OrderItems i on o.OrderId = i.OrderId
    where o.OrderId = 5

vs.

    var order = context.Orders.Where(x=>x.OrderId = 5);  
    var items = order.OrderItems;

which translates roughly to

    select * from Orders o where o.OrderId = 5;
    select * from OrderItems where OrderId = 5;

As far as I understand the former will outperform the latter, even if you don't take into account the additional connection overhead. If the second version was faster...wouldn't SQL just compile down to a series-of-queries automatically?

mattmanser · on Dec 18, 2017

The example is too trivial, which is why it looks like it might be better. Here's a real world example, not even complete. A restaurant booking might have:

    - A venue associated with it
    - A user who booked it
    - A menu
    -- With courses (starter, main, dessert)
    --- of Menu items (steak)
    ---- with Menu item option groups (think, 'pick one of', 'choose at most 3', etc.)
    ----- of Menu item options ('rare', 'medium', or 'extra chips', 'onion rings')
    - Guests
    -- Guest.User
    -- Guest selections of menu items
    --- Guest selection options
    - And Many more! (payments, events, offers, postal addresses, etc.)

There are loads of things that are optional, or even extremely rarely filled in (say, an associated special area of the restaurant, or perhaps an assigned waiter, or a third-party partner who placed the booking). And we can just let the EF load that in lazily. It knows a nullable int means nothing to load, but if there is a third party supplier, it can go off and load that lazily (and extremely cheaply).

As for the query, you can load them in chunks (which we do), but the way the EF works you're limited on how you can do that.

If you try loading that all in one go, you get a very, very slow query. It's beyond the limits of the execution planner to do it well.

Because of the nature of ORMs, the EF can also make decisions which result in horrible sub-selects, or terrible joins of sub-tables where the execution planner can't use the right indexes, especially when you're trying to do groups, counts, sums, etc.

This makes it often better to load things separately and to selectively use the lazy loader to do certain things.

Other scenarios include where say you have a complex object that you've only partially filled in, but in 1 in 5 cases you want to send an email using that object.

Now you could write your email function to load all the data again, or you could let lazy loading do its thing and, overall, save time and decrease db load, because you've already got 80% of the data, it just needs to fill in the missing 20% with some simple queries.

Answer to earlier question: I use to hand-code my db upgrades as my opinion is that having correctly structured data is king and I understood relational db design. But it turns out EF Migrations are wonderful when you know how to use them. I still check every single one to make sure they're doing exactly what I wanted and expected though, and take them up and down manually. Good way of catching mistakes.

Mouse47 · on Dec 19, 2017

I see - so in your case you gain from the fact that many of the joins are likely to result in zero matches. And since you're joining on a nullable FK, you can tell in advance whether the record exists without a lookup - I ran a test and I verified that you pay a cost for the below join regardless of whether the FK column is null.

    select *
    from Person p1
                        --p1.Spouse is null for the record in question
        join Person p2 on p1.Spouse = p2.PersonId 
    where p1.PersonId = 42

(I understand why it can't do it in the execution plan, but I'm surprised it doesn't 'short-circuit' at runtime since the join predicate is trivially unsatisfiable for that row)

Your other benefit involves conditionally needing data. I will say it's not too hard to structure app code to avoid loading redundant/unneeded data in your email example, but it's certainly easier and more maintainable when property access is fundamentally linked to its actual retrieval - it's impossible for another developer to make changes to your version and 'lose' the efficiency, while the same isn't true for mine.

So it's less black and white than I thought...which it usually is :)

How do you feel about using 'explicit' lazy loading? E.g.

    PersonEntity person;
    if(IWantToSendEmail){
        person.Reference(x=>x.Email).Load();
        //use email info here
    }

This might be the best of both worlds...

mattmanser · on Dec 20, 2017

It actually is quite hard to do it and have re-usable code.

There are various different ways the same email might get triggered, maybe the booking came from an API call, maybe it came from a new booking form, maybe it came from a 'send reminder' button.

In all cases, I have a booking object that will be in a different state of being filled in. The underlying need for data for the rest of the request is very different. Some of them need a fully filled in booking, some of them need the bare essentials. Our "fully load this booking" function takes like 150ms, which isn't cheap and a significant amount of the time of that is DB time, which is again our most in-demand resource.

CPU/Memory is (generally) under-utilized on web apps and letting the EF do it's lazy loading thing is usually the best solution.

As for explicit lazy loading, it's inelegant and way more code. One thing we know for sure, more lines = more bugs.

I'm not saying turning off LL is a bad thing, if it works for you, but I semi-regularly have a SQL profiler running while developing so I see when it starts kicking out loads of queries un-necessarily.

pjmlp · on Dec 18, 2017

We aren't even considering Core it reaches feature parity with EF 6.

Not only lazy loading, ODP.NET and designer support are also a must have.

eru · on Dec 18, 2017

Is LINQ actually an ORM? It looks like a DSL to do relational stuff in C#, but I don't see where the 'O' part of ORM fits in especially not in your examples to show off LINQ's power and convenience.

jeswin · on Dec 18, 2017

LINQ is not an ORM. My examples were showing easy, lazy evaluation. My C# is rusty, but the examples mostly hold I think.

The following gets executed on iteration (of expensiveOrders), the lambda is compiled into a Function and run on each item in orders:

  IEnumerable<T> orders = ...;
  const expensiveOrders = orders.Where(o => o.total > 100)

The following gets compiled into an expression tree, which an ORM can analyze and convert to SQL. Basically it's just "Code as Data".

  IQueryable<T> orders = ...;
  const expensiveOrders = orders.Where(o => o.total > 100)

The language's ability to treat code as data allows the programmer to express queries in native language syntax and pass it to an ORM (such as EF or Linq to Sql) for execution on a data store.

Add: Where does the 'O' part fit in? You build a entities (in plain C#) with relationships to each other, and you could do stuff like:

  //Pseudo-code
  db.Customers.Where(c => ...)
  db.Save(customer);

Edit: clarified - executed on iteration, not immediately.

eru · on Dec 18, 2017

Thanks. My quip was just that objects (as in object-oriented-programming) aren't the right abstraction. Having some kind of mapper in your language between the database and the entities your program is dealing with is useful, and you showed that functional programming is a more friendly host than oop.

(Logic programming might also be workable?)

int_19h · on Dec 18, 2017

Objects are fine, it's classes that are often limiting. Consider the very first example above (corrected to be valid):

   var r = customers.Select(c => new { cust = c, orders = c.Orders })

This gives you an IEnumerable (basically, a forward-only sequence) of objects - but these objects are of an anonymous type that was implicitly defined by "new".

eecc · on Dec 18, 2017

Linq looks like a free monad: you declare the program as a data structure and the runtime implementation just interprets it on the go (something you could - not so easily - try to do with a strategy pattern)

Merad · on Dec 18, 2017

LINQ itself doesn't have anything directly to do with SQL. It's a mechanism for composing expression trees that are compiled into your application. At runtime a LINQ query provider can walk the expression tree and translate it into a query for the underlying data store. It's a very very powerful concept, but IME it doesn't get used all that much for things other than SQL because writing a query provider is a lot of work.

Anyway, the typical C# ORM is Entity Framework, which includes the LINQ-to-Entities query provider.

scarface74 · on Dec 18, 2017

The Mongo Linq provider is excellent.

cryptonector · on Dec 19, 2017

1. So does PostgreSQL... 2. Coding directly in SQL one first normalizes as much as possible, then denormalizes as much as needed to make desired queries performant. If you should need to refactor the schema, things like "rename prop" are trivial, and other things less so, but probably also not automated by most ORMs anyways. 3. Simple joins, sure, but need much more than that and the ORM gets in the way. 4. Well, SQL is lazy.

dang · on Dec 18, 2017

Thanks for the year! Missed that before. Added now.

red_admiral · on Dec 18, 2017

My 0.02 BTC on the matter:

Object-oriented programming 101 assumes that all your objects are in memory, in a graph, so you can do things like person.getFriends()get(0).getName() [assuming the person in question has >0 friends]. Each step in the graph is essentially a pointer dereference, costing a constant effort.

(If your data is small enough to fit in memory, that's what you should generally be doing. People who use hadoop for half a GB of data are usually doing it wrong.)

A relational database assumes that all your data fits on disk, but only a subset of it will be in RAM at any one time (and you generally have a network round trip every time you change that subset). This means you need a completely different way of thinking; this difference is sometimes called the "object-relational impedance mismatch". This is not to do with SQL and OOP just being different APIs for the same thing, they are designed for very different use cases.

ORM tries to pretend that this difference doesn't matter, and works quite well in simple cases when it really doesn't matter.

My standard example why it sometimes does matter: PersonDAO.fetchAll().size() is silly because it forces the database to fetch all Person objects, send them over the network, your application creates the necessary objects for them - and then you throw it all away again because all you needed was the number of people. PersonDAO.count() is much better, even if you have to implement it yourself.

If you don't like the syntax of SQL, sure - use a query builder. In C# or Java you can even get some kind of type safety that way. But you need to understand the difference between an object graph and a relational database to use either of them efficiently, long before you get to advanced ideas such as window functions.

spdionis · on Dec 18, 2017

> My standard example why it sometimes does matter: PersonDAO.fetchAll().size() is silly because it forces the database to fetch all Person objects, send them over the network, your application creates the necessary objects for them - and then you throw it all away again because all you needed was the number of people. PersonDAO.count() is much better, even if you have to implement it yourself.

I mean, PersonDAO.fetchAll().size() is just bad code. You probably need to know how to write ifs properly before programming? You need to know about databases before using an ORM.

red_admiral · on Dec 18, 2017

> You need to know about databases before using an ORM.

Pretty much the TL;DR of my whole post.

antjanus · on Dec 19, 2017

Your $365.80?

$354.38 is a lot of money to stake your opinion on. Are you sure $389.20 is the amount you're willing to invest in this?

huherto · on Dec 18, 2017

> Object-oriented programming 101 assumes that all your objects are in memory, in a graph,

Yes. We insist on using this OOP style where your objects are Person, Order, Invoice, etc. Ignoring that the data is stored in tables.

But, it may not be the only way. What if your objects really are things like Tables, Records and Relationships. (e.g PersonTable, OrderTable, OrderDetailTable, etc) and your operations are whatever you can do with these tables. We are trying to abstract away something that is very real. What if we embrace it instead ?

blaisio · on Dec 18, 2017

I think saying "Just use SQL" is probably a bad idea. You'll most likely end up implementing an ORM anyway, or you will end up with your model code mixed up everywhere with your views.

I do think a lot of people use ORMs as a crutch, which sucks. Also, ORMs often provide too much abstraction, forcing people who actually know SQL to relearn how to do everything the way the ORM happens to like it. I should not have to learn twice as much to be productive due to an abstraction.

What I prefer are really lightweight ORMs that give me models which I can then enhance with custom code. I don't need an ORM that supports plugins or inheritance or a dozen different kinds of joins. All of that can be done more efficiently with custom code.

Also, I think SQL builders are really useful. I think a lot of people conflate SQL builders with ORMs but they're actually very different problems.

craigvn · on Dec 18, 2017

> You'll most likely end up implementing an ORM anyway,

This is a really good point. Many people start with the "no ORM" philosophy, realize their application needs some way to map the SQL to the code, time passes..., they have implemented their own half-baked ORM.

hodgesrm · on Dec 18, 2017

A more positive spin is that you'll have an "ORM" that's exactly adapted to your application. Many apps (1) don't need to work with multiple DBMS types and (2) don't use even close to the full panoply of SQL features.

In a language like Java that has a generic DBMS API you can get along just fine with a few classes that handle CRUD operations and transaction management. Somebody familiar with JDBC and SQL can write the bridge classes in about a day, while keeping the overall application vastly simpler.

Either way somebody needs to make an informed choice about ORM vs. direct SQL. It seems as some people get in trouble because they skip that part of the design process.

zkomp · on Dec 18, 2017

No. You should really wind up with a DAL. Define some stored procedures for accessing and working on the data and use only stored procedures.

No need for ORM, and no inline sql logic in your application code.

blaisio · on Dec 18, 2017

I mean, stored procedures are fine, but I don't think that actually solves anything? Except maybe for reducing the amount of SQL code you have to send back and forth and (in some databases) allowing for a few more optimizations?

If you use stored procedures, all you've done is move part of the model into the database, so you have to update the stored procedures as part of a deployment. You still need to have the SQL code written out somewhere, and you still need to have something in the application code that knows which procedures exist and how to use the data they return in business logic.

zkomp · on Dec 18, 2017

you solved isolation decoupled much of the db logic from app-logic and made security easier

you can deploy schema changes independently

you can change everything and the app should not notice

blaisio · on Dec 18, 2017

But you can decouple the database logic from the app logic anyway, without using stored procedures. They don't actually help you do this since you still need code that knows what stored procedures to call. Also, I'm not sure how this makes security easier? It seems like security would be the same or maybe a little harder since you now have to track the stored procedures you're currently using as well.

I'm not really sure what you mean by "you can deploy schema changes independently" and "you can change everything and the app should not notice". The stored procedures are basically just an extension of the apps logic right? So you can deploy them at any time, sure, but that isn't different from an app that doesn't use stored procedures, because you could also deploy changes to any part of that app at any time.

I do think stored procedures can be more efficient, because you have a lot more control. But it's not like they are clearly superior from an organizational standpoint. If you write an ORM using stored procedures, it's still an ORM.

zkomp · on Dec 18, 2017

As far as I know, in my experience. Stored procedures in postgres are good when you really use the database and care about the data, you need transactions, need to handle races and concurrency etc. Whereas ORMs break down at this point or prevent you even getting to a point when you can use your database as a database.

Why pretend your SQL database is about objects? It is not... (it is about data)

A stored procedure can act like a view or a query, or use procedural logic. Point is: your app can call it and get a concistent result, no matter what refactoring has been going on.

A direct query needs to know too much about the database (orm generated or otherwise) which prevent refactoring and couples app to database harder...

You can rename or merge tables, views functions in the database but the interface the app use (stored procedures/DAL) will stay the same and work the same way.

As for app logic... I prefer bussiness logic in the database, not the app when the data is important. Application logic stay in your application, data dependent bussiness logic stay with the data.

scarface74 · on Dec 18, 2017

Every single implementation where I've seen "business logic in the database" has been an unmitigated disaster.

On the other hand, having well factored microservices (out of process) or in process modules have worked out really well with modern devops and software engineering principals - easy push button deployments and rollbacks, unit testing, A/B deployments, etc.

zkomp · on Dec 18, 2017

I think bussiness logic in the database has prevented disasters in the projects I have worked on. I honestly dont see how it could have been solved better...

It probably depends on the domain/problems.

My experience is with transaction heavy financial systems or similar, with web frontends, microservices sprinkled around in different languages...

The web app or java worker should be allowed to focus in its problems, the bussiness logic needs to live in one central place, which happens to be in the database accessed through thightly controled interface in the form of stored procedures.

scarface74 · on Dec 18, 2017

And what's stopping you from having a tightly controlled interface with a REST Api that is easily deployed, rolled back, unit tested, source controlled and deployed?

zkomp · on Dec 18, 2017

I like data. A database is created to handle it, give you tools to query, modify, scale, secure the data.

A rest api... how and why should it be responsible for your data? It solved a different problem.

You might not even need a database I guess, and then anything goes.

I need and like my database, and have suffered trying to get along with different ORMs. SQL is so good at what it is designed to do if you just let it.

(And why just one rest api? How about 100 restapis, some microservices, some web apps, some background workers, many different languages. One database. No ORM)

zkomp · on Dec 18, 2017

(have we come to some max nesting level here, cant reply to the child comment)

One db can be a problem, or a strenght depending on the domain; And I really dislike religious design, esp microservices.

I have less problems by avoiding ORMs (and religios microservice arch, or fundamentalist interpretations of rest)

Database handles the shared state in a heterogenous environment. We need it to be centralized to keep track of money, the apps can't do that, two independent databases cant do that either. It must be one system that guarantees concistency.

It works great, there is no downtime. The interfaces are defined, the database stands alone, updates are deployed separately.

scarface74 · on Dec 18, 2017

Database handles the shared state in a heterogenous environment. We need it to be centralized to keep track of money, the apps can't do that, two independent databases cant do that either.

Why can't apps "keep track of money"? I'm assuming you're referring to transactions. Apps can create transactions and you can share transactions across apps using distributed transaction (I'm not saying distributed transaction is a good idea).

scarface74 · on Dec 18, 2017

One database is still an issue. When you have a clear slice with one microservice being in charge of one set of data, it's easier to scale, slice, rewrite, and you can deploy and iterate faster without interdependencies.

And you lose all of the benefits of microservices if there is still a tight coupling between unrelated (from a domain perspective) to tables.

scarface74 · on Dec 18, 2017

Why would you want to deploy schema changes separately? I would be horrified if someone changed my DB back end without running a full (hopefully automated) set of tests.

zkomp · on Dec 18, 2017

The db is separate and the interfaces are defined and the test is for this interface (as part of the schema repository)

You dont need an ORM for testing your code...

But I think this varies from project to project. How many different applications, in different languages are using your db and do you tolerate downtime?

scarface74 · on Dec 18, 2017

Why downtime? A developer commits their code, the CI server builds the code, run non database dependent unit test, it gets deployed to the integration environment, automated integration tests get run - fewer in number somewhat slower - it gets deployed to the QA environment and goes through a round of manual testing (sometimes), QA signs off and the build gets deployed to the UAT environment and waits for the business owners sign off, then we turn off the A side of the load balanced farm and it gets to deployed to the A side of the load balanced production servers, it goes through a round of smoke testing (automated and/or manual) and once everyone is satisfied, we make A live, set the load balancer to use side B and deploy to B.

All of the manual sign off steps are integrated with the automated release pipeline. As soon as the required approvals sign off, the next step of the pipeline is done.

Rolling back is just redeploying the previous released version. Branching, source control, etc is also a lot easier when all of your business logic is in code and you don't have to sync up the "right" version of your source control with the right version of your stored procedures.

Of course this is even easier when you're using a NoSql solution where your schema is also defined by your class models. But that's another discussion.....

Of course this doesn't have to just apply to code. With things like Packer and Terraform you can do the same with infrastructure. Automated infrastructure deployment is not my expertise...yet

dizzystar · on Dec 18, 2017

In some flavors of SQL, the query planner can't see in a stored procedure.

You can then start combining stored procedures. Great way to build a slow mess quickly.

scarface74 · on Dec 18, 2017

Just use stored procedures? Then you lose the ability to do unit testing without a database dependency, it's a lot easier to rollback code than to rollback code and stored procedures as one and you don't get full visibility on what the code is doing just by looking at the source code.

ZenoArrow · on Dec 18, 2017

> "Then you lose the ability to do unit testing without a database dependency"

Not really, you just mock the database calls in the code you're unit testing.

scarface74 · on Dec 18, 2017

If all of your business logic is in the stored procedures, what are you actually testing?

And I realize that being able to test queries without database dependencies, only really applies to a few languages that treat queries as a first class citizen in the language like C# and Linq where you can mock out your actual Linq provider - replace the EF context with in memory List<T> - and still test your Linq queries.

ZenoArrow · on Dec 18, 2017

> "If all of your business logic is in the stored procedures, what are you actually testing?"

Depends on what you want to test. Can either write unit tests for the stored procedures or unit tests for the code that makes use of those stored procedures.

scarface74 · on Dec 18, 2017

And then when you write "unit tests" for stored procedures with a lot of developers you get slow "unit tests" that don't scale across multiple developers because of Comte toon issues.

ZenoArrow · on Dec 18, 2017

> "because of Comte toon issues"

Qué?

scarface74 · on Dec 18, 2017

Damn auto correct. That should have been "contention issues".

spdionis · on Dec 18, 2017

I think GP meant that you can't/it's hard to test the stored procedures themselves. In this case if you mock the database calls you will not test the database logic.

ZenoArrow · on Dec 18, 2017

You can unit test stored procedures. For example, with T-SQL can use the open source unit testing framework tSQLt:

http://tsqlt.org/

tSQLt is used by the commercial SQL Test product from Red Gate if you wanted a more polished user experience:

https://www.red-gate.com/products/sql-development/sql-test/i...

hodgesrm · on Dec 18, 2017

How would you test data access in a meaningful way without a DBMS? It does not really matter whether you use tables or SPROCs. You'll still need a DBMS instance available.

For PostgreSQL and MySQL you can bring up the DBMS in Docker. That is not a built-in fixture obviously but easy enough to do locally as well as in CI/CD systems like Travis. You'll need to load SQL into the DBMS as a prerequisite to testing. That has the benefit of testing your load/upgrade sequence.

scarface74 · on Dec 18, 2017

How would you test data access in a meaningful way without a DBMS?

The beauty of Linq and Expression Trees.

In the real world:

Linq (c# code) -> compiler -> expression tree -> run time Linq provider -> destination language (sql, Mongo Query etc.)

When you are unit testing you switch out the Linq provider for in memory Linq to objects provider.

https://msdn.microsoft.com/en-us/library/dn314429(v=vs.113)....

walshemj · on Dec 18, 2017

Not in my experience though our DBA was very good (oh my first boss was Dijkstra he mentioned down the pub one lunch time)

And would you not have your IDE on one monitor and your SQL IDE in another so you could look at both sets of code.

scarface74 · on Dec 18, 2017

Usually with most modern automated deployments, you keep your build artifacts in a package (zip file, tar, etc.) and run a script to deploy it to your target system.

Rolling back is a simple matter of installing the previous archive. That doesn't just apply to code anymore. You can treat "infrastructure as code" also.

You can do A/B upgrades, rollbacks, etc. There is so much better tooling around regular code than sql/stored procedures. How many times have you seen stored procedures with hundreds of lines, duplicated stored procedures with V1,V2, etc appended to it, commented out logic etc?

I've had to wade through some hairy code to but at least with a code, I can do some automated guaranteed safe refactoring, find dependencies, keep the interfaces backwards compatible, etc.

walshemj · on Dec 18, 2017

Sounds like you have been working at companies with poor procedures and employees without the required skills.

Avoiding SQL in favour of an ORM or inline sql is 95% of the time a sign of laziness.

scarface74 · on Dec 18, 2017

You did see me preach about all of the capabilities for unit testing without a database dependency, type safety, flexibility (with Linq I can switch back and forth between an RDMS and NoSql without any code changes)?

putlake · on Dec 18, 2017

AlphaZero's chess strategies turned out to be quite different from how humans have traditionally come up with chess heuristics/strategies. I wonder what paradigms will be used by AI that does computer programming. Will it organize code into some bits of functional programming, some OOO? Will it structure things into MVC? My guess it whatever AI does will be completely inscrutable to us. It may be optimized for efficiency rather than understandability, which humans need for maintainability.

saryant · on Dec 18, 2017

I have to agree. ORMs can be great when an application is just starting, because that's when you're writing the most tedious queries and statements, but beyond that I personally find that ORMs just get in the way.

As soon as I need to write something more complicated than select-by-id I end up reaching straight for SQL. Otherwise I have to learn both the ORM's query API or DSL and have the proper mental model for how it translates to actual SQL.

Or I could just write SQL and be done with it. No mysteries.

Maybe this is just me, but I've never written a join in an ORM that I had the slightest bit of confidence in.

ProblemFactory · on Dec 18, 2017

Any discussion on ORMs needs to consider what sort of app and queries you are writing.

ORMs are fantastic for the very common case:

* Fetch 20 rows and display them as a table,

* Fetch 1 row by primary key and display it as a form,

* Write updated fields from the form back into the 1 row in the database.

Anything more complicated, and direct SQL starts being more attractive. But for the common case that ORMs are designed for, they are a major productivity boost.

qwerty456127 · on Dec 18, 2017

The first time I've seen an ORM I was fascinated by this seemingly beautiful idea. But I have quickly realized that it's almost useless in real life projects, plain old SQL seems just much much better. Now I don't understand why would anybody use an ORM actually.

Also, basic SQL can be easily taught in as little as 10 minutes (I have been initially taught it at middle school during MS Office Query, Access and VBA class). An image of a programmer that can't use SQL (I don't mean advanced cases which can indeed be a bit tricky but these are far beyond the powers of any ORMs anyway) seems really bizarre to me.

meesterdude · on Dec 18, 2017

> But I have quickly realized that it's almost useless in real life projects

there are plenty of real life projects out there that would beg to differ

qwerty456127 · on Dec 18, 2017

Sure. I mean in all the real life projects I've seen from the inside, using ORM instead of SQL felt like roaming a forest on a gyroboard.

nine_k · on Dec 18, 2017

Object-relational mapping is in many cases an excessively leaky abstraction. I try keep away from it.

A DSL for writing SQL in a nice, composable way is a useful thing.

Some libraries, like SQLAlchemy, provide both levels, not insisting on using the object mapper.

jes5199 · on Dec 18, 2017

SQLalchemy is kind of a pain in the ass to actually use, though. The DSL doesn't feel very Pythonic, it's weird and confusing. The way it traverses the Object graph when loading associations between models is magical and opaque and I could never predict when it was going to automatically work and when it wouldn't.

I actually would rather be writing Ruby on Rails, because it's _less magic_ than SQLalchemy.

nine_k · on Dec 18, 2017

Yes, SQLAlchemy uses tons of magic.

But my very point was not to make it traverse object graphs at all. You can write nice SQL using it, with selects, joins, functions, etc. All these parts, SQL clauses, are composable, so you can factor out common parts.

Yes, I'm fine working with lists of tuples, not "model objects". Object graphs don't map all to well to the RDBMS model all too well. It's best done on case-by-case basis, if you care about performance at all.

in9 · on Dec 18, 2017

If you aim to have a pythonic data layer, I've heard good things about pony orm[1]. I mean,you can't get much more pythonic then the example they have on their website. But I haven't used it my self.

[1](https://ponyorm.com/)

sethammons · on Dec 18, 2017

The one that bit me with sqlalchemy is joins. You have to structure your sqlalchemy object in a specific way to do joins. Usually I write my sql query then spend half an hour trying to convert it to sqlalchemy.

nine_k · on Dec 18, 2017

Having PK relations defined was usually sufficient for me. I like SQLAlchemy for its straightforward mapping of SQL.

I don't usually work on the "model" level, though; I work on "table" level.