Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: What happened to the ORM?
95 points by olalonde on June 29, 2017 | hide | past | favorite | 104 comments
After all those years, I sometimes feel like the whole ORM thing hasn't made progress or been settled. Which ORMs do you like and what do you like about them? What is the sweet spot? I'm asking this because database I/O has always been my biggest frustration when doing application programming. These days, I do more functional programming and tend to side with the ORM haters but I do occasionally wish I could just abstract out the database more. Anyways, I'd just like to hear HN's thoughts on this.

I'm the original author of Sequel [1], an ORM for Ruby. Lately I've been finding that ORM's actually get in the way of accomplishing stuff. I think there's a case to be made for less abstraction in programming in general, and access to data stores is a major part of that.

I believe in most cases the efficiencies that can be gained from using an ORM is quickly offset by a substantial dependency on third-party code, limited control of performance, and inability to express more complex SQL expressions, such as CTE's, lateral expressions, JSONB functions and operators etc.

ORM's also tend to pile on lots of functionality, of which most projects will normally only use a small fraction.

For an in-production system I've been maintaining for the last 10 years, I've recently ripped out the ORM code, replacing it with raw SQL queries, and a bit of DRY glue code. Results: less code, better performing queries, and less dependencies.

[1] https://github.com/jeremyevans/sequel

I don't feel any of those limitations using SQLAlchemy. Most projects just end up implementing a subset of ORM features poorly. I think your experience as an ORM author and contributor may rather uniquely qualify you to write "a bit of DRY glue code".

Good ORM lets you use a small slice of their features without any big performance hit in terms of development or actual. With a nice object model it just seems easier to remember and work with code.

When I need complex SQL I just write a raw query... It is trivial and concise in any ORM to drop down to raw SQL when needed.

I can agree they are no silver bullet, but a good ORM does so many chores for me I can't imagine bothering to do them myself, anymore when the pattern of using objects in my code and storing them in a SQL DB solves the problem at hand.

> I think there's a case to be made for less abstraction in programming in general, and access to data stores is a major part of that.

I'd like to hear more of your thoughts on that.

My experiences with ORMs are less about too much abstraction, and more about them encouraging wrong abstractions. I'm yet to work with an ORM-using project that doesn't employ ORM objects as business model objects. The ones I've worked with always exploded due to the fact that even with ORM, database model doesn't map 1:1 to how the business logic would like to see the world - and the tension of trying to use one set of objects for both purposes at the same time is what messed up the codebase.

> I'd like to hear more of your thoughts on that.

These days it seems that even for the most rudimentary programming tasks everybody reaches straightaway for their favorite framework/lib without too much thinking.

I think in a way all these frameworks and libraries have had a negative impact on the general quality of coding. Programmers end up with limited knowledge of how the underlying technologies work: the HTTP protocol, SQL, the DOM...

And because these frameworks tend to grow over time, in an effort to be comprehensive and serve everyone's needs, they introduce more and more dependencies, more and more tooling, more and more code, which you'll need to get to know sooner or later.

Programmers who use said frameworks and libs become more and more "hooked", start using more features than actually needed (do you really need that auto generated REST API for that 3-page dry cleaning website?), and in the process become, well, dumber.

You can of course take this argument to absurdity by claiming that everybody should just write machine code. That's of course not what I mean. I just think that for many programming tasks, programmers would be better off - and no less productive - just by attacking the problem head on, with a good understanding of the technologies involved.

Sometimes a simple handsaw is more efficient and less hassle to use than that fancy sophisticated table saw you got in the corner.

Sequel was the last one that I used and was the best one. I can't remember the talk..."ORM's are the Vietnam of programming".

You are right, very little of the ORM gets used in the end and one almost always needs to write custom SQL anyway (models don't always map 100% to app layer). Most importanly, as languages change, evolve or dry up it's easier to find or train new talent with SQL experience vs. a language specific ORM.

>I can't remember the talk..."ORM's are the Vietnam of programming".

Had seen that article a while ago. Saw a few other mentions of it already in this thread, but here are a few I googled for and found:

http://blogs.tedneward.com/post/the-vietnam-of-computer-scie... (the original, maybe)

HN thread:

Object-Relational Mapping is the Vietnam of Computer Science (2006) (codinghorror.com) https://news.ycombinator.com/item?id=7310077


I haven't been a big fan of ORM's even before DataMapper in Ruby. I found that it was too easy to make non-performant queries and found myself writing raw SQL anyway.

Some argue that using an ORM means you can switch underlying database technologies on a whim. I think this is an incredibly weak argument. How often do people truly switch database technologies?

I created a small wrapper around the node postgres library to make querying a little easier.

Have a look at https://github.com/joeandaverde/tinypg - It's a no frills library that makes it easy to execute SQL files as prepared statements and pass parameters as keys on objects.

I've been battling a lot of cases where I don't want to duplicate SQL statements in my application. For example I have a query defined which I want to add a WHERE clause to. How do you handle these cases elegantly in code?

Some databases treat the query themselves as components that can be chained together before execution, look at RethinkDB and ReQL. For SQL, there are lots of ways to make dynamic sql (generated by code) and while the patterns are many the implementation in the language often looks bad and breaks most sql injection protections when using db middle ware.

Would love to see the DRY glue code.

A lightweight PostgreSQL-based Ruby object store using JSONB, in ~ 100 LOC:


ORM's are good form, why?

- Avoids mistakes when dealing with writing raw SQL queries (SQL is quite repetitive in practice)

- The declarative nature of classes maps well to types and relationships

- The declarative nature of classes maps out well to tables, even with polymorphism [1]

- Keeping "Models" in an ORM often maps out well to migration utility (Alembic, Django Migrations)

- Object-chaining map very well to queries

- ORM objects can be reused and composed

- They can abstract out intricacies across SQL dialects

- They can potentially make it easier to migrate to different SQL servers if no specialized features were used

- Can help avoid common security vulnerabilities like SQL injections

- When something can't be expressed via ORM relationships, they tend to allow the dev to drop down to raw SQL. In the case of SQLAlchemy, there is a core query language [2], too.

- In the case of Django, QuerySet is used as a standard throughout extensions that power a whole community. Plugins that don't even know each other (e.g. django-filter and django-tables2) can operate on the same django queryset to filter/search and sort/display data.

I mention QuerySet/Django ORM quite a bit in a recent blog post at https://www.git-pull.com/code_explorer/django-vs-flask.html.

[1] http://docs.sqlalchemy.org/en/latest/orm/inheritance.html [2] http://docs.sqlalchemy.org/en/latest/core/

> Object-chaining map very well to queries

This isn't remotely true; it turns what looks like an in-memory access into a network round trip. Navigating your database using an idiom of lists and member accesses is a fast path to n+1 queries all over the place, or ORM tuned to eagerly fetch the whole world when you only touch a tiny piece of it.

The closer an ORM's API is to a monad, the happier you'll be. Fundamentally, accessing the database is executing remote code; the more you can package up into the query before it goes off and does anything, the better performance you'll see.

IMO trying to shoehorn objects (in the OO sense, with polymorphism, data hiding and behaviour) into a database is wrong-headed. Data hiding in particular is a wrong mental model for thinking about facts in the database, and the more is hidden, the harder it will be to reason about performance and bulk remote operations generally.

> This isn't remotely true; it turns what looks like an in-memory access into a network round trip. Navigating your database using an idiom of lists and member accesses is a fast path to n+1 queries all over the place, or ORM tuned to eagerly fetch the whole world when you only touch a tiny piece of it.

That isn't true for e.g. Django's ORM, which lazily evaluates the query and only actually accesses the db after that, with a single query, and filtering done in SQL.

This is not the case - with Django's ORM you are still susceptible to the N+1 problem. It can be largely mitigated through `select_related` and `prefetch_related`, but the fundamental issue is there - that simply by accessing an attribute for an object that wasn't already fetched, you can do another database query without it being at all clear in the code that this will happen.

That's true, though select_related is mostly the default, and that behaviour is documented well. In practice, I very seldomly find N+1 cases with Djangos ORM, even when doing relatively complex data analysis.

This didn't use to be the case, I remember a single line of Django orm causing several hundred db requests, but maybe they improved it.

When was that and what was the form of the query?

> Fundamentally, accessing the database is executing remote code; the more you can package up into the query before it goes off and does anything, the better performance you'll see

The queries that most tend to be making just aren't that sophisticated. Relationships tend to be basic.

And I haven't even mentioned stuff that'd really, really hard to express/manage in pure SQL like tree/nested information that has to stay balanced [1]. Thanks django-treebeard/mptt and sqlalchemy-orm-tree.

> The closer an ORM's API is to a monad, the happier you'll be.

Developers using ORM's simply aren't caring about ORM's matching a certain construct. They care that models emit correct representations of their schemas and that the data is retrieved "fast enough".

Take it a different way: the best part about ORM's? They're effective 95% of the time, right out the box, so you end up avoiding time that'd be spent over and prematurely optimizing.

> IMO trying to shoehorn objects (in the OO sense, with polymorphism, data hiding and behaviour) into a database is wrong-headed.

Objects in things like SQLAlchemy declarative and Django Models map perfectly to generated SQL, so they also act as a way to generate migrations. It's that accurate. A lot of the relationships project's need expressed tend to be vanilla joins.

> Data hiding in particular is a wrong mental model for thinking about facts in the database, and the more is hidden, the harder it will be to reason about performance and bulk remote operations generally.

ORM's strive to hit a value sweet-spot in terms of code expressiveness, reducing duplication, handling the bread and butter relationships and types. That covers what most developers really need.

Perhaps there are projects out there not fitting to ORM's. Not all projects are sophisticated data mart projects, but even then, a good share of those still go back to simple joins at the end of the day.

And I've even gone as far as trusting heavy-duty stuff like django-mptt, along with plugins that filter and sort. I don't even look at the queries, all I see is they're running performantly. In all these years, SQL queries have never been a bottleneck. Maybe it's because I'm only storing simple stuff.

[1] https://en.wikipedia.org/wiki/Nested_set_model

An ORM provides easy DB access and various implicit operations. The trade off is that if you don't know what you're doing, your app gonna have bad performance. Seems fair to me.

Also, Django ORM documentation is very clear and you can use Django Debug Toolbar to analyse the raw sql generated.

>This isn't remotely true; it turns what looks like an in-memory access into a network round trip. Navigating your database using an idiom of lists and member accesses is a fast path to n+1 queries all over the place, or ORM tuned to eagerly fetch the whole world when you only touch a tiny piece of it.

Only if you use naive ways of doing this. In .NET at least your overcomplicated expression can be compiled down to the minimum query needed to pull the bits of data you use.

It's not magic. I've personally removed dozens of N+1 queries from code written by one of my fellow consultants caused by misuse of Entity Framework. Here's a reference: https://msdn.microsoft.com/en-us/library/jj574232(v=vs.113)....

At least with EF it's possible to completely disable lazy loading. I always recommend doing so -- when you're passing entities between methods it's pretty easy to lose track of what's implicitly 'loaded' on the object. Innocent changes to a method that operates on an entity can cause a database round trip...horrible idea.

But they're unnecessary baggage, because your top six reasons all work in reverse.

If you define your schema in your database and derive your data layer from that, you get everything in your list (apart from that thing that totally always happens where you switch your underlying database technology every few months).

But then you don't have your database defined in two places. And if anybody ever does modify the db by hand, your build will break and it will quickly surface itself as an issue at compile time instead of via an obscure error message somewhere 40 levels deep in the call stack.

How do you in practice derive the data layer from the schema? Somehow parsing the schema in the application code?

Personally, I generate base classes for table objects at compile time.

You can look at a table and its keys to determine what sort of thing it represents (entity, lookup, association, etc.) and build out helper stuff as needed. So in addition to basic CRUD anything that looks like an entity gets .Load(), .Save() and accessors for fields as well as .GetByWhateverID() methods for any foreign keys. I base my actual Entity classes off of those auto-generated base classes, so they can get blown away and recreated as often as necessary.

I also wrap any one-off stored procs that are lying around in calling code, so that they can be used in place of the standard-issue Frankenstein SQLBuilder thing that an ORM would have.

It's kinda all the upsides of an ORM, but without any dynamic garbage, schema-as-config-file, mystery auto-SQL, migrations, or (again) Almost-SQL-In-The-Magic-Query-Languague to Not-The-SQL-I-Meant conversion.

Thank you.

Yes, with Django I would definitely stick to ORM. But when working in Golang or some other new language, unfortunately there isn't any ORM library that good. So people usually stick to SQL.

I always liked Django's ORM because it didn't even try to do any complicated stuff, it forces you to drop down to raw SQL instead (I haven't used Django in years, hopefully it's still like that).

Compared to something like Hibernate or SQLAlchemy that tries to support everything under the sun and can result in a lot confusion when trying to understand what exactly it's doing.

They're still alive and well, just not sexy anymore. The most popular one is probably still Hibernate which is helped by Java being the top high level language.

I never understood the ORM hate. Every place I worked we intermingled raw SQL when needed. Hibernate has a way to clear query caches so you can use raw SQL when you need to. You can just write raw SQL exclusively if you want within hibernate so I don't get how you could lose anything :) .

Still, my experience is mostly with Hibernate. It's extremely mature, meaning reliable, feature complete, and only 0-30% slower than raw queries in most cases. It makes adding support for things like multitenency and audited tables a breeze without losing database agnostic behavior. It makes database upgrades a no-brainer 95% of the time too. It has a built in memory caches that help enormously with common queries. Probably the biggest thing is it makes the database strongly typed so it's possible to refactor. Code refactoring in Java is easy but raw stringified SQL is nearly impossible to fix in any language.

I think the biggest counterpoint to ORM is shitty ORM. Things like SQLAlchemy generate downright horrific SQL slowing everything to a crawl and causing deadlocks. Another honest counterpoint to ORM is the learning curve. Everyone is taught SQL but the ORM is more abstract and harder to reason about, not a fun thing to learn.

TBH I think most ORM's are just poorly done. Putting an object abstraction on a relational database is hard. The only ones I've enjoyed for completeness and performance are Hibernate and to some extent Entity Framework. EF being the easiest to use but a bit slower with less features.

I have heard good things about Dapper but never used it. I like the idea of injecting a SQL DSL into the language itself, wish it was more prevalent.

I think your experience reflects that basic gist I've gotten from most professionals: ORMs are complicated and convoluted because they provide a solution for a very complicated problem. If you are somewhat new and don't understand things, it's easy for you to see the few problems introduced by ORMs and think that it means that the whole concept is just unnecessary bloat. With more experience and wisdom, you understand that the few quirks of ORMs are a necessary evil for having a software that solves such a complex and flexible task.

That's a good analysis of things, and on top of that, when you're new you'll treat an ORM like a hammer in search of a nail. So someone who's less experienced and doesn't realize that you should supplement the ORM with raw SQL queries might try to evaluate complicated queries in the ORM, with the usual results. Some of that has changed with e.g. LINQ and IQueryable in .NET, but it's no panacea.

>> Things like SQLAlchemy generate downright horrific SQL slowing everything to a crawl and causing deadlocks.

Not my experience... Would you care to explain ? SQLA works pretty well for me, as long as I stay in rather simple queries. For example, data analysis queries are next to impossible to express with SQLA, but that doesn't matter much.

For the simple update queries, it just works fine for me. It also allows for very good control on relationships, etc.

I've always had success with ORMs when I let them write the 95% of trivial queries (SELECT * FROM table, INSERT INTO table, UPDATE table SET ... WHERE id = x). The complex queries I'd always rather write by hand to take advantage of the expressiveness of SQL, and to make sure that queries hit my indexes correctly.

I totally agree. Mr. Fowler's article lays out a good case for ORMs being a good, leaky abstraction with manholes for diving deeper when necessary. That implies that we can't, as application developers, use it as a black box, because underlying knowledge is important. Until someone comes up with a magic algebra for paving over the details SQL, we're stuck mitigating or dealing with its realities.

One of the things I've recently prioritized is the ability to run arbitrary queries (generally read-only) at some sort of REPL prompt, for debugging and investigative purposes. ORMs and their offshoots offer a great deal of support in that area because you can converse with an ORM in a domain-specific way without dropping into raw SQL. That can be a big advantage.

Maybe things have changed, I haven't used it since school and needed an example :-/

Amazing how perception works, everything you said about Hibernate and SQAlchemy is what I think about them, only switching them around. At least for the one time I had to use Hibernate, which was two years ago.

And not just because Java is an odious, ugly language, but the SQL Hibernate generates I just didn't like.

Mind you, this is with complex queries. For 90% of them, I didn't notice any difference between them, as far as generated SQL goes anyway.

It's now called GraphQL. Let me explain.

First, the need for data has moved from the server to the client side in API-centric architectures. The backend is often just a dumb HTTP layer on top of a (sometimes relational, sometimes not) data persistence. Heck, I even built apps without server, using PostgREST.

Second, on the client side, we don't need an object-oriented interface to the datastore, because JavaScript. But we do need a powerful declarative query builder, and static data types to build up on. GraphQL provides that.

Third, if the purpose of the ORM was ever to abstract the SQL dialect of the database, GraphQL has resolvers, which turn a GraphQL request into a request for the database of your choice. So it does that job, too.

Last, for the data crunching jobs that you decide to do on the backend, you often need specialized data stores (event store, document store, full-text search index), and ORMs don't address that. And since these jobs are often resource intensive, you need the best performance - therefore falling back to raw SQL in most cases.

ORMs were a good solution for the kind of software architectures that were used for the web in the 00s. Since then, we have different needs, which require different solutions.

Disclaimer: I was the maintainer of a popular open-source PHP ORM [1] for 3 years. I no longer use ORMs.

[1] http://propelorm.org/

What have you used PostgREST for?

Public and private web apps for a French media company. Dumb backend, most of the logic was in the frontend (React) and in async workers (Node.js, with direct access to the database without ORM).

Serious question, if you do most logic in the frontend then where do you do the server side input validation and where do you do the business logic? If I fumble around with your client side code and you don't check that then I could theoretically do whatever I want in your database.

What happened to the ORM?

To my opinion, it was a bad solution to the wrong problem.

For one, we're not that enamoured with objects anymore (what with functional programming, immutability, etc).

Second, SQL and DDL, being declarative, is both a higher abstraction that (at least) most ORMs, and offers more fine level control to the DB at the same time!

Third, people don't really switch databases that often, for the abstraction between different SQL syntaxes to matter.

> Third, people don't really switch databases that often, for the abstraction between different SQL syntaxes to matter.

But third-party libraries shouldn't assume any specific SQL database, so if you want libraries that can do database things, ORMs are very useful.

They can always do SQL abstraction without doing the object-relational mapping.

this is what you want: http://www.jooq.org/

An old article, but still somewhat relevant I think:

"ORM is the Vietnam of Computer Science" (2006)


I thought we've been through :-)

"ORMs are a thing of the past" (2009) http://codemonkeyism.com/orms/

"Be careful with magical code" (2010) "[...] ORMs and in particular Hibernate [...]" http://codemonkeyism.com/beware-magical-code/

My experience is strongly flavoured by developing and optimizing on the edge of what the database is able to do; and my performance condition isn't scaling up millions of tiny trivial queries, but getting big hairy dynamically generated queries to execute efficiently within a deadline.

No ORM comes remotely close to being able to solve this kind of problem. The kinds of things I need to consider:

- analyzing select / predicate / order / etc. for tables used and selectively including tables

- forcing query evaluation order by creating a subtable query with conditions, projecting out an id, and rejoining in the next query up

- analyzing predicates and rewriting predicate trees to push predicates down into subtable queries, or convert predicates into joins, or replace references to columns on other tables with denormalized columns

- gathering heuristics about query set size and changing query generation technique

- splitting queries into parallelizable queries and composing result set client-side

An ORM doesn't really help me. Even a library tuned for symbolic manipulation of SQL doesn't help hugely; my life is made much simpler when I can freely extend the SQL AST classes with my own analytic and rewrite methods.

Isn't this the type of thing that 'the' (or a) query optimiser is meant to do?

With a suitably tweaked ORM, you can pass platform-specific hints down to the DB. The platform's query optimiser should be (obviously up for debate!) the one that's taking a query and working out the most efficient way of executing it.

Not saying it always will always pick the best execution, but if you're getting to the point of supplanting the optimiser you're starting to replace the database itself.

At which point you're probably beyonds the realms of what an ORM can and should reasonably doing, you're starting to writing your own custom persistence platform.

No, query planners are generally hopeless; for my use cases, it's trivial to get orders of magnitude better performance with different query structures.

If you have just a predicate, or a single join, most query planners are just about ok; with multiple tables and predicates touching those tables, there's ever increasing risk it will start the join on the wrong table and produce too many rows somewhere in the plan. Engines like PostgreSQL are unpredictably stupid (but sometimes very smart), while MySQL is predictably stupid (but consistently fast when you keep it simple and avoid specific constructs).

When designing for interactive use, you want to design for a maximum run time on the query. If that means using a query which is often not the fastest, but is never slower than X, it's the query to use. Smart planners are more of a hindrance than a help, because they reduce predictability.

The worst case is when a planner starts with a small row set (planners usually like starting with small row sets) and joins against a big table expecting a similarly small joined set, but ends up with an enormous result set because of the cardinality of the join. Estimating join cardinality is a tough problem for database vendors to solve, and it's critical to figuring out the correct join order - particularly since the number of join strategies grows with the factorial of the number of tables to join, so the planner can't spend much time on it. Meanwhile, the application developer may very well know the designed cardinalities and a join sequence that minimizes the chance of a blowup, the most important thing for interactive use.

Well yes, this is exactly what a query optimiser is meant to do.

What you're meant to do is add a few indexes, maybe sprinkle some hints into your query (platform specific), based on your knowledge based on your rough knowledge of the statistics of your data structures and let the optimiser/planner work out the best way to execute it.

Now yes, the optimiser may sometimes get it hilariously wrong, but I guess that's the price you pay for a relatively portable (across DB platforms, with some tweaking of the hints) solution.

The risk of going too far down the rabbit hole of dynamically re-writing the SQL is that you're essentially trying to second-guess any query optimiser/planner and for most of people making software they're going to do a poor job building a 'shadow optimiser' based on 'experience' of a specific platform and how it behaves in observed scenarios. If the people building databases are finding it hard, there's probably a reasonably good reason.

I'd also note that this isn't really a problem with ORMs per-se, but with relational DB platforms and their performance when running arbitrarily complex SQL.

You can concretely measure how much time it takes to scan X million rows of a table; how much slower it is for every join you add; how much slower an index is to scan for every column that's added to it; how much slower it gets for every column you add to the select clause; etc. And determining optimal join order in my case isn't particularly hard (it's a lot like a star schema; the key thing is not to bring in the whole of the big table in the middle).

I was a compiler engineer in my previous job; transforming and optimizing expression trees are my bread and butter. I'd give myself good odds of improving MySQL for my particular use case - I'd know where to look, what stats to collect and how to approach things differently. But likely my optimizations would be pretty specific to my situation, and not everybody would be happy with more predictable, sometimes slower performance.

The query optimizer faces the same limitations as an optimizing compiler - improving the query as much as it can given limited information, in as short a time as possible.

And it's frequently unsuccessful, for the exact same reasons as optimizing compilers - proper query optimizations require more information and time than what is available to those optimizers.

Hand tuned code created by an expert will always be faster than that produced by an optimizing compiler; hand tuned SQL created by an expert will always be faster than that produced by the query optimizer.

They still have their uses, particularly with largish LOB/enterprise apps. For me they are no longer the goto solution for apps I write though, if I started building one today I would start with an "ORM lite" (never liked the term) like dapper. If it ever got big and complicated enough that an ORM was justified then I would introduce one.

I find an ORM really excels when you've got complicated business logic, the sort of thing where there are dozens of business rules that may or may not be involved in any given installation. Something like this is extremely stateful and and ORM does a great job of tracking this state. In fact, this is what I'd define an ORM as, a state tracker for database modifications. If you don't have much state then you probably won't get much out of an ORM.

Even when using an ORM does make sense, it's important to remember that they don't make sense everywhere. For a performance critical or particularly complicated query then SQL should still be a fallback.

Not sure in other platforms, but in the .NET world, ORM(read Entity Framework) usage skyrocketed and improvements are made with each version of .NET released. I use Entity Framework. Both at work and when doing my home projects. Code First. I started with Database First and only recently moved to Code First. It feels so good modeling my database using C#.NET classes instead of doing it in SQL. Don't get me wrong. I do like SQL and I am very good with it, but I prefer C# better.

Speed of development greatly improved since I switched to EF. Speed of development is what matters most for me when building first version of an application.

It could be that most haters are simply using the wrong one? If JPA was my sole experience with ORM I would side with the haters as well, fortunately I built enough successful projects using DBIx::Class and SQLAlchemy to understand the advantages of using an ORM to the point that I refuse working without one.

I still use ActiveRecord a lot (Rails) and I would not write SQL queries manually since the ORM works so well. But I used ORMs in other languages and they were not as convenient so maybe it depends on the ORM itself. The time it saves compared to standard queries is massive.

ORM haters would likely point out that complicated queries are hard to write with an ORM but these complicated queries are not a massive part of the work and can still be abstracted away in a method using SQL if needed.

Another good point of the ORM: validation of data. There are plenty of validation you cannot enforce in SQL (regex, postcode...).

> Another good point of the ORM: validation of data. There are plenty of validation you cannot enforce in SQL (regex, postcode...).

Those particular examples are trivial to enforce in every DBMS I am aware of...

Dbms constraints are a usually a subset of constraints that are supported by ORMs.

The big question is when to use ORMs/raw SQL.

SQL is the most concise and perfect fit for RDBMS.

However, at the application level there are benefits of using ORM.

- The application itself is usually imperative style as against the declarative nature of SQL.

- Chaining is sometimes more readable and concise. One can chain dynamic filters.

- Abstract the underlying data model with higher level names. SQL eq. of table views.

- Hides the underlying relational model. Which can sometimes be helpful in a large code base. And sometimes a curse.

I normally opt for ORM in Rails/Django web apps. But SQL in

- Performance critical - Report generation, where it might be complex and declarative nature of SQL shines.

I have the same opinion now that I did at the beginning. They provide a decent abstraction of the database. But that abstraction leaks.

Just a week ago, I ran into an issue where SQLAlchemy wouldn't support a "in" clause against a list of Model objects. There's a leak that I had to patch by using a list comprehension to generate a list of integer IDs from those objects.

Then I ran into a "you're doing a check against an empty list; we are pretty sure this will come back with no results, but we're going to run the query anyways." Another leak (and it's a valid one), but I have to manually work around it if I don't want that extra query run (I didn't).

Finally, we get the script to the point where it's passing all of the unit tests, and package it up for the integration tests. Those integration tests fail, because SQLAlchemy dynamically imports DB connecters based on the DB being used. The unit tests were running in SQLite in-memory DBs, the production Aurora, requiring a connector to be installed. Another leak.

Ultimately, ORMs can be useful, but they are an abstraction which leaks all over the place. So long as you're OK with the tradeoffs (the rest of my team was; I'm ambivalent, though I was the one who ended up fixing all of these leaks), they're another good tool for your programming toolkit.

I think part of the reason they are not talked about anymore is that people like to talk about the hot and sexy and not the mature and incrementally improving. The hot and sexy now is full of Javascript and noSQL so ORMs aren't as relevant as before. ie: schemaless JSON + dynamic languages have eaten a big piece of the cake.

That said, people seem to still use ORMs in Ruby, Java, C#, etc. and frameworks look fairly mature.

The point of an ORM is not to be some magic 1:1 powerhouse, but to make it easy to abstract your storage layer. If I had the time and resources, I would make storage plugins with testing interfaces and dependency injections that precisely match the query patterns of my application. ORMs are almost never an ideal fit but they give me that for free out of the box.

I was part of the team that developed TopLink, later EclipseLink that became the reference implementation of JPA. After I left Oracle, almost all of the original 30+ Engineers/QA/Managers were moved to a different project. The git-repo is infrequently updated by only a few support Engineers. From my point-of-view this means that the de-facto Java ORM is Hibernate (even though we were first!). It is very mature, well-tested yet still getting some 'love' (c.f. recent changes for Java 8 streams).

Other posters have mentioned the 'Vietnam of CS' post from many years ago - the main message I would like to highlight is that ORM's are deceptive: it appears easy to create one, but once you spend some time with how people actually use databases, the real-world starts to drag you down (the 'leaky abstraction' meme thrown around). An example of this is: how many ways are there to map a One-to-One relationship (hint: greater than 5)

Answer: nine ways to map 1:1

See slides 20, 21 and 22: http://images.slideplayer.com/9/2645248/slides/slide_20.jpg

I currently think ORMs might be an anti-pattern, but I'm interested in what other folks think. It's the same kind of anti-pattern that led us down the path of taking Object-Oriented Analysis and generating an entire stack using MDD.

The thing is, you should really only want your database to do a small number of things, and those things you should pay special attention to. You don't want a database that does everything and all runs on autopilot. Then who the heck is coding the system?

I think most any app has a reasonable number of features which map to an also-reasonable-number of function calls back to persistent storage. Identify those and make them first-class citizens in your coding. After all, aside from UI dazzle, that's where all the magic happens.

Nothing happened. 99% of scenarios are a great fit for a tool that maps and moves data from the object-oriented programming language to relational databases. They decrease errors, increase productivity, and for the most part well.

Most of the complaints seem to be from using exceptionally bad ORMs or using them in the 1% of scenarios where performance is critical or very complex queries must be produced.

There really needs to be less blind hate and more understanding of when and where tools should be used (as well as the general quality of the tool itself). That's one of the basic tenets of programming so it's rather unfortunate to see so many misguided arguments here.

One of the issues is that programming languages are evolving so fast it is hard for ORM libraries to keep up.

A good ORM library needs to be well designed, battle tested and mature. Then it provides great benefits over writing just raw SQL. That takes years of development effort.

The problem is that with new languages (Node, Golang, Rust) there hasn't been enough time and effort put into ORM yet to produce something good enough to be usable.

When developing in Java or Python there are great, mature and battle tested ORMs I would use but with these new languages I had bad experience with ORMs as they still seem experimental and have issues therefor stick to raw SQL when working in newer languages.

there's raw SQL, there are ORM and then there's the mid layer of DSL-ish query builders like sqlalchemy.core, which are amazing if you've ever been dissapointed by not being able to compose queries in a sane way in SQL.

I come from a heavy C# background. I can say that MS has been revising their Entity Framework platform and it's a perfectly good ORM tool to use.

Being stubborn and old-school, I prefer the lower level roll your own approach for speed and efficiency. That said, if dealing with an enterprise application where ease of code mendability/adaptability is key, I'd happily go with something like Entity Framework.

It would be interesting to see if code generation tools have progressed in this domain at all. I remember several data layer class generators which although not exactly an ORM, could provide many of the same benefits.

Using just SQL or using an ORM is not a binary choice (and I'm not talking about the ability to drop down to SQL in an ORM). There is a middle or third option: libraries that wrap SQL (with a thin or thick layer) and provide a somewhat higher level of abstraction. Often called Data Access Layers, or similar names. SQLAlchemy Core is one such, and the pyDAL of web2py is another [1].

[1] I had blogged briefly about pyDAL here: pyDAL, a pure Python Database Abstraction Layer:


I'd worked as the team leader on one such database middleware product, written in C for Windows, some years ago. It was successful, and somewhat widely used in projects for clients at the company where we built it.

Such products can enable you to write less boilerplate SQL code, and also to work at a somewhat higher level of abstraction, thereby providing some productivity.

I agree with some of the other comments here, that 1) such abstractions can sometimes be leaky, and 2) SQL is not hard to learn (for the basics and a bit more - hell, it used to be considered bread-and-butter tech), but for some reason there seems to be an impression that it is tough, from the last few years - sometime after Web 2.0, maybe.

I think the answer is to take a hybrid approach. Use an ORM where it makes sense. You shouldn't need to write SQL to do simple CRUD and elementary queries with a join or two. But when things get more complex, you should be able to easily drop to SQL, and have the results mapped back to an object. Perhaps also a DSL that allows you to write queries in code. I have yet to find an ORM that can elegantly accomplish this, although SQLAlchemy comes close.

I never seen orm as abstracting database out. More like easier to maintain way to access it and have features like caching/consistency available without much work.

Take a look at Ecto (kind-of-orm for elixir), didn't even hear about anything comparable for other languages (which is sad)

* No impicit N+1 and stuff like that (elixir is functional with immutable variables)

* Whole "orm" is really fast, thanks to macros (compile-time metaprogramming)

* Can express very complex queries, allowing raw sql then necessary


I agree, Ecto is a great.

Ive always appreciated ORMs, but disliked how people tend to forget about the database behind them.

Ecto gives you a lot of power and flexibility, and its very easy to remember there is SQL behind there.

Compose-able queries are awesome.

Schemaless queries are awesome.

Not tying your "model" or "context" directly to a table or letting it be a portion of a table is awesome.

Backing "validations" with database constraints is awesome.

That Ecto team kills it.

Not just allowing raw SQL - allowing raw SQL fragments in your code, mixed in with regular ORM usage! Turn those fragments into a function and you have nicely reusable components with which to build advanced queries.

It depends a lot on the project. I think learning SQL and basic database theory are a strict requirement. ORMs are a great tools to help make life easier, but you'll need to reach in sometimes in order to fix performance bugs. I've learned to love both.

ActiveRecord is amazing. It makes it really easy to get stuff up and running, and it works for a wide range of problems. There's nothing better for typical CRUD operations, especially when you models map mostly 1-to-1 to your tables. You start to encounter problems later on as the application grows, especially when you have lots of associations, complicated queries, or your models grow too fat.

Maybe your problem isn't really with ORMs, but with MVC or whatever application architecture you're using. For example, Rails has been criticized because of its fat models and its overall "flavor" of MVC.

Trailblazer [0] was created by a Rails critic as an alternative approach for structuring your web application. I won't go into the details, since the website already does a good job explaining. But he effectively kills the traditional model and shifts the persistence layer around, which is a fairly common pattern.

I remember reading book that discussed ASP.NET's approach to MVC. I don't quite remember all the details, but it mostly amounted to having four primary layers (presentation, application, domain, infrastructure) which were further divided and given specific responsibilities. This approach is overly verbose and complicated for some cases, but the abstractions appear to make more sense as the application increases in complexity. A small example is an application in which you have to break up a model's data between multiple storage services like Redis and the RDBMS.

You'll probably like Ecto [1]. It's an ORM for Elixir. After the 2.0 rewrite it's a lightweight embodiment of some of the concepts mentioned above. One of the key insights is that you'll want to separate business logic from data and presentation. But it really depends on the project! It might help or it might just add more boilerplate.

[0] http://trailblazer.to

[1] https://hexdocs.pm/ecto/getting-started.html

I never understood the ORM hate/bashing. It's like the framework fanboy paradigm. "I love framework X, and framework Y sucks!"

They are tools built to help solve a particular problem.

ORMs are great when dealing with CRUD operations on entities and remove the need to write tedious boilerplate code every time. On the other hand, ORMs can be misused and cause all kinds of performance related problems. Most obvious use-case is developers that don't have much experience with databases, and instead of choosing to perform processing of data in the database run-time, they fetch data through the ORM and write poor performing code riddled with business rules in the application layer.

I think it's good to take a step back and understand that the ORM is a tool, which when used for the purpose it was designed for, works well.

Complex joins and queries can still take advantage of an ORM. One example is using materialized or non-materialized views, which can then map to an Object using the ORM to query it.

That way the bulk of the processing is done in the database runtime, and simply accessing the object is done through the ORM.

There's also the separation of concerns. A seasoned developer will opt for creating a well defined schema, and baking business rules into the database layer, because he knows that a well designed, and highly normalized data model will outlive any application where the business rules are housed in the application layer.

All things considered, I do love the ActiveRecord ORM design-pattern. I've worked with hibernate, JPA, and some other ones in the past, but ActiveRecord as employed by Rails and Django, is still my favorite.

Nothing. They're just as bad an idea today. I think perhaps that message had begun to sink in.

Developers have this unreasonable fear of the SQL language, and I can't really understand why. Sure, it's different. But then so is CoffeeScript and everybody seemed fine learning to draw little arrows in their code for the six weeks that was popular.

For me, SQL is the perfect language for describing and working with databases. "Computer, give me a list of Students who are taking Algebra 101". But developers really seem to prefer taking that whole Student list and spinning through it themselves, looking up every class everybody is taking and building their own list thank you very much.

And instead of using a simple, concise way of defining data tables (that follows that same elegant data retreival and manipulation language), devs would much prefer something with angle brackets.

So they never bother learning the easy way of doing things and spend several years inventing something more complex and opaque.

Eventually, I guess enough people got fed up of dealing with the silliness and picked up a book on SQL.

I'm not sure which ORMs you worked with (any?), but that's not how they work. You don't pull the whole table to filter it in your code. Any non-toy ORM is perfectly capable of filtering on a known attribute or relation of a class on the db side.

There's some irony in the fact you're making fun of devs who don't want to learn SQL, while you don't understand ORM...

Was it ever about avoiding SQL entirely? I thought it was about avoiding fragile SQL that's hardcoded to your specific data structures?

SqlAlchemy and the Django ORM are good - recently worked on a C# project and the devs are very anti-orm.

I think, the REPL makes a huge difference.

In a python project, getting a shell at any point - or experimenting in the shell Jupyter is really straightforward.

The brackety languages have options for this sort of thing now, but it is still just a lot more hassle - the easiest way to get something similar is to just write SQL.

ORM gets in the way of programming.

ORM abstracts which something developers know as his/her fundamentals : SQL statements. For simple operations this abstraction doesn't hurt. But this abstraction becomes obfuscation when we write complex queries or statements.

First a developer has to figure out complex SQL query to solve a business problem. Then has to figure ORM equivalent to it. Then ,ofcourse one should, write original SQL query in comment section above ORM API query to help figure whilst debugging.

Once into debugging, if original SQL query is missing in comment section or hasn't been updated. Spend extra time on what is wrong with both queries. Which one is right/wrong and then how to solve the problem.

Although, I do see some value in ORM in migrating to another DB or abstracted objects of tables-and-properties.

But for simple reason of learning additional set of apis to run SQL statements hurts!

We write science and GIS related interactive web applications. All our backends are Django because every application has some CRUD components somewhere and in general Python and Django have fantastic libraries available.

The Django ORM just works, I love it, it's never in our way and does what we need.

Nowadays I'd say there's something better than ORMs, take Slick (Scala) for instance, it's an FRM (Functional Relational Mapping) or JOOQ (Java). Both of those provide pretty much 1:1 DSL to SQL while at the same time granting a lot of flexibility and type safety.

ORM is a leaky abstraction. It does provide benefits, like avoiding SQL injection, however, it still don't have the level of flexibility as SQL, and probably will never do.

It is still useful though, but hardly stands on its own term.

In Node, I consistently reach for Knex, which is less of an ORM (like Bookshelf) and more like a query builder.

This simplifies binding, mapping, and most queries.

There have been rare situations to use ".raw", but the escape hatch id there.

I made my own ORM: https://code.google.com/archive/p/memory/

It had a pretty nifty graphical editor: http://rupy.se/logic.jar

But now since I need a distributed database I built my own async-to-async HTTP/JSON thing: https://github.com/tinspin/rupy/wiki/Storage

I've used several. Liked some, hated some. My favorite is SQAlchemy, can't say I have any real complaints about it.

The few times I've had to strain the object model, SQAlchemy has degraded nicely and lets me use raw SQL. It's a great compromise.

I feel like, just like using an ORM can turn into an "all-solving" hammer, so can functional programming. If a project one's working on feels like it would benefit from being represented as mapped objects, maybe one should... Do that.

Then again, I don't have much experience nor interest in FP. Haven't needed it, I guess.

There are ORM's in functional languages too, but they too have problems. Mainly the problems arises from SQL being structurally typed, meaning you can JOIN totally unrelated tables as long as they have columns with same types.

Practically there are no programming languages that allow to do all set operations on properties of records. Though Elm comes close. If the language has this peculiar feature, it should be possible to write a "ORM" that just works for all queries (given that the language has similar flexibility).

I mostly use this now: https://pandas.pydata.org/pandas-docs/stable/generated/panda...

Which is basically an easy to use API that turns a SQL query into an efficient columnar table in memory.

Yes, this still requires writing actual SQL. And yes it does not help you insert or update data. But for 95% of uses it works well.

Django user here. I feel like it got settled a while ago.

I think that probably encapsulates the whole issue here. ORM isn't one thing, it's a whole range of implementations and use-cases. For the sorts of annotations and reports I need to generate, Django has really started to come into its own in the past couple of years. Well on top of schema migration, caching and things like that.

So I wouldn't go looking for one answer. It's always going to be contextual.

I work someplace where the primary db schema is very poorly structured, and the one attempt we (or rather a single dev that was under-supervised) made to use an ORM was disastrous - a single http request would generate literally hundreds of queries. I don't know whether the problem was the dev himself or the ORM or the db, but either way we're having to replace everything and I've been put off ORMs for a very long time to come

I used to use an ORM but I discovered in fact that SQL gives me precision, control and performance and indeed I like it. I'll never go back to the ORM.

I sometimes feel like the whole ORM thing hasn't made progress or been settled.

The object and relation impedance mismatch is not a new concept. The feeling of progress in ORM was just the act of bringing ORMs to ecosystems that hadn't had good libraries for it, introducing ORM to a generation of developres that had never used it, and improving the libraries.

Which ORMs do you like and what do you like about them? What is the sweet spot?

I've worked with and liked

- JPA, mostly Hibernate (although we were mostly sticking to the standard parts)

- a few months of Entity Framework which was one of my favourite parts of my .Net experience (2.place I think after Asp.Net MVC which I really liked.)

- SQLAlchemy (a few years ago) at least I think I liked it.

ORMs aren't very good. They make simple things slightly simpler and get in the way the rest of the time.

The problem with ORM? I always forget DQL "like" syntaxes while I never forget raw SQL.

There are a number of things you can do to improve performance of queries generated by ORMs.

* Make sure you are indexing properly

* limit the number of items returned by introducing paging

* use lazy loading where applicable

I'm sure there are other techniques, but these are my gotos.

What I see are often custom ORMs implemented for a specific library or purpose. Many of the APIs I work with daily are in fact a type of ORM.

That said, I never use an ORM myself unless it's one I implemented as stated above.

My point against ORM are that will always, no matter what you try, will limit your domain logic. But it will help you to get things done. What I'm doing is use ORM only as a mapping for database layer.

It is a good tool, people use it, however it turns out it was not a silver bullet (tm). I think it is pretty much settled for what things it makes sense or not.

I'm still using Linq-to-SQL in the .net world. For 99% of what I need it to do, it works golden. Once in a while I've got to go in and profile and rewrite a query when somebody that doesn't know what they are doing has introduced an n+1 query, or done something overly "clever" that's resulted in gnarly SQL being generated.

What's wrong with ORMs? I tend to use Dapper. I've found that it can solve most of my problems, and it's easy to drop back to raw SQL as needed for the last 5% that need finer control. It's also super easy to extend and customize based on my requirements, and it's almost as fast as raw SQL.

Applications are open for YC Winter 2023

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact