"Keep all logic out of the database. It already is the slowest point. And hardest to scale up."
I don't think this is true in all cases. Sure, ORMs are awesome, but sometimes you need to write SQL queries by hand, and those queries necessarily implement some business logic (even if they're just retrieving data).
> Keep all logic out of the database. It already is the slowest point. And hardest to scale up.
That's... weird advice. I think a lot of what happens in Rails projects (based on my very limited experience) is that developers start to rely on this easy syntax that ActiveRecord provides and stop thinking about the queries that the ORM is creating. So you end up with these massive N+1 queries that kill performance.
This is one of the reasons I like to stay away from ORMs at all costs. A majority of what an ORM provides can be solved by a view or a function.
As to Ruby's performance... yeah... it's pretty terrible. We had to completely abandon Docker for Ruby on Rails because the performance was absolutely abysmal, even with VirtioFS enabled.
That's not a good reason to stay away from ActiveRecord. AR has plenty of ways to write optimal queries (all the way down to raw SQL). That said, green developers often won't ever go that far due to not knowing SQL well enough. No matter what, you gotta have some competency with SQL whether you're using an ORM or not.
Things like DB views come with their own problems and constraints, as well as DB level functions (stored procedures).
I would suggest enabling the following two configurations in any Rails project. It should be a must in any new Rails project IMO:
1. Enable `config.active_record.strict_loading_by_default` - this will raise ActiveRecord::StrictLoadingViolationError on almost all cases of N+1 thus forcing the developer to fix it
2. Set `config.active_record.warn_on_records_fetched_greater_than` so that you know when someone will load a lot of records, thus forcing to paginate, load only what is needed ...
I think starting with these two will help mitigate a lot of problems even when using ORM in Rails.
> A majority of what an ORM provides can be solved by a view or a function.
If you want to get a single number or column of numbers - then yes, you can go with an SQL query. But if you need to get data about products in a store, you'll have to fetch the data and create objects for every product. Wow, you have just written an ORM.
For example, Postgres has fantastic "upsert" and data integrity checks that reduce round trips to the database, meaning they're incredibly fast and result in less-complex client code.
There is always an alternative to writing SQL queries by hand, and it's usually a better one IME. Any ORM worth its salt will let you do the query in your blog post via the ORM, as a single query.
> There is always an alternative to writing SQL queries by hand, and it's usually a better one IME.
I spent years writing code using Spring/Hibernate, and I can state with certainty that both of those statements are demonstrably false.
Every application starts with good intentions, a simple CRUD webapp, and an ORM, then at some point the business requirements yield an N+1 problem in ORMs or several non-trivial left joins into records that don't map the shape of the entities. At that point it's far easier to write the query in straight SQL and produce a straightforward mapping into the record structure, which doesn't play well with the ORM because that bypasses its entity cache, which causes another huge set of problems on its own. So now not only is there ORM maintenance and SQL maintenance, there is now a problem with the conjunction of the two technologies.
I agree with you, however, some ORMs handle this fairly elegantly.
At that point it's far easier to write the query in
straight SQL and produce a straightforward mapping
into the record structure, which doesn't play well
with the ORM because that bypasses its entity cache,
which causes another huge set of problems on its own
Rails' ActiveRecord ORM offers at least two ways to handle this.
1. ActiveRecord plays really nicely with views (including materialized views) in my experience. It treats them just like tables, basically, except you can't write to them. (note: there may actually be some cases where you can write to them; not sure)
2. You can supply your own handrolled SQL to ActiveRecord, e.g. `User.find_by_sql("select a,b,c from blahblahblah")`
YMMV obviously but I've been working with Rails since 2014 but this has covered all of my performance needs.
Plain old ActiveRecord default query generation is fine 99% of the time, and it's rather elegant/easy to sidestep it when I wish.
> I spent years writing code using Spring/Hibernate, and I can state with certainty that both of those statements are demonstrably false.
> Every application starts with good intentions, a simple CRUD webapp, and an ORM, then at some point the business requirements yield an N+1 problem in ORMs or several non-trivial left joins into records that don't map the shape of the entities. At that point it's far easier to write the query in straight SQL and produce a straightforward mapping into the record structure, which doesn't play well with the ORM because that bypasses its entity cache, which causes another huge set of problems on its own. So now not only is there ORM maintenance and SQL maintenance, there is now a problem with the conjunction of the two technologies.
I agree that that's often the end result, but in my experience 100% of cases are due to SQL fanboys who are unwilling to spend 5 minutes actually reading the ORM documentation and finding out how to do their N+1 query or complex join properly, which is actually easier than doing it in SQL if you try.
And don't get me started on "Hibernate is slow. The entity cache? Oh, our unnecessary custom SQL query made that inconsistent so we've disabled all caching".
I agree but I think Hibernate is an extreme. There is a middleground, query builders, which enable dynamic query construction (e.g. dynamically appending filter predicates) with reduced cognitive load of something like Hibernate.
I personally was heavily in the camp of "write raw queries ideally with code generation for statically typed/generated code" (as exist in Rust, Go, TypeScript, etc.), but I have since tempered my position since it does become a bit brittle and repetitive. Lately I've been playing with Jooq and it seems great.
There are tradeoffs everywhere though, so with Jooq you still aren't 1-to-1 with raw SQL, there is a bit to learn, but I consider it a worthwhile investment (and a minor one relative to an actual ORM).
A pure query builder is just writing SQL on syntax tree level, more or less. This makes sense for the same reason why you want your macros to operate on ASTs and not raw text. But I would argue that it's still much closer to plain text SQL in the code than to any ORM.
Right, but it lives with your application code and has the same syntax as the application code. That's probably preferable to SQL stored procedures (which often live outside source control).
It’s not hard to get SQL DDL and stored price in source control with Liquibase or Flyway. I’ve even done TDD sproc unit (integration) tests in them. But I’m a webapp turned data engineer guy…
That's probably preferable to SQL stored
procedures (which often live outside source control).
Stored procs definitely have some big pros and big cons, but I don't think this is one of them -- any ORM with a decent set of tools to manage migrations (ActiveRecord is one) makes this objection a non issue IMO.
I explicitly do not want to manage stored procedures in the same way as typical migrations - if I did, I would wind up with many, many versions of the procedure in my code base as it evolved over time. This would make grepping or locating the latest version pretty annoying.
Flyway (migration tool in Java) has a notion of “repeatable” migrations, though, which would do the trick.
This would make grepping or locating the latest version pretty annoying
Wouldn't this be an issue with any database object managed via migrations? Do any of them make this easy for any database object?
In ActiveRecord, you have your migrations folder(s) and then you have your `structure.sql` (essentially the raw output of mysqldump or pgdump) or the equivalent.
If I need to see the literal database definition of any database object I look it up in there. Not the slickest solution but works well enough - really just a few keystrokes in my editor.
I'd be curious how other migration tools handle (or fail to handle) this.
I’ve found that all this does is make the query less readable. SQL is purpose made for writing queries, and avoids unnecessary syntax noise you get when trying to fit the query into a host language based dsl.
That really depends on the language - specifically, on whether it already has constructs that can map nicely (e.g. LINQ in C#), or macros to define them, or syntax that is generally amenable to DSLs even without macros in the picture (e.g. Lisps).
SQL itself is also not a particularly well-designed query language. E.g. the order of the query doesn't reflect the natural data flow (SELECT .. FROM .. is reversed - compare to XQuery's FLWOR, for example), there are warts like WHERE vs HAVING etc. A good DSL can do much better.
SQL is powerful. A DSL that "fixes" things in this area getting all the other language feature interactions right isn't trivial, all the while users have to learn yet another language. Take PRQL for example: https://prql-lang.org. It looks nice, but the examples are very basic. What about window functions, grouping sets, lateral, DML, recursive SQL, pattern matching, pivot/unpivot etc. Might be doable, but perhaps, they've already made a decision that won't enable one of those features without adding new kludges.
Besides, every single "fix" will be a proprietary solution, while SQL is an ISO/IEC standard that's here to stay and universally adopted.
> A good DSL can do much better.
Stonebraker's QUEL was "better", before SQL, and yet, where is QUEL today?
> Take PRQL for example: https://prql-lang.org. It looks nice, but the examples are very basic. What about window functions, grouping sets, lateral, DML, recursive SQL, pattern matching, pivot/unpivot etc.
Window functions are very much supported! Check out the examples on the home page & in the docs.
The others aren't yet, but not because of a policy — we've started with the most frequently used features and adding features as they're needed.
> Besides, every single "fix" will be a proprietary solution, while SQL is an ISO/IEC standard that's here to stay and universally adopted.
And yet in practice the fixes end up more portable. How many of the things on your list of non-basic SQL have consistent syntax across databases, yet alone consistent behaviour?
But the point here isn't just that it can be more regular than SQL. Integrating with the syntax of the host language is also a considerable advantage, ideally with static type checking.
In a statically typed language, what you get from a good query builder is that "malformed SQL statements" blow up at compile-time instead of at run-time.
> There are tradeoffs everywhere though, so with Jooq you still aren't 1-to-1 with raw SQL
You're probably hinting at writing derived tables / CTEs? jOOQ will never keep you from writing views and table valued functions, though. It encourages you do so! Those objects play very well with code generation, and you can keep jOOQ for the dynamic parts, views/functions for the static parts.
I haven't used Rails or ActiveRecord for a long time, but from memory it gives you an escape hatch in the form of Arel that lets you represent parts of your query as SQL while still using the ORM for the rest.
Edit: I think it's something like
Album
.select(:id, :name, "SUM(songs.length) AS total_length")
.left_outer_joins(:songs)
.group(:id)
.order("SUM(songs.length) ASC")
I think you don't even need to explicitly wrap it in Arel.
There are also libraries that enable you to define an annotation or aggregation inside a model, that you can then get with a call like select_properties, similar to the built-in select_related (which you use to get a foreign key in one query) and prefetch_related (which is select_related for many-to-many fields).
Thanks. I haven't used Django for ages, but do recall using annotate() in the past. At that time, I didn't need to worry about performance, so didn't look into what happens under the hood, i.e. whether the call generates more than one SQL query.
I don't know about ActiveRecord or Django specifically, but: the obvious way in their query builder? Adding a joined collection, doing an aggregation on it, and sorting by it, are all easy in any ORM worth bothering with. What did you try and where did you get stuck?
I don't think this is true in all cases. Sure, ORMs are awesome, but sometimes you need to write SQL queries by hand, and those queries necessarily implement some business logic (even if they're just retrieving data).
Nice explanation here:
https://tapoueh.org/blog/2017/06/sql-and-business-logic/