
How to write fast code in Ruby on Rails - feross
https://engineering.shopify.com/blogs/engineering/write-fast-code-ruby-rails
======
pjungwir
One of my favorite tricks is to generate the JSON for your API responses in
Postgres instead of in Ruby.

You can get 100x speedups, but the downside is that you wind up with big nasty
SQL queries that duplicate Ruby logic and are hard to maintain. There was a
nice gem that would automatically produce JSON-generating SQL [0], but it is
abandoned now. It only supports Rails 4 and ActiveModelSerializers 0.8, which
are both quite old. I _just_ published a similar gem [1] that works for Rails
5 and AMS 0.10. Unlike the old gem, mine outputs JSON:API (common in Ember
projects). I hope to add the traditional JSON format too. AMS 0.10 makes this
easier, since it introduced the concept of "adapters". My gem is super new and
doesn't yet cover a lot of serializer functionality, but I'm hopeful about
supporting more & more. Feedback is welcome!

On the writing side, I've found activerecord-import [2] to be very useful. It
batches up INSERTs and UPDATEs and for Postgres it even knows how to use
INSERT ON CONFLICT DO UPDATE.

I also have a handful of Postgres extensions that use arrays to do fast
math/stats calculations. [3, 4, 5] If you are thinking about moving something
into C, it's natural to consider a gem with native extensions, but perhaps a
Postgres C extension will be even better.

[0] [https://github.com/DavyJonesLocker/postgres_ext-
serializers](https://github.com/DavyJonesLocker/postgres_ext-serializers)

[1]
[https://github.com/pjungwir/active_model_serializers_pg](https://github.com/pjungwir/active_model_serializers_pg)

[2] [https://github.com/zdennis/activerecord-
import](https://github.com/zdennis/activerecord-import)

[3]
[https://github.com/pjungwir/aggs_for_arrays](https://github.com/pjungwir/aggs_for_arrays)

[4]
[https://github.com/pjungwir/aggs_for_vecs](https://github.com/pjungwir/aggs_for_vecs)

[5]
[https://github.com/pjungwir/floatvec](https://github.com/pjungwir/floatvec)

~~~
caseyohara
One way to mitigate spreading logic between SQL and Ruby is by using SQL views
with something like Scenic [0] which can back an ActiveRecord model just like
an ordinary table. This is especially nice for complex reporting because it
centralizes your query logic in a single view as a sort of virtual table that
can be queried from both ActiveRecord and normal Postgres queries.

[0] [https://github.com/scenic-views/scenic](https://github.com/scenic-
views/scenic)

~~~
ramraj07
Does anyone know how best to deal with views with sqlalchemy in Python? Looks
like a good approach!

------
gingerlime
One comment about the caching section. Rails is generally really clever
getting the cache key for you automatically.

Instead of doing

    
    
      Rails.cache.fetch("blog_#{blog.id}_posts_#{posts.max(:updated_at)}") do
        blog.posts.as_json
      end
    

You can (and should) actually use something like

    
    
      Rails.cache.fetch(Post.all) do
        blog.posts.as_json
      end
    

Or even specific scopes, e.g.

    
    
      Rails.cache.fetch(blog.posts.published) do
        blog.posts.published.as_json
      end
    

The cache key will include the query hash, item count in the results and the
max updated_at automatically for you.

In some rare cases, you might want more control and then probably best to use
something like `ActiveSupport::Cache.expand_cache_key`[0]

e.g.

    
    
      key = ActiveSupport::Cache.expand_cache_key(["blog posts", blog.posts])
      Rails.cache.fetch(key) do ... end
    

[0]
[https://api.rubyonrails.org/classes/ActiveSupport/Cache.html...](https://api.rubyonrails.org/classes/ActiveSupport/Cache.html#method-
c-expand_cache_key)

~~~
stephen
Fwiw I never really got this approach to caching b/c the keys are always based
on calling the db, and calling the db is usually what you're specifically
trying to avoid with caching, not just the transform of db content to
json/html, which should be cheap?

~~~
ninkendo
The active record queries like User.all are lazily loaded, so it’s totally
possible to pass them around like keys without them being executed.

That said, I wouldn’t use a query as a cache key since there may be more than
one thing I want to cache about that query (GP’s example is their as_json
representation, but what if I wanted something different to be computed? Like
maybe html snippets? I would expect the cache key to mention everything about
the thing it’s caching.)

~~~
gingerlime
> what if I wanted something different to be computed? Like maybe html
> snippets? I would expect the cache key to mention everything about the thing
> it’s caching.

This is exactly where you'd use something like
ActiveSupport::Cache.expand_cache_key(["blog posts", blog.posts]). You can add
bits of info to your cache keys to distinguish them from one another.

It's probably also worth mentioning that the `cache` view helper does parts of
it for you already, and takes into account the digest of the view code itself
among other things. Rails provides a really neat abstraction here that does
make caching easier.

It has some rough edges, and some gotchas, but overall it's pretty smart.

------
BilalBudhani
Ruby On Rails is an amazing piece of software. I love the fact that solving
problems in Ruby so intuitive & expressive - You just have to focus on
building the solution rather than constantly cracking the language.

I wrote a post how why one should still learn Ruby -
[https://bilalbudhani.com/why-you-should-learn-ruby-
regardles...](https://bilalbudhani.com/why-you-should-learn-ruby-regardless-
of-what-they-say/)

------
breatheoften
I’m forced to learn rails at a new job — and I really dislike it. There’s so
much magic and things are built in an obscure/extremely inefficient way to
work around/“take advantage” of frameworks limitations/features.

There is so much code that is not part of the codebase driving the behavior of
the system — and that third party code is very tightly coupled to the actual
behavior that will be observed by running the code. Figuring how anything
actually works is made massively harder than required and developers are
encouraged towards designs that will require ridiculously inefficient database
interactions.

Rails is terrible I think for experienced developers because there are no
mechanisms in the codebase — instead there are layers and layers of
“conventions” which often only exist to try to avoid bugs that would’ve been
much better to catch or prevent with a combination of a type system, high
level tests, and less mutation.

~~~
etaioinshrdlu
This is a decent criticism of Rails, but there are things you can do to make
it smoother: try to avoid doing unusual things.

A big point of using a framework is to leverage an ecosystem other people's
libraries (and stackoverflow questions!), and they will typically only work
well if your code is relatively "normal".

If your code looks like "beginner" code in the framework, you're doing it
right.

As a point of comparison, I switched to Django for a few years after a few
years of Rails. Django was a lot less magical and was more debuggable. But
Rails' magic actually encourages a better project structure. Django doesn't
care about your structure.

I also found the thoroughness and completeness of the ecosystem surrounding
Rails to be stronger. That is the real selling point to me. For most things
you want to do in Rails, you will find a library that integrates with the
project and DB structure you already have.

~~~
srazzaque
Disclaimer: I haven't used Rails since around 3, but my assumption is the DX
hasn't changed too much. Plus if Rails N was marketed at Rails "N-1" devs,
then new devs coming in late have a lot of learning to do.

> avoid doing unusual things

If by "unusual" you mean "not the Rails way", then this makes sense. But the
problem is, the Rails way doesn't always line up with the way an experienced
programmer might want to do something.

In the specific example of fetching data from a DB - an experienced programmer
might intuitively avoid multiple round-trips to the DB, and want to find the
appropriate hook point(s) in a request handler to optimise this. Making 20
requests for 20 models is "unusual".

Rails doesn't make things like this easy. It's quite opinionated on how you
should work. The core assumption in play here is that databases will either be
fast, or can be made faster, essentially externalising the problem. Or if you
want to address this - find a Gem to do it, or dive into the layered guts of
it to figure it out.

Related - if I recall correctly, foreign keys were not deemed an important
enough feature for inclusion in Active Record, and the Rails way is to embed
the logic for joins into your models rather than the database. I'm sure many
of us here could name several scenarios where this is just plain wrong.

~~~
e12e
Activerecord is indeed a major trade-off with rails. For heterogeneous
projects, it can be frustrating that rails tends to favour validations,
constraints and foreign key relationships in the ruby model files.

Eg. You have a dot.net service that needs to write some data to the database,
and doing it via REST endpoints on your rails app is really slow compared to
accessing the database directly. But now you don't have any constraints on
your relationships, and it's easy to shoot yourself in the foot. Extra sad,
since you do have some constraints (belongs_to etc) - they're just not
communicated to your database.

I have some legacy rails projects I maintain, and my job would've been easier
if previous developers followed the rails way more closely - and whenever I
manage to refractor my way back, code becomes shorter, clearer, more efficient
and less error prone.

Things like only half of a relationship being defined, and joins being hand-
rolled in one direction is a "favorite"...

------
dwheeler
This is a really good post. That said, I would have listed caches first.
Caches are the real key to excellent performance in Ruby on Rails.

------
faebi
A long time ago I made "The complete guide to rails performance" by Nate
Perkobec. It's not free but it really taught my a lot about on the ways to
achieve very good performance with Ruby and Rails. Does anybody have any good
resources on how to achieve better performance with ruby and something like
async, concurrent ruby, threadpools and forking?

------
peter_retief
I miss RoR, I think I will churn out a project for old times sake!

~~~
crispyporkbites
Rails 6.0 is excellent. For single developer and small team web apps it's
pretty much the pinnacle of productivity.

~~~
Nextgrid
How does this compare to Django? Would it be worth it for me to learn Rails?

~~~
vinceguidry
Depends on what you're looking for and how much time you want to put into it.
What Rails does, nothing else even holds a candle to. But it's hard-won
mastery.

Rails, when learned properly, enables complete and total dominion over the
entire sphere of web development. Nothing else compares, nothing else even
comes close.

Anything you'd ever want to do with the web, save the kind of scaling that led
Twitter to replatform 5 years in, can be done 10x faster and 10x more reliably
with Rails. The main bottleneck is understanding. There's a zen to Rails that
you have to appreciate before you can unlock its potential.

It greatly saddens me that Ruby and Rails has been falling out of favor.

------
durkie
what are modern practices for scaling rails apps when the database is the
bottleneck? I have a GIS-heavy app that is built with rails, but leans on
PostGIS a lot for the heavy computation. Simply getting bigger and bigger
database instances doesn't seem to scale very well, and moving the GIS
computation code in to ruby is not an option either (RGeo is good for some
things, but can't replace PostGIS for most of my needs)

~~~
malyk
Postgres 10 introduced native table partitioning and it became much easier to
use in postgres 11. This will let you divide large tables into smaller ones
that can be queried more efficiently

Rails 6 introduced multiple databases natively in activerecord. You can more
easily have read only databases or segment off high write workloads.

Foreign tables are a way to allow bringing in data from multiple databases
into one so it “looks” like everything is in a single db. This got
better/faster, particularly with joins in postgres 12.

Also consider things like materialized views or even plain views which can
help bring together just the right data you need ahead of time.

Just a couple options off the top of my head.

~~~
durkie
Thanks! Partitioned and foreign tables look quite interesting -- I'm at the
point where I need to scale the database compute load (with write access)
across many nodes, but coordinating those nodes has seemed challenging with
rails.

And yeah I can probably lean on views a bit more to save some search time for
point/line-in-polygon operations.

------
gray_-_wolf
Hm or maybe for APIs just replace the whole thing with sinatra and sequel.
Most of advices still apply in the article still apply though.

------
samzer
Would these suggestions apply for Django?

I'm assuming Django also faces similar issues

~~~
winrid
Yeah, some of them are just common "best practices".

------
raintrees
I vaguely remember an article about a fatal flaw in Ruby on Rails that was
said to be not fixable? Is that ancient history and has been overcome, or was
it not even a thing and I read bogus information?

I have developed in Django and avoided RoR due to this possibility, but am
always interested in learning more, provided the tools have a decent shelf
life...

~~~
castwide
You might be thinking of an old SQL injection vulnerability that allowed
updating models from unchecked request parameters, aka the mass assignment
problem[0]. That type of thing can be a concern in any web application,
regardless of the framework. Modern Rails does a decent job of discouraging it
out of the box.

[0] [https://arstechnica.com/information-
technology/2012/03/hacke...](https://arstechnica.com/information-
technology/2012/03/hacker-commandeers-github-to-prove-vuln-in-ruby/)

------
jjav
Step 1: Step away from Ruby.

I loove ruby, it's my first choice for scripting wherever performance doesn't
matter. No way around the fact that it very slow though.

~~~
pcmoney
Is ruby that slow? I always understood it was comparable to python?

Plus if it is performant enough for Shopify and Github it seems like it would
be performant for 90%+ of use cases? The speed of development and the
flexibility of development is much greater than most stacks once you add
rails.

~~~
jonny383
If ruby was performant enough for Shopify, this article would not have been
written.

~~~
jrochkind1
Hm, a lot of the article focuses on optimizing your interaction with a
database (rdbms/sql), which has nothing to do with the performance of ruby
(some of it arguably has to do with _Rails_).

Even more of the article has to do with Rails, and not ruby in general.

But the larger point, are you suggesting that in a language that is
"performant enough", developers need _no_ performance advice, they can write
however they want and get adequate performance, for any performance needs?

Maybe, although I'm dubious. The existence of advice for programming does not
mean a language/platform isn't "something enough" until we have AI writing
code. At any rate, if such a language exists it is not ruby. But I disagree
with that suggestion. The existence of optimization advice does not mean that
a language "is not performant enough". If Shoppify is still happily using
ruby/rails, I would say it indeed demonstrates they are performant enough for
them.

~~~
jonny383
Nope I'm not. But Shopify is large enough to have a first-class engineering
team that would (presumably) be writing efficient code, which is nothing
special. So if this first-class engineering team is still writing about _fast
code in Ruby_, doesn't that imply Ruby isn't fast?

~~~
jrochkind1
Shopify's engineering team is obviously writing code congruent with the advice
they give here, that's why they give the advice! I'm not following you at all.

