
Rails, callbacks, workers, and the race you never expected to lose - appleton
http://logicalfriday.com/2012/08/21/rails-callbacks-workers-and-the-race-you-never-expected-to-lose/
======
patio11
For this and related predictability/absence-of-surprise reasons, I have
generally moved away from ActiveRecord callbacks in my Rails projects. It's
very easy to forget that you e.g. have an after_save defined for accounts
which made sense when accounts necessarily represented people with credit
cards in the system but doesn't make sense for e.g. people paying via purchase
order. It is also easy to forget an after_save which e.g. takes a
consequential action like "purchases a phone number" which doesn't hurt when
there is only one account.save in the code base but after you add the second
one, say in a Rake task that executes every 5 minutes, blows up terribly.

These days if I find myself wishing I have a callback that probably means a
related model method or controller needs to be one line longer than it is
right now. (e.g. I moved the welcome email out of the after_save and into the
controller in charge of online signups months ago. This was the right call, as
it means that e.g. my magic administrator create-a-free-account-for-a-
customer-who-pays-via-PO button doesn't actually mail them prior to their
account being configured correctly.)

~~~
danenania
Agreed, I've come to view callbacks as a tool to enforce application-level
data integrity and relationships, not a place that higher level business logic
should live. The line is often fuzzy, but I think the question you alluded to,
"should this happen with _every_ conceivable create/save/etc.?" is a good
general guideline.

One thing to keep in mind when moving logic out of callbacks is that you'll no
longer get the implicit transaction that ActiveRecord wraps around the whole
callback cycle, so you have to be a bit more hands-on with managing
transactions.

~~~
kenkam
And even if it happens for every create, the fact that the author had to do so
much more work just to make something like this work is a sign that Model
callbacks may not be the best solution.

------
chrisacky
This isn't actually an isolated Rails issue (although the example is perfectly
explained with Rails).

To tell a quick little story, I have experienced this issue in a system on a
very unpredictable way. It cost me hours (days?) of repeated bug hunting to
figure out what was happening.

So, our users will save an "Entity". The entity is actually pretty complex,
and has the potential to span 40-50 tables. (O how I wish I used a Document
Store rather than relational). Anyway, when the "entity" gets saved, I need to
update Solr search, with an updated reflection of the model. There are two
flags within this model, "enabled" and "deleted", and I use these flags to
filter on the Documents within Solr.

All jobs are sent to Solr by going through a Gearman Job Server worker
process.. However, the worker, was receiving the job before the database had
persisted the updated model, so it was always one snapshot before the user
initialized the request.

But oddly enough, this issue would only ever happen when the server was under
very very small load. I could not for the life of me track this issue down,
and it took me months before I realised this was almost akin to a race
condition, especially since when under load the bug never happened... And.. If
I artificially added load, the bug wouldn't happen. It would only happen when
the server was running smoothly.

Once I spotted this, I was able to fork the request off to Gearman at the
correct time, (rather than as a prePersist lifecycle event that I was
previously).

I definitely learnt a lot from this little issue. It was such a simple issue,
that eluded me for months, and just goes to show, that not all code runs the
same... given different conditions, (ie Load), you will find that your code
can behave differently.

------
dhh
Emails are like async views. The model should not be responsible for rendering
views. That's the responsibility of the controller. Which is why Action Mailer
itself is modelled on Action Controller, uses Action View, and so forth.

~~~
urbanautomaton
He's not using the model to render the view, though, he's using it to queue up
a future asynchronous request to the relevant mail "controller".

~~~
latortuga
Sure, you're right, but only technically. Having an extra collaborator in the
model is asking for trouble - there's nothing about a User that requires it to
know anything about emailing. It violates all kinds of design principles.
Putting your logic for "I should email the User when they sign up" into the
User model is using "design by related words".

In traditional Rails design, the controller is exactly the right place for
this. Another option is a service layer of some kind, if you do that, e.g. a
DCI context, or just another Plain-Ol-Ruby-Object that is in charge of the
logic of signing up a user.

The best part of separating out the logical business sequence of signing up a
user into its own object (I can see this method in my head, something like
save_new_user, sign_in_user, queue_welcome_email) is that it becomes massively
easier to test - you can just mock the various calls to its collaborators'
public methods because they should all have unit tests.

~~~
urbanautomaton
Ah, technically correct - the best kind of correct. :-)

Anyway, I agree with pretty much everything you say. As I mentioned elsewhere
in this thread, we've gone down a pub-sub event system for handling our
generic post-action tasks. The controller announces a :create_user event, and
any interested listeners can respond appropriately. And yes, one of the
biggest benefits of this has been the ability to test that event-handling
logic in isolation; stubbing and mocking collaborators, and generally having
the sort of testing fun you're normally not supposed to have with Rails.

------
jmileham
This is a great example of the complexity tradeoff when you choose performance
over atomicity. Delayed::Job has lost favor lately, but the semantics of the
job becoming visible to the queue runner atomically with the related new
record are hard to beat.

Note that even in the proposed solution, after_commit hooks offer no guarantee
that the job will ever be enqueued, if for example redis is unavailable, or
your process gets reaped between when the database commit and when the
after_commit hook gets fired -- the commit has already happened. So if it's
really important that your asynchronous task be queued, keeping the queue in
your DB has some advantages.

------
davedx
These kind of blog posts are gold for people looking to gain a deeper
understanding of Rails development. Thanks!

------
strictfp
This can be solved by using a transactional messaging system. The message will
then only enter the queue when all datasources have committed. This also has
the additional benefit that a tx rollback after the entity save will cancel
the message, and not crash the message handler.

------
ruckusing
I encountered this very issue a few years ago, basically your queue picking up
jobs before the database commits. I solved this by actually queuing up my jobs
to a thread-local queue and then firing the whole spool at the end of the
request (in an around_filter, so you fire the spool after the yield to the
action). This means that you actually fire your jobs after the request has
done its whole database commit. This can all be done without having to leak
abstractions and make the model layer know about the presence or lack of the
fake queue.

Let me know if anyone is interested and I can share some code.

------
gnufied
Good points over all. If you don't want to disable transaction fixtures for
certain specs and still want to use after_commit, I have this little patch -
<https://gist.github.com/3205621> which works flawlessly.

And I liked author's solution of tracking attribute changes. But may be there
is a cleaner way. I had similar problem where, I needed a handle on
after_commit :on => :update but in observer.

------
ctrlaltesc
I've been hoping the rails team would implement two phase commits since the
1.0 days. Does anybody know if the idea has been considered?

------
pselbert
This is a really common issue that can be tricky for users unfamiliar with it
to track down. There is no need to try and schedule all of your jobs in the
"hopefully-later-than-your-transaction-completes"!

------
kennystone
This is so much more easily solved by just sending the JSON document of the
user in the message to the worker. A database look is not necessary, a
transactional record lookup based on ID is overkill.

~~~
drumdance
Maybe. We send different messages based on a user's subscription level which
is two tables away ( user.account.subscription). I suppose we could dump that
into the json object too, but IMO that just adds complexity.

~~~
kennystone
Looking up state and dealing with race conditions is way more complex than
fire and forget, imo.

------
atomical
I think this problem would be solved with after_create. Is there a reason not
to use this?

------
railswarrior
The site does not open .

------
andyl
I've started using the DCI technique outlined in Jim Gay's book 'Clean Ruby'.
To me Jim's technique is somewhat similar to the Hexagonal Rails approach
describe in another comment.

The DCI technique is to use the model as a persistence layer, and organize
business logic into modules organized by use cases. With this technique my
system has become much easier to understand and test.

Its great to see people experimenting with architecture styles like DCI and
HexRails.

