Fast Rails updates through minimal dependencies

rauljara · on Jan 11, 2013

Very interesting. This is kind of the opposite of the standard advice of "don't repeat yourself" and "don't reinvent the wheel". Both those pieces of advice make an awful lot of sense to me.

But this is more advice in the style of an STD prevention campaign. "You aren't just sleeping with that gem, you're sleeping with every gem that gem ever slept with." Which, admittedly the blog author isn't taking quite to abstinence advocacy levels, but also rings kind of true.

Reimplementing basic stuff across projects probably also keeps up your programming chops. The practice would help make the trivial stuff truly trivial for you.

But what worries me about this advice is that there are some things that feel like they're trivial to implement but actually have quite a few subtleties and gotchas. In a language as dynamic as Ruby, you might even end up doing something awful (like letting your xml parser run arbitrary code). If the subtle bugs are located in gems, you only need to fix the gem to fix it all.

But then again, how often do you own the gem? Even it's open source and you can fork it, the whole point of a gem is encapsulation. You shouldn't have to be thinking about its inner workings. Some gems really can carry big risks with them.

I don't feel like I have answers to these questions right now. Just worth thinking about.

nirvdrum · on Jan 11, 2013

This is going to sound extreme, but I did measure it. Roughly 30% of my time was spent just managing dependency updates for quite some time. That, to me, felt absurdly high. But, the problem was transitive dependencies and their changing APIs.

E.g., the redis-rb gem decided it was going to change its API. I guess that's fine, but I didn't use the redis-rb gem directly anywhere in my code. I did use a half dozen libraries that used redis-rb, half of which that immediately switched to the new API, half of which did not. The kicker was the new gem also had a hard requirement on a brand new version of redis, so you couldn't update the gem without also updating redis. So, naturally, not all gems wanted to update to the new gem immediately.

This was a manufactured situation that was 100% avoidable. Semantic versioning doesn't help at all in this case. Namespacing your API does. But, that aside, I now found myself having to submit pull requests for libs using redis-rb when I had never used the gem myself. I certainly didn't save any time and the whole thing felt like a gigantic waste of effort.

More recently, I went through this when multi_json changed its API.

And the other common situation is gemspec's inability to say gem Y provides the API for gem X or to override version numbers you know to work (much like maven allows). So, a lot of problems that should be simple to solve become matters of wrangling multiple parties to agree on compromises.

Since actively pruning my dependency graph and only very selectively growing it, my time spent managing dependencies is mostly negligible.

alloy · on Jan 11, 2013

> This is kind of the opposite of the standard advice of "don't repeat yourself" and "don't reinvent the wheel". Both those pieces of advice make an awful lot of sense to me.

It’s not about re-implementing everything per project, it’s a reminder to think about all of the code you pull into your project.

We use some third-party gems, we have our own gems, but as a rule these should all provide the minimum necessary and not try to solve all higher-level use-cases. Because these types of libs, that do come with the proverbial kitchen-sink, tend to bring in functionality that you won’t be using but are still very much opportunities for bugs.

> You shouldn't have to be thinking about its inner workings. Some gems really can carry big risks with them.

Some might indeed carry big risks, which is why it’s good to know your code and that is made significantly easier with less code.

However, you should be thinking about their inner workings. Especially, but not limited to, open-source code which all (afaik) provide no warranty whatsoever. So when you pull it in, directly or indirectly, you are responsible.

Manfred · on Jan 11, 2013

> But what worries me about this advice is that there are some things that feel like they're trivial to implement […]

It's up to the developer to make that call.

For things like OpenSSL or concurrency libraries it seems an easy choice. You will definitely save time and effort using a library for that. For things like pagination a few scopes with a view helper might cause less problems than a gem in the long run.

nirvdrum · on Jan 11, 2013

We've been doing the same for the past year or so at Mogotest and the results have been great. To build on the points in the articles:

- Exceedingly few gems provide any semblance of a human-readable changelog and even fewer care about targeting multiple APIs at once. Assuming you don't just blindly update your entire Gemfile at once, you're going to end up spending much more time than is reasonable to determine whether you should upgrade that gem you just pulled in.

- There's a huge stigma against libraries ever being "done." So, they keep adding new features or coming up with better APIs. You're going to end up having to go along for the ride.

- Rails doesn't have much of an API per se or at least not one that's likely to survive a major upgrade in tact. After the Rails 2.3 -> 3.0 upgrade, we just flat out avoid anything that's a "rails plugin." I suspect our 3.2 -> 4.0 upgrade will go much smoother this time around.

- If you target multiple rubies, it becomes much, much easier with a slimmer dependency graph.

At the end of the day, a lot of this is just the general debate of building it vs using a library that any language or environment has. But I think some of the cultural aspects of Ruby amplify the debate as it's just a much faster moving target, which may be at odds with stable apps.

dasil003 · on Jan 11, 2013

This comment is gold, you nailed the issues that really hit ruby harder than most other languages. Minimizing dependencies in ruby land is an excellent mitigator for the effect of rapidly changing APIs, allowing you to at least approach having the best of both worlds: the latest APIs on the libraries you use heavily and avoiding dependency hell with incidental requirements.

troels · on Jan 11, 2013

Basically, whenever you decide to use a 3rd party library, you should be prepared to maintain it at some point.

thibaut_barrere · on Jan 11, 2013

For faster updates in the future, you may want to use this in your Gemfile for the coming weeks:

gem "rails", "~> 3.2.11"

See http://gembundler.com/v1.2/gemfile.html for the meaning of ~>.

Also, I think that "having a good test coverage" is quickly becoming more interesting that plainly "avoiding breaking stuff" or "adding new features faster": it's also being able to quickly answer to security threats, either via official gem updates and despite regressions (https://github.com/rails/rails/issues/8832), or via quick work-arounds.

radimm · on Jan 11, 2013

It's always good to use Vagrant (or any other method to be quickly able to refresh the dev environment) and routinely do destroy/up. This is for me so far the best precaution to keep the application light on dependencies.

Manfred · on Jan 11, 2013

I wholeheartedly agree, it's actually something I wanted to write about. A maintainable application should be up and running with a seeded database in a few commands; i.e.: gem install bundler && ./script/install.

Volpe · on Jan 11, 2013

That's fine if it's a ToDo list app. But if your app integrates with other services applications... it needs those running as well.

The more services/external dependencies you cram into single commands... the more maintenance (of that cramming) is required.

Manfred · on Jan 11, 2013

That just means you have to try a bit harder (;

I agree that it's not always possible to set up everything automatically but with a shared encrypted filesystem, good instructions, and tools like Foreman you can go a long way.

In case something can't be set up automatically we usually have a clear README and the initializer prints a note about where to find instructions about getting it running.

Maintaining the install script isn't much more painful then setting up it by hand, especially if you make sure the script is ‘reentrant’.