
Kill Your Dependencies - twampss
http://www.mikeperham.com/2016/02/09/kill-your-dependencies/
======
LukeB_UK
I disagree, and this quote I've seen floating around the internet sort of sums
the idea up to me (albeit with a music analogy):

> _I thought using loops was cheating, so I programmed my own using samples. I
> then thought using samples was cheating, so I recorded real drums. I then
> thought that programming it was cheating, so I learned to play drums for
> real. I then thought using bought drums was cheating, so I learned to make
> my own. I then thought using premade skins was cheating, so I killed a goat
> and skinned it. I then thought that that was cheating too, so I grew my own
> goat from a baby goat. I also think that is cheating, but I’m not sure where
> to go from here. I haven’t made any music lately, what with the goat farming
> and all._

~~~
endemic
I don't think that's what he's advocating: he's just saying that every
dependency you have is another thing you have to worry about, so why not try
to limit them as much as possible? Obviously it's impractical sometimes, and
that's OK, as long as you understand the consequences.

~~~
cheapsteak
Every dependency you have is another thing that people besides yourself can
worry about with you or for you

Need to do X? If you write your own library that does X, chances are you'll be
the only one to ever work on it. Need a new feature? You have to stop working
on your actual project and implement that feature. Found a bug? No one else
will fix it for you.

If you depend on a library that thousands of other people also use that does
X, if you do find a feature you need that it doesn't have, open a ticket,
someone will likely do that work for you. More often than not, that feature
already exists, all you have to do is read the docs. If you find a bug, report
it and and wait for it to get fixed, but there's also nothing stopping you
from fixing it yourself and submitting a pull request.

~~~
bluejekyll
A better way to view this is that if libraries are kept small, then you can
pick a more granular set of dependencies than say a larger library that
includes the kitchen sink.

I agree with most comments here that it's not a good idea to reinvent the
wheel everywhere, but when all you want is a wheel, and not the entire car,
it's ideal that we have a way to just include the wheel.

------
mwcampbell
A large number of dependencies is only a problem in environments that aren't
amenable to per-function static linking or tree-shaking. These include
dynamically typed languages like Python, Ruby, and JavaScript (except when
using the Google Closure Compiler in advanced mode), but also platforms like
the JVM and .NET when reflection is allowed. Where static linking or tree-
shaking is feasible, the run-time impact of bringing in a large library but
only using a small part of it is no more than the impact of rewriting the
small part you actually use.

Edit: Dart is an interesting case. It has dynamic typing, but it's static
enough that tree-shaking is feasible. Seth Ladd's blog post about why tree-
shaking is necessary [1] makes the same point that I'm making here.

[1]: [http://blog.sethladd.com/2013/01/minification-is-not-
enough-...](http://blog.sethladd.com/2013/01/minification-is-not-enough-you-
need.html)

~~~
nostrademons
Not always. Dependencies were a huge problem at Google, even in C++ (perhaps
especially in C++), because they mean that the linker has to do a lot of extra
work. And unlike compiling, linking can't be parallelized, since it has to
produce one binary. At one point the webserver for Google Search grew big
enough that the linker started running out of RAM on build machines, and then
we had a big problem.

There's still no substitute for good code hygiene and knowing exactly what
you're using and what additional bloat you're bringing in when you add a
library.

~~~
fixermark
That's a pretty significant special case though. I'd be willing to go with the
advice "If you get as big as Google's codebase, be sure to trim the
dependencies on your statically-bound languages too." But you probably have a
ways to go before that's an engineering concern for your project.

(... note that one could make a similar argument for more runtime-dynamic
languages. I won't disagree, other than to observe that as a lone engineer,
I've managed to code myself into a corner with dependencies in Rails ;) ).

~~~
fleitz
^ THIS. I wish I had more up votes.

The amount of time I've seen wasted trying to scale to Google is insane.
People should worry about what Google does when they work for at least a
billion dollar company.

For most projects import as many dependencies as you can as you are getting
free labour. Sure, once in a while you'll fuck something up and waste a week
or two, but it pales in comparison to the months you didn't spend reinventing
the wheel.

No one really ever notices that it's all the companies with boat loads of cash
that have massive technical debt. Even with the example at Google the first
thing I'd try is jamming more memory in those machines, keep going until the
linker needs more than 256 GB.

Fuck, Facebook still uses PHP, the stock market doesn't seem to care.

------
diggan
As everything, I think a bit of balance is needed.

You're doing a quick MVP to demonstrate that your idea is working? Fuck it,
just throw in dependencies for everything, just care about solving the problem
you're trying to solve and proving/disproving your point.

Once you verified it, then go and kill your dependencies. But don't do it just
because you want to do it. If in the end the users doesn't benefit from you
optimizing your dependencies, why do it? (Speaking from a product side rather
than a OSS project used by other projects)

Not sure KILL ALL DEPENDENCIES is helpful, but I'm not sure that MAKE
EVERYTHING A DEPENDENCY is helpful either so...

~~~
st3v3r
That'd be good advice if those MVPs didn't so often become the actual product
themselves. If industry and management understood that these things were proof
of concepts, and realized that the actual product is going to have to be
rewritten, then I'd agree with you.

------
allendoerfer
To me, this seems more like an argument for optimizing beyond your own stack.
Don't kill your own dependencies.

Your app uses to much memory? Improve a dependency, you have now improved
other peoples apps, too.

Your app uses to much dependencies in total? Try to get all your first-level
dependencies to standardize on the best http-client. (Which he is partially
doing with his post.)

Dependencies may have problems, but shared problems are better than problems
only you have.

~~~
sargas
I agree 100% with this.

I used to bring in dependencies with the "don't reinvent the wheel" mentality.
Then I realized how much trust I'm giving to the authors of all dependencies I
pull in. Now I tend to do my best to understand the dependencies I bring so I
can improve them if I can.

The only problem I find with this decision is when I make an improvement/fix a
bug on a dependency, and the project is either inactive or the authors don't
give a crap about your work.

~~~
allendoerfer

        The only problem I find with this decision is when I make an improvement/fix a bug on a dependency, and the project is either inactive or the authors don't give a crap about your work.
    

True, but I think a temporary fork, which will eventually be merged back in,
is still better than your own code with its own bugs.

------
simonw
Another benefit to minimizing your dependencies is security. The less external
packages you are using (especially packages without active, security-conscious
maintainers) the less likely you are to suffer a surprise vulnerability due to
something deep down in your dependency hierarchy.

This goes for client-side JavaScript too. XSS holes are one of the worst web
app vulnerabilities out there and could easily be introduced accidentally by a
simple mistake in a library. And this stuff is incredibly hard to audit these
days thanks to the JavaScript community's cultural trend towards deeply nested
dependencies.

~~~
riffraff
but otoh, if you try to reinvent something instead of using a tried & true
library, you might as well just add new bugs.

I.e. I'd 100% use libxml to sanitize xml rather than trying and reimplementing
xml parsing myself.

As always, trade offs.

~~~
fixermark
Yep.

OpenSSL has major security issues encountered on a relatively regular basis.

Do _not_ do your users the disservice of rolling your own SSL implementation.
;)

------
ninjakeyboard
I'm not 100% sure I agree with this as stated. Sure if the functionality is in
core lib, use it but... it depends...

Consider these three statements:

\- No code runs faster than no code. \- No code has fewer bugs than no code.
\- No code is easier to understand than no code.

For a language like scala where there is no json processing in the standard
lib, if there is a json library that is battle tested, then by removing my own
json code and leaning on that well tried and tested code for serialization/de-
serialization, I've removed a whole bunch of code from my own library. The
whole point of having modules as abstractions is to keep concerns neatly
tucked in their own places to to increase re-use. By subscribing to the idea
that my module should implement all of the functionality it needs, we're
loosing the benefits of modularization.

I just went through this exercise myself in a library I maintain - I removed
my own json code and put a library in. I removed a bunch of code and made the
whole thing simpler by leaning on that abstraction.

~~~
falcolas
> ... I've removed a whole bunch of code from my own library

You removed a bunch of code you understood, and added a bunch more code you
don't understand, along with whatever technical debt, edge cases, and
performance issues which are lingering in that library.

Adding a library is never removing code from your project, it's adding code
you don't yet understand to your project. It can still be a net win, but it's
not less code for you to maintain.

~~~
cyphar
> > ... I've removed a whole bunch of code from my own library

> You removed a bunch of code you understood, and added a bunch more code you
> don't understand, along with whatever technical debt, edge cases, and
> performance issues which are lingering in that library.

Not all code you've written is good code. Hell, not all code you've written
you actually understand. Libraries and dependencies make sense in many cases.
Don't write yet another JSON parsing library unless you _really_ need to.

> Adding a library is never removing code from your project, it's adding code
> you don't yet understand to your project. It can still be a net win, but
> it's not less code for you to maintain.

It's referencing code that you don't maintain. If the maintainer is bad, use a
different library.

------
rileymat2
"The mime-types gem recently optimized its memory usage and saved megabytes of
RAM. Literally every Rails app in existence can benefit from this optimization
because Rails depends on the mime-types gem transitively: rails ->
actionmailer -> mail -> mime-types."

It seems like this could also be cast as a major success for "semi" standard
dependencies.

------
laumars
This article would be more accurately written as "prefer the standard library
over 3rd party solutions" since all the examples given still required
dependencies, but ones that are shipped as part of the language runtime (Ruby
in this case).

However when discussing languages with no specific standard library or
languages who's standard library is missing feature _y_ , then it's quite
understandable to use a 3rd party battle tested dependency. In fact I'd go
further and say it would be advisable to use a respected 3rd party library
when dealing with code which handles security or other complex concepts with
high failure rates.

~~~
50CNT
Matter of fact, sometimes it's better to be using a respectable 3rd party
library. Requests vs. urllib2 in Python springs to mind, and I'm sure there's
more examples.

------
Animats
Avoid shims.

There are lots of libraries that just put one interface on top of another
interface. They don't do much actual work. Pulling in shims, especially if
they pull in lots of other stuff you're not using, should be avoided.

If the dependency does real work you'd otherwise have to code, then use it.

~~~
dawnerd
_cough_ Mongoose. Been moving away from it on my projects. While it does
provide a nice interface, it just creates more work down the road.

------
peterwwillis
Perl apps have thousands upon thousands of dependencies. It's intentional -
reused code in CPAN means less downloading, more efficient code, and less bugs
as the codebase gets refined. An app that relies mostly on dependencies is
essentially an app with free support by hundreds of developers. That's the
case with CPAN anyway; I don't know how Ruby people do things.

Bugs happen, though. If you see a bug in a dependency, _it is your job to
report it_ at the very least, if not make an attempt to fix it. Without this
community of people helping to improve a common codebase, we'd all be writing
everything from scratch, and progress would move a lot slower.

------
adenadel
This reminds of of this article

[http://www.joelonsoftware.com/articles/fog0000000007.html](http://www.joelonsoftware.com/articles/fog0000000007.html)

Apparently Microsoft's Excel team had even written their own C compiler.

------
nickpsecurity
Obligatory essay from PHK on the effect the author describes:

[http://queue.acm.org/detail.cfm?id=2349257](http://queue.acm.org/detail.cfm?id=2349257)

History continues to repeat itself. Fake reuse and proliferation of
unnecessary bloat are two of those recurring themes. Fight it whenever you
can. The old TCL, LISP, Delphi, REBOL, etc clients and servers were tiny by
modern standards. They still got the job done. Strip out bloat wherever you
can. Also, standardize on one tool for each given job plus hide it behind a
good interface to enable swapping it out if its own interface isn't great.

------
vinceguidry
Gems I use fall into three categories.

A lot of my projects are just wrappers around one main gem. Rails, Nokogiri,
Roo, API wrapper gems. These are 'project gems'. If they give me problems,
I'll re-evaluate the scope of the project and perhaps pick another gem to
orient the project around. Once the project reaches maturity, I'll default to
fixing the problem rather than re-engineering it unless the problems run deep.

Sometimes I'll use gems like Phoner to handle datatypes that are too tricky to
do with regular Ruby. I'll call these 'utility gems'. When I include a utility
gem, generally it has one job and one job only, it's invoked in exactly one
place in the code and gets included in that file. I can generally replace a
utility gem with stdlib Ruby code if I really need to.

I also have what I call 'infrastructure gems'. These are gems like pry,
capistrano, and thor that I tend to include in every project where it seems
they would be useful. These are gems that are worth getting to know very well
because they solve really hard problems that you don't want to use stdlib for.
If these give me problems I will do whatever I need to to resolve them and
understand why the problem exists, because the costs of migrating off of them
would be _steep_.

The decision to use a gem should not be taken too lightly, but nor should it
weigh large on the mind. Be quick to try it out, but also quick to take it
out.

------
EGreg
I was just thinking about this today. But from the point of view of growinga
community around a platform!

Would you want to have one namespace for "official" modules and heavily
influence everyone to use them? That's centralization (of governance). But,
it's not centralization of a process that requires high availability. So the
"drawback" is only that you centralize control and can make certain guarantees
to developers on your platform.

When you're starting an ecosystem, you can choose a "main namespace" as yum,
npm etc. does or you can choose the more "egalitarian" convention of
"Vendor/product" as github and Composer do. I think, in the end, the latter
leads to a lot more proliferation of crap, and as the articls said, multiple
versions of everything existing side-by-side.

I have to deal with these issues when designing our company's platform
([http://qbix.com/platform](http://qbix.com/platform)) and I think that having
a central namespace is good. The platform installer etc. will make it super
easy to download and install the "official" plugins. You can distribute your
own "custom" plugins but the preferred way to contribute to the community
would be to check what's already there first and respect that. If you REALLY
want to make an alternative to something, make it good enough that the
community's admins protecting the namespace will allow it into the namespace.
Otherwise, promote it yourself, or fork the whole platform.

------
BinaryIdiot
This is a great read that can be applied to node.js very much. I've seen apps
that include 10, maybe 20 dependencies but when you flatten out the full
dependency tree? Thousands. It's incredible and if one of those dependencies
screws up semantic versioning or just screws up in general it can be a
nightmare to debug and fix.

This is why every 1.0 product I work on I include every dependency that speeds
up my development. In 2.0 the first things to do is prune all unnecessary
dependencies and start minor rewrites when a dependency can be done in house
(yeah yeah reinventing the wheel is a problem but most npm dependencies are
small and many can be recreated internally without issue).

This is even more important if you're creating a library / module. My msngr.js
library uses zero dependencies and yet can make http calls in node and the
browser because it was easy to implement the minimal solution I needed without
bringing in dependencies to support a single way of calling http.

~~~
alexose
The worst offender, IMO, is request. I've seen more than a few projects pull
it in just to make a single HTTP call. Just look at its package.json:

[https://github.com/request/request/blob/master/package.json](https://github.com/request/request/blob/master/package.json)

~~~
jbergknoff
Yes, the npm "request" module is extremely bloated and badly written, to boot.
[https://www.npmjs.com/package/needle](https://www.npmjs.com/package/needle)
is a good alternative.

~~~
voltagex_
Can you elaborate? How do I know Needle is any better?

------
yoz-y
> No code runs faster than no code. > No code has fewer bugs than no code. >
> No code uses less memory than no code. > No code is easier to understand
> than no code.

The dependencies you decide to implement yourself in a minimal fashion are
code though. I generally agree with the article, but in the end It Depends™

~~~
bpicolo
And are generally worse tested, worse supported, and you have to maintain it
on your own

------
dec0dedab0de
It sounds like this is advocating NIH syndrome. If a library is going to make
my job easier I'm going to use it, unless there is a very specific benefit of
doing it myself.

------
justinator
Perl takes a pragmatic take on this (as well as other takes...) with the
collection of ::Tiny CPAN modules that just do one thing pretty OK. Things
like Try::Tiny that help immensely with exception handling - something you
don't want to really roll you own.

It itself does not have any dependencies that aren't in core:

[http://deps.cpantesters.org/?module=Try%3A%3ATiny;perl=lates...](http://deps.cpantesters.org/?module=Try%3A%3ATiny;perl=latest)

------
ocdtrekkie
So, I've been writing a home automation system using the .NET Framework (with
Visual Basic, I'll wait until you finish laughing).......... Okay.

I've made a point not to add any third parties references and packages I can
avoid. I went ahead and got a third party scheduling engine, and the SQLite
provider, but beyond that, I'm writing everything else myself so far.

First of all, I'm learning a lot in having to write stuff myself. At the very
least, it's a great educational experience. I've worked with a lot of code
samples, so I'm not going totally from scratch, but they're all at the very
least tailored to my needs.

But for me, the big thing is keeping everything thin. The program loads in
milliseconds. Almost all of the reference data for what it's built on is in
one place (the .NET Framework Reference). And key, is that the features my
program supports are the features I want and need, not the features some
dependency has told me to have.

The biggest dependency I have, Quartz.NET, is actually the most confusing
part. It's not structured like the rest of my program is, it's documentation
leaves some things to be desired, and it does a lot more than I need it to.
There's a lot of bloat I could cut out if I wrote my own scheduler, and maybe
someday I will.

------
cdnsteve
Double edge sword. Deps are _great_! Functionality added quickly. Deps are
_terrible_! They broke my app.

If your app has a long shelf time, the less deps you rely on, the easier to
manage from what I've seen.

For some reason Golang _feels_ like it makes sense here. Pretty much
everything you need is in core. *Disclaimer, I don't have any Golang apps in
prod but I'd love to hear from those that do.

------
pcwalton
A lot of apps (old-timey Windows apps, for example) have this philosophy,
leading them to reinvent things like crypto and image decoding. Naturally,
this leads to tons of bugs, including security bugs.

I would revise this to: Don't bring in more code than you need. But if the
choice is between writing something yourself and using someone else's well-
tested, heavily-used library, always go for the latter.

~~~
ktRolster
As an architect, you need to be able to do a cost/benefit analysis of each
option. That is what software architects do, why they have experience. For
example:

    
    
      How much time will it take to implement each option?
      How much time will it take in the future to support it?
      What security risk does each option incur?
      What is the risk of the project being abandoned?
      What is the risk of the project changing in non-backwards compatible ways?
      What are the performance characteristics of each option?
    

NIH is a disease, but so is import-mania. With experience, you can make a good
decision.

~~~
skewart
One thing I've gotten into the habit of doing is looking around the commit
history and issue list for any package I import. Was it something somebody
wrote in a hurry and hasn't really touched since? Is it something that has a
solid set of regular contributors? Are there a lot of outstanding issues
relative to how heavily used it is?

I also spend more time actually reading through specs to see how well they
exercise the code.

That's probably standard procedure for a lot of people, but it's something
that I had to learn to always do.

~~~
ktRolster
That's a good idea

------
gtrubetskoy
Your programs shouldn't do things you do not understand. You do not have to be
an _expert_ in cryptography, memory allocation or b-trees, etc, but if this is
what your app requires, then you should take the time to read up on it and
carefully research what is out there if you suspect it is beyond your
abilities to implement.

If you take the time to do your research, the choice between rolling your own,
copying or adding a dependency will become clear. If it's not becoming clear,
then you haven't finished your homework. Learning is a good thing, yes it
takes time, but it's time well spent, and it's _fun_ above all.

You may discover that this thing that you thought was hard and needed a
dependency is really a few lines of code (a good example is a graph
implementation). It might even change your career path. At least that's been
my experience in the nearly two decades of writing software.

------
fixermark
Incidentally, though the author says this can apply to any ecosystem, finding
it applies to Ruby too often is what pushed me out of developing Rails apps.
At least at the time I was using it heavily, the Rails space just wasn't
stable enough to trust that I wouldn't have to learn an entirely new wheel to
get my work done every time I went in to fix a relatively small problem.

"Can I implement the required minimial functionality myself? Own it" is advice
one gives if one can't trust the libraries one depends upon to stay healthy,
performant, and applicable to your use-case. Nobody'd recommend re-
implementing readline or printf; if you have some heavy-lifting mathematics to
do in Python, use numpy.

------
EternalFury
How long did it take anyone to realize this nightmare? Since we are on the
path to major discoveries, let's talk about runtime, runtime-dependencies and
all that. Every Ruby app ever created is stuck somewhere on the time axis,
before its origin.

------
jeffdavis
Your app/library inherits the technical debt of all its dependencies.

There's a natural tension between code reuse and avoiding dependencies. If you
can avoid a big dependency by writing a couple hundred lines of low-
maintenance code, its probably worth it.

------
agentgt
I find this argument sort related to the framework vs library and opinionated
vs agnostic.

Being an old fart Java developer I generally prefer things where you can
plugin your own implementation (ie agnostic).

That is there is an extreme for killing your dependencies of either extreme
copy'npaste OR which every library offers a plugin SPI (ie inversion of
control) (or a combo of both).

The problem with the dependency injection above approach (aka Spring prior to
Boot) is that you have developers doing lots of custom crap, bloated/overly
engineered libraries, increased ramp up time, and configuration hell.

But I still think this is probably better than ole copy'n paste.. most of the
time. I do hate dependencies though.

~~~
mwcampbell
You're probably more experienced in java development than me. But maybe a good
rule of thumb is that most Java libraries shouldn't use reflection or
dynamically loaded classes (i.e. using Class.forName or ClassLoader). That
rules out most dependency injection frameworks.

~~~
agentgt
That is correct that the libraries should not have DI but I should be able to
wire up the library on my own and not let the library do its own static
initialization. What is far worse than Class.forName and other crap is
libraries self imposed singletons.

Take for example Hystrix. I'm just now fixing that the thing loads up its own
configuration framework (Archaius) which uses static initialization. Archaius
needed like 10 other dependencies. This is all really because Hystrix uses
static singleton (HystrixPlugins) and many frameworks need this or else is
incredibly difficult to get an implementation up (ie using pseudo singleton to
avoid excessive passing of a context).

[https://github.com/Netflix/Hystrix/pull/1083](https://github.com/Netflix/Hystrix/pull/1083)

------
giancarlostoro
One issue I had with Ruby on Rails was getting MySQL drivers to even cooperate
on both Windows and Linux. At the end of the day I wound up sticking to Python
and other languages instead. I don't mind using any language, but if the
language is fighting me due to native dependency hell then I can't really do
much. Even to just use SQLite was a bit of a painful experience, yet on Python
SQLite works out of the box without any effort on my part (on Windows). Oddly
enough.

I'm looking to getting back into Ruby at some point later this year, but I
might ignore Rails altogether so I don't miss out on learning a new fun
language.

------
jcoffland
This is not only an issue for running software but also a huge issue for
compiling/building software. Each dependency adds the potential to break your
builds in new ways. As the software your program depends on evolves the risk
increases that it will change the way your program executes or cause it to
fail to build. Many devs will insist you do not reinvent the wheel by writing
things like JSON parsers but you always have to weigh the cost of adding a
dependency. It's not free.

------
greggman
This reminds me of an example I ran into yesterday. I haven't used webpack yet
but I saw a question on SO of someone wanting to use some package called
glslify. I thought I'd take a look and maybe learn webpage in the process.

From the description all glslify does is look for files with the extensions
.glsl, .frag, and .vert and lets you get their contents with `content =
require(filename)`.

Sounds like it would be at most 10-30 lines of code. Nope

    
    
        npm install --save glslify-loader
        webpack-glsl-test@1.0.0 /Users/gregg/temp/webpack-glsl-test
        └─┬ glslify-loader@1.0.2 
          └─┬ glslify@2.3.1 
            ├─┬ bl@0.9.5 
            │ └─┬ readable-stream@1.0.33 
            │   ├── core-util-is@1.0.2 
            │   ├── isarray@0.0.1 
            │   └── string_decoder@0.10.31 
            ├─┬ glsl-resolve@0.0.1 
            │ ├── resolve@0.6.3 
            │ └── xtend@2.2.0 
            ├─┬ glslify-bundle@2.0.4 
            │ ├─┬ glsl-inject-defines@1.0.3 
            │ │ └── glsl-token-inject-block@1.0.0 
            │ ├── glsl-token-defines@1.0.0 
            │ ├── glsl-token-depth@1.1.2 
            │ ├─┬ glsl-token-descope@1.0.2 
            │ │ ├── glsl-token-assignments@2.0.1 
            │ │ └── glsl-token-properties@1.0.1 
            │ ├── glsl-token-scope@1.1.2 
            │ ├── glsl-token-string@1.0.1 
            │ └── glsl-tokenizer@2.0.2 
            ├─┬ glslify-deps@1.2.5 
            │ ├── events@1.1.0 
            │ ├─┬ findup@0.1.5 
            │ │ ├── colors@0.6.2 
            │ │ └── commander@2.1.0 
            │ ├── graceful-fs@4.1.3 
            │ ├── inherits@2.0.1 
            │ └─┬ map-limit@0.0.1 
            │   └─┬ once@1.3.3 
            │     └── wrappy@1.0.1 
            ├── minimist@1.2.0 
            ├── resolve@1.1.7 
            ├─┬ static-module@1.3.0 
            │ ├─┬ concat-stream@1.4.10 
            │ │ ├── readable-stream@1.1.13 
            │ │ └── typedarray@0.0.6 
            │ ├─┬ duplexer2@0.0.2 
            │ │ └── readable-stream@1.1.13 
            │ ├─┬ escodegen@1.3.3 
            │ │ ├── esprima@1.1.1 
            │ │ ├── estraverse@1.5.1 
            │ │ ├── esutils@1.0.0 
            │ │ └─┬ source-map@0.1.43 
            │ │   └── amdefine@1.0.0 
            │ ├─┬ falafel@1.2.0 
            │ │ ├── acorn@1.2.2 
            │ │ ├── foreach@2.0.5 
            │ │ └── object-keys@1.0.9 
            │ ├─┬ has@1.0.1 
            │ │ └── function-bind@1.0.2 
            │ ├── object-inspect@0.4.0 
            │ ├─┬ quote-stream@0.0.0 
            │ │ ├── minimist@0.0.8 
            │ │ └─┬ through2@0.4.2 
            │ │   └─┬ xtend@2.1.2 
            │ │     └── object-keys@0.4.0 
            │ ├── shallow-copy@0.0.1 
            │ ├─┬ static-eval@0.2.4 
            │ │ └─┬ escodegen@0.0.28 
            │ │   ├── esprima@1.0.4 
            │ │   └── estraverse@1.3.2 
            │ └─┬ through2@0.4.2 
            │   └─┬ xtend@2.1.2 
            │     └── object-keys@0.4.0 
            ├── through2@0.6.5 
            └── xtend@4.0.1 
    

> 4 meg of source files

\---

update: I think maybe I misunderstood the description. glslify actually parses
GLSL and re-writes it in various ways so maybe this is a bad example.

I've seen other though. Like 40k+ lines of deps for an ANSI color library or
200k+ lines deps and native node plugins for launching a browser from node.

~~~
davexunit
NodeJS developers ought to be embarrassed at how absurdly huge their
dependency trees are.

~~~
geodel
I think they are a proud bunch.

------
aidenn0
Slightly related, I was packaging up a webapp in docker, and one thing the
application did was form-fill PDFs. I had been using pdftk to do this, but it
turns out pdftk is written in gcj, and gcj pulls in a _lot_ for its runtime
libraries. I wrote a small program using the mupdf libraries and cut the size
of my docker image by over 400MB.

~~~
EvanPlaice
Congrats, you just violated the GPL. You were already violating the
proprietary license if you're using his for commercial purposes without
paying.

Not all free software is free.

~~~
aidenn0
I just saw the commercial redistribution clause for pdfTK if that's what
you're talking about. It does not affect me since this was not a commercial
application, but it would seem to me that that clause itself is a violation of
the GPL, since pdfTK links to GPL software and it is also a contradiction of a
separate place on the site that claims the same software is licensed under the
GPLv2. Preventing commercial redistribution is not compatible with the GPL.

------
Walkman
There is a counter example of his reasoning in the Python world. There is a
HTTP client library "urllib" in the standard library, but nowadays everyone
rather pulls in the external dependency "requests" because the urllib API is
terrible. It is mature, well tested, good documented code though.

~~~
_ZeD_
just as a data point: _I_ don't use request, I onestly prefer the urrlib (was
urllib2) API offered by the standard library

------
tarr11
Looks like the mime-types upgrade has some sort of hard dependency on Rails 5?
I seem to be stuck on 2.99

~~~
Rafert
Its the other way around actually: ActionMailer 4.2 depends on mail ~> 2.5, >=
2.5.4, so you'll get 2.6.3 now. That version depends on mime-types <3, >=
1.16, so you'll get 2.99.

The big change in mime-types 3 is using the columnar store by default, which
is where the memory savings come from. It's opt-in from mime-types 2.6 onwards
because it's a breaking change. Mail and afaik most other gems have opted in
already.

------
Rafert
Excellent article. I tried to develop on GitLab once but the sheer amount of
gems it pulls in (~100 directly declared, 350+ including dependencies if I
remember correctly) with a bunch of installation problems made me decide it
was not worth the hassle.

~~~
sytse
I'm sorry to hear you had installation problems trying to develop for GitLab.
Have you tried the GitLab development kit [https://gitlab.com/gitlab-
org/gitlab-development-kit](https://gitlab.com/gitlab-org/gitlab-development-
kit)? If still are interested and experience problems please email
support@gitlab.com and reference this comment for help.

I agree with the article, the less dependencies the better. GitLab's
gemfile.lock [https://gitlab.com/gitlab-org/gitlab-
ce/blob/master/Gemfile....](https://gitlab.com/gitlab-org/gitlab-
ce/blob/master/Gemfile.lock) has over 1000 lines and GitLab uses a lot of
memory. We try to be careful what we pull in but if anyone has suggestions
which can be removed please let us know. Recently we found out that we still
had to remove Redcloth as a dependency, it will be gone in GitLab 8.5.

------
technoir
If there is an analogue in the standard library you should have a compelling
reason to use an alternative. Wish this could be filed under "common" sense.
Thanks for articulating and presenting this principle, among others. Great
writeup.

------
surfmike
if you're running multiple rails processes on a server like this, couldn't you
somehow do the initialization in one process, then fork off the new processes?
wouldn't that prevent the base libraries from being copied in memory?

~~~
paulannesley
Yes, [http://unicorn.bogomips.org/](http://unicorn.bogomips.org/) popularized
this for ruby / rack / rails with its forking model and preload_app option.
[http://puma.io/](http://puma.io/) does the same thing, but additionally runs
multiple threads in each process.

The garbage collector in Ruby 1.8 / 1.9 negated the benefits of copy-on-write
forking, but that's fixed since Ruby 2.0

------
spullara
This is a huge mistake if applied without care. Building things from scratch
necessarily will introduce more bugs, more maintenance costs and leave you
with a codebase that suffers from a lack of maturity.

~~~
Chris_Newton
_Building things from scratch necessarily will introduce more bugs, more
maintenance costs and leave you with a codebase that suffers from a lack of
maturity._

Unfortunately, in some programming language ecosystems where having many small
and transitive dependencies on modules from an non-curated repository is
common, none of those three things is necessarily true.

Code reuse is not a trivial problem, and you always have to weigh the benefits
against the costs and risks to decide whether it’s worth it. If we’re
depending on GitHub repositories with a dozen files and three subdirectories
just to provide some simple functionality that any junior programmer could
implement directly in five lines of code, we’ve probably lost the plot. On the
other hand, if we have a full in-house implementation of encryption algorithms
we use to throw sensitive customer data around between the browser and our
servers, we’ve also probably lost the plot.

------
3minus1
I remember one nasty bug where someone has included one function from
bootstrap.js library and someone else had included the entire library. So both
functions were running causing an issue.

------
mschuster91
This is not just true for Ruby but also for the entire npm ecosystem.

I wonder how much traffic could be saved by optimizing npm packages...
probably on terabyte scale at github alone, methinks.

------
gravypod
Yes, this is the sort of thing that scares me away from Ruby.

I'm worried this sort of "screw it just add a library" is going to spread
further in my language of choice: Java.

In my time doing open source programming on the side, I've found that it has
become more common with the advent of things like mvn and gradle to just
slather on layers to your stack even for the simple tasks.

Need a function to turn a byte buffer into a string? Download these 3 Apache
commons libraries and their dependencies.

I understand if you are relying on a large portion of a library and you need
to use it, but why bring an entire library in for one function.

~~~
mbrock
There are ideas floating around that make it appealing to do just that. For
example, the commons library might be considered "battle-tested," and who
really knows what could happen with your own custom byte-buffer-to-string
function? Maybe you missed something? Maybe there is some "best practice" that
you didn't follow? Maybe the commons library is optimized? And writing your
own thing doesn't add business value. Developer time is more expensive than
dependencies. And so on.

Me, I very often prefer to write things myself, in a way that can get labelled
as NIH. My inclination is based on bad experiences with trying to debug
external libraries. Sometimes I look at open source library code and find
staggering complexity that I have no need for. Yes, maybe the library is
great, but if its combinatorial size is 10,000 times the functionality that we
need, then depending on its correctness becomes scary to me. And when I need
to customize it, due to some requirements alteration, I will find it difficult
and tedious.

Black-box type libraries for isolated complicated tasks like codecs and crypto
I will happily use.

Otherwise, I'm a fan of the "design patterns" approach to reuse, which is all
about learning from others, but without creating reusable formal abstractions
in library form. So if you teach me how to write an URL router, I can then use
your insights without depending on your code base, and I can adapt the idea so
it fits my application perfectly.

~~~
lmm
Urgh. I find design patterns the worst approach. It's just copy-paste at a
slightly higher level. If you really understand a pattern, you should be able
to express it formally - i.e. as code.

~~~
mbrock
Richard P. Gabriel's book _Patterns of Software_ has a chapter about that.
(It's free and out of print.)

Bluntly, it's kind of like: if you really understand a style of house
building, you should be able to deliver it as a prefab. Maybe true in some
way, but also neglects the drawbacks of standardized components.

------
cjhveal
Interesting to note that the Stripe gem removed one of its dependencies
seemingly in reaction to being called out at the end of this article.

------
brightball
This should be one of the perks of Go since the compiler won't let you include
anything that you aren't using.

~~~
jerf
That's an orthogonal issue. You won't accidentally bring in something
completely unrelated because you forgot to remove it from the "include"s, but
nothing technically stops you from having a deep dependency chain. Culturally
the Go community is aware of the issue, though.

Still, it isn't hard to bring in a chain accidentally. I have a program than
needs to do a query against the local LDAP system to extract members of a
specified group. The LDAP library brings in five more libraries for parsing
all the various bits of LDAP. Since this isn't C, I'm a bit less nervous about
pulling in, say, a BER decoding library, because at least Go is generally
memory-safe, but, still, that's a somewhat large stack for such a simple
query. (Traditionally in C, you might as well just expect any library that
decodes anything remotely binary-esque will have buffer overflows. C is a DSL
for writing buffer overflows.)

And yet, I'd be insane to try to implement some sort of just-barely-minimal
LDAP client to do it myself.

Looking at my local godoc instance's full set of packages that have gotten
pulled in one way or another is still sort of intimidating. Some of them are
cases where I'm just pulling in a subdir and got an entire large repo (the
golang experimental repos do that a lot), but, still, I've got a lot of stuff
in there. If you're a go programmer and you haven't run godoc locally and had
a look at the packages page, have a look. You may be surprised.

------
twic
_Does any of this sound familiar:_

 _\- Test gems loading in production._

That does not sound familiar! Is this a thing which happens with Rails?

~~~
nona
No, but sometimes people are a bit careless in their Gemfile I guess.

------
liveoneggs
mojolicious does a great job with this, supporting optional dependencies as
progressive enhancement (installing EV will speed you up, but you don't
necessarily need it)

[http://mojolicious.org/](http://mojolicious.org/)

------
beat
I have one thing to say about all of this...

 _Nokogiri_.

------
PaulHoule
They should sell sonatype for this.

------
justaaron
amen

------
jowiar
I find dependencies to be a very good indicator for how my code should be
modularized. That is, rather than pulling a boatload of dependencies into "the
application", pull a couple dependencies into a module, and then depend on the
module. It makes it very easy for dependencies to be a "well, it gets the job
done for now, and I can reimplement that myself if that changes" sort of
thing.

~~~
edvinbesic
I found this approach to work well for me as well. It has payed off many
times. I try to wrap most of my dependencies so that if I later feel like I
need to pick some low-hanging fruit I can implement some of the functionality
internally while maintaining the original api.

~~~
jowiar
Also, this is probably the single biggest difference for me when working with
a static vs. dynamically typed language. With Static Typing, there's a
"translate the dependency's types into the application's types" step that
pretty much screams "put a seam here!". With Dynamic Typing, it's a bit less
loud.

~~~
Aeolos
That's pretty much a C/C++ problem though. I don't think I've ever had to
translate types in C# or F#.

~~~
jowiar
I was referring to that, with Scala, I like to avoid leaking the innards of a
JSON serialization library in the code for an API client, instead returning
domain objects. Whereas with, JS or Python, the initial approach is to sling
around the blob of JSON.

------
joesmo
So to summarize, to get rid of an extra 10 (or even 100) megs in Ruby (or 0
megs in some other languages) of memory usage (and disk usage don't forget!)
spend weeks rewriting, testing, and integrating your own code instead of using
already written, tested, and integrated code. Now, that I've clarified the
article's point, how can anyone not follow this "best-practices" advice?
</sarcasm>

------
Eric_WVGG
“Oh, I thought you said ‘dependents’” — Abraham

------
sksixk
what's the point? none of them are realistic.

"no code" \- well, it's there for a reason. "own it" \- do i really want to
write my own minimal implementation?

i understand that dependency is a pita but this post doesn't provide anything
worthwhile.

------
edejong
The problem with dependencies is that developers approach it from a top-down
approach. The question they answer is: I need an HTTP client, JSON API,
monitoring tool, logging framework.

Never do they ask the opposite: what kind of foundations do I need? What
elementary blocks do I need to have or learn in order to make a JSON parser in
5 lines of code? Is it possible to do logging without all the cruft? Can I
write the library in the same amount of time as I can read the docs? Could the
code I write be the docs?

Similar line of reasoning: can I leverage my OS to do
scheduling/IPC/monitoring/security? If it can't, should we lobby for better
OSes (that might scale over multiple machines?) Does Linux/Docker offer the
right fundamentals?

Dijkstra was truly right: the art of programming is the art of managing
complexity.

