
Surviving Software Dependencies - kick
https://queue.acm.org/detail.cfm?id=3344149
======
wwarner
Good read, I like that he expanded on the assumptions we make when we include
a dependency. However, I think his treatment of dependency upgrades is a bit
confused. Of course Equifax should have upgraded their struts dependency, but
that is not the result of relying on software reuse, as security
vulnerabilities are found in all code. If they had built their own framework,
the team that produced it could just as easily published a security patch that
was ignored. So to me the point is that important upgrades, whether
proprietary or FOSS, should be made as backwardly compatible as possible.
Fortunately, they usually are.

A related issue that wasn't mentioned in the article is the problem of forced
upgrades. Dependencies can arbitrarily introduce incompatibilities that don't
align with your own priorities, so that you end up spending a lot of time
keeping current with package releases that your users really don't care about.
Publicly facing services with a broad attack surface should choose their
dependencies carefully, as they'll be forced to upgrade often. Services behind
a secure firewall are less urgent.

~~~
zaphar
I think you missed his point on the Equifax story. I think he was trying to
illustrate that their reliance on the Struts dependency come with a
responsibility of both knowing where that dependency was being used. As well
as monitoring that usage to ensure that you aren't left vulnerable by that
usage.

It's entirely orthogonal the responsibility of publishers of the dependency
which you address in your comment. Both are important. But Russ's audience
wasn't publishers. He was talking to the consumers and what they need to
understand when they take on a dependency.

------
donaldihunter
Back in the day we had curated libraries. E.g. Roguewave, Boost, etc. You
chose your libraries relevant to domain needs and stuck with them. Adding a
new library to the project was a big deal.

This may still be true in C & C++ land, and to small extent in the Java world
with the Spring libraries. But Node, Python, et al. seem to have zero curation
and an explosion of transitive dependencies. By making it easy to reference
arbitrary dependencies, projects haul in random unscrutinised dependencies.

~~~
stevekemp
Off-topic, but reading "Rogeuwave" gave me a sudden flashback to an old
project.

Genuinely a name I've not heard for 20+ years. Thanks for the memories!

~~~
scriptdevil
Their c++ stdlib docs used to be best in class around 2008... Maybe even later

------
austincheney
I just spent two weeks trying to figure out code at work that required 8 dev
dependencies from npm locked down to certain version numbers not specified in
the package.json.

That ends up being hundreds of total dependencies to run two build steps and 1
http service. What a waste of time and code.

How did we get so drug addicted to shitty outside code? Had I not been
completely new to the team I could written a better solution in about an hour
directly with Node.

~~~
vbezhenar
You need one function so you add library. But that library contains dozens of
functions and uses other libraries to implement those functions. So now you
have all those dependencies even if you don't use them.

~~~
austincheney
Do you really need that one external function though? At what point is it
better to simply write it yourself?

~~~
juangacovas
It's a matter of balance, always. A thin line. And dogmas that people on teams
follow (we all do in some way).

Not invented here syndrome, always create a class, always put a class in its
own file, premature optimization blablah, don't use goto, this and that is bad
practice in that language, don't optimize until you need it (but you later
don't have time to profile, or don't really want to...)

~~~
majewsky
Never-invent-here syndrome is just as bad as not-invented-here syndrome,
though.

~~~
armitron
No, never-invent-here (aka extensive 3rd party library reuse) is orders of
magnitude worse. NIH means you retain full control of your codebase. It's
really determinism that leads to you being able to make
reliability/robustness, availability and security guarantees.

Never-invent-here is a disaster waiting to happen.

~~~
lonelappde
External code is better than internal code because you can reduce it to
internal code just by forking it. If it is even minkscully useful, that's a
win.

------
pknopf
This reminds me of a recent post I made, urging people to not use
EntityFramework.

[https://pknopf.com/post/2019-09-22-the-argument-against-
enti...](https://pknopf.com/post/2019-09-22-the-argument-against-entity-
framework-and-for-micro-orms/)

It's almost never worth it.

------
tomohawk
This is a great write up, but it seems to be missing a key ingredient to
deciding to use a dependency. That is, are you willing to become a contributer
on the project, and/or are the maintainers open to your participation.
Example: You may really want to use a dependency, but test coverage is
lacking. Are you willing to contribute the test coverage?

The open source model is about building an alliance with others so that you do
not have to create the whole tech stack yourself. It's the only way to compete
with a monopoly and win.

~~~
fauigerzigerk
That's certainly important for some users for some dependencies. But most
users cannot and should not contribute to most of their dependencies.

The overhead of reviewing all those contributions would be enormous, and the
effort required to make a quality contribution that doesn't waste other
people's time would be impossible to justify in most cases.

~~~
marcus_holmes
also getting involved in a project also means dealing with its politics and
drama, too. And the entitlement, criticism and blame from users.

and in a lot of cases, it would be less effort to write the functionality you
actually need (and the tests for it) than writing the tests for an entire OS
project (and then dealing with all the other crap).

~~~
lonelappde
People say this and then you find out it means that they brag about reducing
their dependencies by 1%. How big the code base for your _entire_ system.
We're not shipping 128kB cartridges anymore.

------
swiley
They way people carelessly added dependancies to their python projects kept me
away from the language for way longer than it should have.

It’s another thing you have to be aware of when you’re programming just like
good naming and problem decomposition.

~~~
commandlinefan
Not just Python. For some reason, Java developers all got together without
consulting me and decided that

    
    
        import org.apache.commons.lang;
    
        if (!StringUtils.isEmpty(s))
    

is better than

    
    
        if (s != null && s.length() != 0)
    

bonus points for:

    
    
        if (s != null && !StringUtils.isEmpty(s))

~~~
Viliam1234
OK, I'll bite. The third option is obviously wrong. The first one should
instead have been "if (StringUtils.isNotEmpty(s))".

But it seems like you prefer to type repetitive code that is more difficult to
read, rather than use a function, if that function is in a library. Did I get
that right? Is that because the repetitive code is relatively short (it gets
longer if the variable has a longer name), or because there is the extra line
required for the import?

Apache Commons is the "things that should have been in Java SE, but are
missing for mysterious reasons" library. I can barely imagine a project that
wouldn't benefit from some of its functionality. (Of course, there is always
the alternative option of reinventing the wheel.)

By the way, with static imports the code can be reduced to "if
(isNotEmpty(s))".

~~~
commandlinefan
Ok, I’ll bite back. Yes, “StringUtils.isEmpty” is materially worse, in every
way. It’s not really any more readable: the first time you see it, you ought
to check to make sure that is just checking to see if the pointer is non-null
and the string is 0-length to make sure that it’s not doing something else,
like trimming whitespace, which you may not want. You’re adding a few hundred
kilobytes of dependency plus an extra unnecessary function call, just to make
it less clear what your program is doing. If you use static imports, it’s even
less readable because it’s not at all clear where this mystery function came
from: I hope you’re lucky enough to be able to import this code into an IDE
and click-through the function to see what it is and what it does. Maybe if it
was just this one function it wouldn’t be so bad, but this bit of
pointlessness ends up scattered everywhere.

------
samuell
This is great!

I feel this is the missing guide we should have referred to, in our recent
writeup:

"Software Engineering for Scientific Big Data Analysis" \-
[https://doi.org/10.1093/gigascience/giz054](https://doi.org/10.1093/gigascience/giz054)

We were touching on the issue in point 1 in that paper, and did have a
discussion about it internally, but this guide really takes a holistic view on
the problem.

Thinking about all of this, I'm also happy we managed to build our workflow
manager SciPipe ([http://scipipe.org](http://scipipe.org)) completely without
code-level dependencies :)

------
bakul
To me the Equifax story is the Nth such story that points out a crying need
for a better _security model_. In 99%+ cases single teams can no longer
produce 100% of the code that companies rely on. Most teams do not have (or
can not afford to hire) people with expertise to audit or even an ability to
understand the implementation details of the third party codes they use.
Consequently any 3rd party code should only be allowed to access resources or
data structures that it absolutely needs to carry out some work on _behalf_ of
its caller. This is separate from managing the scale complexity of s/w
dependencies.

~~~
lonelappde
The fundamental problem is that as a group we write too much code, building
dozens of approximately equivalent stacks and ecosystems instead of
compromising on few things of much higher quality that require a little
customization

------
panpanna
A highly relevant study:

"Small World with High Risks: A Study of Security Threats in the npm
Ecosystem"

[https://www.usenix.org/conference/usenixsecurity19/presentat...](https://www.usenix.org/conference/usenixsecurity19/presentation/zimmerman)

The paper is full of hard data.

------
fhs
This seems to be a copy of the blog post from January 2019. Previous
discussion:
[https://news.ycombinator.com/item?id=18979596](https://news.ycombinator.com/item?id=18979596)

~~~
lonelappde
Mods, please update the status of link.

The ACM article used a bunch of _dependencies_ that break the text formatting.
Russ's original blog post is well formatted.

