A related issue that wasn't mentioned in the article is the problem of forced upgrades. Dependencies can arbitrarily introduce incompatibilities that don't align with your own priorities, so that you end up spending a lot of time keeping current with package releases that your users really don't care about. Publicly facing services with a broad attack surface should choose their dependencies carefully, as they'll be forced to upgrade often. Services behind a secure firewall are less urgent.
It's entirely orthogonal the responsibility of publishers of the dependency which you address in your comment. Both are important. But Russ's audience wasn't publishers. He was talking to the consumers and what they need to understand when they take on a dependency.
I believe you're just making the inverse assumption from Cox. He seems to assume a monorepo-esque setup where everyone depends on HEAD, and you seem to assume a microrepo-esque setup where everyone declares their own dependency revisions.
This may still be true in C & C++ land, and to small extent in the Java world with the Spring libraries. But Node, Python, et al. seem to have zero curation and an explosion of transitive dependencies. By making it easy to reference arbitrary dependencies, projects haul in random unscrutinised dependencies.
Adding a library should be easy and I think we shouldn't blame ease of use. I'm worried too about the explosion of dependencies in Node, Python and Rust in my case specifically, but at the same time I'm happy that I don't have to waste my time with the boring work of adding a new library like in C or C++.
What is needed in my opinion is three things:
1. Developers should be more suspicious.
I believe security disasters to come will solve this problem over time.
2. A better way to express trust in package managers.
I would like to be able to express trust or revoke trust against
persons and groups and not against packages.
For example in the Rust world there are a view packages written by
folks that mainly write the standard library and work on the compiler.
I trust these people and if I wouldn't I probably couldn't use Rust at all.
So I don't want to spend a single thought about importing one of these crates. Crates form others will undergo more scrutiny, but ultimately using a package or not boils down to the fact that I trust certain people and per default I don't trust everyone else.
What I want to be able to do is to express to my package manager, so it prevents me from using stuff from untrusted entities and at the same time doesn't bother me otherwise.
3. Curated repos. When I wrote Java for big corp we had to use a curated Maven repo.
The regular public Maven repo was not allowed. The curated version was supplied
by a third party, Sonatype in this case. Back then I thought this service will become
a big deal, that we will soon have many companies that provide services like Sonatype and that they will blossom. It never happened but maybe time was not ripe for it...
That's a great point. I wonder if there is any interest in providing a Boost-esque package bundle for Node or Python, where maintainers adopt a selection of packages and treat their official releases as unstable/bleeding edge releases.
Genuinely a name I've not heard for 20+ years. Thanks for the memories!
That ends up being hundreds of total dependencies to run two build steps and 1 http service. What a waste of time and code.
How did we get so drug addicted to shitty outside code? Had I not been completely new to the team I could written a better solution in about an hour directly with Node.
Not invented here syndrome, always create a class, always put a class in its own file, premature optimization blablah, don't use goto, this and that is bad practice in that language, don't optimize until you need it (but you later don't have time to profile, or don't really want to...)
Never-invent-here is a disaster waiting to happen.
Honestly, I think people justify this nonsense to themselves because they are scared to write original code, even if its incredibly tiny. The primary purpose of writing software is automation, if you aren't automating things with everything you write you are probably just an expense center. This is the financial way to say that your misplaced objectives make you unproductive and/or unreliable. It doesn't have to be that way.
Imagine world where you can import a single function and it'll bring only required dependencies, no more.
But, yeah, currently you have to weight whether it's better to write the code yourself or bring external libraries with some burdens.
It's almost never worth it.
The open source model is about building an alliance with others so that you do not have to create the whole tech stack yourself. It's the only way to compete with a monopoly and win.
The overhead of reviewing all those contributions would be enormous, and the effort required to make a quality contribution that doesn't waste other people's time would be impossible to justify in most cases.
and in a lot of cases, it would be less effort to write the functionality you actually need (and the tests for it) than writing the tests for an entire OS project (and then dealing with all the other crap).
It’s another thing you have to be aware of when you’re programming just like good naming and problem decomposition.
if (s != null && s.length() != 0)
if (s != null && !StringUtils.isEmpty(s))
But it seems like you prefer to type repetitive code that is more difficult to read, rather than use a function, if that function is in a library. Did I get that right? Is that because the repetitive code is relatively short (it gets longer if the variable has a longer name), or because there is the extra line required for the import?
Apache Commons is the "things that should have been in Java SE, but are missing for mysterious reasons" library. I can barely imagine a project that wouldn't benefit from some of its functionality. (Of course, there is always the alternative option of reinventing the wheel.)
By the way, with static imports the code can be reduced to "if (isNotEmpty(s))".
I feel this is the missing guide we should have referred to, in our recent writeup:
"Software Engineering for Scientific Big Data Analysis" - https://doi.org/10.1093/gigascience/giz054
We were touching on the issue in point 1 in that paper, and did have a discussion about it internally, but this guide really takes a holistic view on the problem.
Thinking about all of this, I'm also happy we managed to build our workflow manager SciPipe (http://scipipe.org) completely without code-level dependencies :)
"Small World with High Risks: A Study of Security Threats in the npm Ecosystem"
The paper is full of hard data.
The ACM article used a bunch of dependencies that break the text formatting. Russ's original blog post is well formatted.