Hacker News new | past | comments | ask | show | jobs | submit login
Dependency Drift: A Metric for Software Aging (nimbleindustries.io)
49 points by colinbartlett on Feb 2, 2020 | hide | past | favorite | 16 comments

At Pivotal (now part of VMware, but it's easier to explain this way) we have an increasing number of teams that track "vulnerability budgets" and "legacy budgets", by way of analogy to "error budgets" from SRE approaches.[0]

Essentially, a vulnerability budget tracks days since last release for a dependency or system; a legacy budget tracks time left until a dependency or system falls out of support. Our CloudOps team and telemetry teams now plot these for their own systems and customer systems, producing a characteristic sawtooth pattern.

I think these are good steps forward, but I see some possible evolutions.

One evolution is to combine these into a more general risk framework like FAIR, so that we're able to use a single budget for all these categories of risk.

Another is to consider risk exposure as a volume. At any given instant I have a wide range of risks in play, distributed over a wide range of systems. As time evolves these risks come and go. A single score (like the one sketched on the far side of the link) only gives me one axis of risk. I also need to know what systems are currently expressing that risk by the presence or absence of software, giving me a second axis. Finally, these evolve over time, giving a third axis.

The sum of volume for any time slice is what ought to be budgeted.

Another thing is to make these problems behave less like "big-bang control", where we switch modes drastically when crossing a threshold value. What would be preferred is a greater degree of smoothness.

And finally, to make all of this tractable: automation. All the metrics and scorecards and reports and graphs in the world won't make this problem easier. They can at best make it visible and demoralising.

[0] This presentation by David Laing and Jim Thomson is a good introduction: https://www.youtube.com/watch?v=lXPmh9Ap114

David & Tim are responsible for an ongoing multi-year systematic investigation of upgrades as a technical and organisational behaviour.

The whole process of upgrading dependencies is annoying, mainly because stuff keeps breaking. I currently try to keep everything up-to-date, simply because then the number of breakages are spread out and easier to deal with, and when critical fixes come through, they're easier to apply. Automation does help with this. But ideally, I'd like the "it it ain't broke don't fix it" approach. It feels like such a waste of time.

I would love to see Dependabot go further and track upgrade breakage. Being notified about vulnerabilities is a great service and takes the load off. But too many times, it still breaks.

Recent example: Checkstyle security issue, Dependabot cuts a PR. Great. Except almost every minor release [0] has a breaking change. Some libraries are much better than this, and in fact I ended up simply ditching Checkstyle. (It's open source, they don't owe me anything. I get it.)

Until we quantify the breakage issue, you don't have enough data to make a decision if upgrading dependencies is worth it outside of a security fix. And in those cases, it's hard to make an argument keeping up to date is the best choice (other than personal philosophy).

[0] https://checkstyle.org/releasenotes.html

I believe Dependabot quantifies this to some extent with a "Compatibility score": https://dependabot.com/#how-it-works

But it can only measure what people have test coverage for, presumably.

I did see that, but it's pretty opaque (some percentage). And when I click through the badge, it takes me to GitHub's help page [0]. What would be helpful is to link to other public PRs so you might be able to find out how other people solved the issues. Is there any way to see historic compatibility scores to evaluate if it's worth to drop a certain dependency?

Edit: For normal dependency updates, compatibility probably more useful. For security fixes, what choice to you have other than to apply it, or spend more time investigating and trying to come up with your own mitigation?

[0] https://help.github.com/articles/configuring-automated-secur...

Automated testing, good source management and relatively short update cycles make all of this much easier...

Import the new version in a branch, and see right away which (if any) tests break. Then fix and commit.

If a dependency breaks with every minor version change, it's probably worth reevaluating whether it's worth the work, or if the time is better spent replacing the dep with a substitute or some local code.

I have all that. But it takes time out of my day. I'd like to quantify that vs putting it off and spending 2 days or maybe 2 weeks doing it all at once.

Also, right now we have to manually and individually keep track of breakage. It might change mentality and increase empathy in software development when you see how much breakage a change could cause. I don't believe nothing should ever break, but would like to see more awareness of the impact of backwards-incompatible changes.

> I'd like to quantify that vs putting it off and spending 2 days or maybe 2 weeks doing it all at once.

That would be possible if the contribution of each change was linear with some amount of fixed overheads, so that batching them was an effective way to improve efficiency.

But there are just so many nonlinearities in practice. For example, changes in dependencies can compound such that time-to-upgrade is superlinear in the number of upgrades yet to be applied. Most of the time you get away with it, until one day you discover that you're so far down the road lashing yourself to v1.1.3 that it's very expensive to move to v1.1.5, which you must consume in order to address a critical bug or security vulnerability.

> Also, right now we have to manually and individually keep track of breakage. It might change mentality and increase empathy in software development when you see how much breakage a change could cause. I don't believe nothing should ever break, but would like to see more awareness of the impact of backwards-incompatible changes.

I agree, and it has been done at some organisations by mandating tools and dependency-management practices which enable downstream testing (eg. Blaze at Google). As a practical matter that is difficult to achieve when crossing a Conway boundary, but I expect Github will start feeding test information back upstream as time goes on.

A variant of this idea that would be more useful to me personally would be some measure of how actively maintained my dependencies are. It would be nice to know if my long-lived project depends on something that has been abandoned.

Details might be tricky, given that some very useful packages simply don't need frequent updates.

I like the drift metric more, because some libraries are simply stable and don't need much updates anymore, this doesn't mean they are bad for your system.

I think you're on to something here. This is hot topic amongst our teams right now. Dependency maintenance looks to me to be one of those things like code duplication that correlates strongly with a multitude of sins.

It's also interesting to see someone else building a tool around a new metric. I'm doing something similar (shameless plug -> https://without.fail/estimate-accuracy) for determining how well estimates correlate with actual development times. I could definitely take some lessons from you because Dependency Drift is way more brandable than Estimate Accuracy.

Doesn't this just re-invent the Libyears metric?


I have skimmed through this article, it seems fairly long but light on details. How does this differ from libyears? What actually is your metric?

That site (libyear.com) IMHO makes the concept refreshingly clear, this "Dependency Drift" site seems vague and markety. It's trying to make a big deal out of something, but what remains unclear after reading it. The libyear site is the opposite - clear and not a big deal.

The problem is that this is a bunch of bollocks. But at the same time, that might actually resolve my problem with it as the pain gets fed back.

If one is very very lucky, a library will have accurate changenotes explaining what the version increment actually means, distinguishing between security updates and wanking-over-new-subdependency-for-shiny-irrelevant-features updates.

However, if people are penalised for not wasting their time chasing the latest version of leftpad-with-custom-characters-except-in-locations-which-are-prime-or-square.0.4.3-beta.butnotreally, maybe we'll see shallower dependency trees in the important stuff.

Where 'important' ends up being defined as 'the packages which everyone else gravitates to, and therefore can't be avoided'.

Ideally we'd see security updates for previous major versions of things, for those of us without feature addiction, but that would demand more of the devs producing this crap.

I agree this is an important metric and a kind of technical debt that's hard to capture.

How do you discriminate this from RenovateBot or Dependa-Bot? Both of those will not only track the drift but they'll also generate the PR to fix it.

Co-author here, thanks for the input. We use Dependabot on all our projects and love it. I've not seen any kind of numeric metric in Dependabot or any of their competitors.

What I really want is a graph that shows, numerically, the drift over time. You can show this to your stakeholders: "See, look! We haven't dedicated time to stack upgrades and now look how far out of date we are creeping." And then, consequently, I want to see that chart burn down to zero as you do stay on top of things with the help of Dependabot or similar.

This is the libyears metric ( https://libyear.com/ ) tracked over time.

Or are you using a different metric that you haven't defined for us?

As a freelancer, this sounds really interesting to me. It's always hard to sell people maintenance.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact