
Machine Learning: The High Interest Credit Card of Technical Debt (2014) - amzans
https://ai.google/research/pubs/pub43146
======
tacostakohashi
I recently came across this artice:

[https://en.wikipedia.org/wiki/Overfitting](https://en.wikipedia.org/wiki/Overfitting)

Although it describes the issue pertaining to statistics + machine learning,
this is also _exactly_ what end ups up happening with a large codebase without
clear requirements or test cases, and people just making incremental,
piecemeal changes over time. You end up with an application that has been
trained (overfitted) with historical data and usecases, but breaks easily for
slightly new variations that are different from anything that has ever been
handled by the system before in some trivial way that better designed,
cleaner, more abstract system would be able to deal with.

Given how much poor coding practices resemble machine learning (albeit in slow
motion), it's hard to hold too much hope about what happens when you automate
the process.

~~~
gautamnarula
I really like this extension of the concept of overfitting to codebases in
general.

I especially noticed this in libraries/packages that were "community owned" in
a company--instead of one team owning the package and being the authority on
deciding the long term roadmap and communicating with other teams about
feature requests, deprecations, documentation, bug fixes, etc, the community
at large, where "community" was very broadly defined as a team that for
whatever reason had an interest in using/maintaining/adding onto the package,
would collectively own the package.

Naturally, the result was exactly the scenario you described. Each team hacked
on their own bit of functionality for their specific purpose, while doing
their best to not affect or break the increasingly precarious tightrope of
backwards compatibility. There was no long term architectural vision, so there
was a definite need for refactoring--and yet no team had the incentive to
invest the amount of time needed to do that.The documentation was woefully
incomplete as well, and few people understood how the entire thing worked
since each team would only interact with their small fraction of the code.

~~~
tomelders
Two principles I live by (much to the annoyance of my bosses)

1\. Don't fear the refactor. 2\. If you don't want to rebuild your entire
application from scratch, don't worry, a competitor will do it for you.

There's nothing wrong with creating something in increments. It's the fear of
revisiting something that destroy's a code base.

~~~
vincentmarle
Your bosses might be right.

Technical debt, much like regular debt, can also be used as leverage to
quickly gain a competitive advantage. While your competitors are busy
refactoring/rebuilding perfect applications without hardly creating any more
customer value, the scrappy startup that writes piles of spaghetti code might
be building exactly what customers want.

Code quality != business value.

~~~
guiriduro
> Code quality != business value

I don't think that's a given: in some circumstances code quality absolutely is
business value. It might be better to say code quality can be, but isn't
always, business value. As ever, context is the deciding factor.

~~~
blauditore
Well, I would say technical debt is similar to the classic kind of debt: It
may give you short-term advantage (liquidity), but on the long term, there's
interest on it. If not paid off, it grows exponentially.

So yeah, technical debt can be used as a tool, but it doesn't come for free.

------
denzil_correa
Publication Year : 2014

Previous HN Discussions

[https://news.ycombinator.com/item?id=10006293](https://news.ycombinator.com/item?id=10006293)

[https://news.ycombinator.com/item?id=10338575](https://news.ycombinator.com/item?id=10338575)

[https://news.ycombinator.com/item?id=8775772](https://news.ycombinator.com/item?id=8775772)

"The Morning Paper" has a nice summary

[https://blog.acolyer.org/2016/02/29/machine-learning-the-
hig...](https://blog.acolyer.org/2016/02/29/machine-learning-the-high-
interest-credit-card-of-technical-debt/)

~~~
glup
which means it must be good, right?

I feel like there are a few of these frequent flyers... is there anyway to
figure out what they are?

~~~
chucksmash
"Show HN: Hacker News Classics" might fit the bill.

Discussion:
[https://news.ycombinator.com/item?id=16442888](https://news.ycombinator.com/item?id=16442888)

App: [http://jsomers.net/hn/](http://jsomers.net/hn/)

Source: [https://github.com/jsomers/hacker-
classics](https://github.com/jsomers/hacker-classics)

------
israrkhan
Software Engineering daily has a podcast, with D. Sculley, the author of this
paper. It is quite interesting to listen

[https://www.softwaredaily.com/post/5913c0e74ee01db33cacd027/...](https://www.softwaredaily.com/post/5913c0e74ee01db33cacd027/Machine-
Learning-and-Technical-Debt-with-D-Sculley)

------
rasmi
If you're interested in this, be sure to read "The ML Test Score: A Rubric for
ML Production Readiness and Technical Debt Reduction" [1] and the Rules of
Machine Learning [2].

[1]
[https://ai.google/research/pubs/pub46555](https://ai.google/research/pubs/pub46555)

[2] [https://developers.google.com/machine-learning/rules-of-
ml/](https://developers.google.com/machine-learning/rules-of-ml/)

------
imh
I love this paper. The further I get in my ML/stats career, the more relevant
these lessons are. I would recommend anyone interested in building long lived
ML products to read this.

------
joker3
This paper will eventually be seen as a landmark in the field of machine
learning systems. Read it, learn, and write up what you discover along the
way. This literature needs to grow.

------
dj-wonk
I enjoy the paper as a starting point for discussions. However, given the
varying definitions of technical debt, I think it is important to see
additional elaborations and examples of real-world trade offs in production
systems.

Perhaps the best overall wisdom this paper tries to impart is this: build
awareness, culture, and tooling around your ML systems, both upstream and
downstream. Never stop exploring and improving. Relentlessly try to slim down
your models, simplify your pipelines, and bring people together to talk about
all kinds of dependencies.

------
arendtio
2014 feels like decades ago in AI. Looks like that paper is already a classic.

------
prabhatjha
This is a great paper. It helped us think more rationally and helped us
realize that we were not alone having difficulties running ML in production.

------
thelastidiot
Is this still relevant?

~~~
Xorlev
I work at Google on a product driven by ML doing ranking and regression tasks.
Can confirm, very relevant. That said, ML is usually superior to the rules and
heuristics systems we've been able to come up with, so we take on the debt
once we stop being able to improve our heuristics, but only once we've tried
really hard at the heuristics such that we have a baseline quality bar to
beat. That justifies the effort, but it's still a lot of work to be vigilant
and keep an eye on shifts in signals, unintended dependencies, good metrics
that mean something, etc..

~~~
jacquesm
What really bugged me for a while is how unbelievably easy it was to beat a
very large amount of hand tuned code using ML. Going from 92.x% accuracy to
97% accuracy even without any tweaking at all feels like cheating.

~~~
brlewis
Are you at liberty to share, accuracy of what?

~~~
Radim
Can't speak for OP, but such accuracy numbers often hide a 20/20 hindsight
bias.

After having built and run a rule-based system for a while, you always get
tremendous subject matter expertise, a feel for what works.

Any rewrite of the system at that point will lead to much improved accuracy.
The clarity is reflected in a better choice of the input signals, features,
data preprocessing, metrics, workflows…

A "magic ML" (without domain understanding) beating well-tuned SME rules is a
dangerous fantasy, in any non-trivial endeavour. In other words, without that
clarity, you're better off gaining it first through simple iterations of
rules, figuring out what matters.

