
A Codebase Is an Organism (2014) - Kortaggio
https://meltingasphalt.com/a-codebase-is-an-organism/
======
code_biologist
Catabolism is another aspect of thinking about code as an organic entity that
I find useful. Anabolic (growth) processes and catabolic (teardown) processes
always accompany each other in biological systems. Too often we neglect to
continually tear down and remove unused aspects of our systems.

Both anabolism and catabolism are always operating, though at different rates
depending on the environment and needs of the organism. The funny thing is
that catabolic processes will happily tear down important parts of the system,
forcing them to be anabolicly rebuilt. Things like bone and muscle mass. This
seems wasted, but it has an extremely important purpose: uncontrolled
anabolism is cancer.

I think the analogy to code is obvious. We've all seen codebases that have
gotten out of hand (gotten cancer), and part of the reason they got that way
is because the growth pressures had no countervailing teardown pressure. The
market provides selective pressure in nasty ways: codebases and systems with
cancer slow and die, and are replaced by younger codebases without
uncontrolled growth.

How do we break that cycle? By encouraging catabolic activity culturally
within our companies and codebases. Cleanup is incredibly valuable.
Functionality that isn't being heavily used and is complicating implementation
of new work should absolutely be on the chopping block. We don't have to be
quite as aggressive as biological systems, but we shouldn't take their example
lightly either.

This also provides insight into when cleanup doesn't matter: when the survival
horizon of the organism is sufficiently short that cancer frankly doesn't
matter. Startups shouldn't worry too much until they have product market fit.
Once it's clear they'll be around for years more though, catabolic work will
help ensure future health.

~~~
rojobuffalo
username checks out. that's a really thought provoking analogy. there's
definitely something to be gained in the process of tearing down and building
back up.

------
taneq
I've been waiting for a long time for people to start using the title
"robopsychologist" in earnest. Especially for people who are called out to
debug an unfamiliar system, in 'the wild' (ie. in production). You have to
calm the system down, find out what it thinks is going on, gently correct any
misconceptions, and guide it towards a better understanding of the world that
lets it behave the way you need it to.

~~~
andy_ppp
It’s commentary like this that makes me come back here and also I wish hacker
news had a few more features, like being able to follow people who make
interesting comments.

~~~
grawprog
You could always start a bookmarks folder for 'HNers I find Interesting' and
save their profile to that. I bet there's a way to write a script that goes
through them, checks for posts since the last update and display them.

------
kazlock
> two modules will tend to grow ever more dependent on each other unless
> separated by hard ('physical') boundaries

This is especially true in monorepos. Code resuse has it's benefits, but left
unchecked the dependency graph can turn into a huge highly-connected glob.
Modules end up accidentally importing code they have no business importing
because of sneaky transitive dependencies. The blast radius of even small
simple changes becomes enormous, and testing/debugging becomes more and more
complex.

A great way to keep this in check is to have apply code isolation in the test
environment. When you checkout the entire repo for a build or test, its easy
for these kinds of dependencies to grow unnoticed. But if you require
build/test targets to explicitly declare what code they depend on (and only
make that code present when running them), changes to the dependency structure
must be explicitly acknowledged in code review. This is one of the core
principles behind build tools like Bazel.

~~~
sitkack
Even in monorepos, calls into a module should go through a versioned
interface.

~~~
Groxx
In general I agree, because it allows gradual migration of users... but in
essentially every case I've seen, removing versioned interfaces is billed as
the _primary feature_ of monorepos.

~~~
sitkack
I just realized monorepos are analogous to a flat network, everything can
reach everything else, or that main memory is flat/linear/same cost for each
read/write (not true, but it is the abstraction we believe).

I now believe that monorepos can only work well when there is a mechanical
tool for ensuring correctness and that refactorings can be done atomically
across the whole tree. Infact, it might be necessary to _only_ commit the
refactoring operation to the tree and the source itself. That the whole tree
is a blockchain of tree edit operations.

~~~
KajMagnus
> monorepos can only work well when there is a mechanical tool for ensuring
> correctness and that refactorings can be done atomically across the whole
> tree

Isn't a compiler such a tool? I use a statically typed language, and if I
would try to do this:

> everything can reach everything else

then there'll be compilation errors.

------
js8
A codebase is an orgasm. People working on it get excited when it is starting,
and when it's done, they become very tired of it.

~~~
andy_ppp
Unless they really feel love for their users?

Ugh, too far :-/

------
preston4tw
Two things: mods can we get the title updated to reflect the publish year of
2014?

The second is that the codeswarm link in the article is dead. This project on
github is the closest thing I can find to any remnant of the project:
[https://github.com/rictic/code_swarm](https://github.com/rictic/code_swarm)

~~~
ivan_ah
Check out gource, which does something similar
[https://gource.io/](https://gource.io/) (git history visualization)

------
gandalfgeek
Not just code bases, but it is useful to think of large planet scale systems
in organic terms.

[https://blog.vivekhaldar.com/post/6972614229/large-
computer-...](https://blog.vivekhaldar.com/post/6972614229/large-computer-
systems-are-organic)

~~~
goldenkey
Might as well just ask the question: What is Life? And then realize that scale
of space or time is irrelevant.

A galaxy could be thinking like a brain, except thoughts travel light years,
are mainly dictated by GR (Light and Gravity), and take eons to actually
occur.

I think it's clear that the only meaningful nature of scale in time and space
for life is just the one we artificially put on it, because of our conscious
experience.

LOTR and other fantasy explore this with the idea of trees having much wisdom
having lived so long.

~~~
dwaltrip
I once saw a comment similar to yours on HN, and there was a very good reply
that critiqued the idea. I'll do my best to replicate something along the same
lines.

If we look at the total potential lifetime "clock cycles" of a system, and
compare that number to systems that we know are considered intelligent and
capable of thinking, it might shed light on the plausibility.

It takes information 100,000 years to cross the galaxy, and the universe is 14
billion years old (the galaxy is younger, but not enough to matter for this),
so roughly a maximum number of 100,000 round trips could have possibly
occurred in this "galactic brain".

Let's use a very conservative metric, such as how long it takes a human to
blink in response to stimuli (about 100 ms), for our "human clock cycle". If a
human lives for 80 years, that is about 2.5 billion seconds, or 25 billion
clock cycles.

This is roughly 7 orders of magnitude (10s of millions) more than in the
galactic brain. This gives the a very loose impression that a such system
would probably have not have enough information flow or feedback loop cycles
to "think" anything interesting.

Of course, this a very rough heuristic, but I think it is interesting and
useful. The idea of unexpected or strange "thinking systems" is very cool and
we should continue to explore it, but there are certainly some hard
constraints defining the space of such systems.

~~~
goldenkey
That's a good point. I don't have time to double-check your numbers but I will
assume they are correct.

My only rebuttal (if I'm to take an opposing position), would be that
different organisms have different clock-cycles, an example would be organic
life, hummingbird metabolism versus elephant or whale metabolism. I don't know
if their brains are also faster, but since their reaction speeds do seem to be
connected to their metabolic rate, it's probably the case.

Considering architecture, the way intelligence is being created through
Machine Learning is through cellular units that carry large amounts of
information digitally instead of oscillating analog signalling. In fact, I
found that when building cellular automata, black-white automatas are elegant
in theoretics, but when it comes to what will deliver the most
complex/intelligent dynamics on a GPU, it is using each cell/pixel as a
complex number, floating point, or vector of such types. The GPU can do so
much with single units, so it makes sense to use its full potential.

It's possible that the encoding/signalling of galaxies is similar. Clock speed
only has to be as fast as things you _can_ react to. Can a galaxy move fast
enough to dodge an asteroid? Likely not.

It'd probably be enlightening to know more about how the brains of larger,
slower animals such as elephants or whales operate. Their "clock speeds" might
shed some light on what kind of parameter ranges life operates within.

EDIT: one more thing I thought of is that it's very possible that the living
stage of the larger structures of our universe is simply young, adolescent in
nature. Perhaps we are living at a time when the intelligent behemoths among
us, or rather, that we are apart of, are of their first generation. I agree
though, it's more unlikely to be seeing the head or tail of a distribution
than somewhere in the middle. But we should not rule out young, large
intelligences.

~~~
dwaltrip
> My only rebuttal (if I'm to take an opposing position), would be that
> different organisms have different clock-cycles, an example would be organic
> life, hummingbird metabolism versus elephant or whale metabolism. I don't
> know if their brains are also faster, but since their reaction speeds do
> seem to be connected to their metabolic rate, it's probably the case.

That's a fair point. I think the comparison would have a somewhat similar
conclusion, perhaps not as dramatic though.

Either way, "clock speed" / "number of clock of cycles" is only one way to
analyze such a system.

Another important aspect of an intelligent system is the computational
complexity of each node in the system and their ability send "messages" back
and forth. For a brain, we could look at the neuron. My limited understanding
is that neurons are actually pretty complex from an information processing
standpoint, despite the implied simplicity we see with artificial analogs such
as the neural nets used in machine learning.

I can't imagine what sort of naturally occurring entity would be able to
function as an analogous "node" of a galactic brain, or even what sort of
messages could be used do to do complex information processing. Stars don't
look like they could fulfill the role. In terms of "information processing" I
don't think they really do much.

Anyways, it is all very interesting. It is probably possible to have a much
larger and very different "thinking system" than what we find on Earth,
although I'm not sure what it would look like. Bu I am skeptical that such a
system could span the enormous distances between stars.

By the way, you should check out book The Black Cloud [1]! It is a fun science
fiction story that explores this idea.

[1]
[https://en.wikipedia.org/wiki/The_Black_Cloud](https://en.wikipedia.org/wiki/The_Black_Cloud)

------
gfodor
framing failing fast or not as a conflict of interest between computer and
codebase is an interesting one. I’ve often seen it as a conflict of interest
between the user and the developer: users want the code to muddle through
(“PHP style”), but devs want highly visible obvious failures early so they are
easily identified and fixed.

How do people balance this? The “state of the art” still seems to be things
like “have asserts turned on in the debug build” but a holistic approach where
the application has different runtime contracts with regards to failures based
upon who is using the app seems somewhat underutilized — I’ve done it
piecemeal with application feature flags but has anything like this been done
at the platform/language/framework level?

------
carapace
A codebase is a mechanical approximation of a set of human intentions. It's a
model of an organic phenomenon.

~~~
lonelappde
A code bases _that manages a data store_ is not a model, it is a phenomenon.

------
euske
> In this way, building software isn't at all like assembling a car. In terms
> of managing growth, it's more like raising a child or tending a garden.

I knew it! I always thought maintaining my project was like doing a bonsai
tree!

(Disclaimer: I have zero experiences of bonsai.)

~~~
lonelappde
Bonsai stay small as they age. Sadly, software programs do not.

~~~
inimino
because the gardener trims them both above and below ground.

------
qes
Not even just the source code. Our system, even with the source code static,
has constantly changing characteristics.

We integrate with dozens of 3rd party APIs, making over 10M external HTTP
calls per day. That leads to a lot of variability in runtime characteristics.

~~~
longcommonname
This is true for us too. The behavior of ours is due to how airline fare rules
are filed which creates all sorts of silly things.

------
qntmfred
Always liked the garden metaphor [https://blog.codinghorror.com/tending-your-
software-garden/](https://blog.codinghorror.com/tending-your-software-garden/)

------
lonelappde
A production system is like the TARDIS.

You don't program the system, you negotiate with it.

