
What the Four Color Theorem Can Teach Us About Writing Software - magoghm
http://alexkudlick.com/blog/what-the-four-color-theorem-can-teach-us-about-writing-software/
======
Jtsummers
Interesting article, and good analogy. I've recently gone to making the
comparison to physical systems. A physical interconnect can only connect so
many things (think your steering column in the car) due to the laws of physics
(there's only so much physical space). Software allows us to act without this
constraint, each object is a point and can be connected to and connect to an
arbitrary number of other objects directly. This potential complexity must be
actively fought against in order to keep the system maintainable and
comprehensible.

I actually thought this was going to go in another direction. When the proof
for the four color theorem was published there was a good deal of controversy
about it. It had not been fully verified by a person, the bulk of it was
generated by computers. The method of generating the elements of the proof was
vetted, but not the final output. This has some relation to the issues we have
now with machine learning and systems being built based on the generated
models that we don't fully understand. It also relates to more complex chain
of compilation systems we use these days where each layer of translation
introduces the potential for subtle errors or malicious action.

~~~
prerok
Well, the four colors might have been controversial but five colors is
something that can be proven by hand. Still, four or five doesn't matter for
the points in the article. Keeping fewer connections between objects/modules
is something we should all strive for.

------
onan_barbarian
I think the 4-color teaches us that sometimes software _has_ to be shit. A lot
of people held out for an elegant proof but what they got was auto-generated
spaghetti. I cannot think of a better demonstration of the fact that sometimes
you just have to write a lot of boring code and solve all the cases than the
fact that an important theorem in mathematics was solved this way.

~~~
adrianratnapala
It's exactly that kind of code that most benefits from discipline about
interfaces. Writing a few thousand lines of code to do something hairy is
quite feasible and indeed fun -- as long the result exports a simple, non-
leaky interface.

But if the interface leaks: e.g. if it alters some state, if some edge cases
aren't covered or some of its internal dependencies have to be visible on the
outside then the mess ramifies through you whole system and is a burden on any
code you write, change or read.

Mathematical code often fits this model well. Implementations are frequently
very hairy because of the calculations involved plus mathematically inelegant
bells and whistles required by the real world. But since these things tend to
be _calculations_ , their interfaces are often no more than "call this
function, get or result or an error, with no side-effcts". Very clean.

------
tomsmeding
Nice analogy, I wouldn't have thought of that. But I'm unconvinced that this
is a good measure of software complexity; for one, there are only three
nontrivial complexity levels (2, 3, 4).

But it's also unstable: a circle with an odd number of nodes (e.g. a triangle)
requires three colours, while a circle with an even number of nodes needs only
two, even though both models seem similarly complex.

~~~
rbalicki
This analogy has some strength, but one of the most obvious complaints is as
follows. Having everything go through one central bus (e.g. `bus.setItem` and
`bus.getItem`) requires only two colors, one for the bus and one for
everything else. But of course, that just hides all of the complexity.

Nonetheless, focusing on reducing complexity and simplifying dependency graphs
is good, mmkay.

~~~
Someone
You can even get it down to one color by writing a big ball of mud
([https://en.m.wikipedia.org/wiki/Big_ball_of_mud](https://en.m.wikipedia.org/wiki/Big_ball_of_mud))

------
j2kun
I understand and appreciate the analogy the author made. I wish more people
thought about code complexity from first principles.

That being said, I doubt chromatic number is the right measure for software
dependency graphs. Especially when there are hundreds of graph complexity
metrics (tree-width, centrality measures, measures based on cuts and flows,
etc.). Even just now for the first time I saw this thing called cyclomatic
complexity:
[https://en.wikipedia.org/wiki/Cyclomatic_complexity](https://en.wikipedia.org/wiki/Cyclomatic_complexity)

I'd be interested to hear from someone who has experience applying graph
metrics to static code analysis in practice. What's useful and informative?

~~~
Pamar
Cyclomatic Complexity is pretty old: as an Analyst on a large system (99.99%
COBOL) I had to test "each linearly independent path through the program" in
1989, and had already achieved "old standard" status.

------
dy
Excellent thought piece - wish HN was filled with more original analogies like
this I one.

This reminds me a bit of domain driven contexts (something just added to
Elixir/Phoenix 1.3) - reducing the surface area of sub nodes from each other
by exposing that single point interface.

~~~
andy_ppp
I’ve been wondering if I can do away with the web layer all together and
provide RPC directly to certain contexts fairly trivially over web sockets.
REST feels really tired when working with state containers IMO.

~~~
dy
This seemed very promising, protobuffers to automated typescript over gRPC.

[https://github.com/improbable-eng/grpc-web](https://github.com/improbable-
eng/grpc-web)

------
jmblpati
Quick addendum: although the four color theorem proof always seems to require
a case analysis at some point, proving that planar graphs admit 5-colorings
can be done with a very short proof. Not particularly relevant to the analogy
in the post, but if you want proof that planar graphs admit constant-sized
colorings that's the one for you.

~~~
ocfnash
Just to advertise this addendum a bit more!

The Kempe-chain proof of the 5-colouring of any planar graph is easy and
beautiful.

------
EGreg
I thougut this article was going to be about Appel and Haken's proof of the
Four Color Theorem, as it was the first to use software and not an actual
analytical proof. As a result mathematicians were split: while this proof may
have shown the theorem was _true_ , it didn't explain _why_ it was true.
Because the latter was reduced to a brute force search.

[https://math.stackexchange.com/questions/23409/did-the-
appel...](https://math.stackexchange.com/questions/23409/did-the-appel-haken-
graph-colouring-four-colour-map-proof-really-not-contribut)

~~~
ocfnash
For anyone interested, there's a really excellent book on the history of the
problem as well as the AH proof, going into quite a lot of detail. Very highly
recommend.

"Four Colors Suffice: How the Map Problem Was Solved" by Robin Wilson.

[https://www.goodreads.com/book/show/450635.Four_Colors_Suffi...](https://www.goodreads.com/book/show/450635.Four_Colors_Suffice)

------
rossdavidh
I can't help thinking that it should then be possible to write a tool that
ingests a codebase and shows you a graph similar to the ones in this article.
Perhaps that would be a good way to figure out where to start working on
reducing the code complexity. But I have to think on this for a while to see
if I still believe it after mulling it over.

~~~
disconnected
Ages ago (back when Python 2.5 was still a thing) I used a lib that would
generate a call graph for Python programs. I think it was this one:

    
    
      http://pycallgraph.slowchop.com/en/master/
    

I don't remember much, but I think this wasn't terribly useful at the time,
because it became very messy very quickly. Any moderately complicated
application will be calling a dozen functions, and the graph then becomes a
gigantic mess of interconnected nodes which is very hard to read.

But like I said, it was ages ago, so I don't know how good/bad it is now.

------
mayoff
It's not obvious, but the graphs on this page are interactive. You can drag
the nodes around, and click them to change their colors.

~~~
hughes
I guess that would explain the insanity surrounding the node labels. Each
number has position:fixed and a coordinate relative to the scroll position of
the viewport. They update their position slightly out of sync with the
scrolling motion, which is extremely unsettling to watch. Seems almost
designed to induce motion sickness.

------
gweinberg
I think if you want to represent dependencies by a graph, it should be a
directed graph, not an undirected graph.

I think once you put in direction, you will see that cycles are much more
important than overlaps in edges in determining complexity. If you can keep
the graph acyclic, life is pretty easy. I think big cycles are worse than
small cycles, cycles that opverlap each other make things much more complex.

~~~
jpfed
I wonder whether it would be possible to look at dependencies the way McCabe
looked at flow control.

[http://www.literateprogramming.com/mccabe.pdf](http://www.literateprogramming.com/mccabe.pdf)

That is, we use a strategy for substituting away subgraphs; we can classify
programs by what results from that process. It may turn out that there are
particular problematic subgraphs to avoid.

------
crb002
In the context of HOTT? That most code in the real world can be embedded in
low dimensions, and when you embed something on a surface of small genus you
can do a bunch of optimizations.

Also, that you can use the finite set of minimal graph minors for a surface of
genus X to do case analysis instead of having an infinite amount of cases to
test with.

------
wallflower
Interesting analogy.

When I was told about this many years ago, it was described as the 'n
factorial problem'. This problem assumes that directionality of the
communication is important. If you have two nodes, they are 2! ways to
communicate. If you have three things, there are 3! or 6 ways to communicate
between the nodes. The suggested solution was similar, group components
together behind a black box facade so that 30 components can be reduced to 3!
main communication paths or less.

------
debacle
> The first thing that struck me, as a software engineer, is that four colors
> is not very many.

I really like the visualization. It's a happy medium between something like
UML and something that's actually useful, and creates useful constraints
around communication channels between classes.

I'm sure there's some design pattern that breaks the rule, but overall trying
to force a "4 color" programming pattern seems like it can't do anything but
help.

------
aryehof
This reminds me of the Complexity Principle, as stated by Dan Ingalls who was
responsible for much of Smalltalk ...

"If any part of a system depends on the internals of another part, then
complexity increases as the square of the size of the system."

A link to a slide from him showing n __2 dependencies
-[http://bit.ly/2t0xukK](http://bit.ly/2t0xukK)

------
gameguy43
If you were weirdly interested in the four color theorem, you might like this
common coding interview question about graph coloring:
[https://www.interviewcake.com/question/graph-
coloring](https://www.interviewcake.com/question/graph-coloring)

~~~
crote
That's a rather nasty interview question, though. If you've never heard of the
problem before, writing the greedy algorithm isn't too bad. But it is actually
more challenging if you are familiar with graph colouring, as knowing that
it's NP-complete might be enough to stop one from even considering this to be
an edge case...

If you really want your interviewees to hate you, ask them to solve it using D
colours!

~~~
Someone
The problem description doesn’t specify “planar”, and K4 has degree 3 for each
of its four vertices, but (obviously) requires 4 colors (and K5 has maximum
degree 4, but requires 5 colors, etc.)

Also, K3, K2 and K1 are planar with maximum degree 2, 1, respectively 0, but
require 3, 2, 1 colors.

So, in general, D colors isn’t sufficient.

------
lobo_tuerto
Did he missed a connection between the top red blob and the green one?

Also, what's the name for the process for transforming a "map" into a
connected network, just like he did at the beginning?

------
he0001
How do you color things like a method returning a un/ordered list which you
are depending on? There’s no code to ”color”?

------
evanspa
Shouldn't there be a connection between the top-right red circle and green
circle in that very first graph?

------
0xdeadbeefbabe
Also known as tight vs loose coupling.

------
MattRix
This is a nice analogy, but these days I feel like abstraction is usually a
bad thing. By its very nature, making things more abstract makes it less clear
what you are actually doing, you end up going through all these layers of
indirection. If I want to use the data from some node (aka object) way over
there, I should just be able to use it, without it getting pulled through all
of these other things first.

To use the same node graph analogy, when you focus too much on reducing the
number of colors, you end up needing to create tons of nodes between the nodes
that actually DO SOMETHING. Your software ends up more complex and difficult
to manage.

~~~
pdpi
By its very nature, abstraction separates the incidental from the fundamental
(where "incidental" and "fundamental" are defined on a per-abstraction basis).

If the problem you're trying to solve focuses on the fundamentals of the
abstraction, you're most certainly gaining clarity (possibly at the expense of
performance).

If you find the abstraction's making you lose clarity, it might very well be
that you're really working on incidental complexity, but there's also a
distinct chance you just chose the wrong abstraction.

~~~
MattRix
That makes sense in a perfect world where the abstractions are perfect, but
they rarely are. We make assumptions about the abstractions that often turn
out to be false. Obviously a certain level of abstraction is necessary, but in
general modern software development goes way too far with it.

