
Software Entropy - 1penny42cents
https://camhashemi.com/2020/06/23/software-entropy/
======
__MatrixMan__
I think the intro description introducing entropy is missing something. If the
shoe closet sorted by color and the chaotic one are both composed of the same
set of shoes then they have equal entropy. You can transpose any two shoes of
the same color to achieve an additional microstate whether or not they're
arranged by color.

I suppose it's technically correct if what you're interested in is the ways
shoes can be positioned in a closet, but why bring up color then? The example
sets up a comparison between some small finite number of ways to arrange shoes
on a shelf and potentially infinite ways to arrange them in a pile. Better,
I'd say, to stick to a grid in both cases and compare the number of
microstates in the "all black" macrostate against the number of microstates in
the "half brown half black" macrostate.

~~~
raiflip
That's not quite right, and the author is reaching for a valid definition, but
it also not quite getting it right (entropy is hard).

Entropy is defined as: "It quantifies the number of microscopic configurations
that are consistent with the macroscopic quantities that characterize the
system."

Where there are two levels - microscopic - which in this case doesn't
necessarily mean invisible to the eye, it means the individual units of the
system. In this case shoes. And macroscopic - which is the system as a whole,
or in this case, the set of all shoes. A single macroscopic state is the set
of shoes in some (maybe random) order.

Finally "macroscopic quantities" refers to a property of the macroscopic
system. In this case the author is using the macroscopic quantity of "how fast
can I find a shoe of a particular order".

In this case, entropy measures, for a given macroscopic quantity, how many
microscopic configurations exist. For example, lets say the given macroscopic
quantity is "I find a given color shoe in O(n) time." Then every configuration
of shoes that does not attempt to order the shoes is a possible microscopic
state. There are lots of those, so high entropy.

Another possible macroscopic quantity is "I find a given color shoe in O(logn)
time". In this case the shoes need to be ordered, leading to far fewer states.
Hence lower entropy.

If this is not clear, it is easier to see if we constrain ourselves to having
one pair of shoes for each color. Then the O(logn) configuration has 1 state,
while O(n) clearly has many more, so obviously the O(logn) state has lower
entropy.

However the same logic applies to having >1 pair of shoes for each color. In
the O(logn) case, we can only swap shoes of the same color to have the same
macro state. In O(n), we can always swap any shoes of any color, as long as
this swap isn't the last swap needed to achieve order.

------
fredguth
This is a description of Kolmogorov entropy.

------
gandalfgeek
This view is too simplistic. I've come to the realization that ultimately
software is complex because it models some very complex real-world situation.
E.g. how complex can tax software be? Answer: as complex as the tax code,
which is not just complex, but also contains tons of unspecified behavior.
Yes, sometimes you can take a simple problem and write spaghetti code for it,
but IME that is often not the real problem.

~~~
1penny42cents
Right, there is the concept of essential complexity. This is addressed in the
post:

> Sometimes, our problems are essentially complex. In these cases, our
> solutions need some essential complexity to match. But when does essential
> complexity become unnecessary?

We can measure this when the number of total states dominates the number of
valid states. The number of invalid states is a measure of accidental
complexity.

~~~
loopz
There's also a concept of unnecessary complexity. Especially business
processes are vulnerable to this, and may hint of policy or design issues.

------
wellpast
This reasoning here is severely mistaken.

Static typing does not change the complexity dynamics of your programming
model. It's a mere tool.

If I statically type to DateTime I can still be wildy baffled to see a
transaction take place in the future. Or even more baffled when at first
glance the DateTime looks correct (ie in the _past_ and recent) but to find
out later when I get the support phone call that we incorrectly tallied the
balance.

Static typing just restricts where I can put my shoes in my closet. If I find
a need to sort all the "righties" together -- for example, to figure out how
many lefties I'm short -- static typing makes things _more_ complex -- because
it will _disallow_ this innovation.

Complexity is a count of _concepts_ in your model, not a count of _instances_
and not a count of state possibilities, which in any _real world_ system are
going to be endless no matter how much you waste time with static proofs.

~~~
1penny42cents
If we take `cleanliness = number of valid states / number of total states`,
then using DateTime vs String is strictly cleaner.

Yes, there are still many invalid states within a strong type and static
typing is just one form of reducing complexity. I'm actually much more
interested in cleanliness of design and architecture, but the String vs
DateTime or Enum vs String examples are easy ways to illustrate the idea.

~~~
wellpast
To be clear I’m saying that static typing makes things _more_ complex.

If the original simple argument were true, then everyone would introduce
static typing, it would be a no-brainer.

~~~
hderms
There's a certain size threshold for applications where a statically typed
language becomes extremely valuable. Static proofs that things are "connected"
properly become very valuable, even if, as you say, the benefits aren't
universal. Constraints sometimes simplify things, sometimes they increase
complexity, but it's hard to argue that large systems with many authors would
be universally better if they were unable to use static types to constrain
programs.

------
nestorD
"You should call it entropy [...] nobody knows what entropy really is, so in a
debate you will always have the advantage." John Von Neumann to Claude Shannon

(also why there is clearly of market for the concept of dynamic entropy:
[https://xkcd.com/2318/](https://xkcd.com/2318/) )

~~~
sqrt17
"Software Entropy" sounds so much more scientific and general than "avoid
stringly typed code"

BTW, "stringly typed" has been in use for exactly the behavior to avoid for
maybe 10 years now
[https://wiki.c2.com/?StringlyTyped](https://wiki.c2.com/?StringlyTyped)

------
7532yahoogmail
Using TLA, spin and others will help compute actual numbers to see reachable
states etc

