
Code Inflation [pdf] - chilgart
https://www.computer.org/cms/Computer.org/ComputingNow/issues/2015/04/mso2015020010.pdf
======
userbinator
Those who don't see anything unusual or wasteful about an 8KB+ executable
whose only action is to exit and return a value are recommended to watch this
demo generated in realtime by a 4KB executable:

[http://www.youtube.com/watch?v=jB0vBmiTr6o](http://www.youtube.com/watch?v=jB0vBmiTr6o)

There's more here:
[http://www.pouet.net/prodlist.php?type[]=4k](http://www.pouet.net/prodlist.php?type\[\]=4k)

The demoscene is basically the exact opposite of mainstream software culture,
and although it's focused on multimedia shows, I think some of their
techniques and underlying motivation could be applied more generally...

I wonder if /bin/true and /bin/false in their most simplest forms even meet
the minimum "requirement for creativity" to be copyrightable, and if the
reason why it has bloated significantly is so that it could be.

~~~
morenoh149
Here's an awesome one I found
[http://koti.kapsi.fi/rimina/living_in_a_box_720p.html](http://koti.kapsi.fi/rimina/living_in_a_box_720p.html)
Sad buildings

------
PythonicAlpha
The problem with code inflation is not the the space needed on disk or in
memory. The real problem lies in the quality assurance area. With every byte a
program grows (in source code), the complexity of the system also grows with
it. It is not only inherent complexity, but also extrinsic complexity -- for
example dependencies from dynamic libraries and also the dependencies from the
operating system.

With the growing complexity, the systems become more and more fragile and
difficult to maintain. You can see that, when software just fails on one
computer and runs on an other computer with the same operating system, but
some little, weird dependency (e.g. with the graphics card) just makes the
program misbehave on that particular computer.

Some times I am just worried, that all the computer scientists of the world
make the world of computers more and more complex and on one day, the software
becomes unmanageable. Even today, as normal computer user, I some times get
the impression, that computers take more time (for installing updates and
updates of the updates, worrying about threats, getting the best virus scanner
...) as they save us.

------
matthewbauer
Here's a link to the coreutils source for true if anyone's interested:

[http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/true...](http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/true.c)

I think the argument can be made that command lines tools shouldn't provide
versioning and help text at all - leave it up to the manual pages.

~~~
wyldfire
> I think the argument can be made that command lines tools shouldn't provide
> versioning and help text at all - leave it up to the manual pages.

I'd have a real hard time agreeing with that. But I could make an exception
for /bin/true.

~~~
userbinator
I think that the _code_ to handle versioning and help text (except a brief
usage message, which is often useful) shouldn't be included in each binary.

I agree that help text should be the purpose of the manpages.

A version number (or pointer to version string) could be one of the fields in
the file header, to be read by a tool ("version"?) and displayed that way.

This way you still get versioning and documentation, without a massive
duplication of functionality across every single binary.

------
greggyb
So, I found this very interesting to read.

I created an empty file, a la the original true, named truth and placed it at
the beginning of my $PATH.

    
    
        $ time truth
        
        real 0m0.002s
        user 0m0.000s
        sys  0m0.003s
    

I then realized that since true is a builtin, I'd need to call the true binary
with its full path. I'd hate to give one an unfair advantage by having to
search the path vs being called by name. So....

    
    
        $ time /usr/bin/true
        
        real 0m0.001s
        user 0m0.000s
        sys  0m0.000s
    
        
        $ time /usr/local/sbin/truth
        
        real 0m0.003s
        user 0m0.000s
        sys  0m0.000s
    
        
        $ time true # The builtin
        
        real 0m0.000s
        user 0m0.000s
        sys  0m0.000s
    

Clearly my 2KB true binary is superior to an empty file. The builtin
(obviously) outperforms both, by at least one order of magnitude (estimated).

This post is not intended to be a serious performance comparison.

~~~
function_seven
I just repeated your test, but got opposite results:

    
    
        $ time /usr/bin/true
    
        real 0m0.003s
        user 0m0.001s
        sys	 0m0.002s
    
        $ time /usr/bin/truth
    
        real 0m0.001s
        user 0m0.000s
        sys	 0m0.001s
    

I can't imagine for the life of me how an empty file could perform worse than
an actual binary, but there's gotta be a reason.

~~~
cpwright
If you're relying on an empty binary, or exit 0; instead of an actual binary
you have to execute a shell process to evaluate it. The binary doesn't need to
load a shell, it can just do a bit of setup, then make a single exit 0 system
call.

~~~
function_seven
Thank you. That makes sense. So basically the empty file is seen by the shell
as a "script" that has to be evaluated, so a separate shell process is spun up
to handle it? And the difference between my output and the parent's has to do
with some difference in our environments?

------
roneesh
A good and informative read. While I'm new to programming, I have began to
judge my productivity in terms of lines of code I remove, not add to a
project.

~~~
proksoup
If all programmers did what you are doing, we would not have the problems we
do.

I hope someone gives you a medal and merit badge and a cookie.

[1] have begun _

~~~
icebraining
Agreed; I now always run a minifier before committing.

~~~
hueving
Not sure if sarcastic...

Reducing lines doesn't matter. Reducing logic is what matters.

------
legulere
The same is true for standards. There are often so many non-features that make
the standard harder to implement yet bring no real advantage. With standards
the problem is that you have to implement everything to be standard-compliant.
And then you have a reluctance to remove the cruft that nobody uses in newer
versions of the standard.

I'm currently working a bit with SVG paths. There are some features that
aren't really used that much in the wild. For instance quadratic bezier
curves, arcs, a shorthand syntax for successive horizontal/vertical lines in a
subpath. Those things could be debated but they are okay.

Then we also have things that are really just unneeded. You can also write
numbers in xxEyy or xx.yyEzz way. Scientific notation has limited uses,
computer graphics is not one of them. You can use a comma in addition to
optional whitespace, but only at specific locations and at most one. Also
there's exactly one place in the grammar where the whitespace is not marked
optional.

------
raverbashing
This article is the best response to the other article on the front page about
how optimizations are "useless"

------
al2o3cr
"OMG, /bin/sh increased in size by 191x from 1974 to 2015!" seems like the
byte-counting equivalent of people who lose their minds about "YOUR FOOD HAS
CHEMICALS IN IT WITH COMPLICATED NAMES!". Both sound impressive and might make
you worry - and both omit important facts that would greatly reduce that
effect.

For instance, 191x growth since 1974 seems steep until you realize that the
corresponding storage has grown from, say, 10.5MB (capacity of an RL02 for the
PDP11) to ~1TB; a scaling of 100000x over the same period. That's not even
really comparing apples-to-apples; the RL02 was not the sort of thing you'd
have on your desk.

~~~
abecedarius
If we'd gotten 191 times better at avoiding bugs and security issues (and
those problems scaled linearly with code size), this would be to the point.

------
ignoramous
Mandatory folklore link:
[http://www.folklore.org/StoryView.py?story=Negative_2000_Lin...](http://www.folklore.org/StoryView.py?story=Negative_2000_Lines_Of_Code.txt)

------
logicallee
ironically, this article looks like it grew organicaly from the 14 words

    
    
       "Software tends to grow over time, whether or not there’s a need for it"
    

that it highlights, until it contained an introduction, a First Law, a table
of data abou the Unix "true" command, the same data as a figure, a logarithmic
chart of bash, and a foray into "dark code" (which can be present at any
size.)

------
peterfirefly
I'm a big fan of Gerard Holzmann -- his work is well worth checking out.

[http://en.wikipedia.org/wiki/Gerard_J._Holzmann](http://en.wikipedia.org/wiki/Gerard_J._Holzmann)

------
ExpiredLink
>> _So, why does software grow? The answer seems to be: because it can._

Not really a meaningful explanation. Nor is the example.

