Hacker News new | past | comments | ask | show | jobs | submit login

Size of programs, in terms of disk, memory, cpu time, and network usage, is bloated by multiple orders of magnitude by all the confused people who think the only thing that matters is "developer productivity". Maybe 20% is worth sacrificing, maybe 50%, but 100x? 1000x? It all adds up.

One really easy and relevant example, sizes of docker images for running memcached:

  vagrant@dockerdev:~$ sudo docker images | grep memcached
  memcached                     latest              0868b36194d3        2 weeks ago         132.2 MB
  sylvainlasnier/memcached      latest              97a88c3744ef        13 months ago       297.4 MB
  ploxiln/memcached             2015-07-08          aa4a87ee2c05        5 months ago        7.453 MB
(that last one is my own, the other two are the two most popular on docker hub).

As another example, a co-worker recently was working with some (out-of-tree) gstreamer plugins, and the most convenient way to do so was with a docker image in which all the major gstreamer dependencies, the latest version of gstreamer, and the out-of-tree plugins were built from source. The offered image was over 10GB and 30 layers, took quite a while to download, and a surprising number of seconds to run. With just a few tweaks it was reduced to 1.1GB and a handful of layers which runs in less than a second. It was just a total lack of care for efficiency that made it 10x less efficient in every way, enough to actually reduce developer productivity.

Size matters.




> the confused people who think the only thing that matters is "developer productivity".

Developers, especially good developers (or hell, even just competent) are more than worth the effort put into improving their productivity, and the good ones will usually intuitively have a grasp of the XKCD time trade-off graph and reduce or eliminate delays themselves given the chance.

That being said, even in this day and age of extremely cheap cycles, non-volatile and volatile storage, and insane throughput, making something like VM/chroot images smaller can lead to higher productivity in that you can spin them up faster, or spin up tons more in paralell than you would normally think of. Having the option to do such can help shape alternate modes of development and open up possibilities previously undreamt of ("spin up 1000 docker images? Can't do that because they each need 200MB RAM and I only have 32GB of RAM").


Size of cruft aside, there's value in discussing whether such cruft should exist.

It's normal for common tools to be SUID root - it's necessary for operation on a normal machine. Do you really need 30+ SUID binaries inside your Docker container built for one thing?

Docker seems to present an ideal situation for stripping such potential exploit vectors back.


Are you able to share any of the tweaks that were used?


One really easy one: write a shell script to do most of the image building (run by the Dockerfile), instead of adding a bunch of RUN directives in the Dockerfile, especially if you clean up intermediate files with a "make clean" or something. Each directive in the Dockerfile adds a layer, which adds container setup overhead, and also "locks in" all filesystem space usage at that point.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: