Hacker News new | past | comments | ask | show | jobs | submit login
Reverse Engineering a Docker Image (theartofmachinery.com)
176 points by ingve on March 18, 2021 | hide | past | favorite | 20 comments

I use the tool dive for that. Also to debug my docker builds. You can even run it inside docker (if you expose the docker socket)


In addition to Dive, there's also Whaler https://github.com/P3GLEG/Whaler which will print out a Dockerfile from the image, based on the metadata in the image.

You can also use Portainer https://www.portainer.io/ which will show the image layer details in the images section.

+1 for Dive.

Dive will even show you the exact command (directive?) that resulted in a layer being added. It also lets you browse the file system at every step.

It was always surprising to me that Docker doesn’t just slurp up the Dockerfile just in case so you could get it back easily if you really needed to.

With the right format string (--format), I think docker history --no-trunc might do this.

    docker history --no-trunc <image_id> --format '{{ .CreatedBy }}' | tail -r
this is pretty close

Google forensic team also maintain an Docker introspection tool : https://github.com/google/Docker-Explorer

There are some subtle differences between tools that generate Docker images.

As seen in this Dive issue[1], Google's Kaniko uses a different naming convention for the config files. Docker config files are called <hash>.json, Kaniko uses "sha256:<hash>".

[1] https://github.com/wagoodman/dive/issues/318

The tool lazydocker also will construct a pseudo dockerfile from the history. It's good enough that most of the time you can just copy it and get a working dockerfile out of it.

It's also super awesome in general.

Where can I find this project? I only manage to find a GUI client for Docker when searching for that name.

Homebrew: https://formulae.brew.sh/formula/lazydocker

GitHub Repo: https://github.com/jesseduffield/lazydocker

It’s a great tool! Docker for Desktop is now shipping with a lot of the same features built into its GUI, too.

This is awesome. I’ve always wanted to know know docker images worked under the hood. Now I have a much better understanding.

Working on CI/CD systems, I’ve wanted to make a fast way to bundle a new layer of copied files without needing docker engine.

Does anyone know if there is a tool available for this use case? I know bazel docker rules has some magic for fast file only layers.

The "hard part" is generating the diff layer itself, the rest is json manipulation: add the layer sha to the manifest and image config.

Would be nice if the post didn't stop abruptly.

I think that last layer is the tarball with the source of the application they wanted.

Would be nice if that /work line was followed by a "and I found everything I needed in /work"

As it is ... maybe /work is everything, maybe not.

Maybe he went to answer the door, where he was met by several Mr Smiths from Agency C, and was taken away.

I've heard it's said that docker should win an award for "most constructive use of tar".

I don't think it was a big secret.

Welcome to Linux, everything is a file.

docker history does a lot of this.

    $ docker history tmknom/prettier:2.0.5 --no-trunc

    sha256:88f38be28f05f38dba94ce0c1328ebe2b963b65848ab96594f8172a9c3b0f25b   10 months ago       /bin/sh -c #(nop)  CMD ["--help
    <missing>                                                                 10 months ago       /bin/sh -c #(nop)  ENTRYPOINT ["/usr/bin/prettier
    <missing>                                                                 10 months ago       /bin/sh -c #(nop) WORKDIR /work
    <missing>                                                                 10 months ago       |6 BUILD_DATE=2020-04-29T06:34:01Z NODEJS_VERSION=12.15.0-r1 PRETTIER_VERSION=2.0.5 REPO_NAME=tmknom/prettier VCS_REF=35d2587 VERSION=2.0.5 /bin/sh -c set -x &&     apk add --no-cache nodejs=${NODEJS_VERSION} nodejs-npm=${NODEJS_VERSION} &&     npm install -g prettier@${PRETTIER_VERSION} &&     npm cache clean --force &&     apk del nodejs-npm                                                                                                                                                                                                                                    42.5MB
    <missing>                                                                 10 months ago       /bin/sh -c #(nop
    <missing>                                                                 10 months ago       /bin/sh -c #(nop)  ARG NODEJS_VERSION=12.15.0-r
    <missing>                                                                 10 months ago       /bin/sh -c #(nop)  LABEL org.label-schema.vendor=tmknom org.label-schema.name=tmknom/prettier org.label-schema.description=Prettier is an opinionated code formatter. org.label-schema.build-date=2020-04-29T06:34:01Z org.label-schema.version=2.0.5 org.label-schema.vcs-ref=35d2587 org.label-schema.vcs-url=https://github.com/tmknom/prettier org.label-schema.usage=https://github.com/tmknom/prettier/blob/master/README.md#usage org.label-schema.docker.cmd=docker run --rm -v $PWD:/work tmknom/prettier --parser=markdown --write '**/*.md' org.label-schema.schema-version=1.0   0B
    <missing>                                                                 10 months ago       /bin/sh -c #(nop
    <missing>                                                                 10 months ago       /bin/sh -c #(nop
    <missing>                                                                 10 months ago       /bin/sh -c #(nop
    <missing>                                                                 10 months ago       /bin/sh -c #(nop
    <missing>                                                                 10 months ago       /bin/sh -c #(nop)  CMD ["/bin/sh
    <missing>                                                                 10 months ago       /bin/sh -c #(nop) ADD file:b91adb67b670d3a6ff9463e48b7def903ed516be66fc4282d22c53e41512be49 in /                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             5.61MB

Obligatory self promotion: https://github.com/nanoscopic/mtsc

It is a tool I created myself for figuring out the contents of layered docker images.

It does these things:

1. Compares two docker images to see exact file differences

2. Diffs the contents of two directory structures

3. Views contents of a docker image without mounting it or using docker itself

4. Generates a standalone index of the contents of a docker image

5. Determines the resultant layer composition of a directory within a docker image

6. Extracts a specific pathed file from a layered docker image

It is extremely helpful when trying to determine the exact differences between two different builds of the same Dockerfile.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact