
Tell HN: Docker just ate 19GB of production data - fhackenberger
Be <i>very</i> careful with the live-restore feature of docker. Running &#x27;docker volume prune&#x27; just removed <i>all</i> my named volumes, which were used by running containers.<p>See <a href="https:&#x2F;&#x2F;github.com&#x2F;moby&#x2F;moby&#x2F;issues&#x2F;38883" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;moby&#x2F;moby&#x2F;issues&#x2F;38883</a>
======
gervu
Automation of any sort will sometimes accidentally your data, whether due to
periodic hiccups, system instabilities and bugs, operator misunderstandings or
errors, or random cosmic ray strikes.

The exact reason it blows up isn't even necessarily all that important, other
than in its effect on what you should be doing to reduce the probability of
downtime. Well-engineered systems are routinely developed from less than
completely reliable parts. Stuff fails, we design for it.

It's certainly not reason not to _use_ it, if it's resulting in a net positive
gain in your ability to get things done and maintain control and transparency
over your deployed systems.

But it's certainly a good reason (among a long list of good reasons) to make
sure you have a good backup routine in place, including regular testing of
both their integrity and your ability to restore a working prod system from
them quickly.

~~~
boobsbr
Accidentally what? =)

~~~
TrinaryWorksToo
Delete, but the word delete is deleted. Yo dawg.

------
jtchang
This definitely sounds like a bug.

docker volume prune says:

"Remove all unused local volumes. Unused local volumes are those which are not
referenced by any containers"

If it removed a local volume that was being used by a container that is kinda
bad.

------
orf
1\. Why are you running docker volume prune in production?

2\. Why are you running docker on ad-hoc machines you need to prune?

3\. Why do you even need root access on production machines to fiddle around
with docker commands?

While this is obviously a bad bug (and there are many with Docker), it seems
more of an operational procedures failure than anything else. You could be
saying:

“Beware of rm -rf /, it just deleted 20gb of production data”

Ok. Sure. But why are you tools and procedures putting yourself in a position
to make that mistake?

~~~
reaperducer
One of the most bothersome part of HN is when someone tells us about something
that happened, and out come a ream of second-guessing replies. "Why didn't you
just do _this_?" and Why didn't you just do _that_?" and any number of "It's
so easy to just _thing_ instead!"

We don't know his environment. We don't know his company's policies. We don't
know his hardware, connectivity, or budget issues. These kinds of passive
aggressive responses are almost never helpful.

~~~
orf
When you reduce it down the title here is “giving people access to running
arbitrary, manual and presumably unrestricted maintenance commands in
production leads to issues”.

That’s not a surprise, and maybe the issue at the core here is not really
Docker. That’s all.

~~~
q_queue
Sure, but regardless of people doing dumb things, it's still worth asking "why
did docker delete non-orphaned named volumes?" \-- though you could also
question whether someone was actually mistaken about them not being "orphaned"
\- you could probably arrange an unfortunate timing collision between someone
running prune and a container being respawned.

~~~
uponcoffee
Right, that's what raising an issue with the software maintainers are for.

Aside from anecdotes, there's little value in further discussion beyond the
PSA that is the original post; save for prevention/recovery of such events.

~~~
windexh8er
It almost sounds as if the daemon was in the process of starting the
containers and the prune command was issued. If it were run with `-f` and the
container wasn't running those volumes would be deleted. I tried this on a
test system and didn't get the results in the issue.

------
sz4kerto
I really-really hope you are not relying on Docker only when protecting 19G of
data. Docker volume operations are the equivalent of playing with sudo rm -rf,
shit's going to happen once in a while.

~~~
scarface74
I am a Docker newbie but I thought it wasn’t considered best practice to use
Docker for anything where you care about the data in the container. I’ve only
used for API’s batch jobs, etc.

~~~
cheez
Docker volumes are persistent, unless they're not :-)

~~~
q_queue
I've never been willing to consider docker volumes persistent. In the big
picture, a requirement of "posix filesystem semantics" and "persistent" is a
pretty inconvenient and/or expensive requirement.

~~~
cheez
I am super paranoid about docker volumes. But I have application-level backup
which is tested daily.

------
praseodym
In the Moby issue you mention that you are using live restore
([https://docs.docker.com/config/containers/live-
restore/](https://docs.docker.com/config/containers/live-restore/)) which is
most likely where the problem is. Docker daemon restarts, existing containers
are kept alive, but the restarted Docker daemon doesn’t know about those
existing containers yet and thus thinks their volumes are unused.

------
RocketSyntax
Not sure what kind of company you work at, but I'd export a copy of your logs
so you don't get canned

------
stcredzero
This makes it sound like it's quite common to use docker containers operating
in a heavily stateful fashion. Is that indeed common nowadays? (Though, the
state in this case is only counted on to persist in the named volumes.)

~~~
johnchristopher
Well, you are supposed to be able to delete all your volumes, containers and
networks and then regenerate it by running the recettes (edit:I mean.. recipes
:D) (Dockerfile, docker-compose, kubernetes, volume backup, etc.).

~~~
stcredzero
_Well, you are supposed to be able to delete all your volumes, containers and
networks and then regenerate_

So then Docker is designed to treat all of those as disposable.

I just searched "recette" and only came up with French cooking references.

~~~
ashtonbaker
> I just searched "recette" and only came up with French cooking references.

Perhaps a phonetic spelling of "resets" by a French person :)

~~~
johnchristopher
Something like that :D. I was thinking "recette" as in "recipe" but somehow
only the word "receipt" came to my mind so I thought "oh, it must be one of
those words that are the same in both language".

I am a bit tired.

------
wiredfool
You just won the “I dropped the production db” achievement.

It’s surprisingly easy with docker, especially when dealing with .... legacy
systems.

------
acid303
My browser ate 16GB of ram while I've been reading this. The system crashed
but the tabs were here there after a reboot. I'm not even mad anymore.

------
LiamPa
> please assign this bug to an engineer.

The joys of open source users..

~~~
ironmagma
Seems like a reasonable ask.. I’m not sure what the problem with this is?

~~~
LiamPa
Maintain an open source project and you will understand the problem of users
having the ‘just fix it’ attitude.

~~~
ironmagma
I have maintained open source projects before. Docker is a venture-backed
company trying to sell its product which is mostly just hosting and support
for their free product... and in addition, the user did ask nicely, they did
not make demands. There's a big difference between asking that an engineer be
assigned to a task and demanding something be fixed immediately for no bounty.

------
DannyB2
Computers are wonderful. They can do the same work that would require a
thousand people to accomplish in the same amount of time.

Flip side . . .

Computers are terrible. They can screw things up so bad it would require a
thousand people to accomplish in the same amount of time.

------
frenchman99
`docker volume prune` is specifically there to remove volumes, so backing up
before using it seems to be mandatory, just in case. But yeah, if this is a
bug, it's a nasty one.

------
clinta
This bug specifically says it affects anonymous volumes. If you had it delete
a named volume that sounds like a new issue.

