
DigitalOcean not destroying droplets securely, data is completely recoverable - nixgeek
https://gist.github.com/agh/d0e2b115de77b1bcb902
======
powera
I think the moral of the story is that if you are so concerned about
IMMEDIATELY deleting your data and a 48 hour period where you can recover it
is unacceptable, you should definitely run your own servers.

~~~
ceejayoz
Or run on an encrypted volume that you scrub prior to killing off the
instance.

~~~
nikcub
you shouldn't rely on your hosting provider to secure/shred your data, do it
yourself.

encfs is really easy to setup and use, everybody should be using it,
especially on VPS's:

    
    
        $ sudo apt-get install encfs
    

or:

    
    
        $ sudo yum -y install fuse-encfs
    

then:

    
    
        $ sudo encfs ~/.encrypted /home/private
    

the /home/private folder is where you place your files (web sites, etc.).
~/.encrypted is where the encrypted version is stored. when prompted hit 'p'
for the default paranoid mode, enter a password and you're done. read more
about it on the homepage[0]

When you are done with a server and need to secure delete, use shred -- which
is installed on both RHEL and Debian based distributions:

    
    
        $ sudo shred -fz ~/.encrypted 
        $ sudo shred -fz /home/private
    

[0] [http://www.arg0.net/encfs](http://www.arg0.net/encfs)

~~~
coolj
[https://defuse.ca/audits/encfs.htm](https://defuse.ca/audits/encfs.htm)
[https://defuse.ca/audits/ecryptfs.htm](https://defuse.ca/audits/ecryptfs.htm)

Probably better to use ecryptfs or dm-crypt.

------
spindritf
Both retaining the IP and being able to recover a destroyed droplet are
strictly features. It would be a problem if someone _else_ could create a
droplet and recover your data, not when you can do it.

~~~
userbinator
> It would be a problem if someone _else_ could create a droplet and recover
> your data, not when you can do it.

This. Sometimes it really is better if it's not truly gone _forever_ \- in
other words, you still have the option of recovery. I'm speaking from the
experience of someone who has accidentally lost important, irreplaceable data
that had yet to be backed up.

Especially after reading this:

 _There was also an incident where a third party provisioning service which
integrates with our cloud as well as others was compromised and all of the
servers for those customers were destroyed_

~~~
devhen
Exactly. The default behavior is exactly as it should be-- scrub the VM itself
but keep a snapshot for 48 hours in case it was deleted accidentally.

There is no security concern here but if some users are still upset with this
default behavior then they should be manually scrubbing their sensitive data
prior to deleting the droplet or, better yet, they should be using their own
hardware.

Frankly, I'm shocked at a lot of the responses here. When I first came across
this article it was immediately obvious to me that this was not an actual
security concern but rather a mixture of misunderstanding and skepticism. So I
came here to hackernews feeling entirely confident that this article would
have been efficiently debunked and instead I find all manner of ill-informed
arguments. The people that don't understand the original article, its
fallacies, and the reality of how Digital Ocean is handling droplet destroys,
should not be running servers IMO. This stuff should be obvious to you and
furthermore, if your data is particularly sensitive, you should have already
been under the assumption that the host could not be trusted and that data
scrubbing must be done manually.

------
derefr
DigitalOcean instances run with a machine-shared storage pool (think EC2
ephemeral storage), which is why not securely erasing them was a problem.

The "destroyed instance" you see in the "spawn instance from a template" UI,
on the other hand, is a _snapshot_ of the destroyed instance, taken upon the
instance's destruction. Snapshots are stored in a separate network-object-
storage pool (think S3), and raw-reading your ephemeral storage won't turn up
deleted snapshots.

Securely erasing an instance means erasing its data from shared ephemeral
storage. It doesn't mean erasing any snapshots of it, because snapshots aren't
_located on_ shared ephemeral storage.

~~~
nixgeek
The issue I have with your statement is that whilst you are likely entirely
correct, it was not the expectation defined by DigitalOcean when they built
the interface.

The interface clearly states:

"This is irreversible. We will destroy your droplet and all associated
backups"

It is very reversible so that statement is plainly untrue.

There are also nasty 'corner cases' (very unexpected) in the resuscitation of
droplets, whether you had 'Scrub Data' checked or otherwise, wherein
DigitalOcean is tampering with the snapshot contents and replacing the SSH
host keys. Totally understand the point Moisey made about needing to poke in
~/.ssh/authorized_keys so a customer can gain access but that's very different
to the /etc/ssh ones which are being replaced entirely unnecessarily, and with
significant security detriment since you now have a harder job in verifying
this is indeed the instance you think.

~~~
nwh
I agree. I would have expected data to be nuked, not nuked on one volume and
retained on another.

~~~
derefr
Would you expect deleting a Docker container to delete a Docker image made
from it?

~~~
nixgeek
That's not necessarily a good analogy since you're talking about a COW (copy-
on-write) situation, not always, granted, but in this case it seems true.

DigitalOcean admitted that in some cases they are using qcow, so I wouldn't
say I expect the image (e.g. Ubuntu 13.10 x64) to disappear when I hit
'Destroy!' (with the appropriate bits set for secure destruction) but any
changes which I have made should certainly be gone.

Single Layer:
[http://i.stack.imgur.com/Riw7y.png](http://i.stack.imgur.com/Riw7y.png) Multi
Layer: [http://docs.docker.io/en/latest/_images/docker-
filesystems-m...](http://docs.docker.io/en/latest/_images/docker-filesystems-
multilayer.png)

In Docker parlance I would not expect the base image to disappear forever,
that's a separate layer, but I would expect that were DigitalOcean serving up
containers with a 'Scrub Data' option then the contents of my writeable
container would be securely erased.

------
zagi
Thanks for pointing this out, nearly 1 year ago we implemented a backup
mechanism which stores a destroyed machine for 24 hours. This is only enabled
for users with a valid paying account and we used this mechanism sometimes to
return a droplet to a customer who accidentally deleted it, other times there
were problems with an integrated third party where we were able to recover
customer droplets because of a security problem.

We will take the necessary steps so that if users are enabling scrub all data
permanently then we will not store this temporary image and therefore destroys
will be immediate and permanent.

------
raiyu
Hi Folks,

Just wanted to clarify the issue for anyone who didn't have the time to read
the full gist.

When we first started DigitalOcean we occasionally received tickets from
customers about recovering a droplet that they had destroyed. Unfortunately
when a droplet was destroyed it was gone from the system and it wasn't
possible to recover. To help our customers we decided that it was a good idea
to take a temporary snapshot of the droplet after the destroy was issued that
would automatically expire. This way if someone mistakenly destroyed a droplet
they could still recreate it.

This proved to be a lifesaver for many customers of DigitalOcean when a third
party company that provided a provisioning service that integrated with DO,
AWS, Rackspace, etc. was compromised and the attacker issued a delete to all
customers and all instances. Because this mechanism was in place we were able
to recover almost everyone's droplets.

We ran into an issue with securely scrubbing data which was publicized on HN
and we implemented a fix immediately with a scrub flag. Unfortunately we made
a mistaken and made the default setting false. Most customers often click the
default, and I myself do the same thing, since I assume that the default is
the best course of action and this led to this issue resurfacing. This also
was posted to HN and we immediately decided that the default behavior should
be to scrub.

Prior to this when a customer selected scrub securely because they were taking
two actions, issuing a destroy, and setting a flag, it was safe to assume that
they indeed want the data completely destroyed. However when we had to reverse
the default we were left in a situation where the default would not create a
temporary snapshot if we used the secure destroy flag as the indicator for
whether or not a temporary snapshot should be created.

Since we've implemented the temporary snapshot feature we have had 1154
droplets that have been restored after a destroy from different 752 customers.

That's 752 customers that were elated to find out that they could recover a
droplet that was mistakenly destroyed so obviously this is a very beneficial
feature since each time one of those customers recovered a droplet it was a
huge win for them.

We assumed that since the temporary snapshots are automatically destroyed this
would not be an issue. In fact in the control panel we provided an additional
feature which would make the snapshot permanent otherwise that snapshot is
deleted.

I think the issue that is brought up here is definitely worth a discussion and
we treat security very highly. Since we had the prior HN post regarding
changing our default behavior we have been working behind the scenes to ensure
that would be the default behavior so that the scrub flag could be removed
entirely and that all destroys regardless of how they would be issued would be
secure.

That behind the scenes work is almost entirely done so this discussion of the
temporary snapshot is great because it allows us to revisit this issue once
again.

We have not had any other customer complaints that during a secure destroy the
droplet and the backups and the snapshots were not immediately destroyed. So
it was great to engage in a conversation with the customer to understand their
view on how they wanted these commands to functions.

We'll be engaging with engineering tomorrow to see if it's time for us to
begin to phase out the scrub data flag and instead perhaps open up a new flag
which would create a temporary snapshot.

For the UX/UI of the control panel we would make the default behavior of the
destroy create a temporary snapshot and we would have to discuss to see if the
API should behave the same way.

Often API customers are creating and destroying many servers so it may be safe
to assume that they do not want a temporary snapshot, though having default
behaviors differ between the control panel and the API is generally not a good
idea.

I think in general this highlights an issue that all startups deal with. That
is as the product grows and matures and as features are added there are often
unintended cascading consequences.

In this case we have done our best to do right by the largest number of
customers to ensure that data is safely and securely destroyed while still
providing a default behavior that would protect customers against accidental
destroys whether they be self-initiated or otherwise.

If anyone has any questions regarding this issue or anything else always
please feel free to email me directly, my first name, Moisey, at DO (expand
that) . com.

Thanks, Moisey Cofounder DigitalOcean

~~~
dkural
Your approach seems very sensible and the data you provide is in support of
this. I do hope you stick with this, and more importantly, your data drive
meta-methodology, and not let an HN post who don't face the use cases etc.
drive you to abandon this.

~~~
raiyu
We love to hear feedback from customers because it allows us to refine the
product. From our internal metrics we know that providing an automatic
snapshot recovery function is essential given the number of times its been
implemented but we can also understand this customers issue with data being
destroyed.

The data is destroyed securely, but not as immediately as they would have
liked since the default behavior is to preserve a snapshot until it
automatically expires.

These discussions are great for us because they allow us to refine the user
experience and ensure that we can provide a great service. As we've been doing
a tremendous amount of work on the backend code this is a great time for this
customers concerns to be discussed as we are knee deep in large rewrites at
the moment and its the perfect time to address this, rather than deploy to
production and then have to go back and retool.

Given that we've been working on the secure destroy all the time having that
be an option doesn't make sense and instead we can use that UX/UI real-estate
to expose the temporary snapshot issue but again would love to hear more
feedback on whether that's necessary.

We would of course love to please every customer but then we would have a
platform with 1,000 features so sometimes you have to politely decline a
request because it doesn't fit in with the general use cases, but it doesn't
mean that we shouldn't engage in a discussion as that is always helpful.

Even if action isn't immediately taken it often leads to us asking numerous
questions that are always great that can lead to new product and feature ideas
that may not be tied to the original question in mind but still can move the
product forward in other areas.

Thanks!

------
unclesaamm
This is a total non-issue. OP was wrong about his interpretation of the
recoverable droplet, then pursued an argument with the same high-pitched
rhetoric about what amounts to a UI issue. Sorry, not news

------
bluesix
Surely this whole mess can be fixed with a 5 minute UI change - reword the
copy and add a checkbox to do/don't take a snapshot (default to checked (take
a snapshot)).

------
michaelmior
Kudos to raiyu and nixgeek for one of the most civilized discussions among
disagreeing hackers I've seen in quite some time! :)

------
eof
The notion of having having 'secure' data on someone else's hardware is just a
bit silly.

I think the OP here definitely points at something, but primarily that the
`scrubbing` checkbox is essentially a placebo button.

Getting a little meta: a 'no matter what delete this in a fully 100%
absolutely totally unrecoverable forever fashion' checkbox is just begging for
a generic law enforcement ping which DO would be forced to provide covertly.

I appreciate OPs side, I definitely see DO's viewpoint of customer happiness
>> accurate UI.. but the lesson here is definitely to own your own
necessarily-secure data.

------
unreal37
You have to hand it to Digital Ocean for actually listening to customers,
explaining themselves thoroughly, and taking the issue to the community (HN on
several instances) for discussion. Totally civil all around. Thanks to nixgeek
for raising the potential issue, and thanks to raiyu for engaging in a
meaningful back and forth discussion with everyone.

The issue itself? I have accidentally "terminated" a few AWS instances that I
instantly wished I hadn't, and so I can see the benefit of it sticking around
for 24 hours. This would have saved me a few times if I was using DO instead.

------
ivan_gammel
Apparently, this is UX issue. The checkbox is not the right choice for
allowing customers to make the decision.

I can suggest couple solutions: Option 1. Explicitly state during the signup
process that for recoverability purposes all customer data is removed in N
hours after request. At this point DO may loose some customers, which have
wrong expectations from the service, but the interface will be simpler.

Option 2. Replace checkbox with additional confirmation page, that asks
customer about data removal strategy ("thrash can" or "scrub"). There should
not be default selection here. Additional safety measures can be implemented
to avoid occasional selection of "scrub" \- confirmation by e-mail, using SMS
security code or some other "two-factor" approval.

------
osteele
The behavior sounds considered; the web site doesn't describe it to the user.

How about if the Destroy dialog read something like:

    
    
      This is irreversible. We will destroy your droplet and all associated backups immediately. We will keep a snapshot that you can use to recover your droplet; you can disable this below.
    
      [v] Scrub data - [etc.]
    
      [v] Temporary snapshot - this will keep a snapshot that you can use to recreate your droplet. This snapshot will be destroyed in 24 hours.
    

and the Select Image list showed something like:

    
    
      Destroyed Droplets
    
      chef.nl-haa1.infr.as f… — automatically deleted at 2014-04-01 09:25 UTC

------
lazyant
In my experience the people that wants to restore a destroyed instance
outnumber the ones that want the instance scrubbed right away by 10-1 or so
(if not more), so basically we (at another ISP) would decommission the
instance (which could not be allocated to another customer) and leave a grace
period after which it would be effectively scrubbed. If a user wanted to scrub
immediately they could send a ticket and we would do it right away (this was
noted in the "power down" email to the user), we saw very few of those.

------
sigil
Question for fellow paranoid HNers: what do you use to decommission a server?
Do you run shred(1) on all "interesting" files? Do you write over the block
device itself with random data?

~~~
rdl
A quick pass of bcwipe/shred/etc. just to make them less sensitive during
transport, and physical destruction of drives. I pretty much just store the
drives until I get enough of them and then angle grinder fun, or, if I have
someone paying for it, a commercial drive destruction service (and if they let
me, I throw personal drives in at the same time, since it's usually no
marginal additional cost).

I haven't owned SSDs long enough to need to destroy any, but some physical
destruction is the only way.

I do use full disk encryption on drives and then repurpose machines by
changing the full disk encryption keys, but those machines haven't left my
control -- it's just for changing e.g. a photo drive to a movie drive.

I'd be a bit conflicted if I were buying FusionIO or high end SSDs, but I
generally buy 1) fast/big consumer SSDs and 2) big spinning drives, and keep
both in service until they're essentially valueless.

IMO, degaussing is probably a good solution in a high volume environment as an
early step, but it's not as good as physical destruction, and it wrecks the
drive, so you really need physical destruction.

I dream of having an office with wet lab, machine shop, private SCIF/VTRs and
a destruction facility with soundproofing.

------
pearjuice
Earlier using DigitalOcean I also noticed that in the .bash_history there
would be a wget for a script on the website of a DigitalOcean employee which
had all kinds of clean-instructions.

------
Demiurge
Anyone else feels like hiring Moisey immediately?

~~~
nixgeek
I suspect he's waiting for the $1B+ exit so he can buy his own island complete
with secret volcano lair, and that any attempts to hire him would be
unsuccessful.

~~~
Demiurge
Well, you can't suspect that when you're just reading that convo without prior
information!

------
bakhy
this headline is misleading click-bait. the customer recovered only his/her
own Droplet, not someone else's. and the DigitalOcean explanation is perfectly
reasonable. they should only perhaps improve their UX to not surprise users
with this.

all in all, someone is venting frustrations.

------
solomone
This headline is a bit sensational given the the OP was incorrect in their
assumption. Should probably be changed to: DigitalOcean leaves your droplet
around 24 hours after you destroy it. If you care, destroy your own data.

------
kakashi19
destroy: put an end to the existence of (something) by damaging or attacking
it: the room had been destroyed by fire.

\--- Oxford Dictionary

------
good_guy
Previous discussion,
[https://news.ycombinator.com/item?id=6983097](https://news.ycombinator.com/item?id=6983097)

------
nixgeek
Also putting the word out via Twitter:

[https://twitter.com/nixgeek/status/450438984574193665](https://twitter.com/nixgeek/status/450438984574193665)

I think awareness is key with these type of issues, infrastructure providers
are very opaque beasts and the underlying platform behaviour varies with each
of them.

Knowing that you may need to erase sensitive data yourself before initiating
the destroy, so that it is not captured in the snapshot, that's probably half
the battle.

~~~
derefr
The snapshot is on object storage. Just delete the snapshot after it's made,
and everything will be okay.

For DO staff: also add an option (orthogonal to the secure-erase checkbox) to
not make a snapshot.

~~~
nixgeek
It's actually impossible to delete the snapshot as a single operation, first
you have to 'Restore' it. Oh the irony.

The entire motivation behind this submission was to be educational and
informative: the fact is you either need to erase sensitive data from within
your droplet before initiating the destroy, or take additional actions
(including jumping through the 'restore before destroy' hoops) to eliminate
the snapshot after the fact.

Note also that DigitalOcean said that in most cases the snapshot is just
deleted via `rm` and is not overwritten with zeroes or random data, so you're
still vulnerable to your instance contents potentially showing up down the
road when a hard drive ends up in a dumpster somewhere.

~~~
derefr
> It's actually impossible to delete the snapshot as a single operation, first
> you have to 'Restore' it. Oh the irony.

Now _that 's_ a bug! DO staff, fix this!

> so you're still vulnerable to your instance contents potentially showing up
> down the road when a hard drive ends up in a dumpster somewhere

From my experience in the computer refurbishing industry, any US corporation
with a legal department has data-disposal-related asset-liquidation
procedures. If the company is sensible, this results in giant magnets or DBAN;
more often, though, it just results in a concrete warehouse floor and a
sledgehammer. Either way, client data isn't getting out of the building. (see
the sibling comment at
[https://news.ycombinator.com/item?id=7499125](https://news.ycombinator.com/item?id=7499125)
for more details.)

It's true that someone who hacked into DO's live snapshot servers could dump
and examine the disks and possibly find your data[1]. But they could, equally
easily, hack into DO's live compute servers and dump your keys from your VM's
memory. Until we've got homeomorphic machine-emulation software, instance
memory, not snapshots, are the weakest link in your security.

[1] If the user-provided-snapshot servers are themselves a cluster of DO
droplets--running, say, OpenStack Swift--then those instances would certainly
get secure-erased. This is probably the way I'd set up the system myself,
though I have no idea whether DO does.

