Hacker News new | past | comments | ask | show | jobs | submit login
dd – Destroyer of Disks (2008) (noah.org)
82 points by opensourcedude on Dec 1, 2015 | hide | past | favorite | 58 comments

About drive wiping: You're probably better off using the ATA Secure Erase command, which is very quick and does the entire disc. dd and other tools risk not doing blocks marked as bad, for example.

He's right that a single overwrite of zero is probably good enough to make sure that data is gone, but it's probably not enough to persuade other people that it's gone. A few passes of pseudo random data is probably better if you need to persuade other people that the data has gone.

But if it's really important drive wiping is what you do to protect the drives until you get a change to grind them.

There is also a cryptographic erase option on secure erase if the drive supports it. It is nearly instantaneous and you can follow up with other slower methods if desired.

Also for SSDs, using the secure erase method is important because of overprovisioning and garbage collection. If that is not available, on most SSD algorithms, doing two full pass writes (with random sector data if drive supported compression) will get you close to wiping out all contents as possible.

The OPAL standard has that cryptographic erase function. I've used it before, but did not deeply verify if any data was recoverable. At least the partition table was gone. In theory, the command destroys the old key and creates a new one that the drive uses to read and write data. A different key means everything is noise. You do need access to the printed label on the drive itself to do it (for the PSID).


This is what I am doing currently

# openssl enc -aes-256-ctr -pass pass:"$(dd if=/dev/urandom bs=128 count=1 2>/dev/null | base64)" -nosalt </dev/zero \ | pv -bartpes <DISK_SIZE> | dd bs=64K of=/dev/sd"X"

To randomize the drive/partition using a randomly-seeded AES cipher from OpenSSL (displaying the optional progress meter with pv): https://wiki.archlinux.org/index.php/Securely_wipe_disk/Tips...

Then I take out the drill press and make a bunch of holes.

>but it's probably not enough to persuade other people that it's gone.

I believe there is a long standing bounty for anyone who can retrieve useful data from a drive that had been zero'd once. No one has been able to thus far.

A lot of the disk wiping "culture" stems from a much earlier time when disk technology was less reliable, especially in regards to writes. Dan Gutmann himself says that the Gutmann method is long antiquated and only worked with MFM/RLL encoded disks from the 80s and early 90s.

Perhaps instead of humoring these people, we should be educating them. A zero'd out disk is a wiped disk until someone proves otherwise.

This reminds me of assertions we used to take for granted about DRAM. We used to assume that the contents are lost when you cut the power, but then someone turned a can of cold air on a DIMM. We usually assume that bits are completely independent of each other, but then someone discovered the row hammer. The latter is especially interesting because it only works on newer DIMM technology. Technology details change, and it's hard to predict what the ramifications will be. A little extra caution isn't necessarily a bad thing.

I agree but redoing a wipe isn't extra caution, its just literally repeating the same thing. If that thing is wrong, you're not helping the situation, just wasting time/resources.

Extra caution would be shredding the drive or some other non-wipe method. At work for example, we zero out drives and then those drives get physically destroyed by a vendor.

Gutmann's paper talks about a time when you didn't know what drive controller was used, and so he created a set of patterns to be used for each pattern. That comes to (about) 35 different patterns.

This gets misunderstood as "you need to do 35 passes of these patterns". You don't, Gutmann recommends a couple of overwrites of random data, and a single overwrite of zeros is probably enough.

You make a good point, but see countless Dilbert cartoons about the futility of trying to persuade your boss that what he wants is stupid.

This may be my paranoia talking, but is there a way short of an examination with a scanning electron microscope to ensure that the erase command is actually doing what it's supposed to do?

Not so much that the drive manufacturers are engaging in malfeasance (though that's certainly not off the table), but that it's not unheard of for certain agencies in certain governments to crock low level system components (intercepting them in shipping and so forth) so they work against the user.

..or just plain ignorance. A study indicates that back in 2011, half of the major drive vendors weren't doing the erase correctly. https://www.usenix.org/legacy/events/fast11/tech/full_papers...

You could just read the disk again and check for nonzero blocks. If you don't trust the disk to read or write to itself either, then you might as well just toss it.

It wouldn't be that technically complex to have modified firmware that looks for the secure erase command (not something that's ever invoked except purposefully by the user), and returns all random for read sectors without ever actually overwriting the underlying data.

But yeah, you're right. If you ever have reason to invoke the secure erase command, you're probably in a position where the next step is throwing the drive in a shredder.

>A few passes of pseudo random data is probably better if you need to persuade other people that the data has gone.

You only need one.


It's not about what is needed. It's about what the person ultimately responsible (the client, your boss, etc) perceives is necessary.

Then don't tell them how many passes you did :-)

> Unfortunately, there is no command-line option to have `dd` print progress

How difficult could it be to write a dd command from scratch that does include progress-reporting? I mean, dd is simply reading blocks of data from one file descriptor and writing them to another.

`pv` (pipeviewer) is usually very useful for tracking progress.

dd if=/dev/zero count=10 bs=1M | pv > file.bin

The other way to see progress on `dd` is to issue a signal 3 (USR1, iirc) to the dd process. kill -3 <dd pid>

> The other way to see progress on `dd` is to issue a signal 3 (USR1, iirc) to the dd process. kill -3 <dd pid>

Be careful with this on some distributions and compilations of DD. Purely anecdotal evidence, but in college I had a friend imaging a very large (5400RPM) drive and about 10 hours into the process he lamented that he wished he could see how far along it was.

I popped open a terminal, ps -A |grep dd, kill -USR1 $PID, and it just exited.

He was rather pissed that I lost him 10 hours.

The aptly named "Progress" also works on dd.


Actually, you can send signal USR1 to a GNU `dd` process to see the progress:

    $ pgrep dd
    $ kill -USR1 22230
- and `dd` prints its progress.

I guess you could pkill directly.

Anyway, this "trick" is mentioned on the man page, which isn't that long. No additional tool required.

Huh, on my laptop (running OSX) it's SIGINFO. Read the man pages!

If you're on OS X, it's SIGINFO instead. SIGUSR1 will kill it D:

(This has caught me out before. Oh how I wish these things were standardised...)

It seems to be a difference between GNU and BSD variants of `dd`.

Yeah, OS X doesn't use the GNU one. It's irritating that there's a difference.

the dd on my machine (from the gnu coreutils 8.24) does just that with the 'status=progress' option.

dd if=/dev/zero of=/dev/null status=progress 4814691328 bytes (4,8 GB) copied, 4,000000 s, 1,2 GB

It's only the size copied and the speed but it's usually enough.

I tend to run `watch kill -USR1 $(pidof dd)` in a second terminal. Watch executes your command repeatidly (by default every two seconds) so you then get regular dd status updates.

ddrescue has built in progress reporting, plus you don't have to be bothered with if= and of=


Does the third command really work as intended?

    sudo cat ubuntu-14.04-desktop-amd64.dmg >> /dev/sda1
I believe this will attempt to write data after the of the the block device, which almost by defintion will fail.

However, I often do the following, which works pretty well:

    sudo cat ubuntu-14.04-desktop-amd64.dmg > /dev/sda1

'>>' will cause O_APPEND to be specified as flags when opening "/dev/sda". I'm pretty sure this flag is ignored on block devices as it's obviously useless.

    fd = open("/dev/sda", O_WRONLY|O_CREAT|O_NOCTTY|O_APPEND, 0644);
    pos = lseek(fd, 0, SEEK_CUR);
    -> pos = 0

    # tmp  sudo ./open_sda
    Current position after open: 0
    Current position after seek to end: 128035676160

The redirection will happen in your shell, but the command will be called inside the shell invoked by sudo. So that won't work either unless you have write privs to the block device, but if that were the case, you probably didn't need sudo to read the dmg file. So this will likely fail for a reason other than the one you pointed out.

Correct. In these kinds of situations I usually do

  sudo sh -c 'blah blah blah >> file'
but I don't like it when I have to do that.

A better way to write files as root is to pipe the output into `tee`. eg:

   blah blah blah | sudo tee -a file
This will do the appending as per your example. If you want to write to the file like people are doing with the Ubuntu image then just drop the append (-a) flag:

   cat ubuntu-14.04-desktop-amd64.dmg | sudo tee /dev/sda1
Though obviously the `dd` utility is still a better way of writing disk images than any of the above.


    cat ubuntu-14.04-desktop-amd64.dmg | sudo tee /dev/sda1

You better redirect tee to /dev/null otherwise your terminal is going to print out that whole file.


    sudo tee /dev/sda < ubuntu-14.04-desktop-amd64.dmg

I know you meant to use the pipe instead of redirection, but it might be worth updating your comment for the benefit of others who are less command line literate :)

Whoops, fixed.

if you used cat, could potentially something escape and run in your userspace?

It's actually 'copy and convert' but 'cc' was taken.

You can 'dd' from Unix to the cloud ... well, some clouds ...

  pg_dump -U postgres db | ssh user@rsync.net "dd of=db_dump"

  mysqldump -u mysql db | ssh user@rsync.net "dd of=db_dump"
... although these days, now that we support attic and borg[1], nobody does things like this anymore.

[1] http://www.rsync.net/products/attic.html

That has only one minor advantage compared to:

  mysqldump -u mysql db | ssh user@rsync.net "cat > db_dump"
Namely, the syntax is one character shorter. (But only because I used whitespace around >).

With dd, you can control the transfer units (the size of the read and write system calls which are performed) whereas cat chooses its own buffering. However, this doesn't matter on regular files and block devices. The transfer sizes only matter on raw devices where the block size must be observed. E.g. traditional tape devices on Unix where if you do a short read, or oversized write, you get truncation.

> If you want to erase a drive fast then use the following command (where sdXXX is the device to erase):

    dd if=/dev/zero of=/dev/sdXXX bs=1048576
Question: is there a disadvantage to using a higher blocksize? Is the read/write speed of the device the only real limit?

> is there a disadvantage to using a higher blocksize?

Maybe, depending on the details. Imagine reading 4 GB from one disk then writing it all to another, all at 1 MB/sec. If your block size is 4 GB, It'll take 4000 seconds to read, then another 4000 seconds to write... and will also use 4 GB of memory.

If your block size is 1 MB instead, then the system has the opportunity to run things in parallel, so it'll take 4001 seconds, because every read beyond the first happens at the same time as a write.

And if your block size is 1 byte, then in theory the transfer would take almost exactly 4000 seconds... except that now the system is running in circles ferrying a single byte at a time, so your throughput drops to something much less than 1 MB/sec.

In practice, a 1 MB block size works fine on modern systems, and there's not much to be gained by fine-tuning.

It is worth noteting that the shred program mentioned is more or less useless on modern filesystems for a variety of reasons, the man-page has a list that it will fail to work correctly on (btrfs, ext3, NFS).

It may well be that the only usable filesystem for it, is FAT32 (and possibly NTFS, not sure on that thou).

This usually messes stuff up pretty good:

  perl -e '$file = '/dev/sda';\
  $s = -s $file;\
  $i = $s/2;\
  while(--$i > 0){\
    $r = int rand($s);\
    system("dd if=/dev/urandom of=$file skip=$r count=1");\

The "Unable to install GRUB" recommended fix to remove the GPT is wrong. The proper thing to do is create a 1MiB partition, BIOSBoot partition type (gdisk internal code 0xEF02), and then grub-install will find it and install core.img there automatically.

I prefer `pv` (pipe viewer) for watching dd's progress


CTRL+T on BSD platforms works brilliantly. I will never understand why Linux refuses to adopt CTRL+T (SIGINFO)

The point of using random instead of zero is that it's harder to see which parts have been overwritten and which parts haven't been.

Also, some (flash) drives are compressing data on the fly. Writing zeroes to such a drive might just not actually write anything at all to the flash cells except for some run length descriptors (numbers).

Since the worry seems to be speed, here's a tool I wrote to get random bytes about as fast as drives can accept (well, at least spinning disk drives, there might be some flash drives that are faster):


I'm expecting you to figure out where this tool goes wrong and produces predictable bytes yourself. Also, please tell me if you do :)

Anyway, even if it isn't secure, it will be enough to foil compression algorithms.

On the other hand that makes it indestingishable from encrypted data. What is worse?

"Give us the encryption keys" "It's not encrypted, I used /dev/urandom" vs "Yes, I wiped it"

You're supposed to overwrite the entire disc.

And random data is very easy to see.

The people who propose using multiple overwrites are clear that they're talking about making it harder to recover bits. (They're wrong, but it's what they say.)

When is that useful?

In the case of a partial erasure (eg, maybe someone disconnected the power during the write, to stop the write from completing), I guess it would make it harder to prove someone had tried to erase (and therefore possibly hide/destroy) information.

I don't believe this is the actual name. Are you serious?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact