

Recovering deleted files using grep - atomicobject
http://spin.atomicobject.com/2010/08/18/undelete?utm_source=y-combinator&utm_medium=social-media&utm_campaign=technical

======
sp332
Make sure your output file is on a different filesystem! Otherwise, it might
be saved in the newly-freed blocks of the file you're trying to recover.

~~~
Wilfred
The author's intent is to write enough of the surrounding context that you
recover it first time.

Still, it raises two questions:

What about fragmentation?

Why don't have a GNU safe-rm yet that moves files to the (freedesktop.org
specified) trash location to avoid this?

~~~
someone_here
Because you told it to remove the file instead of moving the file? I don't see
why we need a safe-rm on the command line.

File managers already implement a trash function as per the freedesktop.org
spec: <http://www.ramendik.ru/docs/trashspec.html>

~~~
thaumaturgy
> I don't see why we need a safe-rm on the command line.

I think this is hilarious. :-) Throughout Unix/Linux/BSD history, there is a
steady series of essays, lamentations, wails, and gnashings-of-teeth regarding
the recovery, attempted recovery, or irretrievable loss of really important
data that got somehow mistakenly rm'd by some admin.

...and, _every single time_ , someone says, "Shouldn't this be made safer?",
and _every single time_ someone else says, "Nope, rm is doing exactly what
it's supposed to! Just be more careful!"

As if the huge volume of arcane commands and various scripting languages
disguised as configuration files weren't proof enough that the mass of
Unix/Linux/BSD admins and developers all share a common streak of masochism,
we also seem hell-bent on ensuring that we have tools which can -- and
eventually will -- bite us in the ass.

For my part, I think that having some form of undelete option standard in
every file system is as obvious as keeping backups.

~~~
BrandonM
The problem is not Unix as much as it is the work habits that Unix users have
developed. The rm command is hard core, and yet everyone (including me) uses
it regularly. It would be much smarter to create a command named "trash" or
"del" or whatever to instead move files to a trash folder. Then "empty-trash"
could actually use rm.

Alternatively, just slow down a little bit before using rm, especially when
operating as root. Understand that it's (intended to be) permanent. Use echo
first when using rm with a splat in order to ensure you're actually deleting
what you expect to delete.

The question, "Shouldn't this be made safer?" is irrelevant. At some level,
you have to have an rm command. If users decide to use it regularly, then it's
up to them to "Just be more careful!" The smarter thing would be to create a
workflow that doesn't rely on using rm at all. Why whine and complain (not
you, I mean users in general) about an operation that can be easily changed?

------
fragmede
If the file was on an ext3 filesystem, you can use ext3grep, written by Carlo
Wood (<http://www.xs4all.nl/~carlo17/howto/undelete_ext3.html>)

(Grepping your hard drive for file fragments is suggested in the ext3 FAQ -
<http://batleth.sapienti-sat.org/projects/FAQs/ext3-faq.html>)

------
cryptoz
> To help prevent this problem from happening in the first place, many people
> elect to alias the rm command to a script which will move files to a
> temporary location, like a trash bin, instead of actually deleting them.

Whatever happened to backups?

To help prevent this problem...

KEEP A BACKUP.

~~~
derefr
The kinds of files I most often regret rm-ing are the temporary files I have
created myself as a step in a process, then deleted after I had moved onto the
next step, not realizing an error had crept into the processor and that I
would have to run it again on the source files (which are now, conveniently,
gone.) Backups don't solve this problem, because the files themselves are
never more than an hour old. A "trash" folder, however, fixes this perfectly:
the semantic is that the file no longer has any place it "belongs," and may be
purged if you successfully complete the project, but may be needed again if
the project must be "rewound" to that step.

However, you're right that making rm(1) express move semantics isn't the right
solution. Maybe if the filesystem had a "BEGIN TRANSACTION" command that you
could ROLLBACK...

~~~
BrandonM
Storage is cheap.... Why remove intermediate files at all? If you don't want
to permanently remove files, then why use rm? Just create a "del" command that
moves deleted files to a trash folder. You can then make part of your backup
routine be to empty the trash after performing the backup (since that file
would now be available in the backups).

~~~
derefr
You're right—it's more of a "these files are in the way, and I'm _sure_ I'm
done with them... so it shouldn't hurt to just type those two little letters
and reclaim the storage..."

Actually, that sounds like exactly the cognitive dissonance people had when
they first started using Gmail. Perhaps filesystems need an "Archive" folder
as well? Not even a Trash folder—because people want to _empty_ a Trash
folder—but rather just an enforced (and shell-supported) directory where
things go when you don't have any reason to keep them, and therefore have no
place to put them?

------
csummers
Been there, done that. I rm -rf'd a bunch of important files once, and at the
time grep was giving me "memory exhausted" errors. I was able to use strings
to grab all of the text of the disk, and then wade through the results with
vim.

I guess this is a pretty common problem. The blog post I wrote about it in
2005 continues to be the most searched-for entry point on my site:
[http://csummers.com/2005/12/20/undelete-text-files-on-
linux-...](http://csummers.com/2005/12/20/undelete-text-files-on-linux-
ext3-partition/)

------
tbrownaw

        cat /dev/mem | strings | grep -i llama

~~~
ramidarigaz
Hmm... I'm getting an error on that one.

    
    
        cat: /dev/mem: Operation not permitted
    

Edit: even as root

~~~
someone_here
If I recall correctly, that's a bug that is preset on a particular kernel from
6-9 months ago.

~~~
miles
It's not a bug:

x86: introduce /dev/mem restrictions with a config option
<http://lwn.net/Articles/267427/> "This patch introduces a restriction on
/dev/mem: Only non-memory can be read or written unless the newly introduced
config option is set."

Command-line access to /dev/mem in Ubuntu
[http://superuser.com/questions/39583/command-line-access-
to-...](http://superuser.com/questions/39583/command-line-access-to-dev-mem-
in-ubuntu)

~~~
someone_here
Oh cool. Thanks.

------
auxbuss
Where the author says conservative, he means liberal.

(From afar, I understand my Colonial cousins' struggle with these two words.)

~~~
thaumaturgy
Well, I appreciated your joke, anyway.

------
omrisiri
I've been using this method since i first learned about raw disk access (dev
files) and grep.

I think it should be mentioned that this will work properly only if the file
was not fragmented - Which will usually be the case in EXT3 unless you are
using almost all of the space in the drive, but may happen frequently if you
are using a FAT file system (which is used a lot in USB disks).

Also, If you just deleted a binary file this method will be problematic as
well, and in that case you can use a tool like photorec to scan the disk and
even limit it only to the free space on the drive - which reduce the time it
takes to go over a disk and can detect all kinds of binary file types (uses
the magic number of the file to detect the type).

Like other people mentioned here before, you should recover all the data to a
different partition/disk than the one you are trying to recover a file from.

With that said - recovering data is a tedious and error prone process, so if
the data is worth enough(and for some silly reason you don't have a backup)
you should:

A. turn off the computer immediately after you've discovered the loss of data
(to reduce the chances of overwriting anything important)

B.Give the computer/disk to a professional to recover (because you obviously
aren't one since you don't keep backups)

~~~
moell
Fortunately point A on Linux can be substituted with mount -o remount,ro /

------
naturalized
Or, if you want to really delete a file, use #shred filename command

#man shred SHRED(1) User Commands SHRED(1)

NAME shred - overwrite a file to hide its contents, and optionally delete it

I especially like the -n option!

~~~
moobot
Except that shred is not guaranteed to work on many (most?) modern
filesystems. From `man shred`:

    
    
           CAUTION: Note that shred relies on a very  important  assumption:  that
           the  file system overwrites data in place.  This is the traditional way
           to do things, but many modern file system designs do not  satisfy  this
           assumption.   The following are examples of file systems on which shred
           is not effective, or is not guaranteed to be effective in all file sys‐
           tem modes:
    
           * log-structured or journaled file systems, such as those supplied with
           AIX and Solaris (and JFS, ReiserFS, XFS, Ext3, etc.)
    
           * file systems that write redundant data and  carry  on  even  if  some
           writes fail, such as RAID-based file systems
    
           *  file  systems  that  make snapshots, such as Network Appliance's NFS
           server
    
           * file systems that cache in temporary locations, such as NFS version 3
           clients

~~~
16s
It works fine on default EXT3. The only thing journaled is meta-data. You
snipped that part out. More from man shred

In the case of ext3 file systems, the above disclaimer applies (and shred is
thus of limited effectiveness) only in data=journal mode, which journals file
data in addition to just metadata.

 __ _In both the data=ordered (default) and data=writeback modes, shred works
as usual._ __

------
albertzeyer
Via `reiserfsck --rebuild-tree`, you can also do that for ReiserFS partitions.
Have worked _very_ reliable for me. Only problem is that it doesn't always
recover the filename and/or the directory structure (depending on how long it
is ago that you have deleted it).

~~~
kentnl
Just don't do this if you've at some stage backed up another reiserfs
filesystem inside your reiserfs filesystem with 'dd'.

the rebuild tree trick mistakenly sees entries in the dd'd copy as being files
in the parent file system, and then sprays them all over your drive.

~~~
jrockway
Better than burying them off in the woods somewhere.

------
datums
One of my most mememorable cluster fucks was recovering a database using
strings on the disk. The customer ran repair table and ended up with a very
small table :) . It was tedious but felt awesome actually getting a large part
of the data back.

------
pixelbeat
I've also used this technique. I even wrote a script with progress bar to do
it, which is linked to at the end of:

<http://www.pixelbeat.org/docs/disk_grep.html>

------
kajecounterhack
I used this method once...the file created gets pretty huge but you can even
_manually_ sift through it for lost code if you know roughly where it ended
up!

------
retroafroman
Excellent Linux hack. I hadn't ever heard this before.

~~~
albertzeyer
It works on all systems where you have raw access to the disk. And it isn't
really that fancy if you think about how it works and how file systems work.

~~~
pcora
Yup, it's not very fancy, but in the end, serves its purpose and can really
save your work.

the last part, about using an alias for rm is something that I've never
thought about it and now I'm gonna use always on my servers.

------
koevet
Actually, I think that the real great hack here is to alias the rm command to
a trashbin script (as suggested at the end of the article)

~~~
telemachos
The danger of aliasing the command itself (the bare 'rm') is that you come to
count on the safety of the alias. Then you work one day on a friend's or
coworker's machine and...BOOM.

What I do instead is make a nearby (and simple) alias. For example:

    
    
        rmi='rm -i'

------
mkramlich
frequent automatic backups and version control are your friend

------
freerobby
Clever stuff, thanks for sharing.

