Hacker News new | past | comments | ask | show | jobs | submit login

A very important lesson regarding the Unix principles of "everything is a file" and programs that "do one job well".



It's kind of flawed where he says " If you pick x to be big enough, you should get the entire file, plus a bit of junk around the edges." - because this is simply not how filesystems work - if you manage to get the full file, it's by luck that it was small enough for the filesystem to allocate its contents contiguously on the block device.

Anything bigger and you're gonna quickly need something more sophisticated which can understand the filesystem it's dealing with, as it will need to collect the many pieces of your deleted file scattered across the block device and merge them. I'm sure that would be mountains of fun to do with bash.

And in this case, the "do one job well" program that you're gonna need is a program which specifically recovers deleted files from a specific filesystem.


I'm the author of the post -- that's good to know, thanks! :) I know embarrassingly little about filesystems. I'm glad you pointed this out.

EDIT: though, I'd point out that if you really wanted to recover the file you should probably try to use /proc or something (at the time I didn't know about this). This approach requires crawling the disk which is obv pretty slow. :) It's less of a "here's a useful thing" and more of an excited "HEY DID YOU KNOW THAT YOU CAN DO X".

EDIT 2: I updated the blog to link to your comment, because it's baller.


If processes which hold your file open are still running, then you can access the file via the /proc/<pid>/fd/ entries. Run an 'ls -l' in the proc directory of the process to see those.

You can simply copy the (proc) file to a new location to recover it.

Remember: open files keep the contents present on disk until the controlling process exists or closes the filehandle.

Since you're actually accessing the file on disk your issues of storage contingency don't come into play -- it's all read back to you in proper file order.

But yes, files (and virtual files) on Linux are pretty slick.

I also remember being really excited learning about disk image files and the ways in which they can be manipulated. Including the options of creating filesystems on virtual disks, partitioning them, and them mounting those partitions, etc. First tried with bootable floppy distros, but I've played around with them in a bunch of contexts since.


Last time I ran fsck on my ext2 partition the fragmentation ratio was pretty low, and I tend to fill up my disks. Fortunately, homework assignments tend to be shorter, and more likely to fit in a contiguous spot. Anyway, what else can you do?

From a different perspective, hopefully /tmp is on a different filesystem from /home, otherwise the reading the man pages might overwrite the blocks you need to recover with the temporary files they produce. (And less, more, sort, etc.) Also, doing Google/StackOverflow searches is probably unwise due to the browser writing stuff to the 50 MB disk cache (FF default, anyway) on the filesystem you want. Probably step 1 should be "remount the partition read-only". Or better yet, "find another computer to use for research" :)


Also a very important lesson regarding the dangers of poor user interface design. It's a little bit crazy that in 2014 we still have a significant amount of serious work being done on systems where a slip of the finger or a one-character typo in a script can literally destroy whole systems with no confirmation and no reliable recovery mechanism.


This is why every system I administer has 'rm' aliased to 'rm -i' (along with 'cp' and 'mv' just in case). I believe this is the default on RHEL/CentOS boxes. Certainly for root, but should be for every user. Sure, it can be a pain sometimes to have to confirm, but at least you get the chance....unless you add '-f'.


This is why every system I administer has 'rm' aliased to 'rm -i' (along with 'cp' and 'mv' just in case).

Glad I'm not the only one. :-)

However, that is rather a specific case, albeit a common one. I have lost count of how many times I've seen even very experienced sysadmins do something disastrous by accident that is entirely due to the poor usability of some Linux shell or other command line-driven software with a similar design style and culture.

I have seen someone nuke an entire system, with a shell script that failed at string interpolation and literally did an 'rm -rf /', after I explicitly warned them of the danger and they thought they'd guarded against it. That person was a very capable sysadmin with many years of experience, but expecting anyone to never make a mistake with that kind of system is like expecting a similarly experienced programmer to write bug-free code with nothing but an 80x25 terminal window and a line editor.


Nothing makes you appreciate "don't miss" like deleting /etc on a live system. For a good few weeks after that I nearly introduced a peer review process to my own shell.

That being said, there's certainly something to that one event doing more to reform my being fast and loose with destructive commands than years of being told/telling myself to do so. (Something likely being that I'm apparently a slow learner.)


Looking at the man page now, there is now a -I option, which asks only once per 3 removals.


We are moving in the right direction with copy-on-write snapshots. What would be neat is an 'immutable' filesystem, where nothing is erased (up to garbage collection). This is likely to extreme to be practical, as we don't want to copy an entire block to change one bit, or read through a journel for every fs action. Even in theory, we don't want do spend the diskspace to record the precise state at every point in time.

Now that I think about it, it shouldn't be to hard to turn this into a workable product for general use. Automatically take a snapshot every 5 minutes, and present a the user a program that broweses the filesystem at time X, probably with integration into the filemanager to restore files/folders. Practically speacking, it needs some form of pruning. Probably along the lines of save every 5 minutes for the past hour, etc. My only concern with this is how optimized for frequent snapshots Btrfs is. Either way, I know what I am doing this weekend.


There are certainly some very grand schemes we could adopt to improve this specific problem, but let's not overlook the simple things. Every popular OS GUI has included some sort of "recycle bin" concept for a long time. There is no reason at all that a text shell shouldn't provide the same safety net for its delete command.


> Automatically take a snapshot every 5 minutes, and present a the user a program that broweses the filesystem at time X, probably with integration into the filemanager to restore files/folders.

Something like this? http://java.dzone.com/news/killer-feature-opensolaris-200


That sounds like it is doing something fancier then periodic snapshotting, such as using an immutable filesystem, where every fs operation is inherently lossless (up to garbage collection).

Of course, I might be reading to much into the continous nature of a slider. Does anyone have experience with that feature?


(Sorry for the late reply)

In OpenSolaris it just used cronjobs to create zfs snapshots.


go find solace with plan9




Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: