Hacker News new | past | comments | ask | show | jobs | submit login
Undeleting a file overwritten with mv (pretix.eu)
268 points by todsacerdoti on Nov 29, 2020 | hide | past | favorite | 85 comments

> alias mv='mv -i'

While --interactive mode is certainly useful, a better option for this problem that I don't see mentioned often is the --no-clobber option:

    -n, --no-clobber
              do not overwrite an existing file
I often add --no-clobber to scripts to make automated file movement safer. An alternate "safe mv" alias can be useful, e.g.:

    alias smv='mv --no-clobber'
(I don't recommend changing the behavior of normal 'mv' directly; getting used to assuming 'mv' means the safer 'mv --no-clobber' can be dangerous when you use use a different computer without your custom environment)

bash also has a noclobber option, IIRC. It is to prevent overwriting existing files with the > redirection operator.


bash set noclobber option


bash set command

and see


Is there a convenient way to make this type of alias work when using with _sudo_ too?

i guess one can alias for the root user too but its annoying.

  alias sudo='sudo '
If the last character of the alias value is a blank, then the next command word following the alias is also checked for alias expansion.


goodness. I never knew of this black magic.

sort of like a space before a command and it doesn't go into bash's history (if enabled)

Strange that Sudo can be made to use your aliases (outside of sudo control) but makes an effort to ignores your $PATH.

It isn't sudo that's expanding the alias, it's your shell. In other words, your shell is expanding sudo mv to sudo mv -i before sudo is involved.

Right. The 'alias' word is a shell built-in[1], so is interpreted by the shell, which is anyway the first process to parse any command line that you type at the shell prompt. It is only after the shell has done its processing of the command line that the processed line is passed to the actual command being invoked (such as sudo or mv or ls or ...), which is done in the argc/argv pair of arguments (using C terminology) to the invoked program. (In Python it is the sys.argv list.)

[1] Section 6.6 in https://www.gnu.org/software/bash/manual/html_node/Aliases.h...

And the above is also why the shell metacharacters (such as dollar, asterisk, square brackets) that you may use in any Unix command invocation, get expanded by the shell, not by any individual command[2], which is why these metacharacters work for any command, old or new, built-in, external or user-written (in any language, as long as it supports argc/argv-like conventions), . This is unlike MS-DOS, for example, where some commands supported wildcards but others did not. There, the support was programmed into (only some of) the individual commands. (CMD.EXE of later Windows versions may be different, and IIRC, is more Unix-like.)

[2] That may be a less-known fact.

An executable can’t control aliases of it AFAIK, that’s all up to your shell.

The aliases are expanded before calling sudo. It is equivalent to just typing out the full command.

IIUC the reason sudo ignores the path is because if you use per-command sudo rules you often depend on the path being correct.

Thank you!

Just put it in a script.

    # call this file 'smv' or whatever
    exec mv --no-clobber "$@"
You only have to pay the 'annoying' cost once to add /some/util/script/dir/ to both the user's and root's $PATH.

> You only have to pay the 'annoying' cost once to add /some/util/script/dir/ to both the user's and root's $PATH

Put it in /.../sbin. It’ll be accessible to both root and non-root users.

This works too but I personally prefer etcet's way as it don't require to change anything in root user.

Here's the summary on how he did it :

- Search and list all FLV files on the disk.

- Search for all FLV file signatures on raw disk.

- For each known file on the disk, compute md5 of the first 512 bytes.

- For each FLV file signature found on raw disk exclude those that match any of the known files using those md5 values.

- That leads to only 5 files remaining.

- The original file was known to be 1.6 GB. Read 1.8 GB serially from raw disk starting from file signature and save those.

- One of these is your file.

That will work if your file isn't fragmented on the disk, I guess.

If anyone would like to read more, this technique is called "file carving".

> That will work if your file isn't fragmented on the disk, I guess.

Even 15 years ago when I was doing digital forensics it was actually pretty uncommon to have to deal with fragmentation. After about 2000 filesystem allocation implementations started avoiding it like the plague.

It is hard to avoid it when streaming a video of unknown length to disk though.

And as soon as you start to fill up your volume fragmentation will spread quickly.

Any reason the data couldn't be carved out with photorec[0]? It even has flv built in, but even if it didn't, creating new types isn't hard. This type of datacarving seems exactly what that software was designed for, and it does so quickly with a wizard.

The taking of the 500K bytes for the has does seem like a smart way to differentiate between the undeleted files and the deleted ones, I'll keep that in mind if I am ever in such a crazy situation and it would be helpful.

[0] https://www.cgsecurity.org/wiki/PhotoRec#How_PhotoRec_works

Given that they mentioned there was a small bit of corruption at one point in the video, I think photorec might even do a better job.

My guess as to the cause of that corruption is that the file was slightly fragmented, but the fragments were pretty close together, and streaming video formats are resilient enough to tolerate some garbage in the middle of the stream.

IIRC photorec should be able to handle this.

The thing that tolerates the garbage is the decoder. The format says what’s valid and how to read/write it.

Depending on your platform and what codec is used during playback, the corruption may be handled differently. The player might crash.

Best to scrub the file and properly reformat to remove corruption.

Another crazy thing - if the corruption is indeed due to fragmentation, the garbage bytes may in fact be another customer’s data! Or your company’s data, or whatever, you just don’t know.

You’re handing out random blocks of your disk to a customer, embedded in a video.

Depending on the type of business you run, you might not want to do this.

> The thing that tolerates the garbage is the decoder. The format says what’s valid and how to read/write it.

The format also gives the decoder ways to re-synchronize with the stream. That isn't guaranteed to be possible for arbitrary formats.

Would be good to have that installed on the disk in future, but he mentioned he immediately put the disk in read only mode, so couldn't install new software

You don't need to have it installed on the disk, just run it from wherever.

It probably could've, and PhotoRec is iirc designed to ignore not-deleted files. (This also means that PhotoRec is useless if you formatted your drive, but scalpel still works.)

I'm surprised that the lost file was simple continuous area on the raw device, and would've expected the pieces to be intermingled with data from other files.

Maybe if the disk is being filled for the first time, space for the file is allocated in one go (as opposed to gradually appending to the end of the file), and there's a large enough unfragmented area of free space, and the disk isn't very busy, the file would be allocated like this on the device. But I sure wouldn't count on it.

Not sure about the flv format, and how it recovers from errors, but in the worst case it might look like the correct file but might have pieces from other videos mixed in.

Modern file systems try fairly hard to keep files contiguous for performance, and if recordings are always written and deleted semi sequentially, you'd expect the disk to end up fairly not fragmented. That said, two or three big fragments would be pretty common in this situation.

We recently performed a large study on fragmentation on the NTFS file system [1]. On NTFS, we found that slightly over a quarter (26.8%) of FLV files were fragmented.

So if EXT4 performs similarly, it would be far more common not to have fragments.

[1] http://www.open.ou.nl/hjo/papers/di20.pdf (Paper accepted but conference and therefore publication delayed due to corona)

NTFS is known to allow more fragmentation than ext4 (as long as there is plenty of disk space left; full ext4 volumes will fragment). This is one reason why running Windows from a hard drive requires occasional defragmentation (done automatically in the background without you noticing) while Linux will happily run without any defragmentation at all in most cases.

Fragmentation will still happen, of course, but instead of preferring write performance and accepting fragmentation as a consequence, like NTFS, the ext4 system tries to avoid fragmentation where it can.

Now, in the age of SSDs and flash storage, fragmentation isn't as big a problem as it used to be, but it's nice to see the optimizations done for running an OS from a hard drive make it easier to recover byte streams from ext4 than it would be to recover them from NTFS.

I too was surprised that the article mentioned nothing about blocks or fragmentation. The deleted file has an inode yes but that inode also describes all the blocks that contain the data for that inode and it is no guarantee that the blocks are contiguous. It seems that was completely overlooked, but in this case it turns out everything apart from one part in the middle of the video was recovered leading me to believe that a few of the blocks were in fact fragmented.

pv is much smarter than you think. If instead of `cat|pv` you just use `pv filename` or even `pv <file`, it'll work out the file or block device size on its own.

If you forget to use pv and want a nice progress bar for an already executing process, use `pv -d <pid>`. That'll display progress bars for every open file. Works even for things like installers and servers, where you wouldn't be able to use a pv pipeline anyway.

By the way there is a tool called "progress" which will scan all running processes of (supported) tools like gzip, cat, grep etc., and report progress of their file operations.

If you want to emulate this by hand, first get the fd number of the file of interest by `ls -l /proc/PID/fd/` and then `cat /proc/PID/fdinfo/NUMBER`. There is a line called "pos", which is the position in the file.

I did not know about `pv -d`, that's a great feature!

Great article!

I acknowledge that this is an honest mistake that might happen to me anytime.

I guess human error is the likeliest cause of data loss nowadays... :(

Anyways for my ext4 ubuntu desktop I immediately made some changes in response to this post:

* installed ext4magic and extundelete so I don't have to do it after the accident, potentiality overwriting the deleted file

* changed my 'll' alias to 'ls -lisah' to include the inode. I guess it's very likely that one does a file listing before moving files around and this can be a live saver.

Of course. Obvious.

Actually, er, what?!

One of the nice things about HN is how it reminds me that I know nothing and others know a lot. ;)

Also, kudos for being brave enough to write this up. Even had I figured out the technique, I would have been so afraid that I missed some simple trick or tool, or that I would look like an imbecile for letting this mistake happen in the first place, that I would passed on a public blog post.

What irritates me is that he was willing to put so much effort into restoring the video file. Usually you would only do that if the files are important. But if the files are important, then you would definitely have a second copy of it (backup). Always have a backup is the real learning here.

It is also recommended to automate data related tasks as much as possible. If you have a human doing mv per ssh regulary or semi-regulary, then there is always the risk of a typo or some other kind of human error. I would rather expect that such an error would happen over a long enough time period, than not.

In this case, they explicitly called out that they decided not to make backups of these files, and maybe you’re right that they chose the wrong trade offs and the amount of engineering time they spent recovering cost more than just keeping a backup.

But let’s say they were taking backups. “Always have a backup” turns out not always be enough.

Perhaps the overwritten file was new enough that it hadn’t been backed up yet.

Perhaps they didn’t realize their mistake until after the backup process had run, and the backup no longer contained the file they had overwritten.

Perhaps they attempted to restore the overwritten file from backup and discovered that the backup process had actually been failing but they had insufficient testing or notifications.

Point is, backups have an engineering, hardware, and complexity cost, too. I don’t know enough about their tradeoffs to judge them for making the wrong decision here.

That said, I do agree that in general, the default choice should be automated backups, with multiple sets for different time intervals, in a mixture of on- and off-site storage, with regular automated restoration tests.

I do agree 100% with your last sentence.

As for the other (non-default) cases, I think one have to make it very clear and be precise about why a backup is not needed or why the company decided against it.

I have observed that basically no one thinks about backups until they've had at least one incident where they've lost something important to them and were unable to recover it. The actual number varies from person to person, but I've never seen less than 1, though I have seen more than 5 on several occasions.

I'm pretty sure anyone that used a computer for a while deleted a file on accident before having made a backup...

If that happens on your home computer - fine, lesson learned. But we are talking about production in a company. I'm pretty sure, that not anyone had to tell the boss: Sorry, I just deleted an important file from production without a backup. Therefore, in order to prevent such a conversation, we should take measures to prevent it.

I'm pretty sure it also happens quite often, but of course instant automatic backups should be setup in that case, because companies are more important then people.

Depends on the definition of quite often, but I very much agree with the middle part of your sentence.

I was kidding with the last part of that sentence but that is just how the world is right now

You could have done this much faster with the sleuth kit. Since you already knew the inode number of the deleted file I think you could just run `icat -f ext4 /dev/md2 <inode#> > recovered_file.flv`

> In the long term, we’ll of course work on preventing this from possibly happening again. Leaving very specific solutions like alias mv='mv -i' aside..

A bare minimum first step would be to stop using mv directly and wrap it in a shell script with appropriate error checking/environment setup/etc. This will take 5 minutes to develop and immediately prevents a whole class of operator errors from happening again. It also makes for just one place to put all the logic needed, so when the process to expose a file becomes mv and something else the operator’s interface remains unchanged.

Simple (but possibly incomplete) answer: store files you don't want to delete on a ZFS filesystem with snapshots.

Better answer: have backups. Fancy filesystems may save you if you use their features properly, but are also hell to do deep data recovery on when something does go wrong. And they are also buggier simply by being more complex.

Using ZFS is no replacement for backups.

Still, this would have been a non-issue on a ZFS system:

  /var/recordings $ ll .zfs/snapshot/
  total 72
  drwxr-x---+ 7 root  root  7 Apr  9  2019 0003/
  drwxr-x---+ 7 root  root  7 Apr  9  2019 auto-2020-11-26_00-00/
  drwxr-x---+ 7 root  root  7 Apr  9  2019 auto-2020-11-27_00-00/
  drwxr-x---+ 7 root  root  7 Apr  9  2019 auto-2020-11-28_00-00/
  drwxr-x---+ 7 root  root  7 Apr  9  2019 auto-2020-11-29_00-00/
  drwxr-x---+ 7 root  root  7 Apr  9  2019 manual-2020-10-05/

  /var/recordings $ cp .zfs/snapshot/auto-2020-11-28_00-00/recording-16679.flv recording-16679.flv

(every directory/dataset contains an invisible .zfs directory "mounting" the snapshots of the dataset)

The file he overwrote was the backup, haha.

>Given that we already consider it to just be a backup, we currently don’t make any further backups of this data.

Never mount a backup writeable.

How do you make the backups then?

Well, you can make the backup, but then set the read-only flag on the tarball and never touch it again.

You should always have backups, but for this specific issue I think that snapshots are much better since you can run them more often (I do a snapshot of my system every hour) and it is probably easier/faster to recover a file from a snapshot than a backup

You might be right in the general case, but ZFS is extremely mature and reliable; frankly I trust it more than ext4 at this point.

Yes, ZFS makes this very easy. It's no problem to snapshot an entire filesystem with billions of files every 5 minutes from cron.

Then the OP could have done:

  zfs-restore-file recording-16679.flv
With `zfs-restore-file` as the following script (for example only, I hacked it up in a few minutes) :

  FULL_PATH=$(realpath "$FILE")
  DATASET=$(findmnt --target="${FULL_PATH}" --output=SOURCE --noheadings)
  MOUNT_POINT=$(findmnt --source="${DATASET}" --output=TARGET --noheadings | head -n1)
  CURRENT_INODE="$(stat -c %i "${FULL_PATH}")"
  RELATIVE_PATH="$(echo "$FULL_PATH" | sed "s|^${MOUNT_POINT}/||")"
  # iterate all snapshots of the dataset containing the file, most recent first
  for SNAPSHOT in $( \
    zfs list -t snapshot -H -p -o creation,name "${DATASET}" \
    | sort -rn | awk '{print $2}' | cut -d@ -f2 \
  ) ; do
    echo "snapshot $DATASET @ $SNAPSHOT"
    SNAPSHOT_FILE_INODE="$(stat -c %i "${SNAPSHOT_FILE}")"
    if [ "${SNAPSHOT_FILE_INODE}" == "" ] || [ "${SNAPSHOT_FILE_INODE}" == "${CURRENT_INODE}" ] ;   then
    echo "found the same named file with a different inode:"
    ls -l "${SNAPSHOT_FILE}"
    cp -i "${SNAPSHOT_FILE}" "${FILE}"
If OP didn't change the inode (overwritten with new content) then you could make another script that compares size/hash of the file, or manually specify a time of a snapshot to restore.

Question: would creating a symlink via ln -s also work instead of doing mv? That would be less risky and more performant compared to moving across filesystems.

Actually, creating a hard link "ln" (no -s) would be the best choice. With the hard link, there are two directory entries pointing to the same data on the disk. At that point, removing the original with a "rm" unlinks the original but the new directory entry for the file remains.

As a bonus, ln will not overwrite the destination file if you mistakenly try to "ln" it to an existing file.

To best of my knowledge, hardlinks do not work across filesystems; softlinks do not have any such issues. Orphan links can be detected and cleared separately.

Thanks for teaching me I can use "ln" without "-s". I used it for years and never thought about it.

I'd say that depends on the retention requirements of the recording folder and the public folder.

If you symlink "./public/recording" to "./recording", the public file only exists as long as the original file exists. Some automated cleanup could result in unexpected file deletions from "public/" (or, more specifically, the creation of dangling symlinks). However, this might be a use case for a hard link if you need the file in both places and both directories are on the same filesystem. Though I haven't thought about the implications of a hard link in this case enough so far.

The httpd should not follow the symlink across filesystems (or arguably, follow symlinks at all) and hence the file would be inaccessible.

Brilliant write up. Thank you for posting this. It makes me want to do a fire drill to see if I can use the same tools and techniques.

The remote-control magic sysrq trick is also fantastic.

All modern software has an undo option. Why doesn't the filesystem?

All modern software applications have an undo option. Windows Explorer has had a recycle bin since the '90s, the MacOS Finder has had a trash can since the '80s, and various Unix equivalents have the same. Those are interesting because they've had to put a lot of work into solving this seemingly simple problem.

I think file systems run into technical and psychological issues.

The major obvious technical issue is simply running out of space, and how you deal with that informs many other aspects of such a system. The other issue is how the user figures out what action they need to undo. High level applications have an integrated interface, so the user is directly issuing commands into the application's event loop, and the undo feature is also integrated into that event loop.

But you don't directly make calls to the filesystem; deletions or updates are always issued by a process acting on your behalf. Many processes are generating a ton of temporary data, so while these actions shouldn't be undoable, there's no general way for the filesystem to know this. A user attempting to undo an action would have to sift through a flood of irrelevant history.

The first psychological issue is the user's intent. Software applications tend to make this a two-step process: you make revocable changes, then when you hit "save" your actions become irrevocable. You move a file to the trash, and you can take it up, but when you empty trash it's irrevocable.

For a filesystem, again, because applications are acting on their behalf, this connection is largely lost. There's no clear "save point" where changes should be lost, so maybe you could add a "force" flag to make changes permanent. But if you start adding a "force" flag to actions, you'll change user behavior; if they have to force actions to make them permanent, they may start to do that routinely.

And there's a moral hazard produced by insuring that actions are revocable; if users get used to having an "undo," they will naturally begin to rely on it. If the system has to automatically make changes irrevocable (running low on disk space), then you'll get situations where users are screwed because they assumed they'd have undo to fall back on.

And, of course, users can be screwed when they thought they had deleted (or changed) something and it wasn't. This is already something forensics experts can do, but a generic undo feature lowers that bar to the nosey middle manager.

> The major obvious technical issue is simply running out of space, and how you deal with that informs many other aspects of such a system.

The same problem exist with any other kind of undo option. For example, your text editor's undo option might run out of memory. Yet, we have undo in text editors.

> But you don't directly make calls to the filesystem; deletions or updates are always issued by a process acting on your behalf.

Yes, so ideally, any system command (such as "mv") should write a small journal to some undo system, saying what actions are necessary to undo the command. When those actions are logically grouped, then the undo system's journal should reflect that.

Of course, it would be even better if the filesystem just worked with nestable transactions, like a database.

It does. If you care about your data, you're running ZFS

Yes. But can you undo single operations even as non-root user?

You don't have to be root to read snapshots. It would be nice if every fs operation was a micro-snapshot (like an undo tree), but in practice frequent scheduled snapshots work pretty well. (If it didn't exist 15 minutes ago, you can probably get it back)

They may have been able to make this command they used:

cat /dev/md2 \

| pv -s 1888127576000 \

| grep -P --byte-offset --text 'FLV\x01\x05' \

| tee -a /mnt/storagebox/grep-log.txt

somewhat (or a lot) faster by adding a strings command (maybe with an appropriate length arg matching the grep pattern length) to the pipeline after the cat, and removing the pv call, be ause then only the printable ASCII characters woyls be passed to the grep, potentially heavily reducing the work it has to do, depending on the breakup between binary and text data (bytes) on disk.


That would invalidate the byte offset. The intention here isn't so much printing the data as its location on disk.

Oh yes, of course. My mistake. I did look up all the grep options used, including --byte-offset, before commenting, but somehow your point did not occur to me. Yes, the byte offsets would be meaningless after the data was piped through strings.

Sorry, typos above:

>be ause then only the printable ASCII characters woyls

should be

because then only the printable ASCII characters would

I have used automatic daily and weekly LVM snapshots. They slow down your write speed (especially the second one IIRC), but in software development use I haven't found it an issue. If you are write huge videos all day long that might be different.

This is why writing scripts/tools for ops work is helpful.

You can do `alias mv='mv -n'` or similar but then you have to hope everyone is using the same shell prompt, etc.

Even if your tiny script is just:

``` #!/usr/bin/env bash mv -n recordings/$1 public/$1 ```

you’ve removed some of the human tendency towards occasional typos and written something move defensive.

In my experience as an SRE people set way to high a bar for what should be tooling or “automation”. As soon as you make something software then you can iterate on it, or not as it makes sense.

If you keep on typing one off commands then the humans need to be correct every time.

> I used the big hammer to remount everything read-only immediately:

> # echo u > /proc/sysrq-trigger

> Uhm, okay, this worked, but how do I install any data recovery tools now?

Yesterday there was a discussion around an article that talked about how Desktop OSs were simpler (read: better) in the 90s. One of the things mentioned was that applications in may of them were single files (or folders) that could be located anywhere, requiring no special installation or uninstallation steps. This scenario highlights one of the many reasons that is useful.

If the recovery system allows you to write to the filesystem and has network access, what prevents one from using the package manager? If not, then you couldn't place a single executable anywhere on the system either (if I'm not mistaken?) so what difference does it make in this case?

You don't need to place it on the system, since it can run from whatever medium you have it on. Network disk, USB thumb drive, whatever.

Scalpel was made to do this. It works great. Has for years.


On Btrfs, there's no overwrite of metadata. The super keeps a record of current plus three backup sets of trees. You want to stop this file system quickly though, they don't stick around very long.

Those root tree address can be plugged into 'btrfs restore' (offline scrape tool) to search for and extract the files you want.

Pretty disappointing that it's 2020 and there isn't better tooling for this.

There's actually really good tooling for this, see: Sleuthkit. The author just didn't seem to know about it.

If we applied content based hashing on logical sectors and kept a list of hash -> file, offset, recovery would have been so much easier.

Any idea why the journal approach failed?

Should he have done a hard poweroff instead of a read-only remount?

Thank you for this article!

I wasn’t aware that grep could search for offsets in hex like this.

This will come in very handy.

I love the detailed timeline

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact