It should be possible to cross-file de-dup more than just images.

Sit your files on ZFS with dedupe enabled and it will do that to any block-of-data. Not using filetype specific methods like the post describes, but file-agnostic dedupe nonetheless.

