
De-duplicating Primary Storage - briansmith
http://storagemojo.com/2008/09/30/de-duplicating-primary-storage/
======
briansmith
The next wave of file systems should have block-level compression (like gzip)
plus automatic block-level de-duplication.

ZFS and NTFS can already compress your data transparently; on my computer, I
am getting 50% compression without any reduction in performance.

Windows Server 2003+ has "Single Instance Storage" that does de-duplication of
entire files.

ZFS has an open issue in its issue tracker for this feature, and the
foundation work for it (block checksumming) is already in ZFS:
[http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6...](http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6677093).

I am a Mozy user and their tech support told me that they already use de-
duplication to cut storage costs. A single client only has to send each unique
block (over all files on all computers on the account that use the same
encryption key) once to the server, and that block is shared across all files
on all computers on the account. The results: backups are faster because less
data is sent, costs are lower because less storage space is used and less
bandwidth is consumed. (Mozy told me they don't do compression.)

