Hacker News new | past | comments | ask | show | jobs | submit login
Copy-on-Write on APFS (wadetregaskis.com)
48 points by ingve 21 days ago | hide | past | favorite | 12 comments



I was researching this topic some time ago too when I wanted to find ways to dedupe files on my Mac: ie, show me which files are actual duplicates, which are cheap clones, and convert any real duplicates to clones.

Anyone know if something reliable like this exists? I swear like a year ago I found ‘fdupes’ on GitHub (seems gone now) and it had that feature but it wasn’t clear from GitHub issues if it’s not buggy.


fdupes is still available on github


Oops, seems like I misremembered and it was actually jdupes.

The author is hosting the software somewhere else now, but this is the feature and issue I was referencing: https://web.archive.org/web/20210506130542/https://github.co...


The article completely overlooked one of the greatest benefits of a copy-on-write file system, which is snapshots. APFS is able to create snapshots of entire volumes near instantaneously, and is used by Time Machine for performing backups.


I wish they opened it up more and give us the tools to manage snapshots. Time Machine gives you no guarantees, it'll prune them when it sees fit, so I don't trust it. I wish I could give up my clunkier setup for APFS snapshots.


By far the biggest takeaway for me was to alias my cp command to cp -c.

95% of the copies I make are from my terminal so I wasn’t really benefiting from CoW before.


Wonder if the homebrew version of cp has it, too… afk now.


Depends on what you mean by homebrew version of cp. cp isn’t a package on homebrew, but I suspect you’re talking about the gnu coreutils which contain the gnu cp.

That one does not have the flag.

However I think you were still using the old macOS cp without knowing because long time ago when I played with this package, they all were prepended with a g. Like gcp instead of cp.


Yeah, I meant the cp provided by Homebrew, which is, as you point out, in coreutils.

I'm use GNU versions over local versions, in all cases. I have a script which links them into /usr/local/bin/.

I guess I can use /bin/cp if I need this.


I wrote https://crates.io/crates/possum-db (https://github.com/anacrolix/possum) specifically to take advantage of this feature. APFS was the originally supported FS.


Why isn't content addressable inodes a thing? Address them by hash instead of id and avoid creating duplicate inodes. So for example, two 1TB files containing all zeroes will take up one inode worth of disk space but the fs table records 1TB worth of inodes. You tune the fs inode sllocation table disk usage a lot, what other downsides are there?

Also, how is linux support for apfs these days?


That sounds like basic in-band deduplication. Apfs just doesn't support it. You can find that on zfs though.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: