Hacker News new | past | comments | ask | show | jobs | submit login
Apple’s new APFS file system (mondaynote.com)
305 points by robin_reala on April 17, 2017 | hide | past | favorite | 168 comments

> the “iOS First” release once again speaks to Apple’s priorities, to its vision of its future

I actually disagree about the reasoning behind this strategy. iOS may be more important to Apple at the moment, in terms of market value, but there's a better reason to target it first. iOS is deployed to constrained systems, iPhone, iPad, appleTV... this is easier to test because there will be less variance to deal with than on macOS deployments. It makes a ton of sense to start there for technical reasons so that they can validate it in more consistent environments.

I'd also like to mention that it is extremely impressive that Apple was able to transparently swap out the filesystem, with no data loss, on millions of devices. I've not heard of anyone who had serious issues with the update. It was an incredible technical achievement.

Yeah that's amazing.

More generally, Apple was providing working OS upgrades for years while the notion of upgrading Windows between major versions was basically laughable. Wasn't Windows 8 the first version of Windows that you could actually upgrade to and have the system boot successfully? Maybe Vista to 7 worked for people. I never bothered trying. But I remember upgrading from OS X 10.3 to 10.4 and being floored when everything just worked.

You might want to look at http://rasteri.blogspot.co.uk/2011/03/chain-of-fools-upgradi... to see an upgrade of windows 1 through to windows 7 and see most things still working

That's a cool little project, but it runs counter to almost every Windows upgrade I've ever performed. The natural bit rot of long running windows installations has always made a fresh install a much stronger guarantee of success than an upgrade.

My guess is that it's the keeping around of third-party drivers made for the previous OS that causes the majority of upgrade "rot."

If older editions of Windows just forcibly purged all the drivers when you upgraded, and told you to reinstall them afresh from the installation CD/OEM site (where you'd then get a version that's maybe for your current OS instead), that could have saved everyone quite a few headaches.

Then again, with major peripherals (displays, keyboards/mice, Ethernet cards, USB controllers) being much less standard than today, ripping out the OEM's driver could wedge your computer.

Certainly. I honestly never even expected a Windows update to be guaranteed to succeed simply because the surface area for them to test is unimaginably large - the same was not true for Macs (at least once OS X got good), so when a Mac OS upgrade failed I was generally more surprised.

i'm really confused about this.

i have upgraded (or my dad had anyways) from win 3.11 to win 95 and win 95 to 98. i think i clean installed win xp when i got a new computer but eventually i upgraded that to win 7.

all of them ran fine. or was that just not the norm?

Windows 3.x, 95, 98 and ME all used the DOS FAT filesystem (or the extended FAT32 variant). Windows XP supported both FAT and NTFS so if you did a clean install you were probably using NTFS.

What iOS 10.3 did was basically the equivalent of having a Windows upgrade which converted FAT/FAT32 to NTFS in-place without the user having to manage the process.

Windows XP did actually include a built in FAT32 to NTFS upgrade tool. Of course, you did have to run it manually from a command line, but I never had a failure from it. Vista would convert automatically if it saw a FAT boot drive, and I think Windows 2000 would upgrade FAT16 to 32.

I wonder if Apple will take the same approach for macOS (convert manually if you want for now, or we'll use it if you reinstall). Seems like a far greater risk for Apple if upgrading the OS breaks something for a lot of people, when they want users to be completely comfortable about updates.

I was trying to remember if CONVERT was added in W2k or XP but if memory served you couldn't run it on the boot partition.

Seems it was 2000: http://www.cs.toronto.edu/~simon/howto/win2kcommands.html

It definitely worked on the boot partition at least on XP, not sure about 2k. If it can’t lock the drive it schedules it to be done at boot time (similar to on-boot chkdsk).

If you, personally, were reasonably good about general computer hygiene, your odds of success in upgrading would go up considerably. The average computer user, however (sample size of the hundreds of people I interacted with during the years when I did IT consulting for individuals and small businesses) did not have the faintest idea what they were doing, and it was likely that an upgrade would fail for one reason or another.

I think my failure rate got to about 30% before I stopped even trying to run the upgrade and just did a fresh install after backing everything up.

I agree. I updated iOS and it was actually noticeably faster, too. Amazing!

Also, they _know_ that many of those devices are backed up regularly to iCloud (they may know which ones were recently sync’ed with Mac, too). That diminishes the risks even further.

I'm not so sure.. if you look at Apple's direction over the last ten years or so, you see a lot of convergence happening. Off the top of my head:

    * Siri on the desktop
    * Gatekeeper getting more restrictive
    * SIP
    * Neglect of pro users
    * New stuff showing up on iOS first

I was speaking specifically of this filesystem released to the wild, not about other new features outside of this context.

Yes, if anything because they know exactly what apps are installed, and where they come from. No custom configurations, scripts, or anything.

Possibility to help the user make a backup via iCloud (harder to do on Mac).

Also, you might argue that if you had to lose it, you'd rather lose data on your phone then your computer. I backup my computer every hour with Time Machine, but if I lost my phone I would care the slightest if I had to install again from a 10-day-old backup.

Addressing data integrity is a slippery slope of comparing hard drive bit read errors and ECC ram bit flips. The reality is, for things like video and pictures it probably won't be a problem. However, if the data is sensitive, you should be using a raid 6 with 2 disk redundancy, coupled with constant data scrubbing to ensure that the disks remain bit rot free. The probability of bit rot occurring at the same place is probably less than 1 in 10^16, possibly even less if data scrubbing is constant.

Here's a good article addressing bit rot: https://lockss.org/locksswiki/files/ACM2010.pdf

> The reality is, for things like video and pictures it probably won't be a problem. However, if the data is sensitive, you should be using a raid 6 with 2 disk redundancy, coupled with constant data scrubbing to ensure that the disks remain bit rot free. The probability of bit rot occurring at the same place is probably less than 1 in 10^16, possibly even less if data scrubbing is constant.

The first assertion is simply wrong: the internet is littered with people who have corrupt media files which cause user-visible errors. Beyond that, however, RAID can be part of a strategy but is far from sufficient. The paper you linked provides a good discussion of the reasons why but the root mistake is assuming that failures are perfectly distributed random events at a single point.

End-to-end strong integrity checks are so important because e.g. in that scenario with a two-drive RAID array you'd still be subject to errors caused transferring the data from the host to the RAID controller, on the controller itself, in communication to the drives (one reason why it's important to mix vendors & models), and on the path back when reading data. You'd also have to worry about corruption in memory, errors caused by e.g. power fluctuation or other hardware events which may be grossly underestimated because they're silently detected and retried until you get lucky and one of the 100,000 errors happened to generate a valid checksum, etc.

Generating a cryptographic hash as early as possible and checking it at every stage is key both because it catches almost all of those scenarios but also because it allows you to confidently report a positive validation result rather than just the absence of detected errors.

I agree with most of your points, and I agree that end-to-end integrity is important where information is critical. But my first assertion is not wrong. You are assuming that the flip of a single bit from a photo or video is catastrophic, both in the rendering of the file, as well as the resulting harm done to the user/individual. The reality is that most people don't value >99.999999% accuracy, and are fine with the current state of data storage. Out of 1,000,000 photos, will the user notice/care if 1 jpeg has a 16x16 block that is a different colour?

There are other places bits flip which break photo or video decoding outright.

More importantly, however, is that this isn't a reliably random process but often a highly-correlated one. Many users won't have a problem but the people who do often have many files affected. I've seen multiple cases where that wasn't noticed until after their backup software, cloud service, etc. had copied the corrupted bits, not to mention cases where data was corrupted on a RAID array which wasn't regularly scrubbed and so the corruption wasn't noticed until the other disk failed, long after the invalid data had been written to tape as well.

Flipping a single bit of an encrypted file renders the whole thing useless.

No it does not. Errors shouldn't propagate more than a block

Yes it does.

For photos, sure, it's not that big of a deal. But not all files are independent like that. A single bit flip in an executable can be a huge deal. I'm sure you can think of other examples.

People keep forgetting that this FS was designed by Apple for use in their appliances.

Apple devices, save for their least sold model (MacPro), in their least sold product line (Mac) are consumer devices with no ECC RAM. Most people don't know what ECC RAM is, most don't care, and won't pay more for it.

For Apple, it doesn't make sense to switch to ECC for system memory, however, if they implement storage ECC at block level in their SSD controller (Apple designes their own SSD controllers), they can obtain the same functionality at much lower cost (only flash controller RAM needs to be ECC).

> Apple designes their own SSD controllers

Huh! I assumed they used off the shelf stuff. Do you have any articles about this? :)

The first time I've heard of this was when Anandtech did their review of iPhone 7 and it has the Apple's custom SSD controller: http://www.anandtech.com/show/10685/the-iphone-7-and-iphone-...

They mentioned the '15 MacBook was the first to use Apple's custom SSD controller, which iFixit took a picture of here: https://www.ifixit.com/Teardown/Retina+Macbook+2015+Teardown...

They most likely started off from Apple's purchase of Anobit, the custom flash controller designer, back in 2011: https://en.wikipedia.org/wiki/Anobit

Except that not all storage devices APFS will be used with have those SSD controllers. If nothing else, Apple still sells Macs with hard drives (not sure if all the SSDs they sell have their controllers). There's also external disks, which Apple will presumably support formatting as APFS.

Since when Apple is a server company?

All their SSD's use their controllers.

As for external disks, Apple couldn't care less, for them, the future is cloud storage.

The fun thing is that Apple never used "parity" much even back when it was common in PCs

Parity was never was common in PCs.

Also interesting is Facebook's handling of cold storage (2015).

From their blog post:

> The easiest way to protect against hardware failure is to keep multiple copies of the data in different hardware failure domains. While that does work, it is fairly resource-intensive, and we knew we could do better. We wondered, “Can we store fewer than two copies of the same data and still protect against loss?”


Really interesting to think about how these guys model out reliability estimates and come up with such crazy numbers. Thanks for the share.

Parity raid does not protect against bit rot.

You should have redundancy even for data in memory, if data is that sensitive.

Unfortunately bit rot is not something APFS addresses as it only checksums metadata. Wonder if this is something Apple can add to APFS in the future?

I hope. It wouldn't be such a big issue if Intel stopped stripping ECC from desktop CPU's. Traditionally bad ram causes more bit rot than damaged drives. Ram get less reliable every die shrink, everything should be ECC by now.

Why not just go with AMD? ECC is supported on all Ryzen SKUs.

But that does not mean that all the boards support it, even if the chip works at all. What good ECC RAM does, if the errors are not logged?

I was under the impression that without ECC RAM, you should't be using an auto-repairing file system anyway because it will cause the bit rot to spread when it incorrectly calculates checksums.

This is incorrect. The point of ECC RAM is that if you are worried about bit-rot, you'll still get it without ECC RAM. Strictly speaking something like ZFS improves the situation - since a bit-flip which hits the checksum and not the data at the very least will raise an alert that something went wrong. But this will also happen with any other filesystem - it'll just be silent.

But the point is no filesystem can protect you from unreliable RAM.

Sure probability of errors increases, but RAM reliability seems to have improved

CPU ECC is for RAM, not storage - checksums/parity is not the concern of the CPU at all.

Though we're talking about Optane, then that's different. In the future (with solid OS support), Optane will be both RAM and non-volatile storage - and you really don't want data corruption there. Has Intel announced anything regarding ECC, parity or other systems in-place to protect against bitrot in Optane?

I don't think you get what I'm saying. The bit rot that happens to your data usually happens when it's in the RAM. The CPU likes it there when you're doing anything with it and once you're done editing a file it will stream happily corrupted data right back to disk. Hard drive and SSD error detection and correction is quite good, in regular ram its non existent.

This is a major danger with database servers, as regular bit rot can be detected or corrected above the filesystem level, but no systems exist that can garantee protection to data in ram. Since databases try to do most/all operations in ram before flushing to disk there's a huge risk of silent data corruption.

Adding bitrot protection to the filesystem only gives you more protection if the drive has crappy error correction. Adding protection to ram adds checks to a system that usually has none

> The bit rot that happens to your data usually happens when it's in the RAM.

Would love to read a source on this.

Just check your disk's SMART indicators for unrecoverable errors. Drives are using pretty high quality checksums on data at rest; the chance of corruption silently slipping past both the BCH and LDPC codes used by a typical modern SSD is far smaller than the chance of your non-ECC RAM getting corrupted by cosmic rays.

Your statement is correct, but doesn't tell the rest of the story.

The disk has an onboard data buffer, usually implemented in DRAM. That DRAM chip is likely sourced by the lowest bidder, and it is likely not to have either parity or ECC. Also the CPU used in the drive can introduce its own data errors and corruption. It doesn't even have to be hardware. A firmware bug can also corrupt data.

There are numerous buffers and storage elements involved in transferring the data to/from the disk's onboard memory to/from the computer's RAM. Many/most of those elements are not checked for parity. Data is subject to corruption in transit.

That's what makes a filesystem like ZFS so interesting. It checks the entire data path. It doesn't care where the corruption occurs.

I never considered DRAM errors on the drive... Maybe that's one of the big cost differentiators of those super expensive commercial grade SSDs.

Disks have their own block checksumming so they can catch errors.

Bit rot isn't addressed by any filesystem without device redundancy, which no Apple device has had as an option since the Mac Mini server was discontinued.

You can tell zfs to store N copies of the files (you do it at file system level, but it doesn't have to be for the whole disk) instead of one, so you can recover from bit rot even with a single disk. At least in theory, I have never tried it.

ZFS can tell you which files are corrupt, and you can then manually restore those files from an offline backup. No online device redundancy required.

Unless the backup files are also corrupt.

Sure, that's a small possibility, but are we trying to get to 0% chance of data corruption?

There is no point to go in the probability of data corruption below than a probability of an event that makes stored data irrelevant. Like, for example, Earth hit by an asteroid, global nuclear war or, for personal data, a chance of death.

Everything is backed up in the Akashic records anyway, it's fine.

I assume bit rot isn't an issue on devices running APFS. They're designed to be continuously backed up. EDIT: never mind; didn't realize iCloud didn't keep historical copies for long.

Backup doesn't do anything about bitrot. Worse, it just propagates and compounds it over time.

Backup will not fix your bitrotted files by itself. It will just propagate silently into the future where you don't have access to historically correct files anymore.

APFS doesn't do proper checksumming on data, and won't mirror or RAID, so it doesn't have the data recovery properties implied by the article.


Isn't it possible that Apple developed this filesystem primarily to be used with their SSD based devices which are using their custom controllers, which simply takes care of this low level protection? SSD controllers are said to be very complicated piece of proprietary technologies, and I believe they are using sophisticated algorithms for integrity protection.

Apple users use external (USB, Thunderbolt) drives too, where they would not get this special hypothetical protection.

Clearly telegraphing Apple's intent to release new "Apple approved (and licensed)" storage "accessories" for their systems.

That'd just be going back to their historical roots. You used to only be able to buy Apple RAM and Apple drives. All parts were Apple parts, there wasn't anything else.

And look, we're already back to, you can only get RAM from Apple, because so many of their products come with it soldered on the logic board with no upgrade path.

Maybe, but isn't the recent trend at Apple to discontinue desktop-only accessories (like monitors)?



At the recent Mac Pro announcement, they also said there would be a new Apple-branded monitor, so they're not out of that game completely.

It's courage!

Lock-in all the things!

Indeed, sure would be a shame for anything to happen to your data because you chose not to use drives equipped with iStoreSafe(tm) technology by Apple(tm).

Brings back the days when Mac users were paying a 3x markup for standard SCSI drives.

This is bad logic. The benefits derived from additional data integrity assurances would be additive.

A special controller isn't a valid substitute for other measures apple hasn't historically been any good at designing filesystems and they aren't any good now.

Don't buy the hype or fall into the magical thinking. Disks fail. Apple tried to handwave it away.

And yet, neither Windows, nor Linux, nor macOS does anything special about it for consumer/90% of professional use, and the sky hasn't fallen.

Earlier cars were terrible death traps and we were told crashes at highway speeds were unsurvivable.

People died in mangled messes but the sky didn't fall then either.

The status quo is rarely a sufficient argument because the human race is pretty much terrible at everything improving things only slowly, incrementally accruing useful strategies and procedures.

We built bridges well via centuries of practice, software is less mature.

>The status quo is rarely a sufficient argument

And I didn't say it is. My answer was to the parent who singled-out Apple as somehow special in neglecting this.

They are writing a new filesystem today and neglecting this they are special and not in a good way.

Nitpick: I can think of one thing NTFS does proactively about this problem. Just not anything good. Googled and found: https://blogs.technet.microsoft.com/doxley/2008/10/29/self-h...

In my time at Microsoft I did see a number of workarounds for bad disks in Windows source, in ntfs.sys and elsewhere.

However I agree with your overall assessment, it's not as if anyone is running zfs or similar as a default.

To be honest, I have a long list of friends, especially in creative professions where people struggle with setting up reliable backups. There is a reason Dropbox and other cloud drives were so successful. Any improvement on the data integrity front is very welcome.

Well, that's disappointing to hear. I've run into multiple MP3s bit-rotting on my MB air over several years. Perhaps soley the fault of HFS+, but trusting the underlying hardware just doesn't seem like a good idea.

Damaged files (that can still be read from disk without error) are more likely than not a result of bad memory. It is incredibly unlikely to have bits mutated on disk without triggering a CRC error -- approx 1 in 4 billion errors would go undetected in a CRC-32; and it has to be spread over more than 32-bits-wide. The fact that you had multiple damaged files basically guarantees that it was a faulty memory (or bus, or controller -- but not actual magnetic media corruption) issue.

ZFS can't really help you with that as the data was likely damaged in transit rather than on disk; though with its own 256 bit hash, it is likely to detect those faulty system components earlier than later.

All my memory is ECC (other than my barely-used laptop). Been there, bought the t-shirt, decided non-parity can go jump in a lake more than a decade ago.


I encountered this going through a copy of my photos stored on a pair of WD Greens using NTFS 3-4 years ago. The original copy on a ZFS machine was fine. I found a few others, and promptly stopped using those drives.

Two years ago I had repeated bursts of ZFS checksum errors from a pair of SanDisk SSDs. Evidently TRIM didn't quite work perfectly 100% of the time, and caused data corruption - luckily ZFS was always able to repair it, and it being detected meant I could do something about it early - I updated firmware and the issue went away. Last year it came back after an OS update, and I just turned TRIM off completely (I guess it was sensitive to TRIM patterns and those changed).

Last year I also had a Toshiba HDD forget how to IO properly, and got a constant stream of ZFS checksum errors from it until I yanked it from the hot-swap bay and reinserted it. It resilvered and scrubbed fine.

These aren't the only times I've seen checksum errors and silent corruption, they're just the most recent. ZFS lost a file once, and was very noisy about it - the status message for the lost metadata stayed until I recreated the pool. NTFS, UFS2, ext2, all were completely silent on the fact that they were showing me data that was clearly wrong.

I don't trust disks, or IO controllers, and I don't trust filesystems that do. Neither should you.

I also disable TRIM on all my devices. Rather I set aside 7% of disk space that I never touch. It eliminates the need for TRIM without performance degradation.

To mitigate the likeliness of damage during transit, it's recommened to use ECC memory with ZFS. I had FreeNAS running on a system with a faulty memory module. ZFS correctly discovered damage in random files I just transferred over. Luckily, it wasn't too late to replace the bad module and repeat the transfer.

Lessons learned:

- Run MemTest86 on hardware before using it as storage

- Use an FS that does checksums

Unfortunately, the latter doesn't seem to apply to APFS.

Can you always just buy ECC memory (I have DDR3, I guess it's more affordable now since it's a little bit older), or does something else (cpu, motherboard) need to support it as well?

You need a compatible motherboard and CPU.

Which is why Intel kinda sucks. It would cost them basically nothing to have ECC enabled on all of their hardware, but they insist on using it as a differentiator between server and desktop parts.

AMD (at least in the past), includes it on all parts so that it's up to the consumer to choose.

While I agree that it'd be nice if Intel didn't use ECC as a server/desktop delimiter, AMD's stance (at least for AM3 and AM4 parts) appears to have been "the CPUs support it, but we haven't run any validation tests on it; if the motherboard manufacturers want to validate it and turn it on, great".[1][2] Which is not quite the same as it being enabled on all parts.

[1] - http://www.hardwarecanucks.com/forum/hardware-canucks-review...

[2] - https://www.reddit.com/r/Amd/comments/5x4hxu/we_are_amd_crea...

That absolutely does mean it's enabled on all parts. They also don't validate the chips against FreeBSD - does that mean you can't run FreeBSD on their chips? Or do you think it would be ridiculous to expect them to test scenario's outside of the market they're targeting?

Do you have any idea the cost of running validation tests? I'm not the least bit concerned that they haven't "validated" the ECC functionality. It's enabled, they know it works, it's the same ECC they use on server class chips, and if someone found a bug I have no doubt they'd issue microcode to fix it.

I should have perhaps included the article which tried _enabling and using_ ECC on a Ryzen CPU+MB. [1]

Page 5 is perhaps the most important one, where it observes that neither Windows nor Linux appear to react by halting to a UE, and Windows can't quite figure out that ECC is enabled on the platform and parse the notifications it gets as such.

So, sure, I should concede that it is "enabled" on all parts, I was wrong. But that doesn't mean it should be trusted on any of them.

[1] - http://www.hardwarecanucks.com/forum/hardware-canucks-review...

I guess we can agree to disagree. AMD implementing it, motherboard mfgs implementing it, but Windows not having an updated driver to handle it in all situations isn't on AMD. And it doesn't mean it's not there - it means that Windows is lagging slightly behind on a brand new platform. Something that's been fairly common with AMD for decades now. There's a reason the acronym Wintel became a thing.

Precisely because it is a market segmentation tactic, they do allow ECC on lower end chips which they do not see stealing marketshare from Xeon.

My home media center PC / NAS runs an i3-4370 CPU with ECC RAM

The same is true for AMD's new Ryzen CPUs, ECC support is enabled. You still need a suitable motherboard though.

On linux you can force enable ECC on an unsupported motherboard with

> ecc_enable_override=1

Most motherboards have the extra memory traces in place even if they don't enable ECC.

"cost them basically nothing"

"using it as a differentiator between server and desktop"

so it would cost them SOMETHING...

Something related to artificial milking, as opposed to tangible production costs.

Sure, but as a consumer it is kind of nice that my chips are probably a little less expensive because server companies are paying more for their chips.

Who said we aren't milked as consumers too?

Those are two segmented markets. If Intel's revenue is above their R&D and other expenses, then whatever profit they make milking enterprises/server companies is independent of what they make milking us.

Why is this any different from tesla selling you software locked batteries?

Well, it could be, who said it isn't? Though i'm not familiar with the Tesla case.

But selling "locked batteries" in a product where the batteries are 80% of the innovation/feature set, and where after-market batteries could cause all kinds of issues, is one thing.

Whereas selling memory at triple or more the price just because you switched on some feature (ECC) that would have costed nothing to switch on for everybody is another thing.

In Tesla, they literally sell everyone the 75 KW battery. Some people pay 80k for the car, some people pay 75k for the car, but the battery is software limited to 60KW. You can later pay 6k to software unlock the extra part of the battery sitting in your car.

Product market segmentation is a very reasonable thing to do. Why do people make it out like a bad thing? If you ever run a business, you will want to find a way to get big enterprise to pay X, and small business to pay X/4. ECC is something businesses care way more about than gamers, so why not charge more for it?

>Product market segmentation is a very reasonable thing to do. Why do people make it out like a bad thing?

Because most of us would rather pay a price that mostly reflects costs + some reasonable profit, not some artificially created segment, not fuel extravagant profits, not pay for future research, not pay for the company to have cash reserves, etc etc.

As a consumer, your only choice is binary. Vote yes and buy, vote no and don't buy.

If it offends you that your device has some enterprise feature that you don't really need turned off unless you pay $x... sorry? But you don't really have a right to the feature for some margin % that you deem fair.

Eh, I've seen an Intel SSD 320 eat a bunch of pages and an HGST 5K4000 4TB direct thousands of writes to the wrong sectors. There are a lot of things that can go wrong with storage devices, not just the optimistic case of a bit getting flipped after being written correctly.

Aside from transferring from disk to every larger disk, how would memory corrupt an mp3? Given Apple's long-time tradition of putting metadata (playcounts/star ratings/etc) in a separate file, an mp3 on a MacOS computer is pretty much write once, read many.

Static data on a flash device (e.g. SSD) is subject to wear leveling, which is a regular re-writing process. This is counter intuitive, but makes sense.

If your flash device never moved the static data, the only flash blocks that would get wear cycles would be the flash blocks that contained dynamic (normally changed and rewritten) data. The result would be the blocks of flash that were not static would quickly wear out and the blocks of flash that were static would have a lot of unused write cycles available.

In order to use all of the wear cycles of all of the blocks, the static data has to be moved regularly so the blocks all have a (roughly) equal number of wear cycles[1]. Every time the data is moved, there is an opportunity for data corruption.

The flash data blocks (typically) have ECC (error checking and correction) which is designed to prevent data corruption. There are limitations to ECC:

* ECC can only correct a limited number of errors.

* Flash memory is not a perfect storage medium, it can "bit rot" too - the primary reason for ECC with flash is to "hide" the inherent bit rotting of flash. "MLC"[2] flash chips aggravate the problem because their margins are smaller.

* If a memory controller does a wear leveling move and the source data is bad, beyond the ability of the ECC to correct, it has no way to correct that error and (generally) has no way to inform the user that their file (system) has suffered corruption.

In Jean-Louis Gassée's anecdote (which is typical), his notification that his wife's files were corrupt was an backup failure notification. The backup failure was telling him that it could not read files, but it was not clear to him (and would not be clear to most users) that the root cause was file corruption, not a backup problem per se.

[1] https://en.wikipedia.org/wiki/Wear_leveling#Static_wear_leve...

[2] https://en.wikipedia.org/wiki/Multi-level_cell

I add parity archives to add some redundancy to my photographs. I've done it for years, but I think it's useful not to rely on a filesystem to handle this.

For example:

  par2create -r5 -n2 example.par2 *.jpg
creates two files, between them giving 5% redundancy. I think that should be more than enough to repair bitrot within a file, but depending how many photos there are, losing a whole file could be more than 5%.

  par2verify example.par2
will verify, and par2repair will repair corrupted or missing files.


Ha, I thought of writing a similar ad-hoc checksuming tool so many times. I should have checked. I now wonder how feasible it is to embed them invisibly in metadata fields )

It would be neat if there was a parity scheme fast enough to preserve 2% of of all files on disk. It could even be tucked away behind savings from file-level compression.

> it's useful not to rely on a filesystem to handle this

In what way?

Relatively few filesystems offer thorough data checksumming. Hardly any offer erasure coding. RAID at the filesystem layer is a bit more common, but also more inconvenient. Doing erasure coding at the file archive level rather than in the filesystem gives you the freedom to move your archives onto standard everyday filesystems and devices without silently losing the protection.

You don't need _a lot_ of filesystems to offer it - just the one you use - and if this is important to you, just use ZFS.

What if I use multiple computers and multiple operating systems want to be able to work on this data on them. Par2 would allow me to create the needed data on any computer and test it on any computer.

The other alternative is having a server that's up and running all the time, exposed to the internet (or complicate the setup with a VPN), so I can sync. Operations would take a long time (via the internet) or I would anyway need to transfer the data to my computer, work on it, sync it back. During this time, any protections that ZFS offers are null since anything could happen in my computer and I can't test for it locally.

ZFS is great. But it's not the answer to everything.

>You don't need _a lot_ of filesystems to offer it - just the one you use

Only if you use it everywhere. Including on your laptop.

How often have you had to, and been able to, correct data using your parity archive?

I think I've needed it once, and I was able to correct the error.

I have a cron job which runs every month and verifies each photo album; that picked up the error.

Yep, I had a similar thing with audio files corrupting on my old MacBook, I was using time machine to backup and guess what.... the copy in the backup was fried too!

Switched back to vinyl for my music, my 20 year old Texhnics 1210 will probably out last my current laptop and the next!

The physical vinyl won't outlast your digital, it'll just fail more gracefully, slowly losing fidelity every time you scrape it with a needle. You're just replacing a small chance of catastrophic failure with a guaranteed gradual failure.

Wikipedia says

"Record wear can be reduced to virtual insignificance, however, by the use of a high-quality, correctly adjusted turntable and tonearm, a high-compliance magnetic cartridge with a high-end stylus in good condition, and careful record handling, with non-abrasive removal of dust before playing and other cleaning if necessary."

In other words, record wear is negligible if you buy good equipment and take good care of it. Just like the risk of bit rot, catastrophic crashes etc. is negligible if you buy good storage equipment and take good care of your data (good backups etc.)

Perhaps the one big advantage of vinyl over digital is that the shelf life of unused vinyl is larger than a human lifetime. A disk stored on a shelf, however, can suffer damage in many ways and is guaranteed to be very hard to connect to your computer in ten-twenty years.

With a little calibration of the TT once in a while it will esily outlast me!

You switched to a playback means that is inherently destructive to the media as a means to protect against data loss?

Nitpick: You can, in fact, create new folders in the iCloud Drive app on iOS. For some reason it's hidden under the Select option.

For those who don't know of him, Jean-Louis Gassée (the author here) is a great. Check out some of his quotes: http://www.azquotes.com/author/38975-Jean_Louis_Gassee my favorite probably being http://www.azquotes.com/quote/754436 about Apple Computer.

He also invented the legendary BeOS.

Isn't the file system guy from BeOS the guy behind the apple fs? Yeah, he's referenced it's Dominic Giampaolo. Not sure how much coding he did but it says he's one of the architects.

BeFS was way way ahead of its time. Really a shame that OS didn't get picked up by more people.

I think he's also credited as the father of the add-on filesystem journaling that was enabled on consumer OS X 10.2.2 back in fall 2002, and many of the shims and patches that have kept HFS+ going to this day.

The author links to a much more technically informative article:


He is joking with the title. Money quote: "encryption will be easier to use, disk space will be better utilized, backups will be more reliable"

Both my iPhone 7 and my iPad Pro only have 32G of storage and the new file system gave me back extra storage due to better utilization. It was interesting how smoothly the transition went.

Bit rot is absolutely not about SSD degradation.

How did the backup software detect it then? Or is the author falsely calling SSD degradation bit-rot

Hard to say without dmesg, but it could be metadata corruption and in the ensuing confusion backupd gives up; or it could be an explicit read error report by the drive which would result in an I/O error to backupd and it'd likewise give up.

Most likely, the bit flipped in ram and was bad in transit to the SSD

The most interesting part of this article (for me at least) are those early Mac screenshots - The Flounder? and that next one with not-quite-Chicago/Geneva fonts in the Finder? Where did those come from? I realize JLG had been with Apple since the early days of Mac development but I've never seen anything like those before.

Jean-Louis kind of speaks incorrectly here when he says "Very old-timers will remember the pre-Mac File system called The Flounder", unless by old-timers he means Apple employees that were on the Mac team.

There's a little more about the early Mac software history at Folklore.org, including another screen shot of what eventually became Finder:



Funny... I converted Cream 12 for use in Apple II programs.

For the old-timers, it was a shape table and multiple draw commands of numbered (ascii value - 32) shapes would allow you to write proportionally spaced text to the graphical screen. It was the same format Take-1 used (we used their programmer's toolkit).

So, Cream 12 came down from Altos all the way to our Apple II+'s.

You can see a lot of this stuff if you read through the early Mac team stories on http://www.folklore.org.

What's the best way to protect myself against bit rot, as someone with extremely slow upload speeds?

The best way is using ZFS with parity.

Since there seem to be experts here in the discussion: I keep thinking about a little toy open-source project to adress bitrot in TimeMachine backups. A utility that compares chekcksums of the original file against its backup. Apparently Apple started using hashes internally in Time Machine and it would probably be simplest to use those. However, I cannot find any documentation on the internals of Time Machine. Does anyone know where to start looking? Would such a tool make sense at all?

I really like this idea. It's a relatively simple approach that could work with any fs/backup strategy. It would be great if it wasn't dependent on Time Machine.

ZFS/Raid + ECC ram are complicated and expensive and are not even a 100% guarantee against bit rot.

I've found and been using this. https://github.com/rselph/sumcheck

Does it make sense? No idea.

tmutil verifychecksums should already do this.

As far as I understand it, this will verify checksums within the time machine backup by comparing hashes with the ones that were created during backup. However, it will not compare the files against the ones on the system. So it protecte against bitrot on the backup disk only.

Well, you'll need some kind of heuristic to figure out if a mismatch is due to bit-rot, or because the working file's changed. Maybe modification date, but a) what if a file's restored from an older backup, b) mtime itself is corrupted (APFS would guard against this I suppose).

In any case, the "hash" appears to be CRC32, stored in extended attributes:

  $ xattr .inputrc
  $ xattr -px 'com.apple.finder.copy.source.checksum#N' .inputrc
  26 E5 4A AB
  $ cksum .inputrc
  2873812262 65 .inputrc
  $ printf '%x\n' "$(cksum .inputrc | cut -d ' ' -f 1)"

What on earth is nanosecond data resolution? Does it just means file create/modify timestamps are saved with that level of precision?

If so, who cares about that?

What they missed with nanosecond timestamps is that it's a feature which is essentially free - What you're actually looking at is 64bit timestamps.

With the 'unix epoch' way of doing things (counting seconds since 1970), a (signed) 32bit int will run out of values in 2038. If they'd counted milliseconds since 1970, they would have exhausted int32_t in less than 36 minutes, and nanoseconds would have exhausted same in less than 2 seconds.

So the move to 64bit timestamps solves the '2038 problem' (for the filesystem, at least) - but at some point someone has decided this filesystem does not need to support the year 292,277,026,596AD, and has chosen to use some of the space for granularity instead. (Given 292 billion years is 20 times the estimated age of the universe, they were probably correct in their assumption that we'll have a new filesystem before then).

If the number of people who need to differentiate between two files created in the same second, is more than the number of people who expect their Watch to still work after our sun is a cold shrivelled mass - they've made the more logical use of the valuespace.

someone has decided this filesystem does not need to support the year 292,277,026,596AD, and has chosen to use some of the space for granularity instead.

I don't know where you came up with that number. It is from Apple documentation?

A true nanosecond timestamp wouldn't get you anywhere near the number you suggest. Here's a back of the envelope calculation:

A 32-bit timestamp with one-second resolution is good for, more or less, 68 years. Or 136 years if unsigned. That's the traditional "unix epoch" setup.

Now add 32 more bits to the LSBs of the timestamp. There are 1 billion nanoseconds in a second. A 32 bit integer can hold over 4 billion different values. So the 32 LSBs, if they represent nanoseconds, will overflow in just over 4 seconds.

What that means is the 32 MSBs now have a range of just over 4x the traditional Unix timestamp. If it was 68 or 136 years before, it becomes 291 or 582 years.

That's what I would have done if I were Apple. I would have set the LSB of a 64 bit unsigned counter to represent 1 nanosecond. I would keep year 0 as 1970. So, problem solved until (roughly) 1970 + 582 = 2552.

292billion AD is where 2^63 seconds from 1970 would get us - eg, what we'd get if we just converted the existing epoch to 64bit.

This is why I call nanoseconds a 'free' feature - it's clear it's time to solve the '2038 problem' now - and simply moving to a 64bit timestamp would solve that. But rather than solving this problem for the next 292 billion years, we can make better use of the timestamp today. So, as you say, providing nanosecond granularity for the next 500 years, being more useful than one-second granularity for the next 292 billion.

> If they'd counted milliseconds since 1970, they would have exhausted int32_t in less than 36 minutes, and nanoseconds would have exhausted same in less than 2 seconds.

Milliseconds would have lasted 24 days. Microseconds would have been 36 mins.

You're right, of course. I wish I could say this was the first time I've confused milli~ with millionth.

On the bright side, your way can make you a millionaire!

Yes, exactly that. I had to follow a few links, but Ars Technica has a better story. To quote:

"It also massively increases the granularity of object time-stamping: APFS supports nanosecond time stamp granularity rather than the 1-second time stamp granularity in HFS+. Nanosecond timestamps are important in a modern file system because they help with atomicity—in a file system that keeps a record of writes, having nanosecond-level granularity is important in tracking the order of operations."

Full thing is at: https://arstechnica.com/apple/2016/06/digging-into-the-dev-d...

High precision timestamps allow you to more reliably determine if a file has changed. If a writer uses the rewrite-then-move approach for atomic updates, and the reader polls with stat() to detect a change, it's possible that the reader will fail to observe a change if the timestamps are low resolution.

Apparently it's important for tracking the order of operations. I don't know why a ledger couldn't be kept.


If it's anything like systems I'm familiar with, it is a ledger: if two things happen at exactly the same time, one happens a nanosecond later.

A ledger is a separate write that slows down I/O and uses up disk space. Any way the question should by why a ledger wasn't used, not why it couldn't. Of course it could, but it's not necessarily desirable.

Hah, you were too quick for me! Must type faster!

Well for the MIFID-II regulations coming out this year, you need to support at least microsecond precision for audit purposes.

And if you're doing microseconds, why not go for the full nanoseconds?

Worth noting that iOS HFS+ already had some non-standard features [hacks] that Mac did not use like encrypting files with different keys based on the data protection level—that APFS provides—so it makes sense to upgrade iOS first, which can readily leverage such functionality, and it not necessarily a direct result of it being the "vision for the future".

What is the release date for APFS on MacOS?

> A director of the largest PC OS licensing company once told me they spent more money on driver development and fixes than on OS code proper.

Funny way to refer to Microsoft.

Isn't this true of gnu/Linux as well though? A lot of patches are drivers and fixes? For example, one could argue trim for solid state storage is a driver issue?

Of course it is. Drivers will be the majority of the work in any OS that is not locked onto a very limited set of hardware.

I think this is the case with both Windows and the Linux base distros. It's a necessity when you need to support a wide array of hardware, but might be different with Apple's ecosystem.

16GB iPhone users care!

This article is so boring I considered killing myself

Can we please change the title? The current one sounds like an inflammatory anti-Apple article, and the only reason I even clicked it was in expectation of a good laugh.

It's a rule not to change the titles when posting, unless there's a very good reason not to do so

The current guideline is "please use the original title, unless it is misleading or linkbait."

I probably would have left the original title alone, personally, but you could make a case that it was misleading. HN moderators apparently agreed and have since changed it.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact