Apple’s big test of data integrity

viraptor · on May 31, 2023

> In the event that they don’t match perfectly: “the startup process halts and the user is prompted to reinstall macOS”.

This feels weird. So they have a very low number of possible systems to support. The OS is immutable. Why would they go with a reinstall at that point instead of "we know which part failed, so we'll re-download that one block from Apple service"?

(I guess that one ties in with: so you've got a low number of systems and an immutable OS - why does it take 15min to apply an update)

parker_mountain · on May 31, 2023

Easier said than done.

Well, it means they have to keep track of every build, and keep every version of every OS live, and that's not something they really do. Especially in cases of security fixes, they'll take down an old version from the update servers.

As for 15min, they're actually testing fast patching right now, although it's currently limited to security updates (rapid response).

saurik · on May 31, 2023

FWIW, I do think they might have changed things very recently--but only for some legal argument (maybe as a result of Corellium...) as opposed to any technical one--but they really did just leave every single variant of every single update on their server... I even made a tool for jailbroken iOS devices that would get the original update files from Apple's servers and implemented exactly this idea: it would checksum every file against the copy from the bill of materials in the update and was able to download only the files that were different (by carving them out of the remote zip file using HTTP Range requests).

parker_mountain · on May 31, 2023

I really hope you're not relying on that, because it's not every original variant ;) although probably most of them are.

viraptor · on May 31, 2023

> it means they have to keep track of every build, and keep every version of every OS live

That's not a lot of work. I would be extremely surprised if they were not already doing it. If someone reports an issue going back to a specific version, they need to be able to install that specific one.

I think you're overestimating the effort / underestimating Apple resources here. Ubuntu keeps all released packages, going back to 2004 online for example. (And isos, even for the effectively dead powerpc) Microsoft's symbol server is likely also a way bigger project than just preserving images of all versions.

> and that's not something they really do

Are you actually speaking for Apple engineering here? The version not being advertised on the update servers does not mean it's not available.

Also, there's the question of supply chain audit. I doubt their security would accept not being able to say "this is/isn't what we shipped at the time".

anyfoo · on May 30, 2023

> During the boot process, unless boot security has been downgraded from Full Security, the contents of that SSV are verified against its tree of hashes. In the event that they don’t match perfectly: “the startup process halts and the user is prompted to reinstall macOS”.

This is written a bit confusingly, if you don't already knows how it works (even though a later reply to a comment tries to clarify it). Because it sounds like the disk is fully and instantly validated at every boot, and "the startup process halts" if there is a mismatch.

Instead, depending a little on the actual implementation, most hashes will very likely only be validated when the corresponding block is actually accessed (up the entire path to the root, where the higher level hashes may be validated now or may have been validated already; this again is an implementation detail).

The result is also, that some blocks may never be validated during normal operation, simply if some files are never accessed. Unless there is some explicit whole disk validation sometime, e.g. during installation, though any corruption happening after that last "full disk check" will still lay dormant until access or the next full check.

But since this is a tree of hashes, it still provides all security and integrity benefits to anything that does get accessed. What never gets accessed does not matter per definition.

bobbylarrybobby · on May 30, 2023

What's the failure mode if you try to access modified data post boot? Something akin to a kernel panic?

anyfoo · on May 30, 2023

I don't know. I/O error and panic both make sense to me, but I'd say the latter only in specific situations. It seems unwarranted to me to panic the system when, say, a README.txt somewhere in the fs has a non-validating block.

krackers · on May 31, 2023

From https://support.apple.com/sr-rs/guide/security/secd698747c9/...

>In case of mismatch, the system assumes the data has been tampered with and won’t return it to the requesting software.

Whether this then leads to a kernel panic later on probably depends on if it's the kernel that needs this file, or a userspace process

anyfoo · on May 31, 2023

Yeah, that's basically what I was getting at. For things like the root hash, it probably makes sense to panic (or otherwise impede boot), in order to have an early, more descriptive failure, than to later only implicitly access the root hash by validating the first accessed page of the init process (launchd, in macOS), and panicking with a less obvious I/O error reading launchd.

In most situations, you likely just want the regular I/O error and then whatever happens downstream of that.

curt15 · on May 30, 2023

Are there any distros that offer similar integrity checks? I understand Chromebooks provide similar assurances using dm-verity but what about more general-purpose distros?

Filligree · on May 31, 2023

Anything that uses ZFS gets it by default, as well as most configurations of Btrfs. Not secured by a TPM key (unless you go to great lengths), but for accidents it's plenty.

tedunangst · on May 30, 2023

This contrasts with the ZFS team who said they started seeing disk errors as soon as they switched to ZFS.

xoa · on May 31, 2023

As well as hardware control including their own controller which may implement more ECC under the hood (and I'm 95% sure I remember that part of the reasoning for lack of full checksumming in APFS was "well our hardware doesn't have those errors"), 9GB also isn't actually that much even over tens of millions of systems. ZFS is used with an absolutely massive variety of hardware from expensive to cheap, with petabyte levels of data (or more at this point) in single installs. I've seen some real weirdness with APFS on non-Apple storage devices, and corruption crept into some of the data stored on my Macs somewhere in the decade before I started using ZFS (images I know were fine in the 00s with data errors I found much later, there is probably other stuff but of course the whole problem is that I don't know and have no great automated way to find out).

This isn't dunking on Apple putting together a reliable product in their own walled garden using their own full array of vertically integrated options at all, but at the same time the ZFS team can be entirely correct given a very different problem space.

lilyball · on May 31, 2023

Apple system volumes are stored on the internal SSDs sold by Apple as part of the system. It's reasonable to believe that these SSDs are higher quality and more reliable than many disks used with Linux machines.

upon_drumhead · on May 31, 2023

I run a decently large zfs setup (in the single digit petabyte range) and I have never actually had a spinning disk trigger a zfs checksum issue that wasn't shortly followed up by the disk failing completely in the next week or so. And this is multiple years of this pattern.

While it has to happen occasionally, it does seem to be a pretty rare event in my experience. Drive actually failing are way more likely to happen it seems.

sillywalk · on May 31, 2023

Cool, the OpenBSD tedu, thanks for doas.. etc.

Also, is there a post about this zfs on apple silicon error thing?

saagarjha · on May 30, 2023

SSV is more a verification technology rather than a data integrity one. It will detect bitflips but it’s really designed to have an immutable root image for the OS, so you can wipe the data partition easily and have a clean, untampered OS image to boot from.

refulgentis · on May 30, 2023

> So while none of us can rule out data corruption due to cosmic rays and similar causes, the chance of that happening appears extremely remote

I don't buy that.

- a process opaque to us.

- seems to be not failing at scale enough to make it to press.

- therefore the chance of [random data corruption] is remote

versus ex. [in 2016](https://physicsworld.com/a/cosmic-challenge-protecting-super...), couple UK PhDs laboriously chart 55,000 bit flip errors on a 100 node cluster.

wtallis · on May 30, 2023

Isn't your link about RAM while this article is about storage?

refulgentis · on May 31, 2023

Lol! I missed that. Thank you :)

siddiqi123 · on May 31, 2023

Thank you for putting this all in perspective. While I am one of the more vocal users calling for Apple to implement user-data integrity checks, the numbers you’ve put forth in this article imply that the potential risks are indeed quite minimal.

uf00lme · on May 30, 2023

Common in Mobile and some of the IoT world. Blackberry was doing this ages ago, it was why you never wanted to have to power cycle them. Took ages to boot

jeffbee · on May 30, 2023

How do they checksum 9GB of data and still boot promptly?

droffel · on May 30, 2023

Merkle trees! You (simplified) hash the data at the leaves of the tree, merge adjacent leaves (concatenation for example), and hash the new leaves, until you're left with just the Merkle Root, a single hash representing the entirety of your data. Verifying the root is easier than loading the data itself into memory and hashing it there since you can verify it piecemeal and without loading it all into memory at once.

krackers · on May 30, 2023

But regardless of the ordering which you read, don't you ultimately still need to read every bit once?

CJefferson · on May 30, 2023

Each level of the tree contains a hash of the hashes from the level below, so to verify the top hash you need only hash all its children (which are hashes themselves). Then you can explore a child, continuing recursively. At the bottom you find a hash of an actual file (or part of a file), then hash the file to check it’s validity.

This allows you to only hash those parts of the tree you actually want to read.

vlovich123 · on May 31, 2023

If you want to verify the entire image, I don’t think you can get around reading the entire image. Because any part you didn’t read to verify the hash is a part that could contain corruption.

viraptor · on May 31, 2023

> Big Sur, Monterey or Ventura in default Full Security mode has verified every last bit of their 9 GB SSV.

In this case they're not selective, so the tree approach doesn't save the reads. (Just makes it easier to identify which part failed)

nomel · on May 31, 2023

That's around 3s, for their slowest offering.

sillywalk · on May 30, 2023

FTA: "This is the ingenious part: verification carries little overhead, as it runs as a rolling process once the top-level ‘seal’ has been verified. So verification is a continuous process as the SSV is mounted and accessed."

vlovich123 · on May 30, 2023

Which presumably means Merkel Trees.

w-m · on May 30, 2023

Don’t these machines have read speeds of like 5000 MB/s? So just read the 9 GB blockwise and apply a super simple hash. Should be done in two seconds.

viraptor · on May 31, 2023

If I include the fs overhead, I can shasum a single 23GB file (likely not fragmented) in 46s.

That's uncached, so it's not that quick at boot. 9GB would take ~19s then. Block would go faster, but not an order of magnitude faster.

Filligree · on May 31, 2023

Single threaded or in parallel?

It's a Merkle tree, apparently, so parallelizing the checksum should be trivial.

foobiekr · on May 31, 2023

it's IO not CPU bound

sillywalk · on May 30, 2023

I'm also curious if Apple SoCs' storage controllers have some fancy error-checking, along with AES.

ryao · on May 30, 2023

As far as I know, Apple has the NAND controller integrated into the SoC. The NAND controller needs to have ECC to keep the error rates on flash down to something acceptable.

sillywalk · on May 30, 2023

That was sort of what I was wondering, if the NAND controller has some extra Apple Stuff(tm) beyond standard ECC.

wtallis · on May 30, 2023

The standard for SSDs goes far beyond the ECC used on DRAM. Every SSD controller designer will have their own proprietary scheme, often a simple and fast BCH that falls back on LDPC for higher error rates. The exact codeword sizes and other parameters of the codes are vendor-specific.

Eric_WVGG · on May 30, 2023

I'm pretty sure that is the reason why support for Fusion Drives was abruptly ganked and punted out by a year.