Hacker News new | past | comments | ask | show | jobs | submit login
Hash collision in Apple NeuralHash model (github.com/asuharietygvar)
1389 points by sohkamyung on Aug 18, 2021 | hide | past | favorite | 696 comments

Ongoing related threads:

Apple defends anti-child abuse imagery tech after claims of ‘hash collisions’ - https://news.ycombinator.com/item?id=28225706

Convert Apple NeuralHash model for CSAM Detection to ONNX - https://news.ycombinator.com/item?id=28218391

Second preimage attacks are trivial because of how the algorithm works. The image goes through a neural network (one to which everyone has access), the output vector is put through a linear transformation, and that vector is binarized, then cryptographically hashed. It's trivial to perturb any image you might wish so as to be close to the original output vector. This will result in it having the same binarization, hence the same hash. I believe the neural network is a pretty conventional convolutional one, so adversarial perturbations will exist that are invisible to the naked eye.

This is useful for two purposes I can think of. One, you can randomize all the vectors on all of your images. Two, you can make problems for others by giving them harmless-looking images that have been cooked to give particular hashes. I'm not sure how bad those problems would be – at some point a police officer does have to look at the image in order to get probable cause. Perhaps it could lead to your Apple account being suspended, however.

A police raid on a person's home, or even a gentler thorough search, can be enough to quite seriously disrupt a person's life. Certainly having the police walk away with all your electronics in evidence bags will complicate trying to work remotely.

Of course, this is assuming everything works as intended and they don't find anything else they can use to charge you with something as they search your home. If you smoke cannabis while being in the wrong state, you're now in several more kinds of trouble.

The police do not show up until a human has compared the matched image to the actual.

Just stop.

The more collisions, the more chances of a false positive by a (tired, underpaid) human. I don't envy the innocent person whose home gets raided by a SWAT team convinced they're busting a child sex trafficker.

The database is known to contain non-csam images like porn. I doubt the reviewer will be qualified to discern which is which.

Known by whom to contain legal images? The "reviewer" might well be a retired Pediatrician who can identify the age of the subject by looking at the photographs and will be extremely accurate.

Not that I doubt your motives, but can I get your source on this? It seems like a huge blunder if so.

This could happen with a perturbed image, but I doubt it. Apple will send the suspicious images to the relevant authorities. Those authorities will then look at the images. The chances are low that they will then seek a search, even though the images are innocent upon visual inspection. But maybe in some places a ping from Apple is good enough for a search and seizure.

FWIW, they won't send the images. Even in the pursuit of knocking back CSAM, there are strict restrictions on the transmission and viewing of CSAM - in some cases even the defendant's lawyers don't usually see the images themselves in preparation for a trial, just a description of the contents. Apple employees or contractors will likely not look at the images themselves, only visual hashes.

They will instead contact the police and say "Person X has Y images that are on list Z," and let the police get a warrant based off that information and execute it to check for actual CSAM.

On reflection, yes, there must be warrants involved. I'm raising my estimate of how likely it is that innocent people get raided due to this. The warrant would properly only be to search iCloud, not some guy's house, but I can easily see overly-broad warrants being issued.

> The warrant would properly only be to search iCloud,

iCloud is encrypted, so that warrant is useless.

They need to unlock and search the device.

The data is encrypted, but Apple has the keys. If they get a warrant, they'll decrypt your data and hand it over. See page 11 of Apple's law enforcement process guidelines[1]:

> iCloud content may include email, stored photos, documents, contacts, calendars, bookmarks, Safari Browsing History, Maps Search History, Messages and iOS device backups. iOS device backups may include photos and videos in the Camera Roll, device settings, app data, iMessage, Business Chat, SMS, and MMS messages and voicemail. All iCloud content data stored by Apple is encrypted at the location of the server. When third-party vendors are used to store data, Apple never gives them the encryption keys. Apple retains the encryption keys in its U.S. data centers. iCloud content, as it exists in the customer’s account, may be provided in response to a search warrant issued upon a showing of probable cause, or customer consent.

1. https://www.apple.com/legal/privacy/law-enforcement-guidelin...

Yes, it's encrypted, but part of this anti-CSAM strategy is a threshold encryption scheme that allows Apple to decrypt photos if a certain number of them have suspicious hashes.

Apple having any kind of ability to decrypt user contents is disconcerting. It means they can be subpoenaed for that information.

> It means they can be subpoenaed for that information.

They regularly are, and they regularly give up customer data in order to comply with subpeonas[1]. They give up customer data in response to government requests for 150,000 users/accounts a year[1].

[1] https://www.apple.com/legal/transparency/us.html

No, the threshold encryption only allows for the 30+ cryptographic “vouchers” to be unlocked, which contain details about the hash matching as well as a “visual derivative” of the image. We don’t know any details about the visual derivative.

Im guessing the visual derivative is a difference of sorts between the image and the CSAM? Of course not sure.

No, since apple _definitely_ isn't sending CSAM photos to your phone so they can be differ. Most likely, the visual derivative is a thumbnail or blurred version of the image.

It's "encrypted", but Apple holds the keys and they regularly give up customers' data in response to government requests for it[1].

[1] https://www.apple.com/legal/transparency/us.html

Didn't the FBI stop Apple from encrypting iCloud, and only Messenger is e2e?

> in some cases even the defendant's lawyers don't usually see the images themselves in preparation for a trial, just a description of the contents

Man that seems horrible. So you just have to trust the description is accurate? You’d think there’d at least be a “private viewing room” type thing (I get the obvious concern of not giving them a file to take home)

In broad strokes, I agree with you. I think you're absolutely correct that most trained, educated, technologically sophisticated law enforcement bodies will do exactly as you suggest and decide that there's not enough to investigate.

That said, I'm not willing to say it won't happen. There are too many law enforcement entities of wildly varying levels of professionalism, staffing, and technical sophistication. Someone innocent, somewhere, is likely to have a multi-year legal drama because their local PD got an email from Apple.

And we haven't even gotten to subjects like how some LEOs will happily plant evidence once they decide you're guilty.

Wouldn’t it also just be possible to turn a jailbroken iDevice into a CSAM cleaner/hider?

You could take actual CSAM, check if it matches the hashes and keep modifying the material until it doesn’t (adding borders, watermarking, changing dimensions etc.). Then just save it as usual without any risk.

I can’t tell what would prevent a jailbroken device from pairing an illegal CSAM image with the safety voucher from a known-innocuous image, since the whole system depends on the trustworthiness of the safety voucher and that it truly represents the image in question. If the device is compromised I would think all bets would be off. To me, one potential flaw of this system may be that Apple inherently trusts the device. The existence of a jailbreak seems like a massive risk to the system.

In fact, Apple themselves generate fake/meaningless safety vouchers a certain percentage of the time (see synthetic safety vouchers.) If a jailbroken phone could trigger that code path for all images in the pipeline, apple’s system would be completely broken.

On the other hand, this may be just the excuse apple needs to lock down the phone further, to “protect the integrity of the CSAM detection system.” Perhaps they could persuade congress to make jailbreaking a federal crime. Perhaps they’re more clever than I ever imagined. Or perhaps they can fend off alternate App Store talk for sake of protecting the integrity of the system. Or perhaps staying up too late makes me excessively conspiratorial.

No it’s not. The client hashes against a blinded set, and doesn’t know whether or not the hash is a hit or miss.

Neuralhashes are far from my area of expertise, but I've been following Apple closely ever since its foundation and have probably watched every public video of Craig since the NeXT take over and here is my take: I've never seen him so off balance before as in his latest interview with Joanna Stern. Not even in the infamous “shaking mouse hand close up” of the early days.

Whatever you say about Apple, they are an extremely well oiled communication machine. Every C-level phrase has a well thought out message to deliver.

This interview was a train wreck. Joanna kept asking: please, in simple terms, to a hesitant and inarticulate Craig. It was so bad that she had to produce infographics to fill the communication void left by Apple.

They usually do their best to “take control” of the narrative. They were clearly caught way off guard here. And that's revealing.

Can you link the interview please?

I assume it is this one: https://www.youtube.com/watch?v=OQUO1DSwYN0

This was painful to watch.

That's the one, yes. Thank you.

I think they clearly didn't anticipate that people would perceive it as anything but a breach of trust, that their device was working against them (even for a good cause, against the worst people).

And because of this they calibrated their communication completely wrong, focusing on the on device part as being more private. Using the same line of thinking they use for putting Siri on device.

And the follow up was an uncoordinated mess that didn't help either (as you rightly pointed out with Craig's interview). In the Neuenschwander interview [1], he stated this :

> The hash list is built into the operating system, we have one global operating system and don’t have the ability to target updates to individual users and so hash lists will be shared by all users when the system is enabled.

This still has me confused, here's my understanding so far (please feel free to correct me)

- Apple is shipping a neural network trained on the dataset that generates NeuralHashes

- Apple also ships (where ?) a "blinded" (by an eliptic curve algo) table lookup that match (all possible?!) NeuralHashes to a key

- This key is used to encrypt the NeuralHash and the derivative image (that would be used by the manual review) and this bundle is called the voucher

- A final check is done on server using the secret used to generate the elliptic curve to reverse the NeuralHash and check it server side against the known database

- If 30 or more are detected, decrypt all vouchers and send the derivative images to manual review.

I think I'm missing something regarding the blinded table as I don't see what it brings to the table in that scenario, apart from adding a complex key generation for the vouchers. If that table only contained the NeuralHashes of known CSAM images as keys, that would be as good as giving the list to people knowing the model is easily extracted. And if it's not a table lookup but just a cryptographic function, I don't see where the blinded table is coming from in Apple's documentation [2].

Assuming above assumptions are correct, I'm paradoxically feeling a tiny bit better about that system on a technical level (I still think doing anything client side is a very bad precedent), but what a mess did they put themselves into.

Had they done this purely server side (and to be frank there's not much difference, the significant part seems to be done server side) this would have been a complete non-event.

[1] : https://daringfireball.net/linked/2021/08/11/panzarino-neuen...

[2] This is my understanding based on the repository and what's written page 6-7 : https://www.apple.com/child-safety/pdf/CSAM_Detection_Techni...

Thanks for the description.

That's a *huge* amount of crypto mumbo-jumbo for a system to scan your data on your own device and send it to the authorities.

They must really care about children!!

If only this system was in place while Trump, Jeffrey Epstein, and Prince Andrew were raping children, surely none of that would have happened!! /s

How can you use it for targeted attacks?

This is what would need to happen:

1. Attacker generates images that collide with known CSAM material in the database (the NeuralHashes of which, unless I'm mistaken, are not available)

2. Attacker sends that to innocent person

3. Innocent person accepts and stores the picture

4. Actually, need to run step 1-3 at least 30 times

5. Innocent person has iCloud syncing enabled

6. Apple's CSAM detection then flags these, and they're manually reviewed

7. Apple reviewer confuses a featureless blob of gray with CSAM material, several times

Note that other cloud providers have been scanning uploaded photos for years. What has changed wrt targeted attacks against innocent people?

"How can you use it for targeted attacks?"

Just insert a known CSAM image on target's device. Done.

I presume this could be used against a rival political party to ruin their reputation - insert bunch of CSAM images on their devices. "Party X is revealed as an abuse ring". This goes oh-so-very-nicely with Qanon conspiracy theories which even don't require any evidence to propagate widely.

Wait for Apple to find the images. When police investigation is opened, make it very public. Start a social media campaign at the same time.

It's enough to fabricate evidence only for a while - the public perception of the individual or the group will be perpetually altered, even though it would surface later that the CSAM material was inserted by hostile third party.

You have to think about what nation state entities that are now clients of Pegasus and so on could do with this. Not how safe the individual component is.

FL Rep Randy Fine filed a report with the Florida Department of Law Enforcement that the sheriff was going to plant CSAM on his computer and arrest him for it.

They are even in the same political party.


Indeed this is not new and has probably been happening for many years already. There are services advertising on dark net markets to “ruin someone’s life” which means you pay some Ukrainian guy $300 in Bitcoin and he plants CSAM on a target’s computer.

> Just insert a known CSAM image on target's device.

Or maybe thirty. You have to surpass the threshold.

Also, if Twitter, Google, Microsoft are already deploying CSAM scanning in their services .... why are we not hearing about all the "swatting"?

Their implementations are not on-device and thus it's actually significantly more difficult to reverse engineer. Apples unique implementation of on-device scanning is much easier to reverse engineer and thus exploit.

Now apple is in this crappy situation where they can't claim their software is secure because it's open source and auditable, but they also can't claim it's secure because it's closed source and they fixed the problems in some later version because this entire debacle has likely destroyed all faith in their competence. If apple is in the position of having to boast "Trust us bro, your iPhone won't be exploited to get you SWATTED over CSAM anymore, we patched it" the big question is why is apple voluntarily adding something to their devices where the failure mode is violent imprisonment and severe loss of reputation when they are not completely competent?

This entire debacle reminds me of this video: https://www.youtube.com/watch?v=tVq1wgIN62E

>Also, if Twitter, Google, Microsoft are already deploying CSAM scanning in their services .... why are we not hearing about all the "swatting"?

>their services

>T H E I R S E R V I C E S

Because it's on their SERVICES, not on their user's DEVICES, for one.

Also, regardless of swatting, that's why we have an issue with Apple.

It only happens on Apple devices right before the content is uploaded to the service.

How is that a meaningful difference for the stated end goals, that can explain the lack of precedent.

I think this is where a disconnect is occurring.

In this specific case yes. That is what is supposed to happen.

But Apple also sets the standard that this is just the beginning, not the end. They say as much on page 3 in bold, differentiated color ink


And there’s nothing to stop them from scanning all images on a device. Or scanning all content for keywords or whatever. iCloud being used as a qualifier is a red herring to what this change is capable of.

Maybe someone shooting guns is now unacceptable, kids have been kicked from schools for posting them on Facebook or having them in their rooms on zoom. What if it’s kids shooting guns? There are so many possibilities of how this could be misused, abused or even just an oopsie, sorry I upended your life to solve a problem that is so very rare.

Add to that their messaging has been muddy at best. And it incited a flame war. A big part of that is iCloud is not a single thing. It’s a service, it can sync snd hold iMessages, it can sync backups, or in my case We have shared iCloud albums that we use to share images with family. Others are free to upload and share. In fact that’s our only use of iCloud other than find my. They say iCloud photos as if that’s just a single thing but it’s easy to extrapolate that to images in iMessages, backups etc.

And the non profit that hosts this database is not publicly accountable. They have public employees on their payroll but really they can put whatever they want in that database. They have no accountability or public disclosure requirements.

So even I, when their main page was like 3 articles was a bit perturbed and put off. I’m not going to ditch my iPhone, mainly because it’s work assigned but I have been keeping a keen eye on what’s happening, how it’s happening and will keep an eye out for their chnages they are promising. I’m also going to guess they won’t nearly be as high profile in the future.

>And there’s nothing to stop them from scanning all images on a device.

All images on your device have been scanned for years by ML models to detect things all sorts of things and make your photo library searchable regardless of whether you use an Android or Apple device. That's how you can go and search "dog", "revolver", "wife", etc and get relevant photos popping up.

I don't think this is accurate. I don't use the Google Photos cloud service, and searching in the Photos app on my Android phone returns zero results for any search term.

My impression was Google had been doing it for ages.

Ex. This article from 2013 where they talk about searching for [my photos of flowers] https://search.googleblog.com/2013/05/finding-your-photos-mo...

I'm an iOS guy, and don't have an Android device to confirm it. I've got a few photos visible on photos.google.com and they're able to detect "beard", at least. Which, to be fair, it's just a few selfies.

iOS does this pretty well. I searched my phone and it was able to recognize and classify a gun as a revolver from a meme I'd saved years ago. That's not this CSAM technology, just something they've been doing for years with ML.

Right, the Google Photos service does this, but it's a cloud service, it's not on-device.

I looked on our iphones. And its possible I am an edge case. With the restrictions I have on Siri, icloud (the only use for iCloud is some shared albums with family, in lieu of facebook, and find my) etc. My phone doesn't categorize photos by person or do those montages others routinely get within the photos app.

And the only reason I know about them is because my wife asked about them and why our iPhones dont do them.

But we don't put stuff on Facebook. Our photos are backed up to our NAS. Phones backup to a macmini only. Siri and search are basically disabled as much as possible (we have to somewhat enable it for carplay) but definitely no voice or anything.

> Because it's on their SERVICES, not on their user's DEVICES, for one.

Effectively the same for Apple. It’s only when uploading the photo. Doing it on device means the server side gets less information.

A server-side action is triggering a local action. The action is still local.

Who cares that Bad Company XYZ already well known for not caring about customer privacy does it? Wouldn't you want to push back against even more increasing surveillance? Apply was beating the drum of privacy while it was convenient, wouldn't you want to hold their feet to the fire now that they seemed to do a U-turn?

Their point is that the attack vector being described isn’t new, as CSAM could already be weaponized against folks, and we never really ever hear if that happening. So the OP is simply saying that perhaps it’s not an issue we need to worry about. I happen to agree with them.

So in your mind, because so far we've seen no evidence that this has been abused, it's nothing to worry about going forward? And that making an existing situation even more widespread is also completely OK?

> So in your mind, because so far we've seen no evidence that this has been abused, it's nothing to worry about going forward?

Yeah, basically. It doesn't seem like people actually use CSAM to screw over innocent folks, so I don't think we need to worry about it. What Apple is doing doesn't really make that any easier, so it's either already a problem, or not a problem.

> And that making an existing situation even more widespread is also completely OK?

I don't know if I'd say any of this is "completely OK", as I don't think I've fully formed my opinion on this whole Apple CSAM debate, but I at least agree with OP that I don't think we need to suddenly worry about people weaponizing CSAM all of a sudden when it's been an option for years now with no real stories of anyone actually being victimized.

And if it is a problem, not doing this doesn't resolve the problem.

If it were to become a problem in the future, it could become a problem regardless of whether or not the scanning is done at the time of upload on device or at the time of upload on server.

It’s that the situation can’t be a “slippery slope” if there’s no evidence of there being one prior

> I presume this could be used against a rival political party to ruin their reputation - insert bunch of CSAM images on their devices.

Okay, and then what? You think people will just look at this cute picture of a dog and be like "welp, the computer says it's a photo of child abuse, so we're taking you to jail anyway"?

You don't have to add a real picture, just add the hash of a benign, (new) cat picture to the db, then put the cat picture on the phone, then release a statement saying the person has popped up on your list. By the time the truth comes out the damage is done.

> "How can you use it for targeted attacks?" > Just insert a known CSAM image on target's device. Done.

Yes, but then the hash collision (topic of this article) is irrelevant.

> Just insert a known CSAM image on target's device. Done.

You don't even need to go that far. You just need to generate 31 false positive images and send them to an innocent user.

But how will you do that without access to the non-blinded hash table? Then you need access to that first.

Also, “sending” them to a user isn’t enough; they need to be stored in the photo library, and iCloud Photo Library needs to be enabled.

How is this different from server side scanning, the policy de jour that Apple was trying to move away from?

> Just insert a known CSAM image on target's device. Done.

What do you mean “just”? That’s not usually very simple. It needs to go into the actual photo library. Also, you need like 30 of them inserted.

> I presume this could be used against a rival political party

Yes, but it’s not much different from now, since most cloud photo providers scan for this cloud-side. So that’s more an argument against scanning all together.

WhatsApp for example automatically stores images you receive in your Photos library, so that removes a step, and those will thus be automatically uploaded to iCloud.

The one failsafe would be Apple's manual reviewers, but we haven't heard much about that process yet.

> It needs to go into the actual photo library.

iMessage photos received are automatically synced so no. Finding 30 photos take zero time at all on Tor. Hell finding a .onion site that doesn't have CP randomly spammed is harder.....

iMessage photos do not automatically add photos to your photo library. Yes they’re synced between devices but afaik Apple isn’t deploying this hashing technology on iMessages between devices. Only for iCloud photos in the photo library.

Cross-posting from another thread [1]:

1. Obtain known CSAM that is likely in the database and generate its NeuralHash.

2. Use an image-scaling attack [2] together with adversarial collisions to generate a perturbed image such that its NeuralHash is in the database and its image derivative looks like CSAM.

A difference compared to server-side CSAM detection could be that they verify the entire image, and not just the image derivative, before notifying the authorities.

[1] https://news.ycombinator.com/item?id=28218922

[2] https://bdtechtalks.com/2020/08/03/machine-learning-adversar...

Right. So, sending actual CSAM would also work as an attack, but would be detected by the victim and could be corrected (delete images).

But a conceivable novel avenue of attack would be to find an image that:

1. Does not look like CSAM to the innocent victim in the original

2. Does match known CSAM by NeuralHash

3. Does look like CSAM in the "visual derivative" reviewed by Apple, as you highlight.

Reading the imagine scaling attack article, it’s looks like it’s pretty easy to manufacture an image that:

1. Looks like an innocuous image, indeed even an image the victim is expecting to receive.

2. Downscales in such a way to produce a CSAM match.

3. Downscales for the derivative image to create actual CSAM for the review process.

Which is a pretty scary attack vector.

Where does it say anything that indicates #1 and #3 are both possible?

Depends very much on the process Apple uses to make the "visual derivative", though. Also, defence by producing the original innocuous image (and showing that it triggers both parts of Apple's process, NeuralHash and human review of the visual derivative) should be possible, though a lot of damage might've been done by then.

> Also, defence by producing the original innocuous image

At this point you’re already inside the guts of the justice system, and have been accused of distributing CSAM. Indeed depending on how diligent the prosecutor is, you might need to wait till trial before you can defend yourself.

At that point you’re life as you know is already fucked. The only thing proving your innocence (and the need to do so is itself a complete miscarriage of justice) will save you from is a prison sentence.

And now you will be accused of trying to hide illegal material in innocuous images.

This isn’t true at all.

If the creation of fakes is as easy as claimed, Neuralhash evidence alone will become inadmissible.

There are plenty of lawyers and money waiting to establish this.

> This isn’t true at all.

> If the creation of fakes is as easy as claimed, Neuralhash evidence alone will become inadmissible.

Okay. https://github.com/anishathalye/neural-hash-collider

Uh? So his if statement is true?

Please read what is written right before that... You are taking something out of context.

Why do you keep posting links to this collider as though it means something?

As has been already pointed out the system is designed to handle attacks like this.

Here is the relevant paragraph from Apple’s documentation:

“as an additional safeguard, the visual derivatives themselves are matched to the known CSAM database by a second, independent perceptual hash. This independent hash is chosen to reject the unlikely possi- bility that the match threshold was exceeded due to non-CSAM images that were ad- versarially perturbed to cause false NeuralHash matches against the on-device en- crypted CSAM database. If the CSAM finding is confirmed by this independent hash, the visual derivatives are provided to Apple human reviewers for final confirmation.”


> So, sending actual CSAM would also work as an attack, but would be detected by the victim and could be corrected (delete images).

What if they are placed on the iDevice covertly? Say you want to remove politician X from office. If you got the money or influence you could use a tool like Pegasus (or whatever else there is out there that we don't know of) to place actual CSAM images on their iDevice. Preferably with an older timestamp so that it doesn't appear as the newest image on their timeline. iCloud notices unsynced images and syncs them while performing the CSAM check, it comes back positive with human review (cause it was actual CSAM) and voilà X got the FBI knocking on their door. Even if X can somehow later proof innocence by this time they'll likely have been removed from office over the allegations.

Thinking about it now it's probably even easier: Messaging apps like WhatsApp allow you to save received images directly to camera roll which then auto-syncs with iCloud (if enabled). So you can just blast 30+ (or whatever the requirement was) CSAM images to your victim while they are asleep and by the time they check their phone in the morning the images will already have been processed and an investigation started.

If you are placing images covertly, you can just use real CSAM or other compromat.

> but would be detected by the victim and could be corrected (delete images).

I doubt deleting them (assuming the victim sees them) works once the image has been scanned. And, given that this probably comes with a sufficient smear campaign, deleting them will be portraye. as evidence of guilt

Why would someone do that? Why not just send the original if both are flagged as the original?

The victim needs to store the image in their iCloud, so it needs to not look like CSAM to them.

Because having actual CSAM images is illegal.

Doesn’t that make step 1 more dangerous for the attacker than the intended victim? And following this through to its logical conclusion; the intended victim would have images that upon manual review by law enforcement would be found to be not CSAM.

> 7. Apple reviewer....

This part IMO makes Apple itself the most likely "target", but for a different kind of attack.

Just wait until someone who wasn't supposed to, somewhere, somehow gets their hands on some of the actual hashes (IMO bound to happen eventually). Also remember that with Apple, we now have an oracle that can tell us. And with all the media attention around the issue, this might further incentivize people to try.

From that I can picture a chain of events something like this:

1. Somebody writes a script that generates pre-image collisions like in the post, but for actual hashes Apple uses.

2. The script ends up on the Internet. News reporting picks it up and it spreads around a little. This also means trolls get their hands on it.

3. Tons of colliding image are created by people all over the planet and sent around to even more people. Not for targeted attacks, but simply for the lulz.

4. Newer scripts show up eventually, e.g. for perturbing existing images or similar stunts. More news reporting follows, accelerating the effect and possibly also spreading perturbed images around themselves. Perturbed images (cat pictures, animated gifs, etc...) get uploaded to places like 9gag, reaching large audiences.

5. Repeat steps 1-4 until the Internet and the news grow bored with it.

During that entire process, potentially each of those images that ends up on an iDevice will have to be manually reviewed...

Do you think Apple might perhaps halt the system if the script get wide publication?

I've only seen Apple admit defeat once, and that was regarding the trashcan MacPro. Otherwise, it's "you're holding it wrong" type of victim blaming as they quietly revise the issue on the next version.

Can anyone else think of times where Apple has admitted to something bad on their end and then reversed/walked away from whatever it was?

The Apple AirPower mat comes to mind although there are rumors they haven't abandoned the effort completely. Butterfly keyboard seems to finally be acknowledged as a bad idea and took several years to get there.

The next Macbook refresh will be interesting as there are rumors they are bring back several I/O ports that were removed when switching to all USB-C.

I agree with your overall point, just some things that came to mind when reading your question.

ah yes, the butterfly keyboard. i must have blocked that from my mind after the horror it was. although, they didn't admit anything on that one. that was just another "you're holding it wrong" silent revision that was then touted as a new feature (rather than oops we fucked up).

The trashcan MacPro is still the only mea culpa I am aware of them actually owning the mistake.

The Airpower whatever was never really released as a product though, so it is a strange category. New question, is the Airpower whatever the only product offically announced on the big stage to never be released?

Do you really care what they "admit"? I thought you were worried about innocent people being framed. Obviously if a way to frame people gets widespread, Apple will stop it. They don't want that publicity.

You clearly have me confused with someone else, as I never mentioned anything about innocent people being framed.

With Apple, nothing is "obvious".

The comment above that I responded to seemed to talk about that. But in any case, I for one don't care what Apple admits.

But I am certain they will not want all the bad publicity that would come if the system was widely abused, if you worry about that. That much is actually "obvious", they are not stupid.

> 7. Apple reviewer confuses a featureless blob of gray with CSAM material, several times

A better collision won't be a grey blob, it'll take some photoshopped and downscaled picture of a kid and massage the least significant bits until it is a collision.


So the person would have to accept and save an image that when looks enough like CSAM to confuse a reviewer…

Yes, the diligent review performed by the lowest-bidding subcontractor is an excellent defense against career-ending criminal accusations. Nothing can go wrong, this is fine.

I would think there is way easier ways to frame someone with CSAM then this. Like dump a thumbdrive of the stuff on them and report them to the police.

The police will not investigate every hint and a thumbdrive still has some plausible deniability. Evidence on your phone looks far worse and, thanks to this new process, law enforcement will receive actual evidence instead of just a hint.

My WhatsApp automatically saves all images to my photo roll. It has to be explicitly turned off. When the default is on, it's enough that the image is received and the victim has CP on their phone. After the initial shock they delete it, but the image has already been sent to Apple, where a reviewer marked it as CP. Since the user already gave them their full address data in order to be able to use the app store, Appla can automatically send a report to the police.

> [...] Appla can automatically send a report to the police.

Just to clarify, Apple doesn't report anyone to the police. They report to NCMEC, who presumably contacts law enforcement.

FBI agents work for NCMEC. NCMEC is created by legislation. They ARE law enforcement, disguised as a non-profit.

> but the image has already been sent to Apple, where a reviewer marked it as CP

No, the images are only decryptable after a threshold (which appears to be about 30) is breached. If you've received 30 pieces of CSAM from WhatsApp contacts without blocking them and/or stopping WhatsApp from automatically saving to iCloud, I gotta say, it's on you at that point.

Just a side point, a single WhatsApp message can contain up to 30 images. 30 is the literal max of a single message. So ONE MESSAGE could theoretically contain enough images to trip this threshold.

> a single WhatsApp message can contain up to 30 images

A fair point, yes, and somewhat scuppering towards my argument.

You’re aware that people sleep at night, and phones for the most part don’t, right?

Victim blaming because of failure to meet some seemingly arbitrary limit, ok.

Not necessarily.

If you know the method used by Apple to scale down flagged images before they are sent for review, you can make it so the scaled down version of the image shows a different, potentially misleading one instead:


At the end of the day:

- You can trick the user into saving an innocent looking image

- You can trick Apple NN hashing function with a purposely generated hash

- You can trick the reviewer with an explicit thumbnail

There is no limit to how devilish one can be.

The reviewer may not be looking at the original image. But rather the visual derivative created during the hashing process and sent as part of the safety voucher.

In this scenario you could create an image that looks like anything, but where it’s visual derivative is CSAM material.

Currently iCloud isn’t encrypted, so Apple could just look at the original image. But in future is iCloud becomes encrypted, then the reporting will be don’t entirely based on the visual derivative.

Although Apple could change this by include a unique crypto key for each uploaded images within their inner safety voucher, allowing them to decrypt images that match for the review process.

Depending on what algorithm apple uses to generate the "sample" that's shown to the reviewer it may be possible to generate a large image that looks innocent unless downscaled with that specific algorithm and to a specific resolution

So here's something I find interesting about this whole discussion: Everyone seems to assume the reviewers are honest actors.

It occurs to me that compromising an already-hired reviewer (either through blackmail or bribery) or even just planting your own insider on the review team might not be that difficult.

In fact, if your threat model includes nation-state adversaries, it seems crazy not to consider compromised reviewers. How hard would it really be for the CIA or NSA to get a few of their (under cover) people on the review team?

I don't see how a perfectly legal and normal explicit photograph of someone's 20-year-old wife would be indistinguishable to an Apple reviewer from CSAM, especially since some people look much younger or much older than their chronological age. So first, there would be the horrendous breach of privacy for an Apple goon to be looking at this picture in the first place, which the person in the photograph never consented to, and second, could put the couple in legal hot water for absolutely no reason.

The personal photo is unlikely to match a photo in the CSAM database though, or at least that's what is claimed by Apple with no way to verify if it's true or not.

Not one image. Ten, or maybe 50, who knows what the threshold is.


I would suggest not clicking this link on a work device.

Remains to be shown whether that is possible, though.

Just yesterday, here on HN there was an article [1] about adversarial attacks that could make road signs get misread by ML recognition systems

I'd be astonished if it wasn't possible to do the same thing here.

[1] https://news.ycombinator.com/item?id=28204077

But the remarkable thing there (and with all other adversarial attacks I've seen) is that the ML classifier is fooled, while for us humans it is obvious that it is still the original image (if maybe slightly perturbed).

But in the case of Apple's CSAM detection, the collision would first have to fool the victim into seeing an innocent picture and storing it (presumably, they would not accept and store actual CSAM [^]), then fool the NeuralHash into thinking it was CSAM (ok, maybe possible, though classifiers <> perceptual hash), then fool the human reviewer into also seeing CSAM (unlike the innocent victim).

[^] If the premise is that the "innocent victim" would accept CSAM, then you might as well just send CSAM as an unscrupulous attacker.

Hmm, not quite:

step 1 - As others have pointed out, there are plenty of ways of getting an image onto someone's phone without their explicit permission. WhatsApp (and I believe Messenger) do this by default; if someone sends you an image, it goes onto your phone and gets uploaded to iCloud.

step 2 - TFA proves that hash collision works, and fooling perceptual algorithms is already a known thing. This whole automatic screening process is known to be vulnerable already.

step 3 - Humans are harder to fool, but tech giants are not great at scaling human intervention; their tendency is to only use humans for exceptions because humans are expensive and unreliable. This is going to be a lowest-cost-bidder developing-country thing where the screeners are targeted on screening X images per hour, for a value of X that allows very little diligence. And the consequences of a false positive are probably going to be minimal - the screeners will be monitored for individual positive/negative rates, but that's about it. We've seen how this plays out for YouTube copyright claims, Google account cancellations, App store delistings, etc.

People's lives are going to be ruined because of this tech. I understand that children's lives are already being ruined because of abuse, but I don't see that this tech is going to reduce that problem. If anything it will increase it (because new pictures of child abuse won't be on the hash database).

> then fool the human reviewer into also seeing CSAM (unlike the innocent victim).

Or just blackmail and/or bribe the reviewers. Presumably you could add some sort of 'watermark' that would be obvious to compromised reviewers. "There's $1000 in it for you if you click 'yes' any time you see this watermark. Be a shame if something happened to your mum."

Yes but the reviewers are not going to be viewing the original image, they are going to be viewing a 100x100 greyscale.

>If the premise is that the "innocent victim" would accept CSAM, then you might as well just send CSAM as an unscrupulous attacker.

This adds trojan horses embedded in .jpg files as an attack vector, which while maybe not overly practical, I could certainly imagine some malicious troll uploading "CSAM" to some pornsite.

NN classifiers work differently than perceptual hashes and the mechanism to do this sort of attack is entirely different, though they seem superficially similar.

Unfortunately, it is very likely to be possible. Adversarial ML is extremely effective. I won't be surprised if this is achieved within the day, if not sooner tbh.

It's been done for image classification.

the issue is step 6 - review and action

Every single tech company is getting rid of manual human review towards an AI based approach. Human-ops they call it - they dont want their employees to be doing this harmful work, plus computers are cheaper and better at

We hear about failures of inhuman ops all the time on HN. people being banned, falsely accused, cancelled, accounts locked, credit denied. All because the decisions which were once by humans are now made by machine. This will happen eventually here too.

It's the very reason why they have the neuralhash model. To remove the human reviewer.

> 7. Apple reviewer confuses a featureless blob of gray with CSAM material, several times

Just because the PoC used a meaningless blob doesn't mean that collisions have to be those. Plenty of examples of adversarial attacks on image recognition perturb real images to get the network to misidentify them, but to a human eye the image is unchanged.

The whole point flew over your head. If it's unchanged to the human eye then surely the human reviewer will see that it's a false positive?

No, it's important to point that out lest people think collisions can only be generated with contrived examples. I haven't studied neural hashes in particular, but for CNNs it's extremely trivial to come up with adversarial examples for arbitrary images.

Anyway, as for human reviewers, depends on what the image being perturbed is. Computer repair employees have called the police on people who've had pictures of their children in the bath. My understanding is that Apple does not have the source images, only NCMEC, so Apple's employees wouldn't necessarily see that such a case is a false positive. One would hope that when it gets sent to NCMEC, their employees would compare to the source image and see that is a false positive, though.

Which would still be a privacy violation, since an actual human is looking at a photo you haven't consented to share with them.

That will be clearly laid out on page 1174 of Apples ToS that you had to click to be able to use your $1200 phone for anything but a paperweight.

For #4, I know for a fact that my wife’s WhatsApp automatically stores pictures you send her to her iCloud. So the grey blob would definitely be there unless she actively deleted it.

I don't know why you'd even go through this trouble. At least few years ago finding actual CP on TOR was trivial, not sure if the situation has changed or not. If you're going to blackmail someone, just send actual illegal data, not something that might trigger detection scanners.

>> What has changed wrt targeted attacks against innocent people?

Anecdote: every single iphone user I know has iCloud sync enabled by default. Every single Android user I know doesn't have google photos sync enabled by default.

> Anecdote: every single iphone user I know has iCloud sync enabled by default.

Yeah, but a lot of them have long ago maxed out their 5 GB iCloud account.

> the NeuralHashes of which, unless I'm mistaken, are not available

Given the scanning is client-side wouldn't the client need a list of those hashes to check against? If so it's just a matter of time before those are extracted and used in these attacks.

I think there's some crypto mumbo-jumbo to make it so you can't know if an image matched or not.

Don’t imessage and whatsapp automatically store all images received in the iphone’s photo library?

iMessage no, WA by default yes, but can be disabled.

So no need of hash collisions then. One can simply directly send the child porn images to that person via WhatsApp and send her to jail.

But then you're searching and finding child porn images to send.

I'd be surprised if there won't be a darknet service that does exactly that (send CSAM via $POPULAR_MESSENGER) the moment Apple activates scanning.

> 7. Apple reviewer confuses a featureless blob of gray with CSAM material, several times

I find it hard to believe that anyone has faith in any purported manual review by a modern tech giant. Assume the worst and you'll still probably not go far enough.

How can we know that the CSAM database is not already poisoned with adversarial images that actually target other kinds of content for different purposes? It would look like CSAM to the naked eye, and nobody can tell the images have been doctored.

When reports come in the images would not match, so they need to intercept them before they are discarded by Apple, maybe by having a mole in the team. But it's so much easier than other ways to have an iOS platform scanner for any purpose. Just let them find the doctored images and add them to the database and recruit a person in the Apple team.

I don't think this can be used to harm an innocent person. It can raise a red flag but it would be quickly unraised and perhaps an investigation into the source of the fakeout images because THAT person had to have had the real images in possession.

If anything, this gives weapons to people against the scanner as we can now bomb the system with false positives rendering it impossible to use. I don't know enough about cryptography but I wonder if there is any ramifications of the hash being broken.

Maybe they could install malware that makes all camera images taken using a technique like stenography to cause false positive matches for all the photos taken by the device. Maybe they could share one photo album where all the images are hash collisions.

Actually, need to run step 1-3 at least 30 times

You can do steps 2-3 all in one step "Hey Bob, here's a zip file of those funny cat pictures I was telling you about. Some of the files got corrupted and are grayed out for some reason".

What makes CSAM database private?

It's my understanding that many tech companies (Microsoft? Dropbox? Google? Apple? Other?) (and many people in those companies) have access to the CSAM database, which essentially makes it public.

Well, the actual hash table on the device is blinded so the device doesn’t know if an image is a match or not. The server doesn’t learn the actual hash either, unless the threshold of 30 images is reached.

Are you being serious? #7 is literally "Apple reviewer confuses a featureless blob of gray with CSAM material, several times"

30 times.

30 times a human confused a blob with CSAM?

If you're in close physical contact with a person (like at a job) you just wait for them to put their phone down while unlocked, and do all this.

Then, with all due respect, the attacker could just download actual CSAM.

> If your adversary is the Mossad, YOU’RE GONNA DIE AND THERE’S NOTHING THAT YOU CAN DO ABOUT IT. The Mossad is not intimidated by the fact that you employ https://. If the Mossad wants your data, they’re going to use a drone to replace your cellphone with a piece of uranium that’s shaped like a cellphone, and when you die of tumors filled with tumors, they’re going to hold a press conference and say “It wasn’t us” as they wear t-shirts that say “IT WAS DEFINITELY US,” and then they’re going to buy all of your stuff at your estate sale so that they can directly look at the photos of your vacation instead of reading your insipid emails about them.


"Then, with all due respect, the attacker could just download actual CSAM."

If you didn't have Apple scanning your drive trying to find a new way for you to go to prison then it wouldn't be a problem.

It’s misleading to say that they scan your “drive”. They scan pictures as they are uploaded to iCloud Photo Library.

Also, most other major cloud photo providers scan images server side, leading to the same effect (but with them accessing more data).

Yes. But this whole discussion is about potential problems/exploits with hash collisions (see title).

Also, this XKCD:


People are getting nerd-sniped about hash collisions. It's completely irrelevant.

The real-world vector is that an attacker sends CSAM through one of the channels that will trigger a scan. Through iMessage, this should be possible in an unsolicited fashion (correct me if I'm wrong). Otherwise, it's possible through a hacked device. Of course there's plausible deniability here, but like with swatting, it's not a situation you want to be in.

Plausible deniability or not, it could have real impact if Apple decides to implement the policy of locking your account after tripping the threshold, which you then have to wait or fight to get unlocked. Or now you have police records against you for an investigation that lead nowhere. It's not a zero impact game if I can spam a bunch of grey blobs to people and potentially have a chain of human failures that leads police knock down your door.

> Through iMessage, this should be possible in an unsolicited fashion

Sure, but those don’t go into your photo library, so it won’t trigger any scanning. Presumably people wouldn’t actively save CSAM into their library.

FWIW, this sort of argument may not trigger a change in policy, but a technical failure in the hashing algorithm might.

Love the relevant xkcd! And to reply to your point, simply sending unsolicited CSAM via iMessage doesn’t trigger anything. That message has to be saved to your phone then uploaded to iCloud. Someone else above said repeat this process 20-30 times so I presume it can’t be a single incident of CSAM. Seems really really hard to trigger this thing by accident or maliciously

People are saying that, by default, WhatsApp will save images directly to your camera roll without any interaction. That would be an easy way to trigger the CSAM detection remotely. There are many people who use WhatsApp so it's a reasonable concern.

Can you send images to people that are not on your friend list?

Have you not worked a minimum wage job in the US? It's incredibly easy to gain phone access to semi-trusting people.

If you don't like someone (which happens very often in this line of work) you could potentially screw someone over with this.

Great article!

one vector you can use to skip step 3 is to send on WhatsApp. I believe images sent via WhatsApp are auto saved by default last I recalled.

> 4. Actually, need to run step 1-3 at least 30 times

Depending on how the secret sharing is used in Apple PSI, it may be possible that duplicating the same image 30 times would be enough.

I'm sure the reviewers will definitely be able to give each reported image enough time and attention they need, much like the people youtube employs to review videos discussing and exposing animal abuse, holocaust denial and other controversial topics. </sarcasm>

Difference in volume. Images that trip CSAM hash are a lot rarer than the content you just described.

I personally am not aware of how the perceptual hash values are distributed in it's keyspace. Perceptual hashes can have issues with uniform distribution, since they are somewhat semantic and most content out there is too. As such, I wouldn't make statements about how often collisions would occur.

We detached this subthread from https://news.ycombinator.com/item?id=28219296 (it had become massive and this one can stand on its own).

> 6. Apple's CSAM detection then flags these, and they're manually reviewed

Is the process actually documented anywhere? Afaik they are just saying that they are verifying a match. This could of course just be a person looking at the hash itself.

They look at the contents of the "safety voucher", which contains the neural hash and a "visual derivative" of the original image (but not the original image itself).


If it’s a visual derivative, whatever that means, then how does the reviewer know it matches the source image? Sounds like there’s a lot of non determinism in there.

Apple's scheme includes operators manually verifying a low-res version of each image matching CSAM databases before any intervention. Of course, grey noise will never pass for CSAM and will fail that step.

The fact that you can randomly manipulate random noise until it matches the hash of an arbitrary image is not surprising. The real challenge is generating a real image that could be mistaken for CSAM at low res + is actually benign (or else just send CSAM directly) + matches the hash of real CSAM.

This is why SHAttered [1] was such a big deal, but daily random SHA collisions aren't.

[1] https://shattered.io/

But you can essentially perform DoS attack to human checkers, effectively rendering the entire system grind to a halt. The entire system is too reliant on the performance of NeuralHash which can be defaced in many ways. [1]

(Added later:) I should note that the DoS attack is only possible with the preimage attack and not the second preimage attack as the issue seemingly suggests, because you need the original CSAM to perform the second preimage attack. But given the second preimage attack is this easy, I don't have any hope for the preimage resistance anyway.

(Added much later:) And I realized that Apple did think of this possibility and only stores blinded hashes in the device, so the preimage attack doesn't really work as is. But it seems that the hash output is only 96 bits long according to the repository, so this attack might still be possible albeit with much higher computational cost.

[1] To be fair, I don't think that Apple's claim of 1/1,000,000,000,000 false positive rate refers to that of the algorithm. Apple probably tweaked the threshold for manual checking to match that target rate, knowing NeuralHash's false positive rate under the normal circumstances. Of course we know that there is no such thing like the normal circumstances.

I have seen it suggested that everyone should flood the system with flagged images to overwhelm it in protest to this move by apple.

Sounds pretty stupid to me to fill your phone with kiddie porn in protest, but you do you internet people.

You don't need to do that, just use images that collide with the hashes.

How will you know something collides?

> Apple’s method of detecting known CSAM is designed with user privacy in mind. Instead of scanning images in the cloud, the system performs on-device matching using a database of known CSAM image hashes provided by NCMEC and other child-safety organizations. Apple further transforms this database into an unreadable set of hashes, which is securely stored on users’ devices.


Yeah, so how would you know something would collide with the hash?

Because the hashes are stored on the user's device?

No, encrypted and blinded hashes are stored. You can’t extract them.

Right, but your device is checking against the hashes, no?

So at some point you generate an image which triggers a subroutine and you know that image collides.

This also means that the device doesn’t know if a given image is a match.

Correct, only the server does.

Presumably, someone on the internet will download a dump _actual_ CP and run it through NeuralHash, then post the resulting hashes to GitHub. Doing so will be illegal for them because they are downloading CP, but it should be perfectly legal for you to then download the hashes they posted, randomly select one, and generate an innocuous image that matches it. The fact that Apple isn't posting hashes doesn't actually make it all that hard to get your hands on them.

Because the algorithm and list will be on your phone, and can (has, per TFA) be extracted.

You cannot extract or reverse the CSAM hashes. They've been encrypted and blinded using server-side-only keys. If TFA said that, it's lying.

After more reading of the whitepaper I think you are right. As I understand it, given the image hash H0 and CSAM hashes H[1]...H[n] (some might be duplicates in disguise) the algorithm proceeds like this:

- The device generates a secret X and divides it into X[1]...X[m] with the secret sharing algorithm. m is some large number and any k (but no less) copies out of X[i] are enough to reconstruct X.

- The device stores blinded hashes f(H[1])...f(H[n]). The function f itself is not known to the client.

- The image hash H0 is compressed with another function g to the range between 1 and n.

- The downscaled image data (for the human check) is encrypted with X and appended with (probably) random X[i].

- The result is then encrypted again with a key derived from f(H0) and sent to the server with an associated data g(H0).

- The server tries to decrypt it with a key derived from f(H[g(H0)]). This is only possible when H[g(H0)] = H0, i.e. H0 represents some known CSAM.

- You can only decrypt the second layer with at least k copies of X[i] then.

At this stage Apple can still learn the number of CSAM images less than k. The fix is described in an overly technical document and I can't exactly follow, but supposedly the client can inject an appropriate amount of synthetic data where only the first layer can be always decrypted and the second layer is bogus (including the presumed X[i]).


Assuming this scheme is correctly implemented, the only attack I can imagine is the timing attack. As I understand a malicious client can choose not to send false data. This will affect the number of items that pass the first layer of encryption, so the client can possibly learn the number of actual matches by adjusting the number of synthetic data since the server can only proceed to the next step with at least k such items.

This attack seems technically possible, but is probably infeasible to perform (remember that we already need 2^95 oracle operations, which is only vaguely possible even in the local device). Maybe the technical report actually has a solution for this, but for now I can only guess.

That synopsis disagrees with Apple's own descriptions - or rather it goes into the secondary checks, which confuses the issue that the initial hash checks are indeed performed on-device:

> Apple’s method of detecting known CSAM is designed with user privacy in mind. Instead of scanning images in the cloud, the system performs on-device matching using a database of known CSAM image hashes provided by NCMEC and other child-safety organizations. Apple further transforms this database into an unreadable set of hashes, which is securely stored on users’ devices.


One does not need to reverse the CSAM hashes to find a collision with a hash. If the evaluation is being done on the phone, including identifying a hash match, the hashes must also be on the phone.

No, matches are not verified on the phone. On the phone, your image hash is used to look up an encrypted/blinded (via the server's secret key) CSAM hash. Then your image data (the hash and visual derivative) is encrypted with that encrypted/blinded hash. This encrypted payload, along with a part of your image's hash, is sent to Apple. Then on the server, Apple uses that part of your image's hash and their secret key to create a decryption key for the payload. If your image hash matches the CSAM hash, the decryption key would unlock the payload.

In addition, they payload is protected at another layer by your user key. Only with enough mash matches can Apple put together the user decryption key and open the very innards of your image's payload containing the full hash and visual derivative.

To quote a sibling comment, who looked into the horses' mouth:

> Apple’s method of detecting known CSAM is designed with user privacy in mind. Instead of scanning images in the cloud, the system performs on-device matching using a database of known CSAM image hashes provided by NCMEC and other child-safety organizations. Apple further transforms this database into an unreadable set of hashes, which is securely stored on users’ devices.


I believe the hash comparisons are made on Apple's end. Then the only way to get hashes will be a data breach on Apple's end (unlikely but not impossible) or generating it from known CSAM material.

That's not what Apple's plans state. The comparisons are done on phone, and are only escalated to Apple if there are more than N hash matches, at which point they are supposedly reviewed by Apple employees/contractors.

Otherwise, they'd just keep doing it on the material that's actually uploaded.

Ah, never mind, you're right:

> Apple’s method of detecting known CSAM is designed with user privacy in mind. Instead of scanning images in the cloud, the system performs on-device matching using a database of known CSAM image hashes provided by NCMEC and other child-safety organizations. Apple further transforms this database into an unreadable set of hashes, which is securely stored on users’ devices.


He is not right, though. The system used will not reveal matches to the device, only to the server and only if the threshold is reached.

> That's not what Apple's plans state. The comparisons are done on phone

Yes but as stated in the technical description, this match is against a blinded table, so the device doesn’t learn if it’s a match or not.

It’s incredibly stupid because your Apple ID will get terminated for abusing Apple services.

There are applications which automatically save images sent to you to your camera roll (such as Whatsapp, IIRC). How can Apple prove you put them there intentionally?

Granted, they most likely won't care, but it's a legitimate attack vector.

You’re right that it’s a valid attack upon the people Apple pays to review matched images before invoking law enforcement, but no harm comes to the recipient in that model, unless they receive real legitimate CSAM and don’t report it to the authorities themselves.

Attempted entrapment and abuse of computing systems, which is an uncomfortable way to phrase the WhatsApp scenario, would be quite sufficient cause for a discovery warrant to have WhatsApp reveal the sender’s identity to Apple. Doesn’t mean they’d be found guilty, but WhatsApp will fold a lot sooner than Apple, especially if the warrant is sealed by the court to prevent the sender from deleting any CSAM in their possession.

A hacker would say that’s all contrived nonsense and anyways it’s just SWATting, that’s no big deal. A judge would say that’s a reasonable balance of protecting the sender from being dragged through the mud in the press before being indicted and permitting the abused party (Apple) to pursue a conviction and damages.

I am not your lawyer, this is not legal advice, etc.

> The fact that you can randomly manipulate random noise until it matches the hash of an arbitrary image is not surprising.

It is, actually. Remember that hashes are supposed to be many-bit digests of the original; it should take O(2^256) work to find a message with a chosen 256-bit hash and O(2^128) work to find a "birthday attack" collision. Finding any collision at all with NeuralHash so soon after its release is very surprising, suggesting the algorithm is not very strong.

SHAttered is a big deal because it is a fully working attack model, but the writing was on the wall for SHA-1 after the collisions were found in reduced-round variations of the hash. Attacks against an algorithm only get better with time, never worse.

Moreover, the break of NeuralHash may be even stronger than the SHAttered attack. The latter modifies two documents to produce a collision, but the NeuralHash collision here may be a preimage attack. It's not clear if the attacker crafted both images to produce the collision or just the second one.

NeuralHash is a perceptual hash, not a cryptographically secure hash. Perceptual hashes have trivially findable second preimages by design, as the entire point is for two different images which appear visually similar to return the same result.

It's not particularly surprising to me that a perceptual hash might also have collisions that don't look similar to the human eye, though if Apple ever claimed otherwise this counterexample is solid proof that they're wrong.

The problem is that you’d need the original NeuralHash, which isn’t stored on the device. The device only has a blinded version.

Are you pinning your hopes that a false positive like this will be appropriately caught because of an army of faceless, low wage workers who stare at CSAM cases all day will immediately flag?

Apple has pretty deep pockets. Think how much that judgement is going to be when they find themselves in court for letting someone get raided over gray images.

Not that it's going to happen, since it would also require NCMEC to think the images match, but whatever. Attack me! Attack me! I want to retire.

> Apple has pretty deep pockets.

For now, sure. What happens when their money runs short? What about the other tech companies that will inevitably be forced to deploy this shit? Will they also have Apple's pretty deep pockets?

Blind faith in this system will not magically fix how flawed it is nor the abuse and harm it will allow. This is going to hurt a lot of innocent people.

>Attack me! Attack me! I want to retire.

If you post your whatsapp address, I'm sure someone will oblige.

> Of course, grey noise will never pass for CSAM and will fail that step.

Never? You sure that one or more human operators will never make this mistake, dooming someone's life / causing them immense pain?

I can guarantee nobody will see the inside of a courtroom, on charges of possession and distribution of child porn for possessing multiple images of grey noise (unless there is some steganography going on).

One does not need to go to court to have their life ruined by accusations. Ironically, there's quite a few examples of this over the years for alleged CSAM.

One example that sticks out in my mind is a pair of grandparents who photographed their grandchildren playing in the back yard. A photo tech flagged their photo, they were arrested, and it took their lawyer going through the hoops to get a review of the photo for the charges to be dropped.

Sure, there are always cases - but was their photo of a grey blob that matched a hash but is clearly a grey blob or of a naked child?

If the photo was a grey blob and they had to go through a judicial review for someone to look at the photo and confirm 'yes that is a grey blob' then color me wrong.

I'd view a "grey blob" to be the MVP of hash collisions. I doubt that this will end with grey blobs - I see it ending with common images (memes would be great for this) being invisibly altered to collide with a CSAM hash.

If you need a judicial review to confim that a slightly altered Bernie in Coat and Gloves meme is not the same image as the picture of a child being raped that they have on file then we have way bigger problems.

Here's the thing with CSAM - it's illegal to view and transmit. So nobody, until the police have confiscated your devices, will actually be able to verify that it is a "child being raped."

They'll view visual hashes, look at descriptions, and so forth, but nobody from Apple will actually be looking at them, because then they are guilty of viewing and transmitting CSAM.

I noted in another comment, even the prosecutors and defense lawyers in the case typically only get a description of the content, they don't see it themselves.

This is just not true. Human review is conducted. Apple will conduct human review, facebook conduct human review, NCMEC will conduct human review, law enforcement will conduct human review, lawyers and judges will conduct human review.

Over the years there have been countless articles etc about how fucked up being a reviewer of content flagged at all the tech companies is. https://www.vice.com/en/article/a35xk5/facebook-moderators-a...

Where did you get this idea, scooby doo?

It is not illegal to be an unwilling recipient of illegal material. If a package shows up at your door with a bomb, you're not gonna be thrown in jail for having a bomb.

In theory, sure.

At the very least, you'd be one of the primary suspects, and if you somehow got a bad lawyer, all bets are off.


Okay, and when a cursory look at the bomb actually reveals it to be a fisher-price toy, what then?

What is the scenario where a grey blob gets on your phone that sets off CSAM alerts, an investigator looks at it and sees only a grey blob, and then still decides to alert the authorities even though it's just a grey blob, and the authorities still decide to arrest you even though it's just a grey blob, and the DA still decides to prosecute you even though it's just a grey blob, and a jury still decides to convict you, even though it's still just a grey blob?

You're the one who's off in theory-land imagining that every person in the entire justice system is just as stupid as this algorithm is.

Possession of CSAM is a strict liability crime in most jurisdictions.

That is simply not true. There is no American jurisdiction where child pornography is a strict liability crime.

On this topic, the Supreme Court has ruled in Dickerson v US that, in all cases, to avoid First Amendment conflicts, all child pornography statutes must be interpreted with at least a "reckless disregard" standard.

Here is a typical criminal definition, from Minnesota, where a defendant recently tried to argue that the statute was strict liability and therefore unconstitutional, and that argument was rejected by the courts because it is clearly written to require knowledge and intent:

> Subd. 4. Possession prohibited. (a) A person who possesses a pornographic work or a computer disk or computer or other electronic, magnetic, or optical storage system ․ containing a pornographic work, knowing or with reason to know its content and character, is guilty of a felony․

I tried to look up [Antonio] Dickerson v. US, but I don't see any SCOTUS decision on it, only a certiorari petition. Do you have a reference for the decision?

Yep, sorry, Dickerson was an appellant that cited the relevant case law, which is New York v Ferber.

Now, Dickerson rightfully lost and it's appropriate that SCOTUS rejected his case because he was involved in child porn production, not posession, so he can't rely on the Ferber precedent. He had the opportunity to ask the underage person in question their age, and chose not to, which would meet the reckless disregard standard anyway.

I don't see any mention of the reckless disregard standard in the New York v. Ferber decision, either? So far as I can see, it just says that CSAM is outside of the scope of 1A, so long as it's "adequately defined by the applicable state law". Am I missing something in the opinion?

Many people never see the inside of a courtroom when false or unproven rape accusations are made against them, but their lives still get ruined because of the negative publicity.

Those cases are not comparable, because the whole reason they have that impact is that the accusations are usually made publicly (because the whole point is to harm the reputation of one's rapist and warn others, should a conviction prove to be impossible), while CSAM review goes through a neural hash privately on your phone, then privately and anonymously through an Apple reviewer, then is privately reviewed at NCMEC (who - I think - have access to the full size image), and only then is turned over to law enforcement (which should also have access to the full image).

It only becomes public knowledge if law enforcement then chooses to charge you - and if all that happens on the basis of an obvious adversarial net image, the result is a publicity shitshow for Apple and you become a civil rights hero after your lawyer (even an underpaid overworked public defender should be able to handle this one) demonstrates this.

As others have stated in this thread, I think the real failure case is not someone's life getting ruined by claims of CSAM possession somehow resulting from a bad hash match, but the fact that planted material (or sent via message) can now easily ruin your life because it gets automatically reported; you can't simply delete it and move on any more.

We give way, way too much weight in the legal system to eye witness and victim statements, in absence of any corroborating evidence. That's a problem.

But not really comparable, IMO. You won't even know you got investigated until after the original images have been shipped off to NCMEC for verification.

Are you suggesting that perhaps less people should report rape accusations, because it might be awkward for the accused to get negative publicity? Thats messed up.

What if it is legal pornography of 21 year olds but disturbed to collide with CSAM? You are aware even defence lawyers are not allowed to look at alleged CSAM material in court right?

The process would not trigger any action. The NCMEC, who can look at the material, and are the people to whom the matter is reported, would compare the flagged image with the source material and reject it as not matching known CSAM.

What if the legal porn of a 21 year old that triggered the collision match looked really really really close? So close that a human can not distinguish between the image of a 12 year old being raped that they have in their database and your image? Well then you might have a problem, legal and otherwise.

> defence lawyers are not allowed to look at alleged CSAM material in court right

I know this is not true in many countries, but cant speak for your country.

You are aware that a lot of CSAM are close ups of say pussies for example, and human anatomy can look very similar?

I'm not talking about images of rape here. I'm taking about images that you'd see on a regular porn site, of adults and their body parts.

You are also aware that CSAM covers anywhere from 0 to 17.99 years of age, and the legal obligation to report exists equally for the whole spectrum?

So let's say I download a close up pussy collection of 31 images of what I believe to be consenting 20 year olds, and what are consenting 20 year olds.

But they are actually planted by an attacker (let's say an oppressive regime who doesn't like me) and disturbed to match CSAM, that is, pussy close ups of 17 year olds. They are all just pussy pics. They will look the same.

Should I go to jail?

Do I have a non zero chance of going to jail? Yes.

If you have content that has a matching hash value and is identical by all computational and human inspection to content that has been identified as CSAM from which the hash was generate then you have a problem.

Without getting into the metaphysics of what is an image, at that point, you basically have a large collection of child porn.

Your hypothetical oppressive regime has gone to a lot of trouble planting not illegal evidence on your device. It would be much more effective to just put actual child porn on your device, which you would need to have to conduct the attack in the first place.

> You are aware that a lot of CSAM are close ups of say pussies for example, and human anatomy can look very similar?

I doubt images that look quite generic will make it into those hash sets, though.

Nearly impossible to verify, though, by construction.

What is the factual basis for you to doubt that?

> I can guarantee nobody will see the inside of a courtroom

This wasn't the question I asked.

> grey noise will never pass for CSAM > You sure that one or more human operators will never make this mistake.

Yes I can say 100% that no human operator will ever classify a grey image for a child being raped. Happy to put money on it.

It's possible that the operator accidently clicks the wrong button.

When dealing with a monotonous task that the operator is probably getting PTSD from, I think the chance is greater than 0%.

Articles about content moderators and PTSD:




> The real challenge is generating a real image that could be mistaken for CSAM at low res + is actually benign (or else just send CSAM directly) + matches the hash of real CSAM.

Why do you have an idea that image have to be benign? Almost everyone watch porn and it's will be so much easier to find collisions by manipulating actual porn images which are not CSAM.

Also this way you'll more likely to trigged false-positive from Apple staff since they aren't suppose to see how actual CSAM looks like.

> The fact that you can randomly manipulate random noise until it matches the hash of an arbitrary image is not surprising.

Strongly disagree. (1) The primary feature of any decent hash function is that this should not happen. (2) Any preimage attack opens the way for further manipulations like you describe.

cryptographic hashes are different from image fingerprints

That's true, one way to put it is that traditionally non-cryptographic hashes are supposed to prevent accidental collisions, while cryptographic ones should prevent even collisions on purpose.

But hashing is used in many places that could be vulnerable to an attack, so I think the distinction is blurry. People used MD5 for lots of things but are moving away for this reason, even though they're not in cryptographic settings.

> Apple's scheme includes operators manually verifying a low-res version of each image

The reviewer, likely on a minimum wage, will report images just in case. Nobody would like to be dragged through the mud because they didn't report something they thought it is innocent.

I don't think that is far away either. I won't be surprised if that is achieved within the day, if not sooner.

Also, generating images that look the same as the original and yet produce a different hash.

I think I am in dire need of some education here and so I have questions:

* Is this a problem with Apple's CSAM discriminator engine or with the fact that it's happening on-device?

* Would this attack not be possible if scanning was instead happening in the cloud, using the same model?

* Are other services (Google Photos, Facebook, etc.) that store photos in the cloud not doing something similar to uploaded photos, with models that may be similarly vulnerable to this attack?

I know that an argument against on-device scanning is that people don't like to feel like the device that they own is acting against them - like it's snitching on them. I can understand and actually sympathise with that argument, it feels wrong.

But we have known for a long time that computer vision can be fooled with adversarial images. What is special about this particular example? Is it only because it's specifically tricking the Apple CSAM system, which is currently a hotly-debated topic, or is there something particularly bad here, something that is not true with other CSAM "detectors"?

I genuinely don't know enough about this subject to comment with anything other than questions.

Not complete answers but background: apple’s system works by having your device create a hash of each image you have. The hash (a short hexadecimal string) is compared to a list of known CP image hashes, and if it matches, then your image is uploaded to Apple for further investigation.

A devastating scenario for such a system is if an attacker knows how to look at a hash and generate some image that matches the hash, allowing them to trigger false positives any time. That appears to be what we are witnessing.

> A devastating scenario for such a system is if an attacker knows how to look at a hash and generate some image that matches the hash, allowing them to trigger false positives any time.

This is my understanding too. But is this not also true for other (cloud-based) CSAM scanning systems? Why is Apple's special in this regard?

They aren't.

Apple could have saved themselves so much backlash and not have caused the outrage to be focused exclusively on them if they hadn't tried to be novel with their method of hashing, and had just announced that they were about to do exactly what all the other tech companies had already been doing for years - server side scanning.

Apple would still be accused of walking back on its claims of protecting users' privacy, but for a different reason - by trying to conform. Instead of wasting all the debate on how Apple and only Apple is violating everyone's privacy with its on-device scanning mechanism, which was without precedent, this could have been an educational experience for many people about how little privacy is valued in the cloud in general, no matter who you choose to give your data to, because there is precedent for such privacy violations that take place on the server.

Apple could have been just one of the companies in a long line of others whose data management policies would have received significant renewed attention as a result of this. Instead, everyone is focused on criticizing Apple.

There is a significant problem with people's perception of "privacy" in tech if merely moving the scan on-device causes this much backlash while those same people stayed silent during the times that Google and Facebook and the rest adopted the very same technique on the server in the past decade. Maybe if Apple had done the same, they would have been able to get away with it.

Perception aside, Apple’s system is somewhat better for privacy, since Apple needs to access much less data server side.

There’s a lot of overlap between Apple product buyers and “fight the thoughtcrime slippery slope” hackers. The CSAM abusers are presumably (if they’re not stupid) also fanning the flames on that slippery slope perception, because it’s to their benefit if the hackers defeat Apple.

I don't know. I don't know what other systems use, I just know what I've read recently about Apple's.

This pretty much sums this entire drama up, I think.

But how does this exact attack and scenario not also apply to Google, Facebook, Microsoft etc... who are also doing the same thing on their clouds servers?

I don't know what those companies do, hopefully someone who does know will chime in and answer.

They essentially do the same, just in the cloud meaning they access all images directly.

That’s an oversimplified and misleading description of how the system works, but ok. I recommend reading the technical description, or even the paper linked from that: https://www.apple.com/child-safety/pdf/CSAM_Detection_Techni...

No, I don't think that's what we're witnessing. No one has yet demonstrated that they can take a hash alone and produce an image that matches the hash. It's true that that would be bad, but right now you still need the original colliding image.

But how would the attacker get the generated image on a person's phone?

I didn't mean to suggest a targeted attack. If the goal is to just overwhelm Apple's system, the attacker doesn't need to target a particular phone, they just need to distribute lots of colliding images to some phones. Even their own phones, since the colliding images wouldn't be illegal.

E.g. send them a whatsapp message that looks innocent

They can see the image - why would they import a random image into their library from someone they don’t know?

WhatsApp has a really weird default behavior: it imports all images you're sent into your photo library.

This is a smart thing to disable, even outside this recent discussion of CSAM.


In that scenario this attack is unnecessary. Someone could just send you legitimate child pornography and then immediately tell the authorities that you possess said things.

Yup! It's a legitimately crazy default.

Edit: Though, to be fair, the specific hash-collision scenario would be that someone could send you something that doesn't look like CSAM and so you wouldn't reflexively delete it.

If it doesn’t look like it wouldn’t a human reviewer disregard it once it gets to that point?

Personally I don’t really see the issue.

We don't know how the human review is going to work. Countries with fewer resources are going to find it easier to just arrest/detain any suspects, instead of spending time and money on figuring out which reports are true.

All you have to be is accused of CP for your life to be destroyed. It doesn't matter if you did it or not.

What you’re describing is possible even if you didn’t receive anything.

If the government wants to get you they don’t need this Apple scanning tech, or anything at all really

Yeah, you'd need a very specific set of processes for it to really be a problem.

Whatsapp imports received images into camera roll by default, and icloud sync is on by default. So you just need to get into a WA group the victim is a member of, and have the victim use default settings.

Auto import could be turned on?

Basically you are right. There's nothing that special about hash collisions with image recognition.

I think this is blowing up because it's cathartic to see a technology you disagree with get undermined and basically broken by the community...

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact