How Good Is Monterey’s Visual Look Up?

jchw · on March 17, 2022

Based on the article, I’d expect Apple was retooling their CSAM scanner to try to catch art thieves.

Jokes aside, I would like to use this opportunity to express something I really want: I really wish I could search Wayback Machine with perceptual hashes. Google Images has had search by image for a long time, but it seems to get rid of content after a while once it’s offline. Meanwhile, Internet Archive has a ton of images you basically can’t find elsewhere anymore, and depending on how it was archived, it may be very difficult to find it if you don’t already know the URL. For sake of preservation, that would be genuinely amazing. You could go from a single thumbnail or image and potentially find more images or better versions.

It’s not like being able to identify common objects and artifacts with a phone camera isn’t super cool, but its far from perfect and in some of its more novel use cases (such as helping blind people navigate) that can be troublesome. Nothing technically stops the aforementioned Internet Archive phash index except for the fact that there will probably never be enough resources to create or maintain such an index.

dessant · on March 17, 2022

Internet Archive has an experimental API to perform reverse image searches.

https://archive.readme.io/docs/reverse-image-search-api

There is also RootAbout: http://rootabout.com/

You may have a better chance of finding the image by searching on a couple dozen search engines using my extension.

https://github.com/dessant/search-by-image#readme

jchw · on March 17, 2022

Thanks, this is awesome stuff. I’ll have to give it a shot sometime.

I’m also seeing a full text search API, which is again, incredible, especially if these indices are relatively complete.

gregsadetsky · on March 17, 2022

Agreed that it'd be great to have that phash image index. A full text search of the Wayback Machine's archives would be amazing to have as well..!

I've been putting away the idea of starting a server that would request archives from the Wayback Machine, parse text from the html documents, and create the world's-simplest-search-index i.e. just the location (document id) of every encountered word. There's a ton of problems with this "plan", but... having any search would be better than nothing?

jchw · on March 17, 2022

Honestly, I think possibly the biggest problem with indexing Wayback Machine is simply the size. I’m pretty sure it’s growing far faster than anyone can pull WARCs out for indexing, especially because well, it’s not exactly high throughput on the download side. I don’t blame anyone for that, but it does make the prospect of externally indexing feel a bit bleak.

At this point, I’d like it if there were just tools to index huge WARCs on their own. Maybe it’s time to write that.

gregsadetsky · on March 17, 2022

Right, the download speed is definitely an issue (and like you say, it's quite understandable considering the volume/traffic they deal with), and the continual growth is one of many factors I didn't consider.

I wonder if the IA would allow someone to interconnect directly with their storage datacenter, if one were to submit a well articulated plan to create this search index/capability.

Also, what do you mean by tools to index WARCs? Specifically, the gzip + WARC parsing + html parsing steps? Would the (CLI?) result be text extracted from the original html pages, i.e. something along the lines of running `strings` or beautifulsoup?

jchw · on March 17, 2022

Yeah, pretty much. Though being able to directly load data into a search cluster like Elastic would be nice.

KarlKemp · on March 17, 2022

For as long as it lasts, yandex is my go-to reverse image search favorite, by a rather large margin.

Try searching with a portrait… it is unlikely to find the person, unless there are images of that person in Russian social media. But it will find your identical twin behind the ironic curtain.

mistersquid · on March 17, 2022

Visual Lookup performs poorly on non-Western historical art such as the Fart Battle Scroll (including non-bowdlerized versions). [0] [1]

Hopefully Visual Lookup's data set will improve with time and usage.

[0] https://archive.wul.waseda.ac.jp/kosho/chi04/chi04_01029/chi...

[1] https://www.tofugu.com/japan/fart-scrolls/

KarlKemp · on March 17, 2022

The feature isn't (yet) available outside their larger western markets: https://www.apple.com/ios/feature-availability/#visual-look-... It will presumably improve over time.

reayn · on March 17, 2022

I don't know what I expected to see after clicking on to that article but I can't say I'm disappointed lmao.

acdha · on March 17, 2022

I had a ton of pictures from a trip to Portugal in 2003 which I had roughly located (the dates told me which city I was in) but the integrated lookups in Photos made it pretty easy to see exact names and locations for most of the historic buildings, artwork, etc.

The big thing I wish they had was a workflow optimization: it'd be great if there was a way to copy the locations with a single click and copy them to temporally adjacent photos since if you took a picture of, say, a famous church you could safely assume that the closeup details of stonework 3 minutes later were in the same place.

KarlKemp · on March 17, 2022

Of note, Photos.app has long had some image recognition features, and they have worked quite well for me in the past.

As one example, searching for „paper“ brings up dozens of hits in my library of thousands, including a fair number where it took me a while to find the paper. It somehow manages to find two portraits where the person is wearing paper-in-plastic-sleeve ID tags, but not any of the almost identical portraits with all-plastic IDs.

WoodenChair · on March 17, 2022

You can do this on iOS Safari too in the latest version with a long press.

dymk · on March 17, 2022

Just for famous paintings, apparently. Surely the article could have showed off more than just that?

I guess there was a single mention of a Havanese dog.

ajmurmann · on March 17, 2022

For what it's worth, I just tried this with a contemporary painting I own that's not famous but sat least came through the gallery system and it was recognized

Edit: it just gives me a link to https://www.artsy.net/artwork/mark-andrew-bailey-ingess It does not present any information like artist, size etc in the iOS interface when showing this painting, just the link. Still pretty cool.

felixthehat · on March 17, 2022

Just tried it on my pup, a triumph! https://imgur.com/a/bpkSTuu

Rebelgecko · on March 17, 2022

Those results are certainly interesting from a geopolitical perspective. It looks like there's an internal mapping of Tibet -> PRC?

felixthehat · on March 17, 2022

'What have I told you about chewing my shoes and getting embroiled in geopolitics?'

recursive · on March 17, 2022

It doesn't seem to say it explicitly anywhere. For anyone as confused as I was trying to make sense of what this article is actually about, it appears to be a new macos feature.

morpheuskafka · on March 17, 2022

> Confusion between the Pissarro and Gallen-Kallela paintings above resulted from a ‘collision’, in which their Neural Hashes are sufficiently similar to result in misidentification, one of the reasons that Apple was dissuaded from using this technique to detect CSAM.

So if two literal paintings made centuries ago can cause a hash collision, there's no way this ever should have been considered for matching against files that no one else can see or research with for the most serious crime/reputation damage imaginable. It would not even be remotely hard to make up some collisions, and it could probably be done even without the original dataset.

outworlder · on March 17, 2022

> So if two literal paintings made centuries ago can cause a hash collision, there's no way this ever should have been considered for matching against files that no one else can see or research with for the most serious crime/reputation damage imaginable.

It simply does not follow that the classifier for CSAM would have the same rate of false positives. There isn't enough information to infer that.

KarlKemp · on March 17, 2022

As apple states at the time, any action would require more than one match. For any hash value of a given size, it is trivially easy to calculate the probability of collisions and to adjust required thresholds for any arbitrary rate of false positives.

If the FP rate is 1/1000, requiring three „hits“ makes it 1/1,000,000,000, or essentially zero.

Someone · on March 17, 2022

The false positive rate has to be a lot lower for this to work.

If it is 1/1000, it is only 1/1,000,000,000 if they have only 3 of images from a customer. They typically have thousands, though. A 1:1000 false positive rate would mean several ones in many iCloud photo databases.

On the plus side, in case of multiple hits, they would have a human look at the images.

The whole thing was intended as a way to make that human check economically viable. Instead of having people look at every picture uploaded to iCloud, they would filter out almost all of them, and only let humans look at the few remaining (where, I guess, ‘few’ still could be a lot, given their number of users)

IfOnlyYouKnew · on March 17, 2022

They wouldn't have a human look at it. They don't have the images, only the hashes–that's the point.

And it's somewhat irrelevant how the probability of collisions is specifically calulated (1/1000 already assumed 1:n comparisons), as long as we agree it's easy to calculate for a given user. The algorithm does know about the sizes of the respective image libraries, for example, and could adjust the threshold with precision.

Someone · on March 17, 2022

They don’t have the images, but they do have “visual derivatives”. https://www.apple.com/child-safety/pdf/CSAM_Detection_Techni...:

“The device creates a cryptographic safety voucher that encodes the match result. It also encrypts the image’s NeuralHash and a visual derivative. This voucher is uploaded to iCloud Photos along with the image.

[…]

Once more than a threshold number of matches has occurred, Apple has enough shares that the server can combine the shares it has retrieved, and reconstruct the decryption key for the ciphertexts it has collected, thereby revealing the NeuralHash and visual derivative for the known CSAM matches.”

https://www.apple.com/child-safety/pdf/Security_Threat_Model... is even clearer:

“The decrypted vouchers allow Apple servers to access a visual derivative – such as a low-resolution version – of each matching image.

These visual derivatives are then examined by human reviewers”

giantrobot · on March 17, 2022

That assumes that only "bad" images exist in CSAM corpus. There's no guarantee of that and it's not something that can be audited in a meaningful way. In the US the only place CSAM images can legally exist is NCMEC. Even someone wanting to generate a detection system can't access the corpus directly and has to rely on some convoluted system of hashes.

_abox · on March 17, 2022

I have mixed feelings about this..

It sounds like a useful feature but I don't want to help Apple to train their algorithms which they're still planning to use to snoop on our computers. Their plans are only 'on hold', not cancelled. Which sounds a lot like they're waiting for the upheaval to blow over, or for some other vendor to introduce this so they can point the finger at them and say they're not doing anything unprecedented.

So probably I won't use it, I've already stopped using Apple's built in photo app and most of iCloud anyway. Not that I have anything to hide, I just don't want big tech looking over my shoulder. It was great when Apple was one of the last to take a stand on privacy and I'm sad at the ease with which they threw it out the window.

KarlKemp · on March 17, 2022

They are not asking for users to correct or provide any information, so you really aren't helping. Plus, it isn't even obvious how tags on your sightseeing photos would be used to improve CSAM detection.

_abox · on March 18, 2022

Fair enough, but the investment sure helps to serve both goals. I wasn't aware there was no training going on from the user side though.

I have to admit the text OCR in photos could be handy. But I hate it when their algorithms poke around my photos trying to identify people and locations and auto-categorise them (which it's already done for a few years now). It makes me feel spied upon and there didn't seem to be a way to switch this off on iOS.

That was one of the many more things (both on the privacy and vendor lock-in side) I have issues with on Mac and iOS now and I've moved away completely in my personal life. I only use it for work now. My personal mobile is Android but degoogled and with mostly open source apps and tracking blockers on the ones that aren't.

So this was not the only reason. I did move all my old content out of iCloud recently but there was also little need to still have it there especially because of its closed nature.

_abox · on March 21, 2022

By the way, another article from the same author came out today where he mentions the same thing: https://eclecticlight.co/2022/03/20/last-week-on-my-mac-is-l...

So I'm not the only one who sees a connection here :)