
Fingerprinting Images for Near-Duplicate Detection – Real Python - nepsilon
https://realpython.com/blog/python/fingerprinting-images-for-near-duplicate-detection#.VlBBDRBn_0s.hackernews
======
dalke
On a big of a tangent, can someone help me identify the history behind the use
of image similarity based on fingerprint similarity?

I've been exploring the history of a certain type of chemical similarity
search. The core idea is to take the molecular graph and turn it into a
fingerprint, the use fingerprint similarity as a proxy for molecular
similarity.

I think the parallels to the image search described here are obvious.

I've been trying to figure out where the term 'fingerprint' comes from.
Historically, my field didn't use the term in this meaning until around 1990.
While I have come across earlier references to 'fingerprint' (in infrared
spectroscopy from the 1970s), they were hash values designed to find identical
matches, not similar matches. They could not be used for similarity.

Similarly, Rabin's early work on fingerprinting / hashing, in 1981, available
from
[http://www.xmailserver.org/rabin.pdf](http://www.xmailserver.org/rabin.pdf) ,
wasn't designed for similarity.

I'm curious to know the evolution of how other fields handled this approach to
similarity search. Searching just now, the I can't find references to this
approach being used for image or sound search until the 1990s, which is well
after it was established in chemical information.

