Hacker News new | past | comments | ask | show | jobs | submit login

Does there exist some kind of hash that preserves some notion of distance but which prevents recovery of the original data?

So f(X) is close to f(Y) if X and Y are themselves close, but without telling you about X or Y. Is that mathematically impossible?




Yes that is possible. Fully holomorphic encryption schemes allow functions to be evaluated on encrypted data without deciphering the data at any point, for example. The problem is that these techniques are very slow, which limits their practical use.


The problem is that then the distance is also encrypted, which doesn't work here.


> Does there exist some kind of hash that preserves some notion of distance but which prevents recovery of the original data?

I don't think that's possible. If you can hash a test image, and you can evaluate its "distance" from other images, an attacker can use that distance as a metric for a hill-climbing or simulated annealing algorithm.


What you say is true for one-way hashes, a.k.a. "secure" hashes.

Hashing functions in general only reduce a long variable-length input string to a short fixed-length output, while preserving as much information as possible within these constraints, so that input strings that differ in some manner that is considered important for the intended application are hashed into distinct values.

So if one wants a non-secure hash that preserves some kind of distance defined on the input strings, that is possible in many cases.

https://en.wikipedia.org/wiki/Perceptual_hashing

Non-secure hashes are not useful for detecting data modifications, but they are useful for searching through large amounts of data, especially when the search is not only for identity, when secure hashes can also be used, but also for similarity, when secure hashes are useless.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: