Hacker News new | past | comments | ask | show | jobs | submit login
The many flavors of hashing (volution.ro)
100 points by ciprian_craciun on July 31, 2022 | hide | past | favorite | 13 comments



Since the OP is the author, I have to point out the following mistake:

> Related to the "looks random enough" aspect, because the input size is almost always larger than the output (which is fixed size), there are usually an infinite number of inputs for which the output is the same, thus these functions are not surjective.

The word you are looking for here is "injective", not "surjective".

Edit: As long as I'm pointing out wrong-word errors like this, I should probably also point out that the phrase is "bear with me", not "bare with me".


I hate the terms "injective" and "surjective". It's impossible for me to remember which is which.


The way I remember it is that all three names are based on the image and codomain since the arguments/domain are total.

Sur- means 'upon' (sorta) as in sur-la-table, so we know the surjective is upon the entire codomain, hence surjective is the 'at least once' class, leaving injective as the 'at most once' class.

Bijective is the invertible one but I've never had problems remembering that and I note that you don't mention it either.


well, they might also not be surjective :) but I guess we'll probably never know


Homework for Dantzig: show that [cryptographic hash] is surjective?


Thanks for both corrections, indeed you are right on both accounts.


This article dances around the problem of encrypting small values a bunch. Here's a Rogaway paper on how to do it:

https://www.cs.ucdavis.edu/~rogaway/papers/thorp.pdf

(You do not need to limit yourself to inputs that are multiples of the AES block size).

There's another one like this, I forget the name.

SHA-2 is just fine, and is usually a saner choice than SHA-3 (if you want to be a crypto-hipster, use Blake2, not SHA-3; it's usually an easier dep to pull in, and more things use it).


The section about signature hashes could use some improvement. Cryptographic hashes have multiple use cases. The main one that comes to mind is basically just secure checksum/integrity hashes where you want to guarantee nobody could force a collision. For this case, all the scary warnings about "don't use these algorithms directly" makes no sense.


The section on "signature hashes" was kept short on purpose because anything that relates to real cryptography or security use-cases should require at least using a high-quality (as in audited) high-level library (like `libsodium`), and / or doing your real in-depth research, not reading a few paragraphs on a random person's blog. :)

For example even in the case you've described "where you want to guarantee nobody could force a collision" my suggestion is to use a keyed algorithm (like Blake3) or a HMAC and choose a different key (if not for each "instance" algorithmically tied to it, at least for each "deployment"). It doesn't solve the brute force, but it might save you from other mistakes related to how you use those hashes (like for example when the attacker might swap or replace two payloads with their hashes).


I once had a need for permutation hashing as part of a 2D sampling algorithm. My approach was to compose the hash using only reversible steps (to ensure that it's a permutation), and then use cycle walking to reduce the domain size from a power-of-two down to an arbitrary non-power-of-two size if needed.

I published my implementation in the Pseudorandom Permutations section of my "Correlated Multi-Jittered Sampling" tech report: https://graphics.pixar.com/library/MultiJitteredSampling/pap...



Thanks for the nice words!


i like it




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: