Hacker News new | comments | show | ask | jobs | submit login

Good posted, upvoted. One clarification:

> That's 25 users per hash

Password choices are probably Zipf-distributed, so averages don't make a ton of sense.




It does if you're trying to estimate the size of the corpus based on the number of users.

The arithmetic mean is specifically the value you'd want. n users times m users/password == total passwords (unduplicated) in the LinkedIn database.

Zipf distribution would suggest that the pattern of reuse among passwords isn't normal, and that the median and mode are probably higher than the arithmetic mean.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: