Hacker News new | past | comments | ask | show | jobs | submit login

Any plans to add a Paranoid Mode that lets you search for a hash of a phone number (or email address)? I'd imagine that could be more successful on here, heh.



The search space of phone numbers is far too small to offer any type of anonymity. Submitting a phone number's hash is no more secure than submitting the phone number itself.


That is true in principle and I didn't take that into account when I thought about it. Though it would still be possible, in theory, to use a hash function that is cheap enough so it's feasible to hash all the leaked numbers once, but expensive enough that it would take a long time to brute-force the whole number space.

Since there's about 32 million leaked US numbers, but there potentially exist up to 10 billion, any hash function that would take a day to process the leak would still require over 300 days to bruteforce the whole space.

Granted, any number in the leaked set could still be trivially reversed when submitted -- but those were known already anyway, they are just associated with more metadata now.


Many methods could be used to reduce the search space, such as not allowing the seventh-from-last and tenth-from-last digit to equal 0. Now I've reduced the search space by two orders of magnitude, so those 300 days just became 3 days.


To be honest, I don't really know the rules governing US phone numbers and just did a cursory google search, which came up with the simple answer of "10 billion".

That is very likely quite an overestimation. But if I'm not mistaken, the limitations you describe only reduce the search space by 19% (since it's two 90%-steps).


  100 * 0.1 = 10  // First  90% Reduction
  10  * 0.1 = 1   // Second 90% Reduction


Not allowing a single digit to be 0 is a 10% reduction though, is it not?


Oh, you're right! Thank you.


No worries :)

I checked the math so many times because I was starting to suspect my mind was glitching, lol.


> I'd imagine that could be more successful on here, heh.

You're right about that!

If I was to make a HN-friendly version, I'd probably make static JSON files that list all the numbers, indexed by the first four or so digits. When you enter a number, the first digits are sent to the server, and the appropriate JSON file is returned. That list is then searched client-side for the full number and the result displayed. The code should be simple and easy to verify that the full number doesn't leave the client, while maintaining the same simple user interface I already have. Variations of this idea could be more secure (i.e., only enter the start of the number and search for your number yourself in a long list) but less user-friendly.

I don't actually have any plans on implementing this though. I feel satisfied enough with what I have.

(I don't think hashing would work because the address space is too small and reversing is too easy. There aren't any email addresses.)


Just release a CSV containing just the numbers as a zstd compressed file. We can search it ourselves.


True, hashes would be completely trivial to reverse, I didn't think that through :D

And you're right, the only way to build a HN-friendly version would probably be to basically do the checking client-side, since any additional information you send to the server could be directly used to narrow the search space.

I think I read that there are some email addresses in the leak though; wasn't HaveIBeenPwned searching only for those, but not for numbers?


Oh, you're right, there are some email addresses, but not many. In the first 10,000 rows of Australian data, there are 62. I could be wrong, but I think the extra data about users (i.e., location, email address, relationship status, workplace) was scraped from Facebook so it only includes it when it was already publicly visible.


Just brute-force his website’s form with a polite delay between requests and enumerate your own list of numbers!


I laughed, then had another idea: Rather than send the server one number to check, generate another 99 random numbers in the client and send them all to the server. The client receives the status of all of them and shows the status of the entered number. The server never knows the actual number, and the phone number address space is saturated enough that many or most of the random numbers are also real numbers.


That is similar to how I checked, actually.

;)



Why bother ? People here would have the dump and just grep it I think.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: