

Ask YC: Searchable, obfuscated location-based data? - cmars232

I'd like to be able to encode geospatial coordinates in such a way that:<p>1. The encoding is computationally hard to reverse (one-way function).<p>2. Encoded locations are searchable by proximity in an index without revealing or using the plaintext location data.  This rules out a simple shared-secret-offset type-solution.<p>I have a perhaps naive solution in mind, but I don't want to re-invent the wheel either.  Anyone know of something that does this, or have any ideas?  Is it even possible?
======
jwp
My intuition says it's not possible to get 1 without destroying the ability to
get 2, or vice versa. But I'm no expert. The solution that leaps to my mind is
to use a locality-sensitive hash function. It would make search stochastic,
which means reversing the encoding would be easier. Seems like any hash-based
solution will involve a tradeoff between 1 and 2.

What did you have in mind?

~~~
cmars232
I still haven't worked out the protocols for matching encoded locations --
hopefully it is possible. I fear you're right, I may not be able to get beyond
a simple challenge-response type of comparison without disclosing some
information.

Cool thing is, since I got interested in solving this problem, I've learned
how R-Trees work! :)

<http://en.wikipedia.org/wiki/Hilbert_R-tree>

------
neilk
1 and 2 seem to be contradictory.

If I know a location, and then I hash the location, and then I can find (with
some precision) the true locations of the nearby hashed places, this
constitutes an algorithm for reversing the hash of such nearby places. Or at
least narrowing the search space down significantly. So we just contradicted
the first property you want.

What are you really trying to do?

~~~
cmars232
Here's a story that illustrates how I'd like this to work, at least with
regard to the encoding and the searching:

Alice tells Trent of a list of identities that she'd like to meet in real
life, one of them is Bob, whom she's seen online, but never in real life.
Trent agrees to send her a notification when one of them is nearby. At that
point, they can chat online and mutually decide if they'd like to meet up or
not.

Trent runs the public location database. He's a well-meaning do-no-harm kind
of corporation, but deep down, neither Alice nor Bob nor anyone else in their
right mind really trust him with all their location data. Even a company that
is never evil could have its share of bad seeds and pitfalls, like a lurking
serial-killer on the payroll or a secret court order.

Periodically, Alice encodes her location and uploads it to Trent. Bob also has
an account with Trent and does the same. Both happen to be in midtown
Manhattan this afternoon...

How can Trent tell Alice that Bob's within two city blocks (and vice-versa),
knowing as little as possible of their whereabouts, if possible not much other
than "Planet Earth", and couldn't tell you even if they had to.

~~~
neilk
There's another requirement there; there have to be no false positives either.
Charlie, who lives halfway around the world, can't be falsely reported to be
nearby.

I don't get what pixcavator is driving at with the x sin(10x) notion but it
seems to fail the Charlie test to me.

If this function has to preserve both proportionate nearness and farness,
without false positives, it to seems me that it's just an ordinary geometric
transform and therefore easy to reverse.

IANAM (I am not a mathematician).

------
void_star
You might want to look into secure indices. It's still very new but seems to
solve your problem.

