Hacker News new | past | comments | ask | show | jobs | submit login
The 10:10 Code (jgc.org)
127 points by jgrahamc on June 14, 2010 | hide | past | web | favorite | 31 comments

See also the GeoHash, which lets you remove characters from the end at a gradual loss of precision:


In my mind the big issue with these is that while they're easy to transmit, you don't actually want to store them because then you lose the ability to do proximity or point in polygon searches.

It looks like this algorithm will give you better and better accuracy as you get farther from the equator. There must be some way to compress the location information with that in mind and shave a couple of digits off the result?

This has always bothered me, as well. I don't like the solution the military grid reference system came up with: http://en.wikipedia.org/wiki/File:MGRSgridSouthPole.png

I'd much prefer a solution that removed the square grid requirement and the dissimilar angular coordinates, divided the earth into roughly equally sized and shaped sectors, and ideally was hierarchical so that you could achieve a naming scheme similar to the geohash. A bit of googling led me to this picture, which looks promising: http://icon.enes.org/swm/icoswp/node3.html

Yes, the military grid reference system looks ugly. Icosahedron based coordinates should be much nicer and more symmetrical.

There's an interesting observation about normal distributions that may prove helpful--especially if you find a way to use randomness to generate a coordinate system.

If you have n pairwise independent identically normal distributed random variables X_i, and normalize them to the unit sphere Z := X * 1/length(X) then Z will be distributed equally on the surface of the unit sphere (in the sense that the distribution is symmetric under any rotation around the origin).

If you add some ideas from arithmetic coding and/or the theory for de-randomization of algorithms, you might be able to get a working coordinate system out of this.

No, that's just because P(X=x)*P(Y=y) = P(Z²=(x²+y²)).

I think you could do a reasonable job by dividing first into two hemispheres, then into six(?) triangular sectors each, then recursively into four subsectors each. This would give log4(510 Mkm²/2/6/100 m^2) = 19.3 recursions, or 43 bits to reach a 100m² accuracy, assuming it really does divide nicely. Won't be square, though. Then you'd have to decide whether to use the remaining 5 bits on a checksum, or to allow arbitrary precision. Or I suppose you could add a checksum there, and interpret any further bytes as unchecked precision.

I wonder if six subsectors are required in the first phase to construct the voronoi cells. The triangular scheme should work fine with four, which seems more elegant.

After a bit of further thought, I realized that the triangular sectors aren't equilateral after just one iteration. The angles at the equator are 90°, and the polar angles are 180°/n - works when n=4. But then the four nested ones will be 3x(90° 60° 60°) and one equilateral, and so no longer identical (and probably not of equal areas).

For area-equality, shape-inequality, a recursive three-division becomes also worthy of consideration.

Might work.

Here's what I thought of: Use a Cartesian coordinate system with the centre of the sphere as its origin and the radius of the sphere normalized to 1. Each point on the square has coordinates x_1,x_2,x_3. The new coordinates will live in [0,1]^3 (i.e. three real numbers between 0 and 1).

To convert, do the following: Generate a random number r from a N(0,1) normal distribution. For each coordinate determine the quantile of q_i := rx_i in the N(0,1) normal distribution. (This results in 0 <= q_i <= 1.)

Take the triple (q_1,q_2,q_3) and express it as a binary number. Bonus points for using octal numbers like this: Q(j) = q_1(j) + 2q_2(j) + 4*q_3(j) where Q(j) is the j-th octal digit of the number Q after the comma and q_i(j) is the j-th digital digit of the number q_i after comma.

(This octal scheme has the property that you can cut off at any time for getting a less accurate description of the location.)

This coordinate scheme is redundant. Also a uniform distribution of random points on the surface of the sphere, will lead to a uniform distribution of coordinates in this scheme.

> No, that's just because P(X=x)*P(Y=y) = P(Z²=(x²+y²)).

I found a way to make my idea work, anyway.

Earth's surface area = 5.1e14 square meters. Give each of 5.1e12 same-sized patches an integer identifier (one way or another): that's 8.3 base-34 digits (disallowing 'I' and 'O'). If you detect all single-character substitution errors, that rounds up to 10 characters -- oh, well.

(This kind of scheme would get you better worst-case resolution with your 10 characters. I'm not sure what it is for the posted code.)

I wonder how this'd work: Take the XYZ coordinates of a vector on an unit sphere with North Pole being at Z=1 and lat/long 0,0 at X=1. Drop the smallest of XYZ, keep the other two. Use 3 bits to store which of XYZ was dropped and whether it was positive or negative. The point can then be reconstructed since the vector must always have length 1, and the encoding scheme will always drop that coordinate which would contribute most to loss of precision when going from 3 coordinates to 2.

If anything you'd want less resolution near the poles, since they are almost uniformly less populated and hence there is less potential ambiguity between, say, individual dwellings.

as far as i can tell, the "10:10 code" is just a linear rescaling of the longitude and lattitude then encoding in base 32 (with a checksum for the last four bits).

if the above is correct, why not explain it thus?

if you do this, it becomes obvious what happens to errors; we know you can gracefully degrade the accuracy (since it's a linear encoding) by dropping the bits before the checksum.

what are the coding properties of the checksum? it doesn't look like crc32 or adler. what is it? why did you choose it?

Actually, it's base 29, but no one has won the prize by checking my code. Funny how that works.

Want to fix it? ^32^29 in the code.

Update: I changed it.

i'm pretty sure everywhere you put "base", was "32" before. the only 29 in the code when i read it, was on the checksum.

why did you pick this checksum? what are its coding properties?

One problem is letter differences; so it's not quite universal. For example a surprising number of languages do not have the letter Q (slavic languages for example)

Well, there's no real universal script, so it's not like a better encoding basis (besides numbers) could be found. And dropping down to Latin script is done for other reasons, too. Probably even on the same page.

Encoding in Kanji would result into fewer characters, but would probably result in some funny phrases. ("Turn left at Ephemeral Outhouse!"). Not quite sure whether that's good or bad...

Ditto for X, Y, W.

I guess they'll just have to use their convenient systems for entering ASCII then huh.

Completely tangential, but I was very excited many years ago when Swatch came up with beats, a decimal time system, intended to replace our current sexagesimal system. Even though the latter has many advantages, I suppose the real reason beats never became commonplace is the overwhelming inertia that prevents wholesale shifts in measurement systems that are used earth-wide (even though the Swatch Group is the biggest watch maker)

Good point, however time is used daily by almost everyone, whereas lat and long are used (directly) by far fewer people.

The main problem this would have is people figuring out what the hell it means! Most people have some grasp on latitude / longitude and although they may not know how to read/use it, they could probably find out.

A random code without obvious meaning is going to be a whole lot less appetizing... At least until it gets popular - I mean look at what twitter did for url shorteners... maybe that's the way to integrate this. Or get google maps to implement it, and use it in their location code (it is shorter than lat/long coords and they already use these in their queries, and they are concerned about bits transmitted...)

It didn't catch on in France during The Revolution either.

You want to provide a postal code for any location? I've been thinking it would be better if we had virtual postal codes. In other words, you use my name and postal code (they have to match) and the post office or FedEx connects it to a physical location without you needing to know where I live.

You'd be able to figure out roughly where it so you know how much it will cost to ship, but otherwise where I live would be private.

I'd be able to change what the code maps to and all my mail would automatically follow me when I move. Or my identity theft, depending on how competantly this is managed :-)

This service already exists; PO boxes can do pretty much what you want.

The problem is that this is a service "worth" something so the postal services can charge for it. You'd have to remove that incentive to convince them to offer it as a global service.

its kind of backwards. they charge you NOT to deliver. you pay for the storage of course. but delivery isn't cheap. its mandated by law though so they can't change it easily or even stop Saturday delivery without an act of Congress (for real)

But, even so, the PO box is tied to a city. people can't keep using it and have mail go to the right place automatically. the best they can do is forward it. so something could be shipped from nyc to a PO box in LA and then get forwarded back

PO boxes provide stability should your physical address change, so there will always be a market for them.

I've heard of mail getting to well known people in Ireland with an address consisting of little more than their name and the town.

depending on how competantly this is managed :-) Is this a play on the "Qualaty Initiative" from Dilbert?

While it's certainly not guaranteed to work, a name and postal code will work fairly often in the US. As a lark, I once sent mail to my father (in a small city in Georgia) with only his name and zip code on it, and it did arrive.

I can send mail to my cousin in a small town in Iowa, with her name and town name only.

That's true. My uncle's address is just Quartermire, County Clare and he gets his mail. Its basically the road he's on.

Something like this already exists: it’s called MGRS, the Military Grid Reference System.


> PS. Many people have pointed out that there are existing systems like this, and existing patents. As far as I am aware, none of them include a check digit. For example, there's the Military Grid Reference System, the Natural Area Code, this Microsoft patent and Geohash. The check digit is critical because it reduces operator error when entering a location on a GPS device.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact