

Ask HN: Ensuring unique account logins across the globe - pfarrell

Question for the community.  My company's application is starting to grow from a single production deployment to multiple production stacks that run independently in different datacenters.<p>How have other applications handled ensuring uniqueness of data (specifically login ids) across multiple distributed deployments of an application on the global stage while letting this data be driven by the public?<p>I keep coming back to this: Google and Yahoo let people generate accounts where (presumably) the email address is the id for the account, yet, these ids must be globally unique and the ids are registered very quickly, creating these accounts while a user sits at a browser waiting on a form post at the end of which, the account is registered (i.e. synchronous from the user perspective and complete within the default timeout for a browser request)<p>We're batting around ideas of using async messaging, highly available databases of hashes of in-use userid, LDAP with replication between datacenters, and some other strategies.  Nothing feels completely solid nor have I been able to locate much on this topic online.<p>Do you have any thoughts on this type of distributed problem?  In 2010, this feels like it should already be a solved problem and that I'm just ignorant of the solution.  Does anybody know of papers or essays related to dealing with this type of issue or have thoughts on how they've seen this approached at other big companies?
======
Travis
Can't find the link, but flikr recently had a post somewheres (maybe
highscalability.com? I cannot find it for the life of me...) about how they
handle unique IDs. In essence, they have one "ticket server" that only runs a
single database, with a single table, with two fields -- `id`, and `a`. The
`a` column doesn't really do anything, but it allows them to issue a mysql
REPLACE INTO, which just increments the auto counter, without expanding the
table size. Then they have a mechanism to retrieve this next ticket.

Found the link -- does this concept work for you?
[http://code.flickr.com/blog/2010/02/08/ticket-servers-
distri...](http://code.flickr.com/blog/2010/02/08/ticket-servers-distributed-
unique-primary-keys-on-the-cheap/)

~~~
pfarrell
Thanks Travis. That link is great. It doesn't totally address what we're
dealing with, but does give me a lot of ideas (which is probably the best that
can be expected).

------
icey
Eh, what's the real likelihood of collision anyways?

There can only be so many people that want a specific username out there that
are going to be registering in the same few seconds or few minutes it takes
before your data syncs up.

Unless you really have a compelling reason to do all this stuff, it kind of
sounds like you're over-engineering it to me.

Why can't you just have 1 registration server and call out to it
asynchronously? Even across the globe you're talking about a delay of less
than a second.

~~~
pfarrell
We've had issues with single points of failure and are trying to plan for
failover and load balancing. I won't argue that this might be overengineered,
but this is a full transactional application with 99.9% uptime guaranteed to
our clients and with millions of transactions (mostly API) per hour. I agree
this is a write-seldom, read-often piece of data. We route based on userid, so
we can not have any chance for data collision.

