
When operating "at web scale", users often simply don't have "real names" - saurik
https://plus.google.com/116098411511850876544/posts/4t8sFLLK4hK
======
patio11
This title is 5 words longer than it could be, because people don't have real
names.

Apologies in advance for the self-citation:

[http://www.kalzumeus.com/2010/06/17/falsehoods-
programmers-b...](http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-
believe-about-names/)

My name+ is confusing gibberish where I live. It is highly likely likely that
when I have children, they will go through life having multiple sets of
gibberish so that they can pick the non-gibberish option when asked "What is
your _real_ name?"

Does your startup really want to get into the is-that-gibberish-gibberish-or-
am-I-just-ignorant-of-the-way-they-do-things-in-weird-places-like-the-United-
States adjudication business? If so, try validating names. You will have
_loads_ of fun.

\+ : Well, a name I go by, at any rate. There's _at least_ eight different
ways to write my "real name" _correctly_ , not even counting nicknames, online
handles, or the like.

~~~
gaius
Well, I know people who IRL insist on being called by made-up names, mostly
goths, you know, something like Darkraven Bloodmisery. Whatever they called
themselves in Livejournal. Very few of them follow through tho' and fill in a
deed poll form! For me that's the acid test: whatever it says on your passport
is your real name. And government IT - the least efficient kind! - has managed
to solve this. Why can't the mighty Google?

~~~
StrawberryFrog
I know a couple of LJ'ing goths who now have Google+. Thus I can see people
with stated first names like "Persephone", "Ariadne" etc. The rule of thumb is
that if someone says that's their name, that's what you call them. There are
also some clearly pseudonymous accounts, e.g. last names like "jumpycat",
"Purple" etc.

It's impolite to ask what their "real name" is. I've seen someone (call her X)
get upset because someone pressed the point of finding out what X's mother
used to call her (if that's a a "real name" in any meaningful sense). And it
was pointless too, what could you do with that information?

There's lot's of inaccurate information there anyway, I have two "Jo"s in my
friends list. On is male, one female. In both cases "Jo" is a contraction, in
one case of a middle name. They also chose to be called "Jo".

I don't really see why Google+ needs to know or care. What people answer to is
what people know to look for, so that's what should be shown.

~~~
rmc
_e.g. last names like "jumpycat", "Purple" etc._

I reminded of Catherina Fake, a Flickr co-founder.

~~~
StrawberryFrog
As far as I know "Fake" is the surname that she was born with.

~~~
rmc
Yes that's my point. You have to be careful about _"clearly pseudonymous
accounts"_

~~~
StrawberryFrog
All right then. I'll give Tallulah von Strumpet-Hausen and Martin Soulstealer
the benefit of the doubt ;)

if your point is "you don't actually know" then mine is "you don't need to
even care".

------
blauwbilgorgel
Having a Dutch surname like "Spring in 't Veld" (roughly translates to "Jump
in th' Field") causes all kinds of problems on form entries and editors that
add smart quotes.

\- Many non-sanitized SQL queries fail,

\- buggy URL or HTML parsers create code like:

    
    
      /spring-in/
      <meta content='Spring in 't Veld ...
    

\- Pagetitles like: Spring in /'t Veld

\- Added smart quotes "Spring in ’t Veld" from Word or Rich Text Editors cause
problems with sorting and identity consolidation.

\- Stripping the quote character trips a 2-letter requirement.

\- etc.

It is unlikely that people will name their son

    
    
      Robert';) DROP TABLE Students;--
    

But you should at least prepare your db queries for a quote. :)

<http://xkcd.com/327/>

~~~
tintin
Your roughly translated HN name is also funny: Blue buttock gargle ;)

But yes. Programmers have a hard time using unicode. It's a shame not all
programming languages use unicode.

I also think is has to do with bad research. It isn't hard to check different
types of family names and naming order.

There are a lot of countries where they place the family name first followed
by there personal name.

I'm still trying to figure the best way to store names. Maybe 2 fields will
fit all:

    
    
      family_name (Spring in 't Veld)
      name (Robert Spring in 't Veld)
    

Family name can be used for grouping and sorting. But you need to store the
name as the user entered it. Don't try to split it in parts, just leave it as
is.

~~~
pornel
You can't (simply) use family names to group Polish names (and I think Slavic
in general), because spelling of names often (but not always) depends on
gender, e.g wife of Mr. Kowalski is Mrs. Kowalska.

~~~
saurik
Yes: Slavic in general (my other example is Russian; wife of Makarov is
Makarova).

------
jchrisa
One of the stupidest things we've done as a culture in the last generation, is
bend our ideas about what is and isn't "well-formed", so that we don't offend
computers. Fixing this is a big part of why I am motivated to work on
schemaless databases. When your database doesn't ask you do predefine the data
structures, it's one less opportunity for an ignorant programmer to blunt the
human spirit.

~~~
vdm
The same thing applies to postal addresses. I look forward to the day when I
can make my own shipping label when ordering goods online. This just needs to
be a free text, multi line field.

The theme you touch on is covered at length in the book 'Computer Power and
Human Reason'.

~~~
protomyth
Shipping address has to be on the most shoehorned pieces of data in any
database in the world. The only thing that seems to have put some structure on
it in the US is the 911 implementations.

For most of high school years, our house was listed as "311 Behind the School"
or some other variation. There were no actual street names (although I am told
the electric company and only the electric company had a map with street
names). When 911 service hit the area, we got actual addresses, but they
really weren't useful for shipping things. The local UPS guy knew where
everyone was and where they worked so he could do his deliveries. His
replacement required some training time.

~~~
mason55
May I ask where this was? Crazy to me that there would be a place with UPS
service but not E911 service.

~~~
benologist
We have Fedex/UPS/DHL and use directions in Nicaragua and surrounding
countries - I live 50 meters south of a doctor's clinic.

~~~
protomyth
Its amazing when you get in rural environments how directions like "left after
the old Benson house" (never mind Benson has been dead for 40 years) or "first
house on the left in the 52 housing" (52 was the year the houses were built.

// never mind the "Easter Egg housing" and how they got their name - cheap*$$
government and their "bright" paints

------
lmkg
<tinfoil hat> Remember, Google is getting into the Social scene because it's a
rich data set that they don't currently have access to, and Google likes data.
Google wants your "real" name so they can tie all your information back
together. The good news is, they'll probably ease the restrictions once they
figure out how to normalize identities even with non-normalized names.</hat>

As a side note, I'm in a similar boat to Patrick with respect to my real name:
at least 6 variations have been used on legal forms alone, due to length and
special character restrictions, and excessive use of spaces. Fortunately it's
only my middle name that gives problems so it can often be ignored, but I
really like it so it's still annoying when I have to distort it.

~~~
divtxt
If that was the reason, they could just have asked for your real name without
requiring that it be your public name.

------
FaceKicker
As much as I understand the privacy concerns etc., am I the only one who
REALLY hates being on Facebook and seeing a fake name with all sorts of odd
punctuation and random capitalization in a list that consists otherwise of
clean "Firstname Lastname"s? Even seeing "Equality" as someone's middle name
bothers me to an extent.

Maybe I have OCD, but I'm personally glad I don't have to see this on Google+
for now (but I wouldn't be surprised if they changed their policy on this
within a week or two given how fast they acted in response to the "gender must
be public" feedback).

~~~
SeoxyS
I totally agree. I can't stand people who put random crap in their Facebook
names (and actually tend to unfriend them.)

Similarly, I religiously give everybody on my IM buddy list their real name. I
would go crazy if everyday I only saw pseudonyms and had no idea who was who.

~~~
nitrogen
Clearly you never used IRC. When I got started with computers, BBSes, etc.,
using your real name was considered a monumentally bad idea.

~~~
mike_esspe
I still don't understand why the fetish for the names from government ID
originated. Was it because of Facebook's policy? I'd like to see this trend
reversed.

------
ajb
I guess google+ won't like Filipinos then:
[http://news.bbc.co.uk/1/hi/programmes/from_our_own_correspon...](http://news.bbc.co.uk/1/hi/programmes/from_our_own_correspondent/9435751.stm)

------
nradov
I am disappointed by how software developers continuously reinvent the wheel
_badly_. There have been comprehensive data models for human names available
for years. For example see the HL7 V3 EntityName data type.
[http://www.hl7.org/v3ballot/html/infrastructure/datatypes_r2...](http://www.hl7.org/v3ballot/html/infrastructure/datatypes_r2/datatypes_r2.html#dt-
EN) Any Entity can have multiple names, each of which is tagged with zero or
more use codes such as "official", "pseudonym", "maiden", "tribal", etc. Each
EntityName has one or more ordered parts which can optionally be tagged with a
type: prefix, suffix, family name, given name. This data model isn't fully
complete since it's missing some naming concepts such as patronymics but at
least it's a better start than the mess that Google came up with.

~~~
daemin
Though what would be better and easier to fill out: (a) multiple text boxes
with one for each important kind of name, (b) multiple text boxes together
with some sort of code selection for each text box, or (c) a single text box
that just says Name.

~~~
nradov
The usual solution is to provide users with a choice. Default to showing a
simplified field or set of fields that suit the common case for that locale.
And provide an "advanced mode" with more fields and options to handle unusual
cases.

~~~
daemin
Well you could do that, but only if you must collect such information for
governmental purposes, otherwise I would stick with just the single text box
and allow them to put in whatever name they wanted.

I would especially do that for any sort of web startup for both simplicity of
the interface and ease of implementation.

~~~
nradov
It's not just for government purposes. If your hypothetical web startup has to
integrate with any third-party systems (like anything in healthcare or
financial services in the US) then you need the name broken into parts.

~~~
daemin
Well I would consider that to be for government (mandated) purposes, since the
service you're contacting with has a government mandated schema for the name
it must use, and hence you must also use.

Of course you can always have different fields for billing information.

~~~
nradov
In most cases those requirements for name format are not goverment mandated at
all.

------
Hyena
I still think that the most bothersome thing is that there are lots of people,
especially in my generation, who grew up with the Internet and with identities
tied to ancient AOL e-mail addresses, screen names, Usenet handles or MUD
characters that are sometimes both more cherished and more important than
their actual name.

What's odd is that my rule for social networks is that if you know my name
online, I might want to connect with you. If you know my real name
exclusively, I probably wish I didn't even know you.

------
hoopadoop
I was kicked off Quora yesterday because they didn't like my 'Real Name'
(which is actually my completely innofensive real name)

------
JeffffreyF
Everybody get over it, preventing anonominity and privacy (as each individual
personally feels it), is a fools game. In the long run there will be no (as in
absolute zero) acceptance of corporate need to determine appropriate levels of
privacy. This is because people are not fools. Facebook is going down because
they misunderstood how people feel about privacy, and a poor replacement
(privacy wise) will meet the same fate. As the saying goes, you can fool some
of the people some of the time yada-yada.... The shame is just in how much
they leave on the table by trying to take to much, but nothing new, and on we
go.

------
waterlesscloud
Requiring real names. Acquiring a facial recognition company. Profits entirely
driven by advertising.

Well, hello Minority Report ads!

------
badclient
Let's call it for what it is: facebook's process/technology to determine real
people is many many notches above google's.

This is as much tech failure as it is a policy failure.

------
ilkandi
As other posters have noted, many Chinese have the Chinese-lettered name eg
"strong army", the pinyin version and if they move to a Western country they
pick their own English name like "John" or something. In writing, Choi/Choy
(and I think Chua and Chow) are the same. I will skip the philosophical
discussion of what reality is to the viewer.

I don't understand the linking to a real name. Give everyone a private number,
and then the user links whatever names they're known by to that number. When a
searcher finds the name Skud and adds Skud, they will forever see Skud and
whatever other names Skud has chosen to have visible to that circle. The link
is the number, and the name is just a display. I wonder how they would deal
with a woman who changed her last name when she was married, and reverted to
her old name after the divorce. And she's also an author writing under a pen
name (like Stephen King/Richard Bachman/John Swithen). It seems obvious to me
but nobody's done it so there must be some unique flaw. Can any commenters
enlighten me on why the unique private number idea is a bad one?

~~~
dredmorbius
Why make it a unique number?

As Google+ has realized with circles, we have different associations we make
in life.

Some of those are widely separate identities.

I see _no_ reason why any identity I have on one site should be associated
with one I have on another, if I deem that it not be.

For some purposes (voting, financial transactions, long-lived financial
accounts such as SSI or a life insurance policy), you'd want to tie one
identity to another. Beyond that, it's simply a control and surveillance
front.

Even in the cases I've mentioned, weak authentication has long been the rule.
Strong authentication in voting is often tied to poll taxes and other means of
restricting the electorate. Public corporations (literally "anonymous
societies" in French) are highly psuedonymous. Cash (and digital equivalents)
are untraceable. And numbered bank accounts are the stuff of legend in both
finance and noir literature.

Identity is a very, very deep, and frought, question. Curiously, G+ is turning
into quite the discussion of it, from circles to gender to names to multiple
identities.

One of the most classic instances of pseudonymity is among revolutionaries. It
played a large role in the American Revolution, particularly among
pamphleteers (the 18th century analog of bloggers): [http://www.magic-city-
news.com/Editor_s_Desk_34/A_Climate_of...](http://www.magic-city-
news.com/Editor_s_Desk_34/A_Climate_of_Fear_34683468.shtml)

What of Mark Twain, Lewis Carroll, George Eliot, George Sand, Ellery Queen,
Frank Dixon, and Carolyn Keene?

A particular usage is among revolutionaries: Lenin, Stalin, Golda Neir, Moshe
Dayan, Subcomandante Marcos, Carlos the Jackal.

Or stage names: Madonna, Lady Gaga, Huey Louis, John Wayne, Marilyn Monroe,
Bono, Cat Stevens, Yusuf Islam.

You're making an extraordinary proposal. Support your position.

~~~
ilkandi
Oh sorry, I'm wasn't intentionally arguing against being pseudonymous. I was
thinking of ways to manage the mutability, multi-mapped and non-uniqueness of
common names behind the scenes. If circles can be kept separate, you could
have a single login containing different circles for your pseudonyms. As an
aside, I hate how current FB and G+ policy turns us non-celebrities into
second class citizens. Disgusting.

Thanks to nradov above for info on HL7. Surely these major international orgs
should already have been aware of it?

------
sixtofour
"I am not a profile, I am a free man!"

[http://www.youtube.com/watch?v=zalndXdxriI&NR=1#t=41s](http://www.youtube.com/watch?v=zalndXdxriI&NR=1#t=41s)

------
tantalor
What does this have to do with "web scale"? I don't see this phrase in either
the article or the reference in the article.

~~~
politician
The title is a reference to the fact that G+'s identity system fails to cope
with names considered normal in their respective cultural contexts. In fact,
it couldn't be more to do with "web scale" since it refers to how literally
everyone in the world reveals their (sometimes multiple) identities through
Google's world wide social network.

~~~
tantalor
It might be a poorly defined term, but I believe "web scale" refers to the
performance and availability, not features, of web applications. To say "They
messed up feature X for population Y" doesn't speak to their ability to scale.

No matter how much effort Google put into supporting "real names", they would
still disappoint somebody. Does Facebook do this any better? I believe they
also require users to provide their "real name".

~~~
dlss
"Web scale" refers to dealing with the problems and issues that occur when you
have hundreds of millions of global users. This includes performance and
availability, but also security issues, social/cultural issues, and
business/tax/legal issues. (Not that "web scale" is in Websters -- just google
around and look at the other uses)

~~~
innes
The naming problem is no greater _proportionally_ 'at web scale' than it is at
lower numbers of users. Used in this context, it's a cringeworthy bit of chin-
stroking jargon.

At global scale would be a more sensible. 'At web scale' doesn't mean, 'With
users from outside the US'.

~~~
dlss
re: proportions: If Google+ was only for Chinese plumbers, the percentage of
"problem" names (and the kinds of problems those names posed to the G+ DB
schema) would obviously be different -- hence the usefulness of the term "web
scale" when talking about this sort of problem.

re: "web scale" not implying/talking about global users... Do you know of
anyone operating a web scale business that doesn't have global users? Even
sites that are supposedly for US customers only (netflix, etc -- caused by
regional licensing), still have non-US customers using proxies to bypass their
geo filter...

Web scale /is/ global scale. :p

~~~
innes
Hmm, actually global scale is wrong too - I retract that :) The problem arises
regardless of 'scale' (number of users).

To reiterate: the number of users (scale) is irrelevant to the issue of
problem names.

I can see why someone sprinkled buzzwords on the HN title that didn't exist
anywhere in the article, and weren't applicable though. Makes the issue sound
more 'techie'.

~~~
saurik
I wrote that article, and consider the title I assigned to it here on Hacker
News to be its official title; this is also the title that I used when I
linked to it on my Facebook Page. Why, therefore, it matters at all that these
words "didn't exist anywhere in the article" is beyond me.

As for "buzzwords", if you deal at all with Google employees, this becomes an
irritating mantra: systems that have less than a hundred million users are
often considered "toys" (and, to be clear, I can totally understand why this
would be this way to these people).

And yes: I think that the point that "this actually happens to everyone
everywhere" is an interesting point, and was the first comment on this post by
patio11. I found that a very interesting insight, as many of my examples are
more "those cases that come up when dealing with a global user community full
of interesting edge cases", which to me /defines/ "web scale".

~~~
tantalor

      Why, therefore, it matters at all that
      these words "didn't exist anywhere in
      the article" is beyond me.
    

What is the purpose of the quotation marks then? Irony?

~~~
saurik
Yes. Again, that is the official title of the work: imagine that it had been
at the top of the linked to page, ironic quotation marks and all.

------
dredmorbius
You should be so lucky.

My entire planet is forbidden.

------
dreamdu5t
Can somebody explain the problem with allowing people to choose their own
names? They claim it is to foster "an environment." What kind of environment?
Surely not an honest one, since using real names doesn't cause people to act
honestly.

Forcing real names fosters an environment of rigidity and conformity. But I
guess people like that about FB and Google+. At least MySpace allowed you to
pick your colors.

The Internet could be it's own place, it's own unique environment, but instead
people like FB or Google+ want some poor attempt at a one-to-one mapping of
the real world.

------
innes
NB: the nonsense phrase _when operating "at web scale"_ doesn't appear
anywhere on the linked page.

~~~
saurik
Given that I wrote both the article in question (on a medium, Google+, that
has no title field) and the title of this post, I think it is fully in my
right to assign the title the way I want to ;P.

For context, though, that phrase is there to point out that when you are
dealing with a worldwide audience of tens to hundreds of millions of people,
you cannot make silly assumptions (see patio11's set of "myths") regarding
names.

As for using that /specific/ wording, that is how Google themselves often
describes the state of operating with that large, and that diverse, of a
userbase, making it anything but "nonsense" and in fact fully "apropos".

(Also... what is "NB" supposed to mean?)

~~~
innes
Implication of "NB" in this context: don't blame the author of the piece for
the cheesy buzzword in the HN posting.

I twig now that the author and the HN poster are one and the same internet
celeb. So... sorry about that :)

------
SeoxyS
For the most part, I think that Google's crackdown on pseudonyms and anonymity
is actually a good thing. In _most_ cases, anonymity online brings out the
worst of us—just look at 4chan, littered with mobs of minions proliferating
senseless hacking and child pornography[1]. You need but look at the news
lately to see the damages of unchecked anonymity: "LulzSec this," "Anonymous
that" and so on.

Enforcing real names is a good thing. It means people finally start to take
responsibility for their actions, and there is accountability. People behave
much better when the threat of embarrassment is in the balance.

I only see one legitimate reason for someone to be allowed a pseudonym: if
they are more widely known by that name than their real name. This applies to
authors, artists as well as web community members. The solution is easy: allow
a nickname field in addition to your real name: [First] "[Nickname]" [Last].
Some already do it. Day9, for example, goes by Sean "day9" Plott online.

Lastly, I realize the hypocrisy of posting to Hacker News without my name
visibly attached, so for the record, I am Kenneth Ballenegger from kswizz.com.

[1]: I have browsed thru /b/ many times, and the behavior of people there
truly is the worst I've seen ever. I was in the middle of the SF Giants riots
last year, and the people setting fire to cars and breaking windows seemed
more civil by comparaison.

~~~
_delirium
I guess I haven't seen much correlation between real names and quality; more
seems to be between the forum type and quality. I'd be interested in a study
of that, though.

Two random examples: HN is mostly pseudonyms and has pretty good quality
discussion; my local newspaper now uses Facebook Connect to have users post
under their real names, and its readers dutifully post a bunch of ignorant
trash under their real names.

I do like having semi-stable identities, whether pseudonymous or real. It
makes more of a sense of community if you actually recognize people and
ascribe viewpoints/personality/etc. to them, rather than every post in effect
being written by RandomUser92736 with no continuity between discussions, or
any ability to form within-community reputation.

~~~
dredmorbius
I think you're on to something here.

There are good communities and bad communities.

Norms play a huge role. Expectation plays a role, Participants' culture plays
a huge role. Some communities work well based on in-real-life associations.
Some work based on technical means (strong moderation tools). Some have weak
moderation tools but an aggressive enforcement community and back-channel
discussions for dealing with egregious abuse, though from my experience those
back channels are used far more rarely than might generally be expected, and
may even take on an inside-joke status.

In the same sense, real-world communities may or may not be well-behaved, and
often have similar foundations. The better-behaved ones tend to be well-
established (but often hard for outsiders to break into), highly policed,
based on cultural norms (common religion, ethnicity, purpose, social class,
interests). It's where multiple structures break down that problems become
most severe.

Even short-lived communities can be very well behaved. There are few
communities more transitory than college campuses, yet they have very few
management issues.

The biggest hazard in either online or physical cultures is creation of a mob
(root word: mobility), in which an antisocial mindset spreads through a
population. What are needed aren't strong authentications ("papers, please"),
but circuit breakers and dampers to dissipate these energies.

If it is in fact civility and community which are valued.

