Sorry what? That seems pretty unneccessary. A third party system to dictate how a third party system handles it local alias system for emails? I can't see any benefit to that.
Whether a mail server handles '+' in a standard way is not guaranteed, and surely it is up to the user how they use that feature if enabled.
From RFC 2822: "The local-part portion is a domain dependent string."
Headaches await you if your code is making a lot of assumptions about how the originating domain manages that local-part portion.
i understand why they do it, but i can't condone it off course.
The service should just allow it, and make sure the appropriated subscribing fee.
I can always create multiple email accounts, you know. So I don't see how spending development efforts to parse and detect + and . is effective.
It is extremely easy to setup extra email accounts, so preventing people from doing so with slightly less work is pointless.
It is stupid because it could prevent someone from using their real, valid email address because it matches another, different, valid email address.
The author has announced they believe them to intentionally indistinct and has announce an intent to break handling for any mail servers that consider them unique. All on account of gmail ignoring them?
This coupled with their author's intent around + makes me loathe this behavior.
So please don't do that. Respect the bytes that be!
You’re fighting the wrong battle.
A lot of sites I've come across lately are going back to the old-school way of only whitelisting email addresses from .edu domains or ISP accounts ("@comcast.net").
Gmail itself is allowed, but the addresses are normalized. I'm well aware of its potential for abuse.
I have an email address for every site I sign up for.
Helps you figure out who sells your email.
How do you handle this? Or do you not?
I guess you could do a DNS lookup of the mail exchange records and see if it points back to gmail and compile a list of domains to allow one account from... but then that would break many companies emails. That’s no fun.
Services need a better way to figure out fraudulent or abusive behaviour than guessing based on the email account's domain name.
I personally own over 100 from a couple years back when recapta was easily automateable and the phone-number requirement wasn't there.
I haven't seen ISPs give out emails addresses like that in years (I now my last couple of ISPs have no such thing).
So you're basically limiting yourself to people from universities, or who've had an old-school ISP for a while.
Mobile carriers even still seem to, though they are optional and require an additional setup step.
It's all too much for me to keep track of, but for some people it's no big deal to create new e-mail addresses every month.
Obviously it never works, as I get the "I see you're trying to create a new account" email, but one of these days he's going to figure out a way to take over one of those accounts and then I'll really be fked.
(1.) What are you imagining is the attack vector exactly?
(2.) Are you asserting that all website owners should build to Google’s (non-standard) behavior?
Can me and my wife both sign up to HN and use my email but hers be email@example.com and mine be firstname.lastname@example.org?
That's a strange usecase, isn't it?
Having said that, in development, it's super nice to be able to create addresses with +'s in them.
On top of that, it's just as easy to set up a catchall email address -- an email box that accepts all mail for a domain, literally email@example.com. So a malicious actor could sidestep this security attempt with minimal effort, but it still inconveniences legitimate users despite being worthless from a security perspective.
It's just as easy to write a script to use ephemeral hosts that you don't need to sign up for. Things like Mailinator.
All it does is irritate people like me who use +words as prefilters for email (and to see which companies are selling my email/user data).
Please don't "normalize" email addresses like this. Not all mail systems are Gmail, and many do treat "firstname.lastname@example.org" and "email@example.com" as different identities. And even if we are talking about Gmail - it's not your identity system's job to deduplicate different logical addresses for the same physical inbox.
Lots of weird email systems exist. Don't assume that everybody works like gmail. And do test that things work right with uppercase letters in email addresses: I've been locked out of systems before because I use an uppercase letter in my email address and one half of the system was trying to match the lower cased version to the actual text.
The local-part of a mailbox MUST BE treated as case sensitive.
Therefore, SMTP implementations MUST take care to preserve the case
of mailbox local-parts. In particular, for some hosts, the user
"smith" is different from the user "Smith".
However, exploiting the case sensitivity of mailbox local-parts impedes
interoperability and is discouraged.
And when that happens, trying to patiently pull a "well, technically" and explain to them about RFC this and the specs say that is a way to lose users.
(I actually have extremely strong feelings about email, email addresses and the whole associated mess of specs, but had to tone it down for this article since it was mostly about the various traps you can wander into from naïvely thinking that you can just read a spec or implement something obvious and get away with it)
Not in my experience. Showing compassion and agreeing with them that what happened is terrible and you wish things were different but you didn't call these shots back in the days and if they want improvement you can both go together and complain to google, the service which is actually broken.
Most people just want to be listened to. If you can do that, you'll earn a loyal fan, even if you don't do exactly what they tell you to when agitated. Some will even appreciate learning more about the email systems after the fact. They may even get the feeling you went above and beyond by offering to help them with matters outside of your site.
Many people, even after being listened to, and even after having things patiently explained to them, still continue to enter someone else's email address into forms which will send sensitive information to that email address, and complain that they never got their important email, or that some "hacker" has "hacked" "their" email, etc.
In a perfect world this would not happen. We don't live in a perfect world and are unlikely to live in a perfect world any time soon, so we should not be asking "how can we be pedantic and tell users it's their fault for not reading the RFCs", we should be asking "how can we protect users from their ignorance of the RFCs".
(when I wrote this article, I did not expect that this would be the single most controversial line in it from HN's perspective, but I guess by this point I should have anticipated it)
Has it? In my experience, even if people know dots and the part after the + don't matter for their Gmail account (and most don't) they know it doesn't matter to Gmail, a peculiarity they can use to create multiple accounts on a single website without creating new email accounts.
(This is different from case insensitivity, which the majority of popular email providers seem to implement.)
On the one hand, that's a (presumably extremely) rare corner case - on the other hand, some applications must handle those corner cases.
Do you want to run a support system where when you ask for people's username or email address, you also have to ask them for their casing (and in the case that their problem is that they didn't know it mattered, they may not know what they signed up for)?
I worked at an ISP for many years. Even though it was all backed by Linux, usernames (which included email addresses) were considered case insensitive for the purpose of the service (all usernames were lowercase). It solves so many problems and the downsides are so small all the sane email providers did it.
The flip side of this is that it was a simpler time. Usernames and email addresses were ASCII, not Unicode. These days with Unicode, you can't even be sure that uppercasing and then lowercasing an already lowercased string yields the same characters.
It's not an issue if they're the same, the issue is if they're different. E.g. if I'm storing bitcoin on an exchange (I know) behind the firstname.lastname@example.org, then I don't want someone else to be able to register Email@domain.com and then start looking for bugs in the service or start trying to socially engineer customer support. (The local part of the email address is case sensitive as per the spec.)
People can hem and haw about the specs, but at the end of the day Gmail trained most of the world to believe email addresses are case-insensitive, dots don't matter, etc., and now we have to live with the consequences. If that means somebody can't sign up for twelve accounts using case and dot variations of their Gmail, well, so be it. And if that means they come to HN to rant about how that awful site didn't follow the RFCs, then they come to HN to rant about that, but their account will be safer in spite of it.
I've only once had my email address rejected by a website (for ending with an underscore), but I never bothered setting up another email just for them.
No matter what, there are reasonable hypotheticals where you get angry and take your business elsewhere. The difference is my approach has you leave because you're angry at the signup page, and your approach has you leave because someone stole your stuff. I'll take my approach any day of the week.
For something as important as the credentials for a bitcoin exchange account, as Alex gave as his example, there should be policies specifying the reasons why account credentials can be changed and what evidence must be presented to do so. Front-line customer service reps shouldn't be flying by the seat of their pants when making difficult decisions with potentially hundreds of thousands of dollars on the line.
The point of social engineering attacks is that they’re innocuous requests that don’t raise suspicion, and are hard to train people against.
- Functions to get the account based on the email address
- Internal tools
- Stored procedures and other SQL stuff that happens outside the main code base
- Third-part integrations (Mailgun, Sailthru, ZenDesk, SalesForce, etc.)
That’s a huge attack surface where if there is even a minor mistake by a junior dev that no one noticed then everyone is going to lose their assets under protection.
At least here in the UK, gmail has high reach among techies but is very much a minority provider for normal people. Looking through the 3000ish registered accounts on the local community website I run, Hotmail, Yahoo, and particularly ISPs (btinternet.com, plus.net, sky.com) are all more common than gmail.
Who cares if they do? It's not like that person couldn't create multiple email addresses anyway. E.g. email@example.com and firstname.lastname@example.org can be the same person. You can't validate for that, other than other "gamification" systems (e.g. how HN treats new users - disincentivizes creating multiple accounts because why be limited on your second one when you have a good first one).
by the way I have used the + trick on google to sign up for a service (and pay for it!) that wouldn't let me reuse my old account for some reason. So their relaxed validation made them money.
Say you run an online store. You offer USD $5 in first-time user credit.
A lot of people (even some nontechies I know) know about the "+" trick for gmail. Assuming your signup flow is easy and fast, it's very easy for those people to sign up for multiple accounts and get multiple $5 credits.
If a lot of people do that, it might significantly impact your bottom line. Not just because you have to give away a lot of inventory, but secondary effects also suck: you stop the $5-free promotional, and then all of the legitimate users who signed up during the campaign who told their friends to sign up and get some free money now have their friends bad-mouthing the site to them because "it didn't give me free shit the way you told me I should expect it to!". You might see a drop in sign-ups to below pre-promotional levels, or, worse, you might see people who signed up during the promotional trust and use your site less. I know trends/behaviors like this seem trivial--and they definitely would only affect a minority of users--but past a certain scale effects like those can have a real financial impact.
Now let's say your site is "smart" about the "+" trick and doesn't let people with gmail (or google-federated emails--boy, is figuring that out a bastard) accounts sign up multiple times. You'll lose the dubious potential business of folks who like gaming promotionals. You'll still be vulnerable to people creating second email accounts and signing up using those--but the difficulty asymmetry now favors you, the vendor: it's work for a user to make a second email account; work they probably won't do, especially if you blacklist typical temp email services like guerillamail. If the promotional is large enough to entice first-time users but small enough to deter people from doing this, you have succeeded in minimizing your loss. If the user already has a fleet of accounts for this purpose they're probably going to just take your money anyway.
Of course, there's another more annoying scenario which I'll mention because sites should never do it: sites that think they're being smart about the "+" trick by not storing the part of the address between the plus and the domain part. This is usually done to get email campaigns (read: almost entirely spam) to show up in the user's main mailbox rather than some filter-purgatory. It will drive users away in two main ways: first, if I'm technical enough to use the "+" trick and a filter to route mail, I probably have enough obsessive annoyance with spam to immediately either junk-flag yours or delete my account, compared to a small chance I would have actually read it otherwise. Second, more than half of sites that do this which I've audited parse and strip the content of the email address wrongly (parsing emails is very hard, after all) in such a way that what they end up storing could be a totally different person's email, or an invalid one. That means signups just won't work. Whatever you do, don't do this.
This doesn't only apply to first-time-user promotionals, either. It also applies to:
- Referral bonus programs.
- Services that give hand-customized products to users (think a "one per user" etsy store with one person knitting cat dolls or something): multiple similar contacts from the same user would mean you spend a lot of time making their products--time which might be wasted, even if they paid for their products, compared with time spent making them for lots of different users and increasing your recognition/exposure.
- The same applies to tech-support contact-us forms: one user can "bogart" your support staff, clogging the queue with (legitimate or not) requests and defeat your rate limiters by using the "+" trick, making other users wait a long time for replies.
- Others I haven't thought of.
What I noticed what several (large) services these days do, is ask for your credit card number on signup (even if they promise not to bill you).
If they are really giving away credit for free that can be used wholly then they probably need to verify your identity e.g. a $0 credit card authorisation or something.
Denying registration to a suspected-duplicate seems a lot safer than mailing to a different person.
For instance, on my email domain, email@example.com, firstname.lastname@example.org, and email@example.com are all the same ('+', '.' and '_' all act like you'd expect '+' to), but firstname.lastname@example.org is not (it get's picked off and aliased somewhere else). Applying gmail conventions to other domains is silly and wrong.
The problem is that then you're adding potential attack vectors to every single web app just to cater to the .01% of email clients that insist on implementing the email RFC exactly to spec. Not deduplicating email addresses creates an attack surface for both hard-to-spot technical issues and also social engineering attacks. E.g. what if your ESP at some point adds deduplication on their end, either mistakenly or on purpose, then suddenly you're sending password reset requests to the wrong users.
I think you should normalize email addresses to enforce account uniqueness, both for security purposes and usability, as long as you also store a second copy of the email address exactly as the user entered it and only send email to the latter version.
Thank you, so much.
I just wanted to highlight that for anyone who looked at the comments to decide whether or not to read the article.
Send email to it
That is literally the only way to validate an email address. There is no regular expression or algorithm that can validate and/or deduplicate an email address.
You must simply treat every email as unique until you send an email to it and that person proves otherwise.
That being said, this article brings up a lot of important things about confusables that everyone should definitely be aware of, especially if you're going to have public identities.
A better approach might be to limit the effect that a new account can have in the system. Hackernews, Reddit, Stackoverflow all do this through reputation. A new account on these systems is unable to achieve much until time is spent proving the account is being used. Thus reducing the incentive for an individual to create multiple accounts.
Which is impossible; a number of people run their own email or use a small friend-run email server, and you can't possible discern the delivery rules from the outside.
Yes. A small number is a number.
If we could go back in time and force every single implementer to follow every relevant RFC to utter perfection (and make sure all the RFCs were perfectly unambiguous), I'd be more sympathetic.
But email is fucked. The sheer number of oddball things, hacks, workarounds, deviations and other bits of mess that implementers have engaged in over the years means the RFCs should be treated as at best a loose hopeful outline of how email might in theory work.
Isn't this a really dangerous game to play? Just because some major MTA's assign a class of addresses to each user doesn't make each member of that class not a unique identifier in general. Is it worth the headache to maintain a list of various email systems' policies rather than just treating them all as unique?
You can generally get away with treating the names as case-preserving (as distinct from case-insensitivity), and you are probably safe in rejecting quoted localparts. But beyond that, even forcibly lowercasing email addresses, is likely to cause problems.
Also, it’s useful to make use of +foo or varied usages of dots to create a unique email address for each site: for one thing, it’ll help if one site leaks your email address, then it’ll let you trace the origin of the leak if that email address gets unwanted email.
Finally attempting to deduplicate email addresses before authentication is almost as bad as lowercasing the password before checking if it matches.
I'd be fairly surprised if your average user of gmail knew this: I know it and I use it in part because it lets me _distinguish_ different accounts on the same site. Second-guessing someone who's taking advantage of this feature is more likely to generate tech support requests than not.
Non-anecdotally, articles with large numbers of views/comments about the trick can be found with a quick Google search on non-techie sites like NYT/HuffPost/BusinessInsider/Buzzfeed/Pinterest/etc. Not that those are definitive, but I think knowledge of this is more widespread than you think.
If you actually strip the dots and plusses from my email, and start sending stuff to my main address, then I will mark your messages as spam. You need to store the normalized and non-normalized versions of the address. Actually, you need to do this for normalizing on usernames anyway, to make sure you don't mutilate people's Arabic names or anything (Unicode-normalized cursive looks really bad; you need to preserve the original version, while keeping the normalized version around specifically for uniqueness checking).
I think it's not even close. You have to transmit the content of the email address to the server, since you might need to email the person. Whether you validate/sanitize/perform voodoo on it there is up to you.
You don't have to transmit the password (because one way hashing), and should never do so.
A harder question is what you should assume about people who run a "+"-tricky email service on their own domain (e.g. federated gmail) and who later switch to using a service that isn't "+"-tricky (e.g. federated gmail user switches to running their own mail server). What's your default policy: default-allow or default-deny? I suspect the answer will have to do more with the amount of potential revenue lost due to such users' likelihood to abuse the plus trick, and less about the technicals of how to address it.
What system with any non-trivial level of use uses the text username as (1) the FK in the database, as opposed to the generated or auto-incremented ID in the db; (2) the login name; and (3) the publicly-displayed displayed "name" of the user for others to see?
Plenty of forums etc use the login name for #2 and #3, and I'm not convinced by this article that that's the wrong way to do it. I haven't ever seen a single professional product that uses the text username that a user logs in with as the actual DB-level foreign key. That's grade school level database design.
There is also the security issue that by having the login name also be the publicly displayed name lowers the bar for attempting to make a targeted attack on the site, as well as other sites where the attacker suspects the victim may be using the similar login name. This can particularly be true in cases of harassment across platforms, which while is not a computer science security issue, it is a personal psychological security issue.
That's exactly the point though. If you join on the username than allowing emails/usernames or whatever that identifier is to be edited is very hard. How you identify the row to auth against is literally the point of a username.
Discord has a very interesting solution to this. They have user names and user ids. User IDs are tied to emails and the user's name seems to just be a random text identity for displaying to users. I assume most of their backend code used a unique, sequential or random, integer ID to identify and talk about users while their frontend just makes the ID to a "user name". As long as you slap account creation behind a verification email and don't mind one user being able to sign up for multiple accounts you side step many of the larger problems that come from choosing user names because, in effect, you are choosing the "Real" username and you can make any guarantees that make writing all of your other software easy.
In Blizzard's implementation, I can't add a friend by just knowing their name, I need their id number as well, and the process for finding it isn't exactly front-and-center.
The edge cases discussed don't pop up that often unless you have lots of folks using your software or are really diligent about fuzzing and testing edge cases. If you roll your own, say, username system, you probably aren't going to fall into either of those two cases. Which means you're vulnerable.
Like storing sensitive data in the authn's session system because you don't understand encryption vs signing nor how to find out -- maybe it's time to just sit down and credentialize as a craftsman.
The authn/z systems I've used that were the biggest headaches in my life were kitchen sink frameworks trying to generalize over everyone's creature features, and they were often tied to a company/community culture of not-gonna-touch-it that only hurt users and security.
My comment was stating that you should default to these types of libraries and only roll your own if you can't do what you need to, simply because they're more likely to handle edge cases that can have serious implications.
Do you do unicode normalization on your usernames? I freely admin that I don't, and wasn't aware it was needed until I read this post.
Please don't do this, lots of people (including myself) use the '+' hack to separate accounts for different contexts (business/personal, different projects/clients, etc).
Checking for the existence of any 'email@example.com'-like accounts would mean I have to register an entirely separate email account or set up (another) email forwarder/alias.
You should ideally also store a second copy of the username in the original casing and normalized as NFC for display purposes, as some users care a lot about seeing their username exactly as they entered it. (And in fact not allowing this may be seen as culturally insensitive in some cases, much like not supporting unicode.) The same applies to the user's first and last name, which you can store in NFC for display purposes and casefolded into NFKC for string comparison (e.g. search) purposes.
That said, most sites limit usernames to ASCII characters so that they can be (easily) used in URLs. In this case you don't need to casefold or normalize, just converting to lowercase is enough.
I wanted to stay out of the Python 2 vs. 3 quagmire in this article, but it's worth knowing that in Python 3.3+, strings have a 'casefold()' method:
Unfortunately, since Python 2 still has around two years of upstream support before EOL, I can't universally recommend people just use 'casefold()', no matter how much I'd like to.
Could be as simple as publishing a set of regular expression subsitution rules, specifying (for example):
* render to lower case (because this particular domain is case insensitive)
* drop periods (because this domain treats them like gmail does)
* drop '+' and any subsequent characters (because this domain treats them like gmail does)
* ASCII only (because mail software is old, and doesn't support unicode)
Each domain could then publish their own rule, perhaps in a DNS txt record, and anyone needing to check if two email addresses alias to the same could run the correct checks.
I think a better solution would be to use a case insensitive collation on the database for the email column.
If the user changes the capitalization of their email, treat it like any other email change (validate the new email via email token)
Excellent read by the way. Many things I have never considered or even worried about before.
Many sites-- like HN-- may not even need that. If you have system and login identity you can just display "dingus" as the name of every single user and the system should still work the same.
>Well, it’s easy until we start thinking about case. If you’re registered as john_doe, what happens if I register as JOHN_DOE? It’s a different username, but could I cause people to think I’m you? Could I get people to accept friend requests or share sensitive information with me because they don’t realize case matters to a computer?
Just this month we fixed this issue by using a citext column in postgres. So yes, it is easy. Maybe I'm missing an edge case here?
If so, the rest of the article covers in great detail all the other edge cases :-)
Now, how did you solve the other problems mentioned?
The reality is that there are almost ten billion people on this planet and they live for upwards of a century. You are simply deluding yourself if you think it is reasonable to build a system with unique, permanent usernames. Nothing in the real world works like that, including trademarks. And it just helps enforce the very problem that people try to trust usernames and then get tricked by people who sniped usernames that are tied to other peoples' well-known identities (leading to abused "verified" badge systems and legal challenges and expensive hostage scenarios... it just sucks).
And for what? To make it easier to hand-type a URL? Does anyone even do that? I am super technical and I barely even do that in 2018, as if nothing else there are too many websites in existence to remember all of their one-off URL schemes. Like almost everyone, I either use the site's built-in search feature or I do a search on Google to find people, and let a combination of page rank and personalized results guide me to the right destination. Some web browsers don't even show URLs anymore!
Here is a great example of where it is completely insane: Facebook. There is absolutely no good reason for that website to have usernames for regular users, and they frankly shouldn't have usernames for businesses either. It isn't even clear to me that the app--which most users are using, not the website--even has a way to show people's usernames, which means this is an identifier which somehow everyone knows must be chosen and must be unique and is nigh-unto permanent but which somehow is also simultaneously meaningless but is also a horrible point of contention? What?
I am lucky. I spent a bunch of time in 1994 to select a username, and despite being 13, I was mature enough to come up with something that wouldn't ever come to cause me complex problems. People ask me what it means, and it essentially doesn't mean anything: it has only a positive connotation to me when I hear it, it is entirely neutral, and it had no existing usage I could find. Yet, I also still got screwed, as I am semi-famous, and everyone knows me as this username. I have kids who look up to me enough to want to take my name as a show of support and I have to essentially be the big bad asshole about it because in a world of unique and permanent usernames, people then assume the kid is really me. On the other side, I have been asked to rename myself by moderators of various forums as they couldn't believe the real saurik got an account on their site, and it was "confusing" people.
And so in the end we all have to deal with the worst-case scenario anyway: unless you do nothing but sign up for random sites rumored to be interesting constantly (which I seriously tried to do), you eventually will succumb to needing a way to prove who you are on multiple sites and tie together those identifies. And for most users... as in virtually all "normal users", that moment comes when they are using only two websites, as their username was probably something like jay.freeman.178 as everything that was even remotely interesting to them was taken a decade earlier by literally a different generation of humans, so they let the website automatically generate one.
In a world where everyone is having to solve the worst-case problem anyway, every site should just have numbers as unique identifiers, at most have some kind of trust score for degrees of separation on the site (so you can get a feeling for "is this the saurik that I met?"), and everyone should be trained "names don't matter and if you see someone with that name it doesn't even slightly mean that they are the same person you met last week".
So that you can be identified? (I'm not talking identified in a mathematical sense, but in a informal conversational sense (you know, what usernames are actually used for)). The whole point of a username is that it is the most humanly convenient way to represent a user in text in the context of a certain site.
Because the alternative would be to have thirteen saurik in the same thread debating a topic and you would have no way of distinguishing them. Avatars are an attempt to fix that but it sucks and is bloated for many scenarios.
Sites that do allow you to change username break conversations where people refer to each other using the username (stackoverflow comments are a really common and annoying issue)
Yes, it is annoying when you don't get your first pick but it truly is not a big deal and it solves a real problem.
jay.freeman.178 is an excellent username. You are not your username.
The problem with breaking continuity in forums and other similar networks can be mitigated via dynamic user name lookups (eg how Facebook does `@` mentions - however I have also seen some forums do this as well), supporting in line quoting (like how message boards often work), nested replies (HN, reddit, etc). Granted there will still be occasions when references slip through the net but us humans have a remarkable ability to deduce the context of the written word even when it doesn't always read perfectly.
I'd argue that a username does not have the same function as a name. My name is not an identifier and most have not chosen their name - which is also why (I believe) the first name change is free of charge where I live. In most cases you can also create a different account. Depending on the type of service this might not be desirable but in others it is exactly what you would want.
A possible implementation would be to allow the user to give a "nickname" to add to usernames he wants to identify uniquely that would be visible only for them. For example since I talk now with you I could add to your username "user with whom I discussed identities and usernames" and this (or a short version of it) would be shown next to your username from now on.
A more automated way to do this is to create a unique image for the user based on the content they have posted, when they created their account, not so personal but requires less effort from the user, I'm sure such systems exist in many sites to create avatars. Obviously in this machine learning times we could get to do sth much better.
There's one "account name" which you pick at sign up, use to log in, and can never change (as far as I know), but you can then pick a "user name". The "user name" is what gets shown in any interaction with other users (forums, profile page, chat, in games, etc) and IIRC it doesn't even need to be unique.
Tagging people on your friends list is a great feature to match, since the tags remain when the tagged user changes their user name, which some people do quite often (whether for a joke or just because they got bored of the old one).
It really seems like a great system and it'd be nice if other systems offered the same degree of flexibility.
1. Username, used to login, not visible elsewhere, unique and security related.
2. SteamID, id number which is visible elsewhere and used in their api fairly often; not too different from the username other than being public
3. UrlID - by default a "steamcommunity" link to the user's page is their SteamID, but it can optionally be edited by the user to a custom url. This ID is a globally unique namespace. For example, https://steamcommunity.com/id/gaben
4. Display name ... Yup, the display name. Arbitrarily editable.
Having that many things is dumb. Having an editable url component is dumb (it shoulda just been the steamid forever).
All in all, they're not a good example.
UrlID can be beneficial for the user so the user can choose a easily spellable identifier for the URL in case he needs to share the link often using voice.
Seems like an optimal system except that the UrlID probably is only a use case not that often. But it still won't really hurt anyone. If it wasn't the URL ID it would be steamID, which does nothing to help to remember the URLs. So why is it poorly thought out if it gives benefits to some usecases without making anyone else worse off?
"I want to add you as a friend so we can play games together, what's your name"
Mrguyorama + random junk I didn't choose
It's user unfriendly
And I don't really see how this is functionally any different to adding digits to the end of your normal username.
This function is separate from the "tagging" feature, which is helpful for keeping track of those friends who frequently change their display names and avatars. It's also possible to view the list of previous display names a given user has had, at least if you're friends with them.
I wish I was making this up. Maybe they're trying to get their chat app acquired by Google.
That sounds a bit like SDSI & SPKI's nicknaming functionality. Entities were identified by keyhashes, but you could (and in practice would) give them nicknames or use others' nicknames for them.
In the real world, people almost always go by their first name, and we don't have this problem. When two people in a social circle have the same first name, we don't turn and say "well everyone has to use their whole name, always, now." Rather, we adjust our names (usually someone gets a nickname, or goes by their last name).
The steam system allows multiple people to have the same display name and it works just fine. Sure, people can troll with it when they join your tf2 server (and then you kick them off).
The blizzard system also works great. The unique identifier is there, if all other forms of attempting to add a friend fail, but mostly you end up working contextually.
Your examples are games, which greatly limit interactions both in number of people and in time. I don't jump in and offer help in a game that ended 4 years ago. Everything is already in a very specific context.
It is often desirable for the username to be consistent across the entire site, if you recognize the username you remember past interactions and conversations which gives you a better context. This is a valuable part of a community.
Sure, there are places where you deliberately want to maintain pseudo-anonymity, where you'd get a new username in each discussion. But that's something else.
If your apps' users report problems with identifying people, just allow users to add more specificity to their username. "People always get me confused with this other user 'chairdude', can I change my display name to 'armchairdude'?"
If thirteen 'saurik's want to have a fun time and create a confusing discussion thread together, so be it.
If I know him in person and ask for his LinkedIn, he can just send me a link to his profile. If Mohammad is advertising his LinkedIn presence, again, he can provide a link to his profile. Adding in unique usernames doesn't help much in this case.
I was also recently annoyed at being forced to switch to a new system for my credit card, and it's a unified system with all their other cards and banking customers, and still uses "username" (instead of email) as a login, so of course my name was taken. I decided to just append some random characters, and then realized I could just generate my entire username and have been doing that since, when I don't care about identifying myself to others. My password manger saves it, so it really doesn't matter to me.
Even better, if they have your email and use it for password recovery, you can basically turn it into a two step authentication by not saving the password and using password recovery every new time you need to log in. Though, that can get annoying if their password recovery takes a while to send.
This is not a great idea if you have a public profile connecting your username to your email because someone can hack your email.
But you not knowing your password doesn’t hurt your security as far as I can tell.
Your login information can be gained via keyloggers, network sniffing, phishing scams, malware, malicious employees, and all sorts of other methods..
This is why two-factor authentication is so important, to help prevent your account from being compromised in the event that your username and password is.
The way I see it, not knowing your password removes some potential threats around managing that password incorrectly, at the cost of increasing the risk of losing access to your account if the recovery mechanism doesn’t work.
It offers some extra security, but very little. It's the digital equivalent of locking your back door but not bothering with the front door.
The comment I originally responded to seemed to think so.
The username isn't supposed to be a secret, social engineering will most likely be very easy.
A sufficiently large random unknown password is actually significantly less likely to be brute forced than the service itself is to be exploited.
So you would have addresses like:
It would only be for receiving.
I know that there are lots of options already: 20 minutes mail, firstname.lastname@example.org or just a catch-all. But this could at least provide somewhat end to end encryption.
Plus it can allow sending and receiving of email and files, if you so choose.
I'm serious but my todo list is quite long.
I was thinking about dedicated application on phone for it instead of regular mail client. Mainly to provide easy interface to manage big amount of accounts, banning hosts etc. Also then the app could just use service like Google's Firebase Cloud Messaging. That would wake my app and then it would get a message. I hope that it is fast enough that a sending SMTP server would not timeout.
One downside has traditionally been that backup MX servers were generally much less stringent in their connection level spam filtering/blocking (since the downstream server is generally responsible for that, and they may be a backup server for multiple downstream servers), so it became common for spammers to send directly to lower priority mail servers to take advantage of this and bypass a lot of that active filtering at the eventual destination. Expect a lot of spam to queue up.
In your case, you actually would control the backup MX and the eventual destination (if it indeed is a separate SMTP server), so that's less of a problem. You could just put a pretty harsh timer on the queued mail, and throw it away after 24 or 48 hours. Then again, you could probably do all this almost identically by replacing the SMTP server run on the client device with an IMAP client, and just have delivery end at your server.
Nowhere near as inconvenient as it sounds - It is the sort of service that you would rarely log in to. Mainly when setting up a new server or adding a third party service to your list of environment variables on your server.
I can see that for a service that you would use several times a day (Twitter etc.) then it would be a major PITA.
Even EnvKey that I mentioned above has a session cookie of some sort - I can usually use the app for several days after logging in, even if I close the app - but after that I am prompted to instigate the email with my unique login key.
It works really well for most users although it does have some quirks.
If your account name and associated email is known, it's not really better than a username and password (except that it's delegated to what should be one of your strongest accounts that you protect more diligently), but if the email is not generally known for that account name then it's extra identifying information that must also be known to access the service account.
Also, if you have an a, b, c, or 2 in your first character, you can use any of them in place of that character index to log in.
There's no way to merge the online accounts, even though the banking accounts are merged and I can see all of the financial information from each no matter which login I use.
I found a way to change it and it still worked last time I tried.
You must install Facebook Messenger. I am using iOS, don't know if the Android version is the same.
Keep in mind that your old username will no longer lead to your profile once changed. For me this was exactly what I wanted but for some they might want to not change their username after all due to this.
In the Facebook Messenger app, tap your profile picture in the top left corner. This brings you to a screen with the title "me". Right under your picture it will say "username m.me/yourusername" where yourusername is your actual username. Tap on your username and select "edit username".
Once you've changed your username in the Facebook Messenger app your identifier on Facebook itself will change also so now when people go to your Facebook profile on facebook.com in their web browser they will see your new username in the address bar.
Figuring this out was actually very difficult, as most information online claimed that your username could not be changed.
Because it somewhat seems to me that Facebook also don't want people to change usernames I ask that everyone who reads this keep that secret. HN pages usually don't rank highly on Google so mentioning it here shouldn't matter too much.
If any Facebook employees read this, please either
a) Make it easy for others to find out by updating official documentation, or
b) Make it easy to change from the main facebook.com application, or
c) Forget that you saw my comment.
As I'd like to be able to change my username in the future as I have done in the past.
For example when I send a photo by Messenger it is attributed to my ( n-2 ) name.
One could probably infer something about the state of Facebook's architecture from further study.
ICQ did that. Though it still led to interesting results, because lower numbers were thought to be more valuable, and people were buying/selling those.
Perhaps a random numbers with the same number of digits or UUIDs may work without such issues. :)
However my slashdot number, which I've had nearly 20 years, I know nothing more than it begins with 2.
The modern numbers I remember are my mobile phone number, my wife's, and my passport numbers (phone numbers as we've had them over a decade and all of them because I have to write them on forms so much). The only other numbers that spring to mind are my staff number at work (used in various forms, had since 2003) and my bank numbers (needed to log on)
If you use a number a lot, you learn it. If you don't (like usernames which are saved) you forget it. I can barely remember my credit card pin as I use contactless so much, but muscle memory seems to work there.
I was (am?) 72167,3530. Does that make my ID older or newer than yours?
Then people will be buying/selling UUIDS which are easier to pronounce or memorize. People will always consider patterns more valuable.
1) Short number.
2) Repeating digits like say only containing 2 or 3 numbers.
One of these was great, but both? Jackpot.
You could add a third factor: keypad pattern. It never occurred to me I'd use keypad to remember the number TBH, but IIRC one of my friends did care about that. I'm actually frightened by that option in Android I kid you not; I am frightened I forget the pattern!
Of my own numbers discounting the starting 1 (I personally did not care about that one but I know others did) one ended with 0's and the other one only contained 2 different digits with one being twice the other one. Extremely easy to remember.
UUIDs were also my first idea, but I have the feeling that sharing them (i.e. to invite a new friend) would be cumbersome. I wonder if a new system akin to what3words.com could help there.
Oldschool Battle.net had a 16 alphanumeric characters with underscore allowed and that was that. At least for Warcraft 3 it's been the case since 2002 for a very long time (until quite recently last year they allowed fancy symbols in usernames).
You also had to login at least once every 3 months or Blizzard purged your account.
Of course, yes. I doubt this problem came up with the original Battle.net accounts :-)
The weakness lies partly in ICQ: they allowed to easily find all these people using @hotmail.com e-mail address and even showed this information. Sure, you could disable being part of this feature (IIRC it was called "yellow pages" or something akin to it) but still.
The other part of the weakness is exactly the very issue of domain squatting, username squatting, e-mail squatting or whatever you want to call it. I understand Microsoft wants to save space on their e-mail servers back in the early '00s but: former username should be frozen and their e-mail could be either bounced or silently rejected to /dev/null or whatever's the Windows equiv.
Blizzard's WoW has the rule that you you can only get a username from an inactive account. An inactive account is an account which did not play the previous expansion. That's their compromise. To be fair, it is not like people use WoW usernames for password recovery.
As for using numbers as username: that is what UNIX does under the hood, it is what Facebook does under the hood as well, it is what Blizzard's WoW does under the hood as well, and what T9 converts to as well, and ICQ did as well in contrast to MSN. Turns out people are lousy at remembering a bunch of numbers. So they resort to 26 character system of letters, or 36 character system of letters plus numbers. (Some services are more or less strict.) So, no, using numbers as human-usable UUID is not a solution but using it under the hood is totally OK.
The real problem is that unique and permanent usernames serve as tatoos (which people later may find to be humiliating or depressing), disadvantage late-comer non-technical users (who will almost never have a good username and almost never will have the same username on two websites), and lead to weird problems with assumptions people make about what usernames even mean (that they are a signal for identity) that are simply not true.
Expiring hotmail addresses have been problematic in many cases, and any unique lookup string will eventually be stored somewhere by someone and assumed still valid later on.
It's not even solved in full for phone numbers, despite everybody knowing they can expire and be reassigned - since long before our own lifetimes.
Do you realize how impractical it is for users to remember these numbers for every site? Until we get to the stage where every non-English-speaking user and their grandma finds a password manager convenient, this proposal won't even pass the laugh test.
Case in point, I operate a service that uses numeric identifiers. Going by the helpdesk queries, our users are more likely to get their email address wrong than their membership number.
And how often do they forget their numbers (rather than getting them wrong)?
It's amazing the results you get from treating your users like competent human beings rather than idiot cattle.
All this snark just to dodge obvious questions about your approach?
Speculation on the potential scalability of this approach seems absurd given that membership numbers have been successfully used by organisations of every scale for centuries.
The concern is not with the scale of the organization, but with the number of organizations a user is a member of, which has exploded since website accounts appeared.
While I don't disagree that there's a lot of useless account creation, I'm still member of at least magnitude more of useful online accounts that I or my parents ever were offline.
I know that whenever I have to contact a service for which I only have a long numeric number I have some reference handy to make sure that I don't fuck it up.
That's really not an issue. It isn't much easier to remember that I'm "John28161" on a busy site. Websites have been offering "I forgot my username" functionality for ages, so as long as you remember your e-mail address, you're fine. Also, just let users bookmark their personal profile page (foo.site.com/user/83755567565) easily and it's solved even if cookies are deleted. Apps won't have a problem either way.
Digits are much easier to spell over the phone than a mixture of letters and digits.
People are used to identifying themselves with a sequence of digits. If you're dealing with the tax office, the water company, the electricity company, or whatever, you get asked for your customer number, your meter number, your reference number, and so on, and usually these consist mostly of digits. Sometimes these identifiers are way too long, or several different identifiers are unnecessarily used, or the same sequence of digits is confusingly referred to by several different names ("customer reference", "account number", whatever), but those are separate problems.
It's not a level comparison though. You don't need those numbers often, and when you do, you have all the resources you need with you, and it's not urgent. You generally need those when you call customer service, which you only do when you're at home and have time to spare. But if every email, messaging/social networking app, bank, credit card company, shopping site, etc. required me to go hunting for a 10+-digit number, I wouldn't have any trouble understanding it, but I would get fed up pretty fast.
You have to remember your UUID? Probably it would be more like a file than a string.
100595964940551841549 isn’t exactly usable for that. Even phone numbers, such as 01573 0677867 are much shorter, and they follow a pattern.
And in either case, someone will end up with huge, unwieldy numbers. By using ascii identifiers, they’ll be 3 times shorter, though. Facebook’s number is just "WvN1+8Yd" in base64.
And at that point, why not allow people to choose identifiers? John.Doe.123 is still much more readable than 100001702987293
This gives every 'username' an effectively unlimited space of identifiers via the number suffix, and it trains users to realize that 'person who controls the account with username X' is not necessarily the same person as 'person who I know on other sites as username X'.
Though yes, i agree that once the nr of accounts start going above 1 or 2 a password manager will be required.
It’s not as good as, say, 1Password but it’s more likely to get used. Combine it with the browser or OS level password manager. It’s good enough for grandma, definitely better than “kitten4” that she’s currently using everywhere.
On a tangent, stereotyping this as “grandma” is a bit unfair. Most of my colleagues are college educated males in their 20s, some of them developers. And their passwords are rubbish, with no password manager, and no 2fa.
(And re: the grandma thing: it's nothing specific to grandmas, it's because the moment you suggest your audience is "college educated developers in their twenties" as in your case, people throw the notion of UI/UX out the window and recommend you suggest they compile their own kernel first. It seems you just can't win.)
For most people, writing (good and unique) passwords down in a notepad is a way more secure system than having the same bad password for every account.
Having a botnet guessing the random "kitten4" password for a random user account, is as likely as having your purse stolen for the passwords on that note. FWIW "m" is almost a secure password on a root account with an SSH that allows password authentication, even if you allow brute force attacks. Imperically speaking, obvisouly it's going to fail in the end but I hope you get my drift.
This is very counter-intuitive. Is the idea that guessing both the username and the password together is much harder than guessing the password when you already know the username?
In the kitten4 example, I would guess most botnets are working from a list of usernames/email addresses that they got from leaks.
> Is the idea that guessing both the username and the password together is much harder than guessing the password when you already know the username?
No, to be clearer no one in the last 6 years has ever tried "m" as a password on my root accounts.
I feel very strongly that there is too much stigma around passwords, kitten4 is a nice password if you use it only once.
That's what my grandpa does. After failing to find his gmail address in it, he went through the "forgotten password" process. Then, after needing it the third time, we found the old password in the notebook, which was now wrong...
I can assure you that the average user wouldn't get above 15 - 20 bits with self selected words. That's often worse than most current passwords.
Is she a Visual Studio Code developer? Does she need to manage Docker containers?
Security does require new hardware because iOS is leaps-and-bounds better than any other system.
There is no other option. Nothing else comes close.
Were you expecting security in broken systems like Android? Instead of forcing security onto a broken system, just avoid that system?
The fist step of security: stop using Android.
And you haven't explained why your grandma is tied to an ecosystem. I'm honestly asking if she's a developer or not?
What is her use case? Why does she need to be on a specific platform?
(Wouldn't the question only make much sense in just one of those cases...? Not sure if I'm missing anything.)
I tried using Google Drive to sync it up, but Drive is useless for this - it doesn't open the file using the right intent on Android ("file type not recognised" or something similar it says, this used to work as well) and the Drive website makes it a pain to upload an updated file even from the desktop using Chrome.
In my case, I use KeePass 2 and KeePass2Android with Google Sync and it works decently well (I would recommend you try this). I would never recommend it to non-technically-minded folks though.
Sync looks to be for Google-domains/business only. In fact Wikipedia says it has been discontinued! I used to sync over owncloud and that worked pretty well, but the provider shut down and I haven't gotten round to setting another up.
If you're on Android, Keepass2Android  is an excellent app that implements the input with a special keyboard. This avoids risking your password via the clipboard. It even comes with a no-network-permission version!
During my freshman year of college a particular sandwich shop hired a spokesperson who shared my first name. One thing led to another, and the name of that shop became a lasting nickname.
Unfortunately that spokesperson turned out to be quite a monster, leaving me in a bit of an awkward position on sites that don't allow username changes.
This dilutes your otherwise excellent point. URLs are great when done right and lots of people prefer them to the sites search functionality. But that is entirely orthogonal to identities, which barely ever need to show up in a URL. (Unless treated as permanent and uniquely attached to physical people, which we agree they should not.)
Uh, Facebook doesn't have usernames, and haven't had usernames for as long as I've been able to be a member.
There's an option to grab a unique identifier for your personal page, so that you become https://www.facebook.com/identifier, but it's completely optional, it's just a vanity thing.
Same for groups, they can grab a unique identifier, or stick to their auto-generated id.
Same for businesses.
Actual usernames typically do all of the above.
First two results are:
Which is the "real one" based just off this information? How do you know that when you message one of them you're getting the actual bank and not a fraudster?
It's not just vanity, people do check these things, not all of them savvy enough to continue researching. Just peek at "safe browsing tips" you'll see from tech rags online and it's pretty clear we do a poor job of educating people about proper vetting online, so you get people instilled with dogmatic understandings of security. ("The URL clearly says Mybank, so it's the real one." "Google actively removes fraudulent websites from the top hit, so it must be the real one", etc)
It sucks, but it is important to try to control for such errors.
They are shown in the address bar of your profile page once set.
Incorrect, I can use my username in the email field and log in.
There are probably people out there with spreadsheets full of service types, account names and passwords of accounts they control that include all the two letter to four or five letter company names, and many celebrity names, just in the case that someone wants to pay for it. The domain name game, just evolved for the current climate.
I mean, it would take me less than $25k worth of my time to build something to automate this, even if I had to get rotating IPs, mobile accounts, and have mechanical turk to solve CAPTCHAs (although with all those features it might be close), and you were offered that for one account.
Also, I don't want to be out of it if they report to Instagram that it's for sale etc.
I joined a Big Dinosaur Company early enough that they were still using mainframe RACF as the system-of-record for authentication, flowing downstream to LDAP. So indeed I received, for example, dlg28 as my ID and stem for e-mail address.
However after several years the SoR was migrated to Windows LDAP ( can't remember its brand name ) and it generated 'sensible' IDs for all the newer staff. So someone received JimSmith as ID & e-mail address.
We oldies felt old and uncool! So a project introduced self-selected e-mail aliases for the oldies, which then led to interpersonal conflicts because jsm22 wanted JimSmith@, but the 'new' Jim Smith already had that... But jsm22 felt he had title to it since he had worked there 40 years etc etc So he was given JimBSmith@ which of course led to misdirected e-mail. Hilarity ensued.
I'd rather they had never introduced the long-form IDs at all!
Having had a major fall-out with them fifteen years ago, it always stings when I have to manually enter this user name.
No need to go that far... just look at the URL bar right here. :-)
I have no idea who you are and don’t recognise your username. Was that whole rant just a humblebrag that you’re “internet famous”? Because here at least, no one cares.
Seems like a real anxiety, not just a humblebrag. I’ve felt the same way and my response is to just make up new names all the time.
Yeah so what’s your alternative again?
If this is your belief why have you stuck to a specific name, to the point of signing up to random sites just to take possession of it?
In business you have customer account numbers, bank account numbers, membership numbers, invoice numbers etc. There is no conflation with identity - you are not your account with your bank, gym or stationary supplier. It is only because of internet forums and login usernames that we have even gotten to this state of affairs in the first place.
Usernames (or internet aliases in general) were a routinely mocked part of internet culture by mainstream culture. People these days use services in spite of usernames rather than because of them. The president of the united states has 'real' in front of his name. Think about that. In no other medium do we have people asserting that they are the genuine person they are claiming to be. Its tautological.
People weren't just tapping random key combos either. They were coming up with their own identities because that's what people like to do.
I can't believe these comments that think people are itching to be assigned a #Reference ID on the internet. For example, you also think the mainstream used internet message boards.
Everything else you've listed is part of a identification procedure that happens mostly between you and a machine. Not between two humans (taking out the call center semi-human who needs to type that into the machine first).
Depends what's associated with the account, no?
If it's something like Steam, your username might be tied to hundreds of dollars of purchases.
If it's Slashdot, you'll lose your sweet low user ID.
If it's StackOverflow, you'll lose your various reputation scores.
It's also likely that they used the login username as a primary key, which means it's unlikely to be able to be changed anytime soon.
To be honest, given user names are just an arbitrary reference, you could probably also include phone numbers, IP addresses, social security, national insurance and house numbers into the list of prior art as well.
Not that I'm advocating the use of numbers instead of names. Twitter I think gets it right where they give everyone a number which is fixed but you can assign yourself a name; which can change. Most of the time people choose not to, But the option is still there.
> The reality is that a relatively small handful of privileged early adopters get good usernames that match their identities, and everyone else gets screwed. These identifiers then act like tatoos that you got a long time ago and are stuck with for the rest of your life: people end up reminded every day of a sport they can no longer play due to an injury ("hockeystar") or loves lost ("iheartjessie"), attached to a joke that is no longer funny or to a thought that they found adorable as a 13 year old (when you are legally asked to "choose a username": a modern era coming of age scenario) but which adults find inane, or to a nickname that means something different than you realized to some people and now can't change.
The best happy medium is a user name that can change. But so many places make them static (sometimes for reasons no better than they just made "username" the foreign key in their users table)
> I come from the ICQ time
Likewise; that's why I used it as an example ;) I think comparing ICQ to IRC is a bit disingenuous as they occupied slightly different use cases.
Plus at risk of repeating myself, I'm not arguing against users names on the whole; just systems which have user names that cannot be changed
Most users have at least two email addresses because most of their mail is routed through email addresses like "PartyChick88@hotmail.com" but they don't feel comfortable putting that on job applications and medical forms.
It is a statistical certainty that people have missed job opportunities and subsequently defaulted on mortgages because they sent off a bunch of job applications on their "business" email address and then forgot to check it because they don't use it much.
One of the more common office security failures is to have your email client auto-fill to someones personal account instead of their company-issued account, resulting in sensitive documents leaving the auditable environment of the office email server.
Now for sure, it's not exactly up there with global warming and north korea, but I'm not sure I'd call it a "no-problem". It's a fundamental UX failure that we're only just now starting to see get fixed with email address aliases becoming a more widespread feature, and even that is just a patch. We've all gotten used to it, but that doesn't mean it's not a problem.
I'm/was doing quite a lot IT support for friends, family their friends and so on and have never heard of anything like that. I'm also sure that the domain @hotmail.com would be enough to not get you a job on certain businesses.
> One of the more common office security failures is to have your email client auto-fill to someones personal account instead of their company-issued account
So...you are sending private emails from your business account? Again it's not the nick/names problem. The problem is your behavior. This is the root cause here and doing some make up won't solve your problem.