On Sept 30, 2014 I sent two emails to Dr. Blum explaining what I believed was the weakness with the approach he was advocating. He never responded (or somehow I never saw a response).
Here is a snip from the first email:
Begin ---%<------------%<---------------------------------
As I understand it, the algorithm, expressed in Python is:
#########################
import sys
from string import ascii_uppercase as alphabet
# ABCDEFGHIJKLMNOPQRSTUVWXYZ
LETTER = "31415926535897932384626433"
NUMBER = [0,2,4,6,8,1,3,5,7,9]
def f(ch):
assert ch in (alphabet + "0123456789")
if ch in alphabet:
return int(LETTER[alphabet.index(ch)])
if ch in "0123456789":
return int(ch)
def g(n):
return NUMBER[(NUMBER.index(n) + 1) % 10]
def pw(s):
digit = g((f(s[0]) + f(s[-1])) % 10)
result = [digit]
for c in s[1:]:
digit = g((digit + f(c)) % 10)
result.append(digit)
return result
print(sys.argv[1], pw(sys.argv[1]))
#########################
Consider a few results from encryption and what it presents to the adversary:
pw(“ABC”) == 928
pw(“ABCABC”) == 928362
If “ABC” is a seed to the algorithm, then any seed that shares a prefix and a final character will have information leaked, sometimes enough to reveal the entire generated password for a different seed.
It’s actually worse than this. For example, if the adversary knows that:
then the adversary knows that the mapping from the character C to an integer is the same as the mapping from character T. Using the terminology presented in the lecture this is
f(“C”) == f(“T”)
and from this adversary can determine information about the result of the password algorithm on other seeds.
pw(“BBT”) == 717
pw(“B.*T”) == 7.*
Because the algorithm uses a recurrence that generates one ciphertext character from the result of preceding ciphertext character, the adversary can make further inferences:
pw(“BAT”) == 728
which implies that if the preceding ciphertext is 7 and the current seed character is A that the resulting ciphertext will be 2. Consider
End ---%<------------%<---------------------------------
My second email on Sept 30, 2014 contained the solution to a challenge he proposed in the video of a lecture on the method he gave:
Begin ---%<------------%<---------------------------------
On one slide during your recent lecture, you present a bit of a challenge, and I noticed that by making use of just the four plaintext/ciphertext pairs:
One can conclude that the permutation of [0,1,2,3,4,5,6,7,8,9] that controls the mapping g() must be one of the cycles:
6159073428
8106279354 <- this turns out to be the one
In fact, with a bit more work one can deduce that it is the second by making use of the additional plaintext/ciphertext pair (which appears on the same slide):
The details of this decryption aren't very interesting, so I won’t go into detail. I didn't need to use a computer, just paper and pencil. The important observation was that from BRAIN -> 06076 one knows
g(0 + f(R)) -> 6
and from TRAIN -> 27732 one knows
g(2 + f(R)) -> 7
thus if g(k) -> 6, g(k+2) -> 7.
This means that map(g, [0,1,2,3,4,5,6,7,8,9]) is some rotation of the list [_,_,_,_6,_,7,_,_,_,_] where 6 and 7 are at two locations apart.
Every letter, say 'A', which appears in more than two places in any of the plaintext/ciphertext pairs reveals information about g(). So BRAIN -> 06076
and TRAIN -> 27732 also reveals that
g(6 + f(A)) -> 0 and g(7 + f(A)) -> 7
Therefore, if g(k) -> 0 then g(k+1) -> 7. Thus, we can now conclude that map(g, [0,...,9]) is some rotation of [_,_,_,_,6,0,7,_,_,_].
In this fashion I concluded that map(g,[0,...,9]) was some rotation of
[2,9,1,3,6,0,7,5,8,4]. I knew that g()'s corresponding permutation was a circular permutation with a single cycle because that was a part of the system that makes it easier to memorize.
In general, of course, there could be ten possible mappings, one for each rotation. However, in practice some of these rotations
won't produce a permutation with a single cycle. This isn't really a problem because ten possible mappings for g() are still easy to validate in the next phase where we derive the mapping f(). In this particular case, there were only two possible circular permutations making it easy to decrypt the system with just paper and pencil.
The next step is to try out each of the possible g()'s determined above on the plaintext/ciphertext pairs. For example, BRAIN -> 06076 implies that
g(0 + f(R)) = 6
applying the inverse map of g() to both sides
0 + f(R) = 0
so
f(R) -> 0
In this manner the entire decryption can be performed.
End ---%<------------%<---------------------------------
This method fails as soon as you have to change a password:
- One of the sites is compromised
- One of your devices is stolen/lost and you have to change some passwords
- One of the sites has a password expiration policy
Pretty soon you end up with multiple password schemes and you're in precisely the same situation as before, wondering which password goes with which site, only this time you have to perform algorithmic dances in addition to memory feats.
Even if none of the sites are compromised; even if your device is not stolen or lost; even if the sites don't use password expiration it doesn't work very well because some sites are just plain stupid with their password restrictions. Some of the things you'll face:
• Passwords that prevent double characters within the password: not ideal when using a scheme.
• Passwords with a minimum/maximum length: I've seen sites with a minimum of six characters and other sites with a maximum of eight characters. That means, to cover them all, you really have to choose a password of exactly six, seven or eight characters and even then it's no guarantee.
• Sites that prevent certain punctuation and other sites that require punctuation. That means you have to remember which sites require which.
• Likewise, sites that require/prohibit capital letters or require a mix of case.
It's a nightmare doing anything other than recording them somewhere. And recording them somewhere is not great either as you always have to carry that thing with you and, if it's electronic, remember to charge it.
Maximum length tops my list of favorite restriction, because it strongly suggests that they're keeping my password in the clear somewhere - or they used to, and just never got around to changing it.
The concept of illegal characters is tied for second. If you're handling it right, there should be no such thing - yet so many sites continue to set arbitrary limitations for reasons that aren't clear even to them.
Kind of wish I could just generate private keys for each site and not think about it anymore. (Yes, I know there are some convoluted ways to do this, but they're not particularly usable.)
THe best part is that even so many new and/or modern sites enforce the same arbitrary requirements.
I agree and was just complaining about this to a coworker. I tried to use a passphrase somewhere, I think paypal? and it made me make my password between 6-10 characters...
A few months ago I set up an account with American Express. I was able to log in to their website from a desktop, but the Android app wouldn't let me log in. Eventually I would up talking with their tech support, having been elevated twice, and I learned that their passwords are limited to 20 characters, as well as NO UPPERCASE CHARACTERS! For some reason, I had been able to make my long, uppercase-filled password on the desktop site, but the app was lower()ing and truncating my password.
To add insult to injury, when I tried to tell the gentleman on the line what an egregious security lapse this was, he cut me off with a terrifying and patronizing explanation: "strong passwords don't matter anymore, because nowadays we have _encryption_".
I totally get what you mean, but setting an upper limit on fields is generally a good idea. Setting the limit too low sucks, but you don't really want to accept, e.g. 1MB passwords.
Some browsers might not want to send the password in a POST request. Effect could be that you can change your password from browser X, but cannot login later from a different browser or after you upgraded that browser. If you are really unlucky, the browser you change it from chops off characters from the password.
Also, chances are users will not type their 10MB password into an input field, but try and paste it in. Again, the browser may silently discard a few MB of that string, or beep while you have your sound muted.
That's no reason to set max length at something small, but setting it to 1024 or so shouldn't limit anybody as long as it allows for more entropy in the plaintext password than will be in the hash you store.
Or worse, if they ask you to type your password into a phone keypad (eg. the 9 button stands in for [9w-zW-Z] and 0 for all punctuation). Not only does this mean they're storing your password in plaintext, but reducing the character set to just 0-9.
A major financial company does this, of all the terrible places.
It doesn't guarantee it, but it does imply it pretty heavily.
The alternative is that they're smart enough to come up with a clever scheme like hashing your password in both its raw form and when converted to 0-9, but not smart enough to realize how converting your password to 0-9 makes it vastly less secure.
No, it's easy for them to convert your password to 0-9 when you enter it on a regular keyboard, before hashing it. It does make it vastly less secure, yes, but it doesn't imply plaintext storage.
Isn't that what I said in my second paragraph? As I said, for that to happen, you need people smart enough to come up with that scheme but not smart enough to realize how insecure it is. Possible but unlikely. "Imply" does not require a complete guarantee.
Does it actually make it less secure though? Consider the situations where each password is used:
1. Your password is required online where the full password must be entered and the numerically reduced password is not accepted. This results in no loss of difficulty in the password.
2. Your password is required over the phone where the password is reduced to a T9 password. Despite the fact that the solution space has been cut down significantly (by 33%^n), the password cannot be broken any quicker because the bottleneck for this case is not computational power but rather the phone system. You can brute force all of the passwords in X seconds, but this is irrelevant because you can't try more than 1 password per call and you can't call more than Y times per second.
Good points. I would still say yes, although it's not a particularly huge problem. One scenario where it could be problematic is if the password database is stolen and nobody notices. They could then potentially brute force your numeric password and then access your account by phone.
In the universe of banking this is pretty small potatoes, though. My bank uses a case insensitive comparison for passwords (!), limits them to 8 characters (!!), and silently truncates anything over the limit (!!!!!!!!!!!!!!!!!!!!!!!!!!).
They say they're fixing it. But it sure is taking a long time.
It's not all that bad, seriously. I've been using an algorithm-based scheme across probably hundreds of sites for probably 10 years.
There are 3 'tricks':
1. First, use several different initial functions for different levels of security, each with different levels of complexity (i.e. f1(user,domain), f2(user,domain), etc).
2. Then, use a function for password requirement rules (i.e. g(f(user,domain),rules)).
3. Finally, use a final function for rotating passwords (i.e. h(g(f(user,domain),rules),rotation)).
So, all together, maybe 5 different algorithms, each of different levels of complexity.
You'd think that this would be very difficult to manage, but it hasn't been in practice. Very, very few times do I pick the wrong initial algorithm when trying to derive my password. And, in that case, I can quickly iterate to the correct password within 3-4 tries.
Most of the time I get it on the first try. Sometimes, I get it on the 2nd-3rd try. Very, very rarely (< 1%), I cannot derive it within 5 tries. And, at that point, I just do a password reset (and rotate the password using the h function).
Previously on HN there was a MasterPassword[1] app that solves the issues changing passwords, too short site names as well as doing computation in head.
"“But James,” you protest, “there are many best practices for choosing passwords!” Yes, I am aware of the “use a vivid image” technique, and if I lived in a sensory deprivation tank and I had never used the Internet, I could easily remember a password phrase like “Gigantic Martian Insect Party.” Unfortunately, I have used the Internet, and this means that I have seen, heard, and occasionally paid money for every thing that could ever be imagined. I have seen a video called “Gigantic Martian Insect Party,” and I have seen another video called “Gigantic Martian Insect Party 2: Don’t Tell Mom,” and I hated both videos, but this did not stop me from directing the sequel “Gigantic Martian Insect Party Into Darkness.” Thus, it is extremely difficult for me to generate a memorable image that can distinguish itself from the seething ocean of absurdities that I store as a result of consuming 31 hours of media in each 24-hour period."
Yes, they have a single point of failure -- but, more importantly, this is the only point of failure.
Write your password on a piece of paper and store it safely (what constitutes 'safely' may vary. For most people, an envelope in the bottom of a drawer is plenty safe.). Put the actual password database on Dropbox and make sure it's replicated in a couple of locations.
> they're not the 'ultimate solution'
Nobody said that, so your quote-marks are out of place. The GP said that there is no better solution today.
Negative statements have less information than positive ones.
I don't need a backup, because my password manager syncs with my phone and tablet. Loosing those two and my computer simultaneously would put me in really big trouble, but statistically I'm much more secure than transmitting non-truly-random passwords across the Internet.
Using two-factor authentication with your password manager helps mitigate the risk. Online managers such as lastpass only keep encrypted versions of your passwords, and they do not know your master pw to unlock that vault, so even if they get breached (which has happened), the attacker just has a bunch of encrypted passwords.
Any scheme I'm going to replace my current password scheme with (whatever that might be) must be both secure and convenient. A scheme which I must invest time into 'practicing' is not convenient.
What's the central point of failure for my 1Password vault?
It's stored on four of my devices. It's stored on Dropbox as well, but a compromise of Dropbox won't give the attacker anything because it uses secure crypto and I have a strong master password.
Share your master password with your friends. You can make it so that, say, 20 out of your 50 friends are required to piece together the master password.
If you are afraid of breaches you can use an offline password manager. Data being lost remains a problem, but this is the case for all of your files so it should be backuped the same.
You could always just pick a number that you want to use and repeat/trim the name? So if you picked 9 you'd use AMAZONAMA as the base for amazon, but HACKERNEW for hacker news?
Hmm, did I use "hacker news" or did I use "ycombinator" or did I use "news.ycombinator" or....?
You have to be specific when you pick the algorithm if it's to work the way the author suggests, and preferably something that is not easily shifted, such as the domain name and not the site's title.
How long is "a long time"? Because over the year or so I gave sitename-passwords a try it popped up as an issue way more than 3 times, even after I started ignoring subdomains and instituted rules about always trying to use the main domain for big companies. There are tons of systems out there that use cross-domain (notably both of my banks and both of my schools), hidden-domain (i.e. log into an app or device where the parent domain isn't immediately obvious), or changing domains. Of course, these issues were nothing compared to trying to remember all of the variations I added to sate the enormous variety of conflicting password requirements (especially novelty requirements and passwords you don't get to choose).
I was in complete denial about how bad the situation was until I encountered some light teasing from acquaintances who didn't even know me that well -- just well enough to know I couldn't ever remember my passwords. I started keeping tick marks on my calendar and found a ~75% success rate, although the 75% was composed of the handful of logins I used every day and the 25% was "everything else" so the reality was that the password derivation system was failing for a strong majority of passwords.
I gave up and started using lastpass. No regrets, but many positive surprises: that time their network was breached but it didn't matter because they didn't use shit encryption, the automated password reset feature for big-profile leaks (they give you a "todo" list which is often as simple as clicking a single button next to each item), the ability to easily store serial numbers, "verification questions", and other nonsense, a general lack of guessing passwords several times before success, and an ability to dramatically up the cross-entropy of my passwords.
I might move from lastpass to a physical repository some time. But I am never, ever going back to a password derivation scheme.
I wish you luck. Or an eidetic memory. You'll need it.
I've used it for probably 6-8 years now, first memory of the base password is well around 10 years.
It is not very strong, I don't have some lookup table or complicated stuff just some small variations for each site.
I've started using 1Password as of late though, but that's mainly because I sometimes want to store notes and whatnot for websites and this makes that convenient.
Yeah I did mull over whether to use news.ycombinator, but I call it hacker news so that's what I went for - I can't really think of a website that I have multiple names for, so not sure that'd confusion over the name would really be much of an issue
This suffers from the same problem as the article's algorithm: it is fine until you need to change one of your passwords. Say a site gets compromised, or for whatever reason you need to change your password. Now what? You need to change all your passwords in order to be able to use this method, or you need to remember one different password, or remember that you need to do <website>.2 or something. Either way, you're back to relying on memory.
> it is fine until you need to change one of your passwords
For me, this happens less often than once a year. And then it is a good idea to change all passwords anyway.
The alternative is to store passwords somewhere in a password manager. However if this storage gets somehow lost/compromised ALL of your passwords get lost/compromised.
To deal with changed passwords and password restrictions,
I use a page with of mapping for each issue to an updated base url or changes to be applied to the forged password.
Because your bank probably limits you to 10 characters or something, and your insurance company requires at least one special character, and now you have to remember a bunch of special cases and you start wanting a password manager again.
Schemes involving lookups into some reasonably sized random table/matrix are probably the most secure thing that is practical with only your head or only pen and paper. Similar system is/was used by various armed forces for both authentication and encryption of short messages (with the matrices being larger and rotated pretty often and thus not memorized).
I've often wondered why login systems don't simply rely on the same system they use for password recovery. Instead of logging in with a username and password, why not request a login token that is emailed to me. I then use that link to access the site and never really have to think about passwords.
It replaced all my alternatives and I don't have to think anymore about passwords. It saved me a lot overseas and doing a new fresh install in my computers is not painful anymore.
If you're security-conscious enough to be taking part in this discussion, you should probably be wary about typing your passwords into someone else's computer in the first place.
Article with bad advice (not controversial - just plain bad), agreed by every single person in this conversation; yet in the front page of hacker news.
It's much better to use
reasonable_passphrase + "-" (or other special character permitted in most passwords like '.' or ' ' or ',') + site_name.
So "Superdonkey11_amazon" and "Superdonkey11_dropbox" would be strong passwords, where compromising one to a password database leak would only jeopardize other passwords if a human would pick out your password and think about how it applies to other services you use.
If you have to change your password with the site just cycle through a couple root passphrases. You now have salted your password per site in a human-memorizable way without some weird algo ritual to access every password.
"Superdonkey11_amazon" and "Superdonkey11_dropbox" would be strong passwords, where compromising one to a password database leak would only jeopardize other passwords if a human would pick out your password and think about how it applies to other services you use.
It's easy to write a script that looks for "dropbox," "Dr0pbox," etc and replaces them with "twitter" and "Tw1tter" respectively.
This is not the easiest way off having secure passwords that you don't have to store. Yes, there are password managers but if you don't want to store/save your passwords anywhere you could use pwdhash which is based off MD5.
Friend of mine actually built one off SHA1 and it's all open at https://github.com/simontabor/pw/ or www.pwapp.io. It's 40 chars so much much better than pwdhash (but that's the original I guess).
Something I have always pondered,...would it be more secure to request a password after the user has been identified by the system?
For example,...
1. User clicks login
2. Webcam uses facial recognition to identify the user.
3. The identified user is requested to enter their password.
In this case, I think, it is harder to impersonate the real user. I am no expert but would interested to know if anyone can see any obvious flaws or if something similar exists?
A face is a username, not a password. I can take a picture of you and hold it up in front of the webcam. And my picture will work every time unless you decide to get surgery.
Exactly,...the user is identified and then requested to enter a password. You are identifying the user, then you proceed to ask them a password. I.e. Identifying the user is something you are, then asking them for a password is something you know.
I'm confused. I never stated that a users face should be used as a password. I was suggesting that their face (or other bio-info) be their username. Am I missing something?
I do something similar, where i build passwords out of words or ideas and if i forget it I have a way of going back to it, albeit with some work. I don't suggest tying it to the website name because changing your password becomes difficult.
Here is a snip from the first email:
Begin ---%<------------%<---------------------------------
As I understand it, the algorithm, expressed in Python is:
Consider a few results from encryption and what it presents to the adversary: If “ABC” is a seed to the algorithm, then any seed that shares a prefix and a final character will have information leaked, sometimes enough to reveal the entire generated password for a different seed.It’s actually worse than this. For example, if the adversary knows that:
then the adversary knows that the mapping from the character C to an integer is the same as the mapping from character T. Using the terminology presented in the lecture this is and from this adversary can determine information about the result of the password algorithm on other seeds. Because the algorithm uses a recurrence that generates one ciphertext character from the result of preceding ciphertext character, the adversary can make further inferences: which implies that if the preceding ciphertext is 7 and the current seed character is A that the resulting ciphertext will be 2. Consider End ---%<------------%<---------------------------------My second email on Sept 30, 2014 contained the solution to a challenge he proposed in the video of a lecture on the method he gave:
Begin ---%<------------%<---------------------------------
On one slide during your recent lecture, you present a bit of a challenge, and I noticed that by making use of just the four plaintext/ciphertext pairs:
One can conclude that the permutation of [0,1,2,3,4,5,6,7,8,9] that controls the mapping g() must be one of the cycles: In fact, with a bit more work one can deduce that it is the second by making use of the additional plaintext/ciphertext pair (which appears on the same slide): So now we know that With g() in hand, it is short work to build up the mapping of f(). For these five words, the letters involved are A, B, D, G, I, N, R, and T. Notes on decryption ===================The details of this decryption aren't very interesting, so I won’t go into detail. I didn't need to use a computer, just paper and pencil. The important observation was that from BRAIN -> 06076 one knows
g(0 + f(R)) -> 6
and from TRAIN -> 27732 one knows
g(2 + f(R)) -> 7
thus if g(k) -> 6, g(k+2) -> 7.
This means that map(g, [0,1,2,3,4,5,6,7,8,9]) is some rotation of the list [_,_,_,_6,_,7,_,_,_,_] where 6 and 7 are at two locations apart.
Every letter, say 'A', which appears in more than two places in any of the plaintext/ciphertext pairs reveals information about g(). So BRAIN -> 06076 and TRAIN -> 27732 also reveals that
g(6 + f(A)) -> 0 and g(7 + f(A)) -> 7
Therefore, if g(k) -> 0 then g(k+1) -> 7. Thus, we can now conclude that map(g, [0,...,9]) is some rotation of [_,_,_,_,6,0,7,_,_,_].
In this fashion I concluded that map(g,[0,...,9]) was some rotation of [2,9,1,3,6,0,7,5,8,4]. I knew that g()'s corresponding permutation was a circular permutation with a single cycle because that was a part of the system that makes it easier to memorize.
In general, of course, there could be ten possible mappings, one for each rotation. However, in practice some of these rotations won't produce a permutation with a single cycle. This isn't really a problem because ten possible mappings for g() are still easy to validate in the next phase where we derive the mapping f(). In this particular case, there were only two possible circular permutations making it easy to decrypt the system with just paper and pencil.
The next step is to try out each of the possible g()'s determined above on the plaintext/ciphertext pairs. For example, BRAIN -> 06076 implies that
g(0 + f(R)) = 6
applying the inverse map of g() to both sides
0 + f(R) = 0
so
f(R) -> 0
In this manner the entire decryption can be performed.
End ---%<------------%<---------------------------------