

Storing Your Data Securely: A Primer - ShaneWilton
https://www.tinfoilsecurity.com/blog/store-data-securely-encryption-hashing-how-to-guide

======
ctz
Not related to the article, but the source:

Having my email address on my blog is a vulnerability. Another vulnerability
is named 'Rudimentary Scan', meaning 'we looked, and your site has no client-
side vulnerabilities'. These issues mean my blog is 'borderline unsafe'. Wut.

Neither of these are vulnerabilities; to call them such does a disservice to
your customer's understanding of technical language.

~~~
borski
Sorry, perhaps it's a little unclear.

Rudimentary Scan means you ran a scan from our homepage, pre-verification, and
we couldn't run our full suite of modules. All that is saying is that now
since you're verified you should re-run a scan which will look for quite a bit
more. We figured putting it there would be the best way of not giving a false
sense of security from our 'rudimentary scan.' We make it 'borderline unsafe'
so as to prevent that false sense of security. I can totally understand your
confusion, and we should make this clearer. Ideas / feedback are always
appreciated. :)

The email address is an informational vulnerability, which is dismissible.

Does that clarify things? (cofounder of Tinfoil)

~~~
ctz
If I were you, I would put the full scan at the end of the setup workflow
(straight after verification), so that by the time the user gets to their
report, you have full results for them.

~~~
borski
Feedback appreciated. We're rethinking the whole flow, and we'll definitely
try to make sure it's clearer. The intentions were good, I promise. :)

------
atmosx
I'm not a security guru, but the most common scenario (even today, that big
data and NoSQL/Graph dbs thrive) is a website running on a SQL (with some
additional memcached/Redis or Mongo?).

Say I encrypt the entries... Since the database runs in local host, if the
host is compromised and subsequently 'rooted', what exactly does database
encryption offer?

The intruder will sniff/find the encryption key(s) since the key must be
somewhere inside the application or in memory (don't know how, but I'm sure
it's possible) in order to be able to decrypt data on the fly. The way I see
it it's just added complexity for the admin with no gain.

Is there any way of defending an SQL database even when the host is
compromised?

~~~
chacham15
Simplest solution to fix this is to change "Since the database runs in local
host". Otherwise, it depends on the extent of the compromise. If the attacker
got root access, then all of the data is compromised. Note: you can minimize
the effects of this compromise by doing one of a few things: storing the hash
of data (e.g. with passwords) so that the original data is still (relatively)
safe, or you could also have the data encrypted with another key (e.g. if it
is a credit card number) that does not reside on the local machine. If the
attacker has not gotten root access, you can use different users to separate
what potential data the attacker has access to.

~~~
pyre
I think that it's probably safer to assume that if the attacker controls a
local user account, escalating that to root privs "isn't that difficult." I
see such comments all of the time here by people that should know about
security.

Separating concerns be user account has its uses (e.g. sandboxing an app from
accidentally screwing with files it doesn't need access to), but I don't think
you necessarily want it as the linchpin of your encryption strategy.

~~~
chacham15
That is not a fair assumption. The reason being that you know nothing about
the type of vulnerability that they are exploiting. For example, if the site
has a SQL injection attack, user separation in the database would prove quite
useful.

> but I don't think you necessarily want it as the linchpin of your encryption
> strategy.

Its not about having one thing at "the linchpin". In fact, you should not rely
on any singular thing, but rather separate out pieces of information so that
no part has more than it needs. The test user should not have access to the
production database and vice versa. In sum, you should use multiple things to
defend yourself, not just one.

------
dopamean
A semi-interesting, completely off topic note on something I just learned
after walking the earth for 28 years. The word "primer" as it is used here is
actually pronounced primmer and not PRIMEr. Apparently the origins of both
words are pretty unrelated.

~~~
borski
OED accepts both pronunciations. Apparently "primmer" is the British way of
pronouncing it.

------
itistoday2
> _These functions belong to a family of functions known as key derivation
> functions: they use hashing to produce a digest which is suitable for use as
> a password hash._

KDFs are not for storing password hashes, they are for _key derivation._ This
is a subtle point, the main thing is to use KDFs as KDFs (to derive keys that
are then used to encrypt data). Details in the first link here (the second is
about common mistakes when using scrypt):

\- [http://blog.ircmaxell.com/2014/03/why-i-dont-recommend-
scryp...](http://blog.ircmaxell.com/2014/03/why-i-dont-recommend-scrypt.html)
(bad title, he's referring to pass storage, not KDF usage)

\- [http://vnhacker.blogspot.com/2014/04/fairy-tales-in-
password...](http://vnhacker.blogspot.com/2014/04/fairy-tales-in-password-
hashing-with.html)

Another consideration is plausible deniability (PD) for situations where
you're compelled to disclose your password(s). Our company writes Mac
encryption software that specializes in this (and it uses scrypt). Here's a
list of tools that offer PD (ours is called Espionage):

[https://en.wikipedia.org/wiki/Deniable_encryption#Software](https://en.wikipedia.org/wiki/Deniable_encryption#Software)

I compared Espionage's PD to TrueCrypt's on reddit:

[http://www.reddit.com/r/security/comments/2b5icu/major_advan...](http://www.reddit.com/r/security/comments/2b5icu/major_advancements_in_deniable_encryption_arrive/cj24a1n)

~~~
tptacek
The first of these two source suggests that with typical settings, due to a
quirk with the way scrypt works, it's less secure than bcrypt.

The second criticizes an API design with the scrypt code.

Neither makes a case that password-based KDFs are "not for storing password
hashes".

Using a good password KDF as a password authenticator is a fine decision.

~~~
itistoday2
> _The second criticizes an API design with the scrypt code._

Yes, I didn't claim otherwise.

> _Neither makes a case that password-based KDFs are "not for storing password
> hashes"._

The first one did quite explicitly for scrypt: "And that's why I don't
recommend it for password storage."

> _Using a good password KDF as a password authenticator is a fine decision._

Apparently not so for scrypt, and scrypt is the best KDF I'm aware of (heard
something about yescrypt being better but haven't had a chance to look into
it).

I don't see why a KDF is needed for password storage. What's wrong with SHA256
+ random salt? If for some reason you want a slow hash function, use the salt
MOD some constant to do several rounds of SHA256 (though I don't see why that
would be necessary).

~~~
pbsd
You mention yescrypt as an example of a good KDF, but its main design goal is
to be a password hashing function!

The main difference between a password-based KDF and a password hashing
function is that the KDF usually needs to support variable-length output.
Otherwise they perform the same task: take a secret string of low-entropy s,
and stretch the cost of breaking it to 2^(s + w) for some work factor w. In
other words, a KDF is a superset of a password hashing function, and it's OK
to use it as one.

~~~
itistoday2
> _You mention yescrypt as an example of a good KDF, but its main design goal
> is to be a password hashing function!_

I mentioned I'd heard it was a good KDF (not a good password storage
function). I didn't know that was its main purpose. Thanks, will need to read
more about it.

Still, there's marginal returns on how much w can be increased while keeping
the KDF useful, right? There's only so much magic they can do for a weak
password.

The article on scrypt being a poor candidate for password storage stuck out in
my memory. I haven't seen or heard of the practice of strong hash function +
random salt for password storage falling out of favor, so I didn't look into
using other KDFs for that purpose.

EDIT: it does seem like a trade off. This article seems to have a good
explanation of it: [https://crackstation.net/hashing-
security.htm](https://crackstation.net/hashing-security.htm)

    
    
        If you use a key stretching hash in a web application, be aware that you
        will need extra computational resources to process large volumes of
        authentication requests, and that key stretching may make it easier to run
        a Denial of Service (DoS) attack on your website. I still recommend using
        key stretching, but with a lower iteration count. You should calculate the
        iteration count based on your computational resources and the expected
        maximum authentication request rate. The denial of service threat can be
        eliminated by making the user solve a CAPTCHA every time they log in.
        Always design your system so that the iteration count can be increased or
        decreased in the future.
    

I do stand corrected about KDFs not being OK for password storage. Seems like
they can be overkill in some situations, but in general do a good job of
protecting passwords.

~~~
pbsd
Yes, realistically you can only add at most ~30 bits of security to a
password. This does not save very poor passwords, but can save decent ones.

'Strong' means different things in different contexts. SHA-256, for example,
is a strong hash function: it has good preimage and collision resistance.
However, a function is evaluated by its weakest point: when given a low-
entropy secret (which is _not_ the assumption in a strong hash function) and a
strong hash function, the easiest attack is to bruteforce it by sampling its
distribution (e.g., the most common password patterns).

Since SHA-256 is fast, bruteforce is naturally also fast: the average time-to-
break for a string of entropy s is 2^(s-1) times the cost of a SHA-256
evaluation. You can build on this by, say, iterating on SHA-256:

    
    
        def pwdhash(input, salt, w): 
            h = sha256(salt + input).digest()
            for i in range(2**w):
                h = sha256(h).digest()
            return h
    

But once you get into these ad hoc schemes to make the password hash less
amenable to bruteforce, you're basically reinventing PBKDF2. scrypt (and
bcrypt, to some extent) is like the above, but also uses random memory
accesses on a large buffer to make things harder for specialized hardware
(think GPUs, FPGAs, etc).

~~~
chacham15
> Yes, realistically you can only add at most ~30 bits of security to a
> password.

This sentence makes no sense to me. tptacek, correct me if I'm wrong, but Im
pretty sure you can increase w to be as large as you want (i.e. the keyspace
is large enough that youll be waiting years for a singular hash to finish
before you wrap the whole keyspace). The only problem is that you force
legitimate users to wait as well.

Let make up an extreme example to illustrate, you set w so that a hash takes 1
years - 1 day to compute on the clients machine. Also, they change passwords
every year. (Yes, this is absurd because it means that the client can only
work one day a year). Now, lets assume that he uses one of five-hundred
passwords. If the attacker has 100x the compute power of the client, he will
only have a 20% chance of getting the correct password. And that is with only
roughly 9 bits of entropy.

~~~
tptacek
Take anything 'pbsd writes to the bank. The problem here is that adding enough
cost to add significantly more difficulty than that to a password quickly
takes you to places where the real-world cost is infeasible.

(He's not arguing against using KDF-like constructions to store password
authenticators).

~~~
chacham15
I've actually wanted to ask you about:
[https://news.ycombinator.com/item?id=8088253](https://news.ycombinator.com/item?id=8088253)
but I cant reply yet. If that ever actually became an issue, could it be
solved by instead having the client send (2 ^ w) - 1 rounds of the hash and
having the server do the last round? My reasoning is that the client would
have to brute force the full hash space if they wanted to reverse only the
last round instead of trying to brute force the password.

~~~
nialo
one possible reason this doesn't work: A limit on the number of iterations for
a password hash is the amount of latency the user is willing to accept, which
is (approximately) constant regardless of whether those iterations are
computed on the client or the server

