
Crypto Anchors: Exfiltration Resistant Infrastructure - diogomonicapt
https://diogomonica.com/2017/10/08/crypto-anchors-exfiltration-resistant-infrastructure/
======
zaroth
Another form of "crypto anchor" is Blind Hashing which uses a large pool of
random data to defend the hashes. An attacker would need to exfiltrate over
90% of the data before they could run an offline attack on hashes blinded by
the data pool. The bigger the data pool, the more data an attacker would have
to steal, and the more hashes/sec you can run.

So while iterative/computational hashing is only secure if it is slow and if
the password is strong, Blind Hashing prevents offline attacks even against
weak passwords and actually runs faster as you increase the cost factor.

In this case it's more like an an actual anchor -- technically we call this
Bounded Retrieval Model -- the idea that we size the network bandwidth to make
it take 300 days at full line rate to steal the data over the network. So it's
a physical limitation rather than trusting a black box to protect 256 bits
like an HSM.

If you're interested here's an intro [0], a tech spec [1], and an academic
paper [2] by Moses Liskov at MITRE.

Disclaimer: I'm Founder/CTO of BlindHash.com which is basicallly Data Pool as
a Service -- we provide an API into a geo-replicated 16TB (and growing) data
pool.

[0] -
[https://s3.amazonaws.com/blindhash/BlindHash+Architecture+Gu...](https://s3.amazonaws.com/blindhash/BlindHash+Architecture+Guide.pdf)

[1] -
[https://docs.wixstatic.com/ugd/005c1c_5996c661899e4d09a28b9a...](https://docs.wixstatic.com/ugd/005c1c_5996c661899e4d09a28b9aa373c090c8.pdf)

[2] -
[https://eprint.iacr.org/2017/917.pdf](https://eprint.iacr.org/2017/917.pdf)

~~~
flipp3r
This looks like a pretty good technique, that's coming from someone who has
collected 240GB+ of user:password dumps.

I certainly wouldn't get 16TB of disks just for that if it were ever leaked.

Bummer(not for me :p) that you guys went the route of patenting it and keeping
it proprietary & only available through an API.

I think it would be adopted in no time if it were open source, and I'd
definitely like to see something like this available as a service on clouds
like GCP/AWS/Azure/etc for my day job.

~~~
unabridged
This can't be too hard to build:

1\. Generate 16TB of random data, backup/replicate many times

2\. Think of data as 16 billion 1k pieces

3\. Generate 64 random piece addresses using hashA(key) as seed

4\. Concatenate the 64 pieces into one 64k chunk, and store hashB(chunk)

~~~
candiodari
That is an excellent idea. But why 16TB of random data ? Why not encrypt some
high entropy value (digits of pi, whatever) with a 100 character password and
generate 16TB like that. You then use the 16TB as a password but you could
regenerate and recover using a scrap of paper.

~~~
zaroth
You can do either. But if you generate the data pool from a seed that you
retain, then you're back to trying to protect a 256-bit value from leaking.

Generating the data pool with constantly cycled and discarded keys (i.e.
/dev/urandom) means the only way to have the pool is to go and get every
single bit of it.

We went the second route because I like sleeping at night and it just felt
like retaining a seed would defeat the whole purpose of _bounded retrieval_.

~~~
candiodari
Sure, but that's a 256-bit value that does not have to be present at the use
point. So it's a lightweight anchor ! It's extremely heavy when someone else
tries to move it, and yet when you move it yourself, it easily fits in your
wallet on the tiniest of sd cards, or even on a scrap of paper.

------
tptacek
I am ambivalent about using HSMs to protect password databases, because if
you're going to do that, you might as well simply introduce a minimalized
authentication server (ie: something with an <AUTHENTICATE(user, password)
bool> interface). It'll have approximately the same attack surface, it
actually helps your architecture in other ways, and it precludes attackers
from getting hashes in the first place (at least, the same way an HSM does
with the HMAC key).

A Go, Rust, or (minimal, non-framework) Java authentication server speaking
HTTPS to solely that AuthN interface and sharing no database with anything
else is extremely unlikely to be part of any realistic "kill chain"; it'll be
among the last things on your network compromised.

Meanwhile: you get to stick with technology you fully understand and can
manage (simple HTTP application servers and a decent password hash) and
monitor.

HSMs have a lot of uses elsewhere in secure architectures, but the password
storage use case is overblown.

~~~
bigmac
We discuss exactly this architecture in the talk we gave back in 2014. See
here for the part where we discuss it:
[https://youtu.be/lrGbK6fE7bI?t=16m31s](https://youtu.be/lrGbK6fE7bI?t=16m31s)

Basically we 100% agree with you that an authentication service should do this
job. The HSM is extra credit. Although it does help in cases where the auth
service's DB is leaked through some other means (e.g. backups).

I will say that I'd depart with you on the return value of that service. It
shouldn't be a bool. It's better to return a token that downstream services
can use to independently verify that the authentication service verified the
user. Its better for your infrastructure if you aren't passing around direct
user IDs but rather a cryptographically signed, short lived token that is only
valid for the life of a specific request.

~~~
tptacek
I agree with you, but I'm apples/applesing that service with an HSM and
deliberately keeping the interface minimal, just for the sake of argument. The
subtext is my worry that normal developers on HN don't really understand why
HSMs are operationally secure --- minimal attack surface, not magic hardware.

~~~
bigmac
FWIW I was concerned folks would get caught up on the password storage use
case since so many are familiar with that problem. The crux of the idea of
crypto-anchoring is to segment crypto operations in to dedicated microservices
and use those minimal microservices to do per record encryption, decryption,
or signing. HSMs are a natural extension to those microservices if you have
budget.

------
Cieplak
If you're serious about preventing exfiltration, you'll do what the military
does and use data diodes

[https://en.wikipedia.org/wiki/Unidirectional_network](https://en.wikipedia.org/wiki/Unidirectional_network)

On a slightly tangent, people with physical access to a server can extract
encryption keys from RAM by plugging into a PCI slot:

[https://github.com/ufrisk/pcileech](https://github.com/ufrisk/pcileech)

~~~
jlgaddis
I think what will (should, perhaps?) ultimately happen -- and this is probably
still years off -- is that we will stop using default routes on (most) hosts.

Publicly accessible servers and such will, of course, still have them, but
things like, say, internal database servers or the PC belonging to Debbie in
Payroll, won't.

Access to things outside of the "local network" (i.e., a company's entire
network, not just the directly-connected subnet) will go through an
intermediary (e.g., an HTTP(S) proxy) that performs per-connection
authorization with a defauly deny.

It may end up looking a little differently than this -- a default deny on
_all_ outgoing IP traffic, for example, with only specific traffic permitted
-- but I believe that, eventually, this is how we'll keep random hosts from
being used to exfiltrate mass amounts of data.

TL;DR: Companies need to start filtering _outgoing_ traffic and not letting
any random host on the internal network connect out to any other random,
arbitrary host in the world. This will be inconvenient and expensive (to
manage), however, so we'll need a few more Equifax's before it begins to catch
on.

------
bigmac
One of the great things that helps when building a crypto-anchor enabled
infrastructure is to have Mutual TLS between all applications/containers. This
allows you to authn/authz and only allow connections from specifically allowed
apps/containers/microservices.

Mutual TLS can be a bit of work to get set up but leads to huge security wins
over time as every RPC within your infrastructure is mediated by an
authorization layer. We've helped out a bit with the SPIFFE project which is
looking to make mutual TLS easy: [https://spiffe.io/](https://spiffe.io/)

~~~
suniljames
SPIFFE's lucky to have Docker, Google, and others helping drive forward the
idea of consumable service authentication frameworks like SPIRE. OSS was just
launched a little more than one week ago ([https://blog.scytale.io/say-hello-
to-spire-7e133fad72ca](https://blog.scytale.io/say-hello-to-
spire-7e133fad72ca)).

------
jondubois
The thing about security is that there is a point where you end up locking
yourself out.

Locking your data to your hardware raises the question of what would happen if
the hardware failed? Also at first glance this seems to introduce difficulties
with scalability across multiple machines. Also it might make it difficult to
switch between infrastructure providers.

The cost of this approach should be mentioned as a footnote.

Maybe the better solution is for society to support more small tech companies
with smaller user bases that have fewer dissatisfied rogue employees to leak
hashed passwords in the first place.

The root of the problem is not technical, it's political.

~~~
cimnine
Commercial HSM have ways of exporting the key they hold onto smartcards.
Usually the keys are split onto a number of smartcards, let's say 3. For
increased robustness, each third is written to two smartcards. (We now have 6
cards.) These smartcards each belong to one person, who is the only one
knowing the PIN that protects the third of the key it holds. Each of these
smartcards is then brought to different bank vault, sealed in tamper-proof
bags.

To restore the key, you need to bring 3 out of 6 persons to the so-called key
ceremony, and each has to bring his smartcard and his PIN.

The same mechanism can be used to provision multiple HSMs with the same key
material. But there are other means to do this. As soon as two HSM share a
common secret, also known as Key Sharing Key, they are able to exchange all
key material they possess in a secure manner.

Some HSM don't even bother to store the keys they generate within the bounds
of their hardware. Only it's master key is stored in it's hardware, any other
key is encrypted with the master key and stored on a shared filesystem.

If this sounds artificial to you, let me assure you that such procedures are
in place at various companies who deal with raw credit card data, at least in
Europe. The EMV committee, the PCI organization and each issuer of credit card
do mandate such procedures.

And they are very strict. We once had to ship HSMs back to the vendor, because
at some point they were not supervised by at least two persons. (At least the
documentation thereof was missing.)

~~~
cimnine
... because at some point they were not supervised by at least two persons,
_before it was taken into operation_ , that is. (Afterwards, they have to be
locked into a rack that requires two different badges to open it's doors and
which must have a CCTV system recording it at all times.)

------
bigmac
Folks shouldn't necessarily be scared off by the use of HSMs in this model --
HSMs are an add-on that adds an additional layer of security. That said, there
are still significant wins to segmenting the applications that hold keys,
particularly if they are on hosts separate from your front-end or application
logic hosts. This architecture still forces attackers to only have access to
data within your infrastructure, which allows your detection systems to have a
chance to catch people before they leave with all the data.

------
justincormack
So can you do any of this in a public cloud setup or is this an argument for
having infrastructure that you control directly?

(I think AWS might have launched some sort of HSM service, but I haven't
looked at the details and not clear if it could provide the right sort of
guarantee)

~~~
diogomonicapt
You can definitely do this in public cloud HSMs.

Azure: [https://azure.microsoft.com/en-us/pricing/details/key-
vault/](https://azure.microsoft.com/en-us/pricing/details/key-vault/) AWS:
[https://aws.amazon.com/cloudhsm/](https://aws.amazon.com/cloudhsm/)

The only thing you're not able to do in a public cloud is run these in Secure
Execution mode—where you get to actually execute arbitrary code inside of the
enclave instead of just doing operations with keys that are protected by the
HSMs.

------
eeZah7Ux
Delegating security-sensitive operation to a dedicated service (e.g. an
authentication service) is security 101. It's just an application of the
principle of least privilege.

Giving it a fancy name makes it look like it's a new idea.

------
mindslight
A promising title, but a disappointingly banal application. When surveillance
companies already have "our" data, we've lost the game. Equifax et al _should_
leak, so society will push back against "voluntary" commercial surveillance.
Meanwhile, exfiltration from end-user networks by negligent or malicious
adversarial embedded software is a growing threat to privacy.

------
matt_wulfeck
Personally, I think our current trend is very useful and should be pursued to
the most extreme level:

1\. Assume that attacker will get data X

2\. Make what you keep in data X as useless and uninteresting as possible.

3\. Hash data X with the most expensive and safest hash possible.

4\. If you really can't do steps #2 and #3, warn your customers about what you
are keeping and encrypt the heck out of everything.

------
cbisnett
Out of curiosity I went to look at the pricing for Amazon’s CloudHSM service
for AWS and nearly spit out my drink. $5,000 up front cost per device and
$1.88/hr and they suggest running two for high availability. At those prices I
don’t think you’ll see this catch on anytime soon.

~~~
jlgaddis
I bought a "real" HSM off of eBay (used, of course) a couple weeks ago but,
unfortunately, it's apparently broken and so I need to return it. The price of
these things is huuuuuge and that puts them out of reach to all but large
companies. I think that Amazon is trying to solve that problem but, yeah, if I
had that much to spend I would just buy a couple outright (they'd pay for
themselves in a few months, compared to CloudHSM).

------
jnwatson
If you have an HSM in the loop for all authentications, why bother with
hashing? Just encrypt the password database with the HSM and be done with it.

There are cheaper ways of keeping secrets secret. Using a TPM on the server
would be one way. SGX would be another.

------
ris
How do people cope with the threat of HSM hardware failures? Would be awfully
bad to lose the ability to read all of your sensitive fields including
backups.

