
Securing PostgreSQL [pdf] - okket
http://thebuild.com/presentations/pgconfeu-2016-securing-postgresql.pdf
======
floatboth
> Use a well-known secure algorithm (AES256 is considered the standard).

> Never roll your own crypto.

Directly using a primitive like AES is pretty much rolling your own crypto.
(Creating an actual cipher and using it is just beyond ridiculous.)

What you should use is secretbox from NaCl (libsodium or tweetnacl). Or
Keyczar, I guess that's still maintained (last commit on github is 21 days
ago). Or TripleSec for long term things that you're really really paranoid
about.

The point is, use a library that does _authenticated_ encryption for you.

~~~
Cyph0n
Why is using AES-256 directly considered "bad"? Is the issue key management?
RNG? Padding? Or what exactly? Genuine question.

Or do you mean that _re-implementing_ AES-256 in your own library instead of
using a well-tested implementation is "bad"?

~~~
technion
You can get a good example of this by simply hitting this link and reviewing
the first few hits:

[https://github.com/search?utf8=%E2%9C%93&q=aes+encryption](https://github.com/search?utf8=%E2%9C%93&q=aes+encryption)

I'm not going to call out any project by name, but on the first few pages, I
see:

\- A project which implements unauthenticated CBC mode AES

\- A project that does not, in any way, document its mode or implementation
beyond "using AES 256"

\- A project with <10 commits, more than two years ago, yet over 50 stars

\- A project that uses mcrypt

Over all, it is easier to find a project with horrible deficiencies than
anything that didn't immediately look concerning. And every one of these
boasts "AES 256 encryption".

~~~
rascul
What is wrong with mcrypt?

~~~
technion
I started writing an answer, but this blog is more comprehensive than anything
I could put together:

[https://paragonie.com/blog/2015/05/if-you-re-typing-word-
mcr...](https://paragonie.com/blog/2015/05/if-you-re-typing-word-mcrypt-into-
your-code-you-re-doing-it-wrong)

Problem #2 is something I've hit in multiple real-world codebases.

------
srd
I have a question about storing the encryption keys. How would one actually
securely store them and distribute them among the application servers in a
cloud environment.

I don't buy into the 12-factor-application way of storing sensitive data in an
environment variable ("ps ae" and a local intruder has the data).

Storing it in a secured file on the server requires the file to be distributed
by the provisioning service, thus implying the key being stored in a
repository (that just isn't your application code, but a repository none the
less).

Using an api service to with client certificates doesn't really help either,
because if the application code can access the required configuration and
certificates, so can an intruder with shell access (since the intruder most
likely has the user permissions of the running application before doing a
privilege escalation attack).

I haven't seen an answer to that question that really satisfied me in the
past. Does anyone have a battle tested method for storing database encryption
keys?

~~~
koffiezet
> Using an api service to with client certificates doesn't really help either,
> because if the application code can access the required configuration and
> certificates, so can an intruder with shell access (since the intruder most
> likely has the user permissions of the running application before doing a
> privilege escalation attack).

The problem here is that if your application requires the use of encryption
keys, and the user which runs this application gets compromised, you have a
problem anyway - since like most applications, yours probably also doesn't
care about the key's security once it's in memory. If they get that far, the
only thing you can do is replace/revoke those keys and take the hit.

In more secure non-cloud setups, encryption/decryption is being done by
something like a HSM, a box with only one interface (usually PKCS#11) which
can be used to encrypt, decrypt and sign stuff, and you never see the keys.
You have software HSM's - and you could try to apply the same principal in the
cloud, where you have an isolated box with only the very well protected and
audited soft HSM running.

But not sure the cost of learning and maintaining such a system is worth it
for most situations, where I would use an API service like Hashicorp's Vault.
Most of the compromising of keys and secrets doesn't happen on your servers,
but on your or some developer/user's work machine. How many crap is exchanged
over email, dropbox links, slack, skype, ...? Keeping keys out of the hand of
the user and eliminating the need for your users to have them at all is higher
on my priority list.

------
altendo
While there is some PostgreSQL-specific things in this, the vast majority of
it is pretty solid advice regardless of the database you have.

------
knorker
The advice on FDE is misleading if not flat out wrong. It's _NOT_ just to
protect against media theft. It's also about "shit, the drive crashed. How do
I destroy the data on it before throwing it in the trash?"

Same with replacing a smaller drive with a bigger drive. Yes, DBAN it, but
that won't take care of remapped sectors.

It's easier to destroy an encryption key than it is to destroy the data.

~~~
rzzzt
To the point that more recent hard drives already come with hardware
encryption enabled, and the secure erase/sanitize command can finish in mere
seconds, because it only needs to throw away the existing key:
[https://en.wikipedia.org/wiki/Hardware-
based_full_disk_encry...](https://en.wikipedia.org/wiki/Hardware-
based_full_disk_encryption#Disk_sanitization)

~~~
fulafel
That requires you to trust the disk to implement many finicky crypto details
correctly in its proprietary black box, you can't get any assurance about that
with reasonable cost. And SSDs have a horrible track record in the area,
manyimplementing SECURE ERASE so it leaves your data on the flash chips. And
if you choose a disk model, make your investigations and find it passes your
sniff test, disks are no longer a commodity part for you.

------
tzaman
So true, which is why I find it kind of awkward when I say to young
developers/business owners they should pay for a professional PostgreSQL
hosting and they argue it's open source and they can host their DBs on a $25
(or less) Digital Ocean instance.

Until it's too late, I guess.

~~~
pjlegato
We (databaselabs.io) are the first Postgres as a Service on DigitalOcean. We
fight this sales battle on a daily basis.

The problem is that younger devs/owners are much less likely to have ever
experienced any significant outage, so it's a non-issue for them. They can't
see why they should pay good money to prevent it.

We've had much better results with slightly older and more experienced people,
who realize that our service is an absolute bargain compared to the cost and
risk of attempting to run continuous backups, etc. themselves.

~~~
aa_memon
i am curious, you are commenting on a post about secure postgres but after
signing up for your service i see that the database i created is publicly
available, on the standard port 5432, with the default postgres username. the
only thing stopping me is a password, no certificate required. i am not a
postgres expert but how is this not extremely insecure?

~~~
pjlegato
"Extremely" insecure is a bit of an exaggeration. It's not as though the
password is "password" on all the databases.

We do support certificate-based logins and firewalls with IP whitelists.
They're just not on by default. Frankly, the reason is because almost all
customers don't care about having a _very_ high level of security if that
implies doing more work (setting up certificates, whitelisting IPs.) Moreover,
they actively prefer to have "simpler and less secure" over "more complex and
more secure."

We've done experiments with certificates and firewall IP whitelists and so on.
Almost all customers and potential customers reject these things. They say
they want a simple password that just works from anywhere.

We had to choose between being slightly less secure and having customers,
versus being highly secure and having no customers. Since the business can't
survive without any customers, that choice was easy.

That said, if you do want any of those things turned on for your database,
just write support@databaselabs.io.

~~~
jlgaddis
Out of curiosity, can one change these settings for their own databases
automatically via some "control panel" or such, or does it require
intervention from your staff?

Fortunately, I think there have been fewer "unauthenticated remote access"
vulnerabilities with PostgreSQL than MySQL so this (being accessible from 0/0)
probably isn't a huge deal. That said, I'd look for ways to restrict who can
actually connect to 5432/TCP that won't negatively affect the majority of your
customers (e.g., if your databases are running on DigitalOcean, can you
restrict connections to that particular DO datacenter by default and provide
an option to loosen those restrictions in increments -- "this datacenter",
"all DO datacenters", "the world", etc.?).

~~~
pjlegato
These settings are currently manually operated by staff, via
support@databaselabs.io.

That will get added to the control panel eventually, but right now as
approximately zero percent of customers want those things, it's not a good use
of our limited engineering time to even automate that, versus other things
that engineers could be doing with their time.

While it would be nice in theory to restrict them by default, in practice
there's just no restriction that's close enough to universally applicable to
be workable (i.e. one that won't disrupt a large number of users' use of the
database if it's applied everywhere.)

And you are correct, there are essentially zero unauthenticated remote access
vulnerabilities that come out in Postgres. Combine that with:

* All connections require SSL * The password is a long string of randomly generated characters * We actively monitor the network for unusual traffic patterns

and it's actually not so bad. Not ideal of course, but very much not
"extremely" insecure, as the above post said.

------
cygned
Awesome! Is the actual talk somewhere as a video?

~~~
craigkerstiens
The same talk is actually being given at PGConf Silicon Valley in just under 2
weeks ([http://www.pgconfsv.com/program](http://www.pgconfsv.com/program)). We
will be recording the audio and slides there and that video will be online
some weeks after. You could also always come and see it live if you're in the
bay area :)

~~~
cygned
Too sad, Germany is a bit too far away. However, thanks in advance for
recording, looking forward to watching it!

~~~
smnplk
Me too. I would like to see the talk. Thanks for recording it.

------
jtchang
This is solid advice. While you may not be able to do everything in the slides
there are some really good practices.

------
jve
"For critical passwords, use split passwords with dual custody."

Could anyone comment on the practice? Does it mean that I have the first half
of password and other guy has the other? We can only log in by combining
password? How do we type the password in by not sharing a physical computer?

~~~
rwilsonperkin
Yep! Dual custody can really just be as simple as providing two halves of the
password from different owners. I've implemented similar before. Both owners
having physical access tends to be the safest way, as whenever one is remote
you have considerations about the password in transit. Something like tmate
could help.

------
esseti
The problem with pgcrypto is the fact that in the logs there's the password?
that's it? I would like to see the talk to get a better understainding of the
whole message, is it available somewhere?

~~~
mmerickel
The fundamental issue is that you should not be exposing the encryption keys
to the database. If you're using pgcrypto then you're issuing SQL statements
_in the database_ with the key. You should do your encryption client-side so
that the key is never passed over the wire at all.

~~~
okket
That said, pgcrypto is still useful, you can compute hashes or generate UUIDs
with it.

~~~
dom0
The point was that you probably shouldn't use it for data encryption in the DB
_and think the server doesn 't have the keys_.

------
smkdtr
One of the constraints I am working with is that a human is not necessarily in
the loop all the time. What is the best way then bootstrap the encryption
process?

~~~
tptacek
You basically don't. You're adding a lot of complexity chasing after a very
marginal (say, 5%†) fraction of the security application-layer crypto gets
you.

You can get that fraction closer to 50% without a human in the loop by
segregating your crypto code from the application in a virtual HSM, using
something like a TLS client certificate to authenticate your application to
the HSM, have humans in the loop for restarts/bringups of the HSM itself, and
doing pretty aggressive monitoring on the request patterns between the app and
the HSM.

The threat model here is that someone owns up your app server --- if that
wasn't in your model, you wouldn't need application layer crypto. The idea is
that owning up the app server will get an attacker enough access to make
requests to the HSM, but not control of the HSM itself. If the HSM can detect
abnormal volume of requests, you can use it as a circuit breaker to prevent
bulk exfiltration of your secure data. You also get accountability, so that
when your server is compromised you might know exactly what records were
accessed.

Typically, the app server itself will handle all of the database operations,
and the virtual HSM just provides a "seal" and "unseal" operation --- convert
plaintext to ciphertext, convert ciphertext to plaintext.

† _The 5% you 're getting in the dumb online crypto scenario is that if you
store the root crypto secrets in a file, an attacker can't necessarily recover
it from an SQL injection attack --- but in reality the percentage is probably
lower, since most of the time SQLI will equate to RCE._

~~~
smkdtr
Thanks for the response. Although not ideal, I am stuck with using crypto at
the DB level. Based on what you said; running something like vault (running as
a separate user from the DB) and accessing it using the HTTP API + TLS client
cert seems like the way to go.

------
rietta
68 slides! This must have been an epic or fast paced conference talk. Good
points. I think the comments on FDE are spot on; it's almost always about
compliance. And I'm reasonably comfortable with SSH access if, and only if, it
uses SSH key authentication (no password auth) and that PermitRootLogin is
turned off. A Bastion host is a good idea, but "never" is a bit strong.

------
Puts
I think this talk misses one of the most important security patterns of
PostgreSQL and SQL-databases in general. If you for example have a table with
hashed passwords. Why would any user except admin need to be able to make a
select on that table? Make a function to validate the user and only grant
permission to run this function.

~~~
tptacek
That's a bad idea, because it implies that your password hash has to be
expressible inside of Postgres.

If you're worried about a SQL dump exposing password hashes, segregate
password validation into its own microservice. This comes with other benefits:
for instance, you can ratchet up the work factor on your password hash,
because the service will very easily scale horizontally.

~~~
Puts
First of all, this pattern is not exclusive to password hashes. There are lots
of situation when handling customer data where you simply don't need the
ability for the client to query the whole data-set, and if that's the case,
allowing it is just bad hygiene.

Now if you make a good set of prepared statements as an interface for your
database, this could be viewed as a "micro service" in it self.

~~~
tptacek
Yes, and that's a good pattern. Just not for password hashes.

------
agentgt
Perhaps to help others extract info: for me most of this I know or is sort of
common sense at this point with the one exception of page 39: "do the
encryption in your app".

I'm not saying I have ever used pgcrypto but it is important to be mindful
that pg is logging stuff.

