
Database constraints: The last line of defense - anarazel
https://www.citusdata.com/blog/2018/03/19/postgres-database-constraints/
======
rosser
Every single time I've seen someone say something like, "We don't need
constraints; we have validations", they've had to clean up their data — when
it turned out validations aren't enough.

Please do future-you a favor: use constraints.

~~~
jack9
I would never use constraints ever again. Period. Billions of rows of data to
POCs. It's interesting to note that AMZ data largely runs without constraints
(the largest systems). Working in Southern California, where tenures run a few
years before you end up somewhere else, everyone comes to a consensus in a
decade. It's obvious to me that embedding business logic in SQL along with the
server app and whatever presents data to third parties, is just making
everything harder. Then you run into the inevitable of data changes and
fighting the existing constraints. It's a waste of effort, but an obvious
path-to-job-security for DBs.

> And of course you can get quite clever and fun here....create a function
> that checks if a function is a fibonacci number

The fact this is presented as a reasonable (trivial!) thing to embed in the DB
screams, this is a terrible place to work.

~~~
ams6110
A lot of things that are really, really good ideas for most applications might
not apply at Amazon scale. Most of us don't work at Amazon scale.

I agree on the fib constraint, though. Fun, but nothing you'd ever really do.

> embedding business logic in SQL along with the server app and whatever
> presents data to third parties, is just making everything harder

Not in many cases. In the past 15-20 years, think about how many application
stacks have come and gone. We've had Postgres that whole time, and plsql. Core
business logic in the DB never needs to be rewritten, no matter how many
different front-end rewrites you go through.

~~~
falcolas
With your last point, you're conflating language longevity with code
longevity. Code in the DB changes as often as any other code: as often as the
business requires it. The last company I worked for who used stored procedures
had a change for those stored procedures with every frontend release.

And logic in the DB is inherently capable of only scaling as large a your DB
cluster (i.e. not very), and can cause your DB cluster to scale prematurely
(an event which incurs a lot of potential technical debt and additional
monitoring).

------
ams6110
Check constraints and foreign key constraints will save your bacon.

I don't like triggers though, as a rule. Not that they are always the wrong
thing, but its easy to kill performance with them, and just generally having
"behind the scenes" side-effects triggered by other operations can be
surprising, to say the least.

~~~
mdpopescu
You might want to look at [1]; you can send a message to a queue and process
that independently. It breaks transactional boundaries though (if you abort
the transaction the message is still in the queue). Using transactional queues
will probably not help because then the transaction has to wait for the
message to be processed so you're back to square one.

[1] [http://www.madeiradata.com/service-broker-asynchronous-
trigg...](http://www.madeiradata.com/service-broker-asynchronous-triggers/)

------
esaym
These are good, I've forgotten about some of them as well. As to:

    
    
      EXCLUDE USING gist (
       id WITH =,
       period WITH &&
      );"
    

What the heck?

~~~
waffle_ss
It's so you can't define two billings for the same period (say, the same
month). To prevent you from double-billing someone.

&& is the range operator that checks for overlap.[1]

You have to use gist indexes to do this. And in fact this page doesn't mention
it but you have to enable the btree_gist extension with:

    
    
        CREATE EXTENSION btree_gist;
    

before the snippet above will work.

[1]: [https://www.postgresql.org/docs/current/static/functions-
ran...](https://www.postgresql.org/docs/current/static/functions-
range.html#RANGE-OPERATORS-TABLE)

~~~
tokenizerrr
They're very nifty, as you can use this to easily prevent all kinds of date
range overlaps.

------
peterburkimsher
"Want to ensure tweets aren’t longer than 140 characters, make sure you set
your tweet column to varchar(140) and not to text."

This is fine if all your users are content with ASCII. As soon as double-width
characters from Chinese or emoji start interfering, the database will limit
some users to 70 characters. It's ok to make restrictions, but be sure to
fully understand how they'll affect your users.

~~~
chronid
In PostgreSQL varchar uses actual characters (as specified by the database
character encoding), not bytes, IIRC. The same thing should be true for (at
least recent versions of) MySQL.

------
rbranson
Errors in memory are temporary, errors on disk are permanent. Don't approach
them both the same way.

------
combatentropy
One thing I like about SQL constraints is that they're so much shorter than in
a procedural language.

Has anyone developed an elegant system for bubbling them up to the user
interface? It seems you either have to write the same constraints again in
JavaScript or build some massive list that maps ugly database errors to
friendly user errors.

~~~
mdpopescu
C# MVC has an attempt at creating some validation and automagically promoting
that to the UI but it's (of course) limited to stuff like "this field is
numeric" or "this field is required". Anything more complex needs duplication.

I would assume that something like node.js or meteor have it easier, since
they're using the same language in both layers; I believe there are also
attempts at tools that automatically translate other languages to JS but I
haven't used any of them.

------
krylon
Yup. I still remember the day I found out about constraints, and I became kind
of angry nobody had told me about this before.

I can imagine one can get carried away and shoot themselves in the foot when
being careless, but all in all, I think constraints are a valuable tool to
protect the integrity of my data.

------
siscia
Once I was building a small application, I didn't want to use a whole database
such as postgres that is amazing but does require some maintenance, so I
settled for using just Redis.

It was fine but the lack of constrains and structure was killing me making
everything harder, but I liked the zero maintenance things...

So I build RediSQL[1] a module to provide SQL capabilities to Redis to get the
best of both worlds: stability, easy maintenance, velocity, constrains and
structure in data.

[1]:
[https://github.com/RedBeardLab/rediSQL](https://github.com/RedBeardLab/rediSQL)

------
Slansitartop
Does MySQL support CHECK constraints yet? Last time I used it it silently
ignored them if they were in the DDL.

~~~
moolcool
Good heavens, then why even have them?

~~~
jimktrains2
Everytime I deal with mysql I ask why there is mysql. It's terrible when
compared to postgres at consistency, correctness, functionality (e.g. check
constraints, cte, and window functions, pg_tgrm, full text search, index
flexibility and that's without thinking about PostGIS, pg-routing and many of
the other excellent 3rd party extensions) and error handling and nearly
equivalent when it comes for basic queries that mysql even supports.

Edit: forgot transactional ddls, real ones, not the ones mysql has in the
pipeline.

~~~
swsieber
Mysql has better replication (at this point). Postgres does seem to be
catching up in that area.

~~~
jimktrains2
Better by what metric, and is it worth the hardship in every other area?

~~~
Something1234
I think mysql supports multimaster, and other replication formats. Postgres
just supports master and slave.

~~~
jimktrains2
Not true.
[https://wiki.postgresql.org/wiki/Replication,_Clustering,_an...](https://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling)
postgres has a variety of ways to do replication, including multimaster
(mainly asynchronous). Core is also getting improvements that will make this
easier to do without extensions, mainly through the infrastructure pglogicql
brings.

Until 5.7 I believe mysql required extensions to do multi master as well.

~~~
chronid
BDR is available in the open (aka for free :D) only up to PostgreSQL 9.4
though, and that's really the only somewhat-equivalent solution to MySQL
multimaster IIRC.

~~~
jimktrains2
Also not true.

9.6 is supported and the next release (scheduled soon) will support 10. Those
changes will eventually end up in 12.

[https://www.2ndquadrant.com/en/resources/bdr/](https://www.2ndquadrant.com/en/resources/bdr/)
[https://blog.2ndquadrant.com/news-and-roadmap-for-bdr-
multi-...](https://blog.2ndquadrant.com/news-and-roadmap-for-bdr-multi-master-
postgresql/)

I'm not sure what you mean by somewhat equivalent. Both are primarily
asynchronous multimaster approaches.

~~~
chronid
> (NEW! Postgres-BDR 2.0 on Server Version 9.6 Now Available!) 2ndQuadrant is
> proud to announce the availability of BDR 2.0 running as an extension to
> community PostgreSQL 9.6 for its support customers. You can now make full
> use of the additional features offered with PostgreSQL 9.6 while using
> Multi-Master Replication from BDR, fully backed by 2ndQuadrant’s world
> renowned 24/7 Production Support. Fill out the contact form below to get in
> touch with us.

This suggests to me it's not really open to the public. And there is no
release 2.0 on github
([https://github.com/2ndQuadrant/bdr/releases](https://github.com/2ndQuadrant/bdr/releases)).

Feel free to correct me if I'm wrong and the 2.0 release is accessible without
a 2ndquadrant support contract (or reverse engineering the changes from the
github repo branch, which pretty much ensures you're never going to production
with it), I'm genuinely interested.

