Hacker News new | past | comments | ask | show | jobs | submit login
New in PostgreSQL 10 (postgresql.org)
962 points by okket on Sept 21, 2017 | hide | past | web | favorite | 249 comments

If anyone even remotely involved with the maintenance and development of pg reads this thread - Thank you! - for all your efforts in building and improving a first class product that keeps me amazed at the strides it takes with each major. release.

Truly. We've recently moved from Oracle (after using it for 15 years) to Postgresql. It's like a breath of fresh air. The documentation for Postgres is unbelievably superior to Oracle. So far its performance is equal to or better than Oracle. We had to go through and rewrite thousands of queries, but the sql syntax of Postgres was always simpler and more logical than the equivalent in Oracle (I think Oracle has too much baggage from being around too long). All in all I'm so impressed by Postgres. I'm sure there are features in Oracle that Postgres doesn't have that keep people on Oracle, but I would imagine that the vast majority of Oracle installs could be moved to Postgres.

Do you plan to write more about the migration, like a blog post? That would be very interesting to read.

This needs to become a thing. Where people produce writeups on Oracle To Postgres, and how much better it is; under something like #RunsMuchBetterWithPostgres ...

It is strange that there aren't more writeups on Postgresql migrations.

Yandex had a good one. Posted here a long while ago: https://news.ycombinator.com/item?id=12489055

> It is strange that there aren't more writeups on Postgresql migrations.

It's probably because switching databases is very painful and rare.

I would think that not having to pay the Oracle/MS/Sybase/Teradata licenses are a good enough incentive...

Speaking of economic incentives.

That could be a thesis topic for my hereto unwritten PhD in Behavioral Economics:

"Choosing to pay" What are the behavioral aspects of top leadership - Why do people still chose proprietary" A quantitative case study."

If I had to wager, it's support. Oracle/MS generally provide much better support that open source software.

Oracle is definitely not known for their good support, and there are plenty of companies which provide really good PostgreSQL support.

This. It was worth it, but it still took us the better part of a year.

It would take a lot more than a single blog post. It would probably take dozens of blog posts (or a book). It took us the better part of a year. There are just so many things that have to be taken into consideration and learned.

I'd be very interested in this after years of working with Oracle (and SQLServer), I know a little Postgres and would like to learn more.

+1 to this.

I think most Oracle installs are from the days ram was limited and incredibly expensive.

Right now anyone can afford a db server with 128gb ram, enough to solve most problems #YourDataFitsInRam

>> I think most Oracle installs are from the days ram was limited

Could you explain ?

Personally I use Oracle (and hopefully one day PG) because it gives me guarantees on data integrity and these guarantees help me to think about my programs more easily.

Oracle started before microcomputers (PCs) were anything but toys. In 1983, they were multiplatform: mainframes and minicomputers. a 1983 minicomputer might have 2 CPUs, each good for 2 MIPS, 8 MB of RAM and primary storage of as much as a gigabyte or two of disks.

In the early 2000s, a serious database machine -- say, a Sun E10000 - had up to 64 CPUs, running at 400-500MHz each, 64 GB of RAM, and a huge cabinet full of disks.

Now you can call up a SuperMicro VAR and order a 64-core AMD EPYC server with 2TB of RAM and 4 TB of NVMe drives, that talks to other machines over a 100Gb/s ethernet, for a price around the same as a used Tesla S85.

If your database is under 2TB, it all fits in RAM on commodity hardware, where commodity is defined as "anyone with a credit card can order it without even talking to sales".

>>> Oracle started before microcomputers (PCs) were anything but toys.

I didn't know that. At that time I was discovering Applesoft basic :-)

Oracle is the real elephant in the room, all pun intended.

Seconded. Postgres is such a high quality project in so many ways. I'm rarely left wanting when using it, and very often pleasantly surprised.

agreed. after years of having used Oracle and SQL Server at work, to find I could have a working database in about 10 seconds with

sudo apt-get install postgresql

was amazing to me!

In 2015 we upgraded from Oracle 10 to 11. It took two weeks.

I took the chance to upgrade our PostgreSQL instance. It took less than an hour. It was a smaller database, but the difference in complexity was still impressive.

SQL Server on Linux in a container is just as easy now:

   docker run -e 'ACCEPT_EULA=Y' -e 'SA_PASSWORD=password' -p 1433:1433 -d microsoft/mssql-server-linux
EDIT: Or with apt-get too:

   sudo apt-get install -y mssql-server

Those aren't all of the instructions for apt-get. The full instructions are on https://docs.microsoft.com/en-us/sql/linux/quickstart-instal... and appear to consist of:

  curl https://packages.microsoft.com/keys/microsoft.asc | sudo apt-key add -
  sudo add-apt-repository "$(curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list)"
  sudo apt-get update
  sudo apt-get install -y mssql-server
  sudo /opt/mssql/bin/mssql-conf setup
  sudo apt-get install -y mssql-tools unixodbc-dev
  echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bash_profile
  echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bashrc
  source ~/.bashrc
And at that point, you have a limited period trial license of a release candidate version installed, and so once you want to use it for production, you have to start dealing with license management issues, which could limit your ability to easily set up dev envs, test envs, hot spares, etc.

All of that is on top of the issues of how heavyweight it is due to containing a platform abstraction layer that consists of a very large amount of Windows running in user mode, as discussed by others in this thread.

Yes, you have to add the repo, the same as if you wanted the latest postgres bits on any distro. You also have to edit configs for both and you can also choose to install extra tools for both.

I'm surprised at how many people seem to have taken issue with a simple comment. Postgres is easy to install... and now so is MSSQL compared to the complex slow GUI installer they had before. Maybe it'll help the original commenter, maybe not. Let's just leave it at that.

Debian and derivatives have it in repos by default. Please don't make claims like those.

That's more to do with the OS than the DB, and many repo entries are outdated. MSSQL hasnt existed on Linux before, but either way:

> I'm surprised at how many people seem to have taken issue with a simple comment.

Due to Microsoft SQL Server licensing, I would be highly surprised if any Linux distribution were to make it available by default in their repositories.

It's not that the OS supplied versions are outdated. They are updated whenever a minor release comes out.

The thing is, that you must migrate your database between major pgsql releases and you don't want your distro to force it on you. The community pgsql repo has the same rules - you have separate repo for each major release, so no surprises either.

Let's not let this devolve into a db war. This submission is about PostgreSQL 10. Your parent expresses how easy it is for them to get PostgreSQL up and running in their environment—no comparison saying that others are bad (edit: though they did have some difficulty in the past). It's great that you've got a good solution that works for you! And I'm sure there are other ways of doing this as well for other dbs. If you have experience with PostgreSQL and SQL Server features that you'd like to compare that are relevant to these PostgreSQL release notes, please do share them constructively. I'm sure there are people that would be interested in hearing about them.

parent here. yeah SQL Server is pretty nice to work with as well!

Where's the db war? The comment was implying that MSSQL wasnt as easy to install and I offered advice for the latest version. Neither database is really production ready with a 1 line install but it seems you're finding conflict that doesnt exist.

That strikes me as being comparable only at the most superficial level. I started that SQL Server container and it's using 825mb of RAM (according to systemd-cgtop), just sitting there idle with no data in it. I started a Postgres 9.6 container also using their official Docker container and it's sitting idle at 58mb of RAM.

Insult to injury, if you're using Docker in a VM (as you must on a Mac) the instructions on that Docker Hub page say you need a minimum of 3.25GB RAM dedicated to the VM. That's quite a heavy lift if all you want is a database instance for local development.

And then sure, you could run SQL Server on Linux for your production instances, but you're well off the beaten path. You're going to have to roll your own infrastructure rather than relying on AWS RDS, Google Cloud SQL, Heroku, etc etc. And the moment you have a performance problem the first advice you'll hear seems likely to be "try it on Windows and see if it reproduces there".

That 58Mb RAM number isn't necessarily a good thing - something postgres has regularly fought against adoption-wise is that the defaults exist to be sane on a shared machine, not provide maximum throughput on a dedicated database server. While I appreciate this choice personally, I also suspect a good container default would idle higher.

(I'm a huge pg fan if that isn't clear - just ...)

SQL Server does use it's own mini-OS layer and this what lets it virtualize and run on Linux too, so there are higher system requirements.

It seems you're just guessing on that last part though - SQL Server has been in production for decades and already has lots of tooling and there are even new CLIs available for the Linux environment. Production SQL Server comes with support from MS, you're definitely not on your own nor will they just tell you to run it on Windows, this is simply not how enterprise support works and they would have a hard time staying in business with answers like that.

We can discuss support issues but it's best to leave the assumptions and hyperbole out of it.

"Assumptions and hyperbole"? That MSSQL isn't available as a service like RDS? That tooling and integration that's been building up for decades around PG (or Mysql/Maria for that matter) doesn't exist? These are just facts.

I'm sure you're right that MS would be happy to sell support. What I don't see is a reason to care, unless I run in to a Windows app that can't use another DB that I suddenly need to run under Linux.

I'm not sure what you're arguing. MSSQL is available on RDS and to quote the parent comment:

> ...SQL Server on Linux for your production instances, but you're well off the beaten path...first advice you'll hear seems likely to be "try it on Windows and see if it reproduces there"...

You dont think this is hyperbole? Have you actually used SQL Server or Microsoft support before? They do not treat enterprise customers this way and both Linux and MSSQL have plenty of tooling available. There's nothing challenging here.

> What I don't see is a reason to care

You're not the target market then. MSSQL has some compelling features and a big ecosystem, adding cross-platform support and a much easier installation process only helps customers with more options for deployment.

Oops, you're right; I thought it wasn't on RDS. My mistake, and a good reminder to double check these things.

You left out the licensing step, working out what license you need is it's own time sink.

That's not as easy in at least two ways: having to use docker and know that command line.

That command line is basic docker usage. Containers are far easier and cleaner than apt/yum installations at this point but you can just do this too:

   sudo apt-get install mssql-server

I had issues with the docker container on Debian 9. I think the container is only tested on Ubuntu.

And they didn't even ICO



* Native partitioning

* Parallel query

Honestly, some very welcome quality of life improvements for use cases even outside of what I would consider "Big Data".

> This means that users no longer need to create triggers for routing data; it's all handled by the system.

Trigger routing has always been a performance foot gun...to the point that it's sometimes better to handle table routing in application logic. I wonder what the performance is comparatively between this new "native" partitioning and existing methods (triggers and rules) and whether or not this makes application routing always inferior.

So far as the speed is concerned, we ran some tests a while back and it came back as nearly native performance when compared to interacting with a non-partitioned table:


That's really awesome to hear. I think it is fair to say that app-level partitioning on top of Postgres is now obsolete.

Parallel query was already there in 9.6 though this release improves on it further.

It makes parallel query much more useful. There were very few places that parallel query helped with my workload in 9.6, that is not true for 10.

Parallel index scan is going to be huge for me, I’ve got a table with 80 million rows I have to constantly dig through, indexes help a lot but when they get so stinking big being limited to one thread really hampers performance.

I worry that having things "all handled by the system" can be even more of a performance footgun. I mean that users will be able to do things that are allowed by the docs, but which result in, expensive and hard-to-see things are happening under the hood.

This is a valid concern, I've seen simple automatic optimizations backfire before. An example I remember from MS Sql Server (older versions) of a very simple optimization of using indexes can backfire with the dynamic filter pattern:

where ((@foo is null) or (foo = @foo))

and ((@bar is null) or (bar = @bar))

Depending on statistics, order of execution, indexes in place and phase of the moon this could produce fast results or have to revert to a full table scan (because it uses a cached query plan with the wrong index).

Devs still have some awareness of how queries translate to actual running instructions and to measure that the results are in line with what is expected.

Modern pg libraries only cache the plan with the query params involved as well so this generally doesn't happen.

(I made it happen with an older version of things and ... yes, argh)

This is a completely reasonable worry though I must note that postgres has done it way more rarely than anything else I've dealt with which may explain the voting response to this comment.

I noticed hash indexes are now crash proof and replicated -- this seems to make them actually usable in production. In other words, this release effectively "adds" a new index type. That seems like a much bigger deal than is being talked about, is there any reason to believe that new databases shouldn't be using hash indexes for columns that won't be supporting range queries? (In other words, pretty much all keys.) I've seen mixed reports about hash indexes being beneficial.

I think it's a neat feature. But note that orderedness isn't just useful for range queries, it also helps to satisfy ORDER BY and to allow for merge joins without a sort steps. There's also no yet support for index-only scans (probably never), no constraints, and no multi-column index support.

There's also still some performance kinks to work out with the current hash index performance - large when growing the index quickly or under very high concurrency.

Stop it! You're giving away all possible answers to one of my favorite interview questions.

Edit: not seriously asking you to stop, but the interview question part is true

I hope those are interview questions for dbas because it's way outside common knowledge for backend developers.

Why are back-end developers not supposed to have a deep understanding of their tools? I hear this sort of statement all the time, it’s infuriating. I once even had a dev team lead claim that it was not a dev’s responsibility at all if the app generates bad SQL; the DBA is supposed to make the queries fast (this person no longer works for me, but I consistently hear similar claims from consultants and others in the dev community).

DBAs are expected to understand the crappy inefficient row-at-a-time ORM in use by the devs, which implies understanding a particular programming language, but not the other way around? There’s nothing magical about SQL databases that a decent developer can’t understand.

> Why are back-end developers not supposed to have a deep understanding of their tools?

You mean all their tools? Like how does modern processors handle branch predictions in the face of out of order operations, or what differences in server cooling will affect throughput or peak performance?

Do you have a deep understanding of all the tools you use? Because I honestly doubt so.

I think there is a very large gap between being able to answer that interview question effectively and having decent understanding of how to not misuse your ORM.

Specifically with ORMs, It does seem that a lot of devs don't understand or care about the generated SQL... but that question was way above the knowledge level I would expect/want out of someone IMO.

Not for DBAs, but I also don't expect candidates to get all the answers in that comment. It's a decent data structure question to ask why databases generally use binary trees for indexes when hash lookups are faster. Non-DBAs usually have to think a little bit.

Databases do not use binary trees for indexes and tables, they use B-trees. B-trees are a family of self balancing tree data structures which usually are not binary because in a database you want the tree to have few levels.

Thank you. I remembe having to balance a B-Tree in my DB final. 5 nodes deep. Had to show all of the work. I think it ended up with 20 some revisions based on all the insert and delete statements.

Thanks! The merge join issue seems like one that probably would affect many workloads. Ordering by key less so but good to realize.

I guess it would be helpful for this feature to also include some documentation about if/when to use hash indexes, since up to this point the db has issued a warning basically saying "don't use these." Right now I'd imagine that there will be a lot of confusion around it given the feature's history so people will just not consider it.

Do you know how that'd work out for those of us with UUID primary keys rather than integer primary keys?

Not quite sure I understand exactly what you're getting at? Are you wondering if those constraints are less important because you less frequently order by UUIDs? If so, I don't think it makes a huge difference - merge joins are still relevant, and you'll often ORDER BY anyway just to get stable resultsets. If the question is if there's bigger benefits to hash joins for largely unordered data? Then yes, that's true.

Equality operations are also cheaper with hash indexes than btree because less data pages need to be fetched, so this can be a gain for certain columns with a high cardinality. You should really look at if those could be used or not, you may gain in performance with a switch.

> ICU Collation Support

This tiny little mention is a massive win! We now have access to the "und-x-icu" collation. As the postgres docs[1] put it: "ICU root collation. Use this to get a reasonable language-agnostic sort order."

I've been waiting on this forever. Columns containing strings from any/all languages can now be sorted well enough to use postgres in truly multi-language applications. If your application has users writing in English, German, Russian, Japanese, et al, using the "und-x-icu" collation should make your application much more user-friendly (as opposed to using the en_US or C locale).

The ICU root collation is far from a perfect system, but it's the best option available today. I've been tied to MySQL purely for its similarly-purposed utf8mb4_unicode_ci collation. Now I can give postgres another very serious look.

[1] https://www.postgresql.org/docs/10/static/collation.html

PgSQL more and more looks like the "redis of databases". Whatever the problem, you can almost always find a good reason to use it :)

You get great performance for the simple "dumb" use case (relational data in DB term, key/value store in redis case), and lots of awesome additional feature on top of it for more complex situations.

I love it.

> the "redis of databases"

That's like calling Swift "the Ruby of programming languages".

and it says more about redis and ruby than postgres and swift :p

or the vi of emacs ? Or was it the other way around ? :-)

> "redis of databases".

That's like saying <good truck> is a "Prius of 8 wheelers".

That's really weird comparison. Especially that PostgreSQL is more of a database than redis is.

Maybe you meant the swiss knife of databases.

Floating point timestamps - the only item which affects my work is also one of the few in this wiki page with no description. These were removed with no breaking compatibility.

For anyone else using them, I just did a bit of reading [1] and it looks like they are inferior to default 8 byte timestamps. FP timestamps have microsecond precision around the year 2000, but it drops off as it moves away from there.

I was using them to be the same as some GPS timing code, but it seems this isn't needed or desirable.

[1] https://www.postgresql.org/docs/9.0/static/datatype-datetime...

Yeah.. You should never use floating point when your use case wants uniform precision across the full range. That's the opposite of what floating point gives you.

> Cross-column Statistics

Holy crap I didn't even know this was feasible much less in development! Can't wait to test this out, as significant amount of my data sets have these kinds of relationships.

Parallel query will also be great for certain queries I do regularly. Been looking forward to this!

Thank you Thank you to all the devs that had a part of this!

Cross-column statistics are one of my favorites. The way Postgres 10 handles it is really a _huge_ advancement that addresses a painful edge case that has been around for ages.


To be honest, skewed data problem can be addressed with partitioning in many cases.

Cross-column statistics is a great feature nevertheless.

Yes, but partitioning also has a number of limitations that may be quite undesirable. Like inability to create UNIQUE constraints or primary keys, for example.

More importantly it doesn't help at all with cross-column correlations. It can help with skew in between various rows though.

Thanks, I'm happy it got into PG10 too. Just to make it very clear - the current implementation is fairly simple, and may not be smart enough to deal with more complicated dependencies between columns. Hopefully that will be improved in PG11.

Full text search in JSON & JSONB looks exciting. https://wiki.postgresql.org/wiki/New_in_postgres_10#Full_Tex...

As someone working with json output from the gmail api I’m curious to see how people smarter than me take advantage of this new functionality so I can adopt it as well.

SCRAM authentication looks like a nice security improvement also. No description on the site but http://paquier.xyz/postgresql-2/postgres-10-scram-authentica... gives a good overview.

Seems like this might make using passwords compliant with FIPS 140-2. (Not sure, so maybe someone else can share their opinion.) Previously I heard in a few places that people would use LDAP to delegate the auth to something else, e.g. here: https://news.ycombinator.com/item?id=12129906

It's an improvement for sure, but I am curious -- does anyone situate a Postgres instance where it is publicly accessible? Who was asking for this feature?

Several reasons:

- Checkbox item for people not wanting MD5 anymore.

- Storing passwords on the server in a securely hashed way, so the admin won’t know your password.

- PostgreSQL developers getting out of the roll-your-own-crypto game.

Lots of cloud managed dbs are/can be publicly accessible. Eg all heroku hosted Postgres instances

are they? even without ssl? by default?

SSL is (was?) required. I left Heroku about a year ago and it's nearly inconceivable that this would be changed, having been the case for many years.

I don't think they've implemented certificate validation since I've left though.

My naive hope, going on many years, is that SCRAM with channel binding would have landed years ago (the first versions of the patch began to show up then), making client-side certificate checking (and let's get real: it's hard enough to use that many people will not validate when developing from their laptops, simply backspacing out the optional cert validation connection option, a elision that is invisible to the server) obsolete. It should be possible to modify the definitions of pg_hba.conf to require a channel-bound SCRAM connection, which would mean that the client is certain to have checked for an untampered certificate.

This implementation of SCRAM doesn't have that yet, but it's been an ambition of the author for some time to do so.

A patch implementing channel binding has been presented for integration into Postgres 11: https://commitfest.postgresql.org/14/1153/. Two channel types are presented: tls-finish and endpoint. Per the RFC 5802, it is mandatory to use SSL if you want channel binding as the data needed for binding validation is either the TLS finish message which can be found after the SSL handshake between the server and the client (which happens before the password-based authentication), and a hash of the server certificate. All those things are actually supported by a set of APIs in OpenSSL.

Not sure about SSL, but in the past customers of mine have copy-pasted full Heroku PG URLs to me and I was able to get in via `psql` immediately.

So yes they're public but their addresses are basically impossible to guess.

Known as "Security through Obscurity" [0]

[0] https://en.wikipedia.org/wiki/Security_through_obscurity

Debatable. If the address is really unguessable, the address acts like a regular key.

It's still not good practice, since most systems treat addresses with far less care than passwords and often save and/or transmit them unencrypted.

Oh, I am not saying it's a good practice at all. I was just answering the question.

I still think it's a low-friction solution. But a secure one -- hardly.

> yes they're public but their addresses are basically impossible to guess.

Ipv6 only then?

Sometimes people do a conference talk or just share the screen, and it's easy to take picture of that URL.

True. I am not saying it's the best idea around, only that it's low friction. I'd probably approach it differently but I can see why they did it like they did.

I think by far the biggest benefit is being able to check the "no insecure crypto algorithms used" box. Even though the way md5 was used wasn't really that concerning security wise, it constantly comes up.

For anyone using postgres on a daily basis I would highly recommend pgcli. It has a very good auto-complete, and can even suggest JOIN statements (parts of it actually).

do you find that it's significantly slower than just using psql?

Yes, in many cases (until I stopped using it), it messed up my benchmarks. I was testing some queries, and often it was: - pgserver fast, returns 10000 rows, pgcli takes time to parse -> result: total time 10s - pgserver slow, returns 1 row, pgcli instant -> result: total time 10 sec.

As we are working in efficient C/C++, our internal time to parse the query set is closer to psql than pgcli, so we tend to prefer the pgserver fast, but the benchmark using pgcli did not clearly show the best time.

Also, when you are testing a query on one machine, you have to carefully monitor the CPU usage to differentiate the CPU pegging by postgres from the one by pgcli.

All those issues were gone when I got back to standard psql.

For bench marking and optimising queries you almost certainly want `EXPLAIN ANALYZE`. That will give you what you want about what Postgres thinks and the time actually taken for the query.

`EXPLAIN ANALYZE` - I read that in a Dalek's voice.

Yes, sure, but executing the query a couple of times on the command line is also something I do

yeah this was my experience as well when I first tested pgcli a couple months back. Lots of neat features, but the differences in time to parse bugged me too much

Postgresql keeps getting better and better everytime! It's amazing the type of features you can pull off once you have a solid foundation :) (P.S. Have been using MySQL in past and MySQL now feels like stoneage database)

What is the current best option for Postgres failover? I looked at this at the start of the year, and found lots of options, but all of them seemed to have various drawbacks, and none were natively supported or built-in to Postgres.

I would recommend repmgr[1] over solutions like DRBD suggested. The simple reason is that DRBD replicates on block level and doesn't understand postgres data structures, while repmgr relies on mechanisms provided by Postgres.

So if there is a failure, you're far less likely to learn that all your data is gone, because some important block on disk was not replicated and the database is not readable. With repmgr the worst what could happen is that you might lost few latest transactions, but your data should always be consistent and in working state.

[1] http://repmgr.org/

Repmgr is relatively easy to setup and supports automatic failover with repmgrd. Resyncing a failed node is also just a few commands. The only limitation I have encountered was that I had to run it behind pgbouncer so the clients can always connect to the primary instance. I think that will change with 10 as it will support client side discovery of the primary node.

This is a bit outside my area of expertise. Not sure about the best way to do the automated promotion process, but looking through these docs there's mention of a change to libpq that allows you to specify all the potential hosts and then have it only connect to whichever one is accepting writes (target_session_attrs).


Edit to add: I'm also interested to hear the approaches people use here. Today we've been setting up WAL-E to run our db backups so we're in that headspace. I'd actually like to decommission my main db server and provision another one to take its place. Getting the new one up and running from the wal-e archives is easy enough, the switch-over is a little more complex.

There is no native solution as there is no unique scenario regarding failover. Some people prefer to do fencing using network tools like firewall rules or managed switches, other just reconfigure their applications connection string to remove the failed node, or even reconfigure connections with a middle-ware like pgbouncer. There surely are other fencing solutions that don't come to my mind now, and others I don't even know of. For failover you can use tools like repmgr which provides automatic failover, or set up your own monitoring which can either trigger an alert to your pager and you run the failover. Or you have it trigger some configuration manager to reset the nodes (I strongly suggest using Ansible ;) ) This is what makes postgres flexible, as it's not tied to one solution.

I think a reasonable option is to use pgbouncer. I wouldn't comment about it because I have long abandonned databases with no failover in favor of solution that works, but people who tried hard enough seem to get pgbouncer to a working point.

postgre fans will hate, but if you really need something with automated failover, you might have to take a look at better databases. Either Postgre/Oracle in the paid SQL world, or elasticsearch/cassandra/riak in the free NoSQL world.

For local cluster, try DRBD to replicate at the disk level. The DRBD replication can be synchronous and are very fast.

Add Linux-HA for health monitoring and automatic failover. When failed over, the standby machine just starts up PostgreSQL and recovers from the transaction log. Uncommitted transactions on the primary machine that haven't been written to disk and replicated will be lost, which is consistent with the transaction semantic. Committed transactions are replicated and recovered from the transaction log at the standby.

Depending on the heartbeat interval and how fast PostgreSQL starts up and recovers from the log, the automatic failover can happen within seconds.

Configure a virtual IP for the cluster and let the standby machine to take over the IP at failover. Linux-HA does all that for you. The clients talk to the virtual IP and don't have to know the primary or standby machine. They just need to reconnect when disconnected at the failover.

The nice thing about DRBD and Linux-HA is that it's not just for PostgreSQL. Any disk based application can be benefited.

I would strongly discourage this kind of solution. DRBD is to replicate disk. As pointed out is not Postgres specific, and it doesn't understand the data it replicates.

One of nightmare of replication is that there's latency and data never makes it instantly to the other host. This means that there is a possibility that once the master fails, the database file on the other host might be in a corrupted state.

If your data is important you should use replication mechanisms provided by the database. Repmgr[1] makes it easier to set up replication, and perform failover. It also has mechanisms to do auto failover, but I would also be careful with that.

[1] http://repmgr.org/

DRBD's replication is synchronous. That means the block device's write() call on the primary won't return until the data is written onto the standby's disk. It's is as strong a guarantee as you can get.

In case of the secondary not available, the data are queued up in the primary and sync up with the secondary when it comes back online. If the primary's disk is destroyed at that time, the failure case and data loss is the same as the database's native replication.

DRBD is much simpler and less error prone precisely because it does not care to understand the application data. It just deals with disk block. The simplicity also helps performance greatly. Benchmark has shown there's negligible impact when doing synchronous replication.

I guess that's better, but it would just decrease performance.

I still think using binlogs is safer and more efficient though, since it transfers only what's needed to make an update.

Performance is the same if not better than database level replication. The database level replication is synchronous too and have to wait for the standby db to finish writing. If it's async, it has a worse data loss problem.

The DRBD replication is fast because it intercepts the write call and then issues the network send and writing to the underlying disk in parallel. Gigabit or 10G network IO are magnitudes faster than disk seek/write. The write to disk on the standby server can happen in parallel as the primary's write to underlying disk.

Shouldn't that be PostgreSQL X?

Seriously good work. I'm not sure why my firm still buys Oracle licenses.

Well, this Postgres release introduces automatic partitioning; in Oracle, it has been available for like 20 years.

Also, some shops are heavily invested in PL/SQL code.

The price of migration may far outweigh the license savings.

My previous company migrated to Postgres by using EnterpriseDB[1]. This allowed them to run existing Oracle data with PL/SQL as they were working on migrating it to the raw PostgreSQL. So where there is a will there's a way.

[1] https://www.enterprisedb.com

>> The price of migration may far outweigh the license savings.

definitely. The cost of work is big compared to licensing. Now, PG has a "cool factor" so I bet some people would migrate to it off hours just for fun :-)

Certifications or compliance?

Postgres isn't just my favorite database. Postgres is an example for all people about how a project should be run. There are lots of stakeholders who want different things. The dev team looks at what's being asked for and what is reasonable within a certain timeline. And then the team delivers features and supports them. And then on top of that there is best-in-class documentation.

Thank you to all of the team members who put your time and effort into this project. It's not only a wonderful tool that I use every day, it's also a model of how to manage a project.

"best-in-class documentation"

?? what am I missing. I dread having to lookup how to do things in postgres because their documentation is so horrible to find things in.

Can you provide an example? In my experience I find the index[0] particularly useful, as well as the SQL command reference[1], the datatype reference[2], the function reference[3], and the server configuration section[4]. I know other people have different usage patterns and different amounts of familiarity with the docs, so I'm aware that my experience may not match yours.

[0]: https://www.postgresql.org/docs/current/static/bookindex.htm...

[1]: https://www.postgresql.org/docs/current/static/sql-commands....

[2]: https://www.postgresql.org/docs/current/static/datatype.html

[3]: https://www.postgresql.org/docs/current/static/functions.htm...

[4]: https://www.postgresql.org/docs/current/static/runtime-confi...

SQLSERVER admin and dev here: these features look really compelling. Though the best part is a team of devs could run these in production and do some self support and basically pay nothing. Not so in SQLSERVER land.

Does anyone have any use cases where PostgreSQL falls down/loses to other DB systems? I know sharding/replication has long been a sticking point, but what else is there?

Why do people still choose MySQL/MariaDB/Oracle over PostgreSQL at all?

The biggest case almost completely unhandled by PostgreSQL is if you need a clustered index like in MySQL. There are basically two common ways of organizing rows in a database: store everything in a tree according to the primary key, or store everything in a heap, and have a separate tree relating values to pointers into the heap. MySQL stores rows in a tree, and PostgreSQL stores rows in a heap.

The strategy used by PostgreSQL is simpler, more flexible, and cleaner, but there are some cases where MySQL's way of storing things can be significantly faster.

There has been talk of PostgreSQL supporting clustered indexes, but that would probably require reimplementing a ton of stuff to support both storage formats efficiently. Also, it turns out that there are a bunch of ways in which PostgreSQL can cheat such that in most workloads there's no difference at all - though not all of these cheats have been implemented.

Note that I didn't say MySQL is faster or slower than PostgreSQL. Such statements don't make sense, because it depends entirely on your workload, and as long as you aren't literally using PostgreSQL as a key/value store instead of a relational database, you can probably always find a way to make it as fast as necessary.

> The biggest case almost completely unhandled by PostgreSQL is if you need a clustered index like in MySQL.

Heh. It's funny how "the biggest thing" is different for a lot of users. I can't count the number of times I heard "if you only had XXX, I'd migrate / be happy / ..." for vastly different features. If you then tell them what other people think the most critical bit is they look at you with some surprise.

That's not to say these people, including you, are wrong, just that it's not always that clear what the highest priority items are.

It doesn't get easier, although it has a lot of advantages too, if there's no coherent direction for the entire project, due to all the different directions various companies and people want to go.

WRT clustered tables: I personally think it's a good feature, but there's other more pressing concerns. Since my employer thinks that as well, I'm for now working on other things...

> Why do people still choose MySQL/MariaDB/Oracle over PostgreSQL at all?

Oracle: because the enterprise is its own unique beast. Companies frequently stick to what they know and have invested into, so long as it still works. Enterprise customers buy/adopt software in a very different way than everyone else.

MySQL: if you've got a history with the product and it works well for what you're doing, there is absolutely no necessity-based reason to switch to PostgreSQL. You should switch from MySQL to PostgreSQL if there's a very good reason/s to, a meaningful gain to be had, it's that simple. Otherwise, it's unnecessary optimization. There are a vast number of considerations when operating a business, upgrading to the latest & greatest (speaking very broadly) just to do it, is not a good reason most of the time. There should be a specific prompt to action based on a gain/s or capability that PostgreSQL provides you, in/for your use case, over eg MySQL.

>>> Companies frequently stick to what they know and have invested into, so long as it still works.

This. For example I'm still on Java 1.6 because the cost of migrating to 1.7 (java is not the problem, the libraries we use are) is big. I'm still on Solaris because moving to Linux (for example) would imply redeployment of thousands of programs which translates to ten of thousands of man/hours.

Big corps go at geological speed :-)

Aside from those reasons, are there technical reasons for Oracle over Postgres? My coworkers and I want to move our data warehouse over to Postgres and my boss has been warming up to the idea. (I know switching databases with that much data can be complex and troublesome, but those concerns are outside the scope of this question)

Oracle is still a very good option for data warehousing. When well tuned it's extremely performant and reliable for that purpose. If I were operating an EDW today, I'd choose Oracle over Postgres 10 times out of 10 when it comes to very large scale operations. You'll still see plenty of credible arguments around that Oracle can be optimized to be faster than every major alternative. And you'll still see arguments around replication that are legitimate.

Obviously I don't know what your cost or performance considerations are. If your organization can save a meaningful amount of money and operational headache by switching to Postgres, that can be a big deal for any business. If cost isn't much of a consideration in your operation (that is, what sticking with Oracle is costing you), stick to Oracle for data warehousing over Postgres - at least for now imo.

With Oracle it's the same story it has been for a long time (and probably will remain), it's expensive and a pain in the ass to get up and running the way you want it. Postgres has everything to gain vs Oracle from here forward, it seems very likely to bite into some of Oracle's well guarded high value territory in the coming years.

Dynamic SQL in Postgres is really frustrating. I understand why. It's because you shouldn't fucking write SQL that writes SQL.

On the other hand, sometimes you just have to. SQL Server . . . I won't say it makes it easy, but it doesn't make it so goddam impossible.

If I lived in a world without Postgres, I would use SQL Server without thinking twice. MySQL/Maria I'm not fond of, and I kind of don't want to touch them. Oracle has some cute features and language extensions, but you have to deal with Oracle.

I can't think of a use case where Postgres falls down compared to other RDBMSs.

What we are seeing is Postgres competing with NoSQL and Mongo-type stuff. With full-text search support for JSONB types, we're looking at Postgres moving into ElasticSearch territory with a vastly more simple deployment model and still carrying ACID guarantees.

It's pretty remarkable.

I don't understand where you're having trouble writing dynamic sql with postgres. I've worked extensively with sql server, and postgres and I found both to be pretty equal once I learned the ins and outs of each.

Dynamic SQL is easy in Postgres Just create a simple pl/pgsql function that takes a text argument and executes it. Works for me perfectly for anything I need such as building tables views functions etc.

It's disappointing that pgsql has no index-organized tables aka clustered indexes. They are quite advantageous for a variety of workloads.

You can cluster a table by an index: https://www.postgresql.org/docs/9.6/static/sql-cluster.html

Or am I not understanding what you're asking for?

Indexes in PostgreSQL require lookups into the table to access the data values in the rows. Index organized tables have indexes which include the data values in the index itself, removing the need for the lookup in the table itself.

Here's some more detail:


Perhaps I'm missing something, but your description doesn't seem consistent with my understanding of the index-only scan feature that's been in PG since 9.2 https://wiki.postgresql.org/wiki/Index-only_scans

Yes, you're right about index-only scans, in the sense that the data in the indexed columns can be used in some cases. As I understand it, indexed organized tables goes further than that in that row data for non-indexed columns is also included.

That kind of clustering is a point-in-time operation. Subsequent inserts and updates aren't stored in clustered order. When the ordering is guaranteed, the query planner can be more aggressive in optimizing certain kinds of queries. A real clustered index could also be used as an implicit covering index as grzm mentions.

PostgreSQL can do all of the same query optimizations. The main difference is that for primary key scans PostgreSQL will have an additional layer of indirection meaning more work needs to be done when scanning on primary key and potentially more disk seeks.

On the flip side PostgreSQL's approach is good when you query secondary indexes since these can point directly to the heap rather than to a primary key, removing the cost of long primary keys and allowing for scanning the heap in physical order after some kinds of secondary key lookups. It is also cheaper to sequentially scan heap compared to sequentially scanning a clustered index.

There are also some difference s on the write side too, but the gist of it is that both models have their own strengths and weaknesses.

I don't see how pgsql could perform such optimizations as eliminating a sorting step when ordering results by column(s) in a clustered index if there is no guarantee that new or updated data will also be ordered by the clustered index. I agree that both models have their pros and cons. While I am a big fan of pgsql, I do also appreciate those databases that offer both heaps and clustered indexes that I can mix and match in my schema as best suits my workload.

PostgreSQL can use that optimization since an index scan in PostgreSQL still returns all tuples from the heap in index order, not in heap order. During an index scan PostgreSQL traverses the B-tree in index order and for each matching tuple found in the index it fetches the data from the heap (which is a quite cheap O(1) operation).

Also this operation imposes full read/write lock on table.

MySQL has native bitcount operator (and more data types last time I checked). Using it ended up being the fastest way to implement one problem I had - I tried Python with Numpy, Java, but generating large combinations in MySQL (joining a table to itself multiple times) and using bitshift and bitcount operators to filter the rows ended up being far faster than generating the combinations either language that I knew or Postgres lacking the bitcount operator. Probably spending a fair bit more time in C or Rust would have been the real solution. But that would have involved way more time than my work were willing to let me spend on it.

Also initial setup is easier with MySQL. It is installed everywhere and setting it up uses involves less messing about. (I would imagine there are far more insecure MySQL install around as a result).

Yes, if you have a need to update large amounts of records. For example, updating a single column of all your records will cause the whole table to be rewrite, thus causing a super high IO load.

Mmm, but it's the same for MySQL, no? Whenever we change a column in one of our tables (pretty big), the whole server hiccups for several seconds. We're using Google's Cloud SQL.

At least PostgreSQL allows you to wrap schema changes in BEGIN/COMMIT/ROLLBACK transactions, unlike MySQL.

UPDATE client set enabled = true;

So something like this will rewrite the whole table because of MVCC. MySQL will update the record in place without rewriting the whole table.

MyISAM I assume? InnoDB, which I hope you're using, uses MVCC too, just like pretty much all mainstream SQL databases of the last 30 years. Updates occur in place, with the older version of the columns relocated to UNDO space. Not sure what you're gaining over PostgreSQL there. Also, if the column is indexed, the index will contain all versions for each row. This is what allows index scans.

Because FS operates pages, and not individual bytes, MySQL will also, for each column, read whole page with row included, and write it back, thus rewriting whole table..

MVCC is almost always a feature for my workloads but it's well remembering it's a trade-off.

Thanks for reminding HN of this point.

You're talking about DDL.

They're talking about an in-place rewrite of the value of a single column, which, yes, InnoDB will do with way less write load than postgresql.

Can you provide specifics? AFAICT that applies only to indexes (where there is another layer of indirection in MySQL).

InnoDB clusters on disk by PK so if your record is the same width and you don't change the PK it can AFAIK update in-place.

Very late answer, but no, that is incompatible with mvcc. How can transactions (and rollbacks) work?

Yes, column updates. Although, using MySQL now, I really miss DDL migrations of PostgreSQL. It's a trade off.

Does anyone know whatever happened to the columnar storage work that 2nd quadrant posted on their blog: https://blog.2ndquadrant.com/postgresql-10-roadmap/

Is this essentially a pipe dream?

Not a priority for 2Q at the moment. Still general interest in the community, but hard to predict right now.

There seems to be an in progress patch for pluggable storage:


The PostgreSQL 11 roadmap also mentions "multi-model" database


How does PostgreSQL 10 compare with Cassandra for BigData requirements?

They are completely different:

Postgres - Solid single-node relational database with SQL standards support, JSON, fulltext search, custom data types, ACID transactions, foreign data access, and more. HA/scaling through replication and some 3rd party offerings for automatic sharding, failover and multi-master if you need it. Great as the operational source-of-truth for your core data.

Cassandra - Distributed wide-column (basically advanced key/value) meant to run as a cluster of machines with native support for multiple data centers. Limited "SQL" support which is really only used for defining tables. Data access is very different (everything is an upsert or delete) with variable per-query consistency settings. Great if you need data spread globally, 100% availability, eventual consistency fast key/value, and some interesting data access patterns.

If you want high-performance Cassandra, look at ScyllaDB first. If you need something in between Cassandra and Postgres, look at CockroachDB.

By no means an expert, but isn’t it Apples and Oranges?

Cassandra is an eventual consistent database.

Postgres is ACID.

My 2c: you don’t need Cassandra or other BigData databases. If you did, you probably wouldn’t ask the question.

> you don’t need Cassandra or other BigData database. If you did, you wouldn’t ask the question probably.

...and by the time you do, Postgres improvements will have rendered your Big Data medium-sized anyway :-)

you can comfortably fit 8TB of data on a single box running postgres, more dependent on your hardware :)

Curious, how long it will take to create index on single node for a such large table?

That's about 1/250th of our working set size. :)

Of course, that's in a Spark cluster running massively parallel queries and not on a single (however large) node.

You did not ask that question. I assume you knew before that PostgreSQL is not the tool for your workload.

I didn’t ask a question. I love PostgreSQL for small relational stuff (where “small” is up to a few TB) and use it any time it’s appropriate. I wouldn’t classify it as a Big Data store, though.

That’s no criticism at all. I wouldn’t use Spark for a transactional DB because it’s not the best tool for the job. Same for PostgreSQL when the dataset grows far beyond what a single instance can reasonably process.

Having provisioned these sorts of "big data" systems in the past, it's now about how much you have today, it's how much you'll have over a growth period. The advantages of a scale-out system like cassandra, riak (RIP), memsql, big-table, cockroach, etc., are that they can grow with you from 3TB, to 9TB, to 81TB (as an example, if you're on some exponential growth curve with your data). It's not that you can't do it with Postgres and MySQL via sharding, it's that you don't have to if your access patterns match one of the models that one of the above databases fits well within.

So, that person above may not need it today, but they may need it sooner rather than later.

Designing for the future is a guaranteed project failure.

If one is starting a new project, hence contemplating what DB to use, starting with an ACID db is a safe bet in most cases(unless of course they are already starting with a huge amount of data).

By the time the outgrow the ACID database they will have a better idea of what exactly they need and more importantly they will have the resources to make the switch.

BTW: riak is hopefully not dead yet, let's see what bet365 can do with it..

I don't think we disagree, but I want to be cautious about saying "whatever we start with is always good enough."

I'm co-founder of a company, doing all of the technical work, and I too am using a very vanilla stack: Postgres, Elasticsearch, Redis, Memcache, Rails. However, having worked on massive scale-out projects before, I recognize where my pain points are going to be with my tech choices.

[1] I'm storing calendar data in postgres. This data is primarily key/value based, needs to be persistent, and on a per user basis potentially is tens of thousands of records. This is a medium term problem, and I'll likely move this data out of postgres and into a scale out solution sooner rather than later. I need relatively fast lookup times, but this is primarily a "high write, low read" scenario, that Cassandra is well suited for.

[2] I'm storing a bunch of text from blog posts, news articles, etc., in Postgres. I really dont' need the data to be in postgres, it's just a handy place to store it to synchronize with Algolia & Elasticsearch, where the true work happens. However, I dont' want to hold onto TBs & TBs of news articles in a postgres system. As that system scales out, it'll be unnecessarily expensive, as I don't need all the bells and whistles I'm getting from Postgres, for what is effectively a form of cold storage. This is a long term problem, and I likely won't tackle it for well over a year.

So, you're right, I'm not contemplating what DB to use, but I'm definitely aware of why I won't just leveraging the same database in perpetuity. Being focused on delivering customer value is vital, and that's my intent; however, scalability and cost structure do matter over the long term, so an understanding of where one will go is good as well.

Have you considered Citus as your scale-out solution? https://www.citusdata.com - maybe you can have your PostgreSQL and eat it too.

Citus seems really cool (this is not a critique of the product), but probably not something I’d use. They seem to be very close in feature set to bigquery, and if I have to choose between the two bigquery wins for me. There are definitely situations where citus is a better solution, but I think the scenarios where I could need a scale out analytics platform, I’d choose BQ. Still, very supportive of what they are doing at Citus. It’s great.

MemSQL would be the on-premise alternative to BigQuery, RedShift, Snowflake and the others.

Citus is more focused on scalable OLTP rather than OLAP scenarios, giving you essentially an automatically sharding postgres database. Heap does use it for analytics but that's more so because of the JSON support and indexing in postgres itself.

I’m not sure I agree with this statement. BigQuery is not in memory, nor is redshift (not super familiar with snowflake). Whereas memsql is and would be prohibitively expensive to operate on datasets that are analytics based, because it is in memory based. It’s my understanding that citus is best suited to analytics workloads and parallel queries, and really isn’t comparable to memsql as it is attempting to provide scale out support with disk, which memsql doesn’t natively prefer.

Can you share some docs with me around how citus is more OLTP based? Looking to be educated.

MemSQL has in-memory rowstores and disk-based columnstores. Every columnstore table has a rowstore to aid in fast data ingest. Rowstores are meant for access of individual rows, lots of data updates, and smaller reference tables while columnstores are for the bulk of data requiring large scans and analysis. We have tables with 20 billion rows that take up < 10gb of memory.

BigQuery streaming ingest also uses in-memory buffering into BigTable until it's written to disk, just like Cassandra with memtables flushed into on-disk SSTables.

Citus is more manual work than MemSQL and doesnt support columnstores native, although you can use it with their cstore extension to get both. It's just a different design giving you a postgres database that automatically shards vs MemSQL's many optimizations around HA, replication, and very fast analytics performance.

Try the features page: http://www.memsql.com/features/

Thanks for the education on MemSQL. I wasn’t aware of the analytics use case (had never come up in discussions I was a part of for that tech), so it was news to me.

This article had a nice overview. Somewhat outdated but mostly still up to date. https://lnkd.in/g4anCf5

I’ll say it seems precarious to me to mix and match row store in memory data with a columnar store. Is it never a performance issue to have these two systems potentially in contention with one another for resources?

It's just data in memory. Rowstores are read from memory and columnstores are read from disk. The architecture uses leaf nodes and aggregator nodes so aggregrator memory is used for assembling the final results of a query.

Version 6 is even faster, you can try out the beta: http://www1.memsql.com/beta-6.3-features.html

Every BigQuery query, actually loads all the data (for that query) in-memory until it finishes.


> Designing for the future is a guaranteed project failure.

Based on what ? There is no evidence or logic behind this.

If you know that you are producing say 1GB a day of data that you must store then it is sensible and prudent to select a technology that can support your needs a year or two from now.

And this idea that you always have resources to make a switch later on is rarely the case. The more common scenario is that a team is established to implement something, it moves into production and then the team either disbands or moves onto something else. Why do you think so many people try and plan for the future ?

In most enterprise companies at least it would be considered a failure to have a project that in a year's time required another business case just to replace the database that the team should've appropriately selected in the first place.

Knowing that you will hit the limit of the database in X years is different.

Designing for the future is a horrible choice because where do you stop?

Eg. Why are you sure that Cassandra, or whatever you choose, will be good enough for the future? Maybe you will need a Spanner type database or something even more big/consistent/feature-full.

> Designing for the future is a guaranteed project failure.

Yes, I understand this. I would not want to get bogged down on infrastructure when trying to deliver early versions of a product. At the same time, I think database choice is one of the more important tech stack decisions you should make early on.

Cassandra also sounds like a good bet for the type of data I'm interested in storing (ML model inputs, so lots of key -> numerical value data). But my experience is mostly around RDBMS and some NoSQL. Hence the question. Thanks for the answer!

> ML model inputs, so lots of key -> numerical value data

How big are the numerical value data? AFAIK Cassandra does like large amount of data per value. https://docs.datastax.com/en/cql/3.3/cql/cql_reference/refLi...

Also what do you expect the total amount of data to be?

BTW, I just recommended to start with PostgreSQL because it just works and will have your back no matter what ways you like to use retrieve you data.

However if you are absolutely sure you don’t require secondary indexes, joins, etc, and just need a KV database, Cassandra will work fine too.

No it's not, say that have 2TB of logs daily, good luck designing that with a regular RDMS without sharding / clustering / re-balancing capabilities.

I expect either of Citus or Greenplum can deal with that kind of volume. Citus acts as an extension, Greenplum is descended from an earlier fork (now merging its back to mainline as part of overall development).

Citus specifically cite analytics and write throughput as their strengths. I believe them. They do seem to charge for shard rebalancing and some other features.

Greenplum comes with an inbuilt data science toolkit (MADlib) and has a long history of dealing with intensive workloads. I don't know if there is a commercial version with extra features. I don't think so, but I haven't looked closely.

Greenplum also gave birth to HAWQ, which uses the Greenplum distributed/parallel query planner to plan queries over Hadoop.

Disclosure: I work for Pivotal, we do most Greenplum development.

And citus will be a nightmare to setup and maintain, for what elasticsearch could have done easier and better out of the box. Use the right tool for the job.

I was pointing out that 2TB/day is not a very high write volume for modern OSS RDBMSes, especially distributed ones.

As for elasticsearch, I have found it to be temperamental. The "out of the box" experience was usually unhappy. For logging I've seen people move to other tools (splunk, stackdriver) more than once.

I have found elasticsearch to be one the easiest database I ever had to setup and maintain. Some tuning to do when you have TB on multiple boxes but nothing hardcore.

2 TB/day is fairly easy... as long as you have a bit of experience AND one of the database that can scale horizontally AND the company buys the amount of hardware that is required.

StackDriver/Sumologic/logentries are cloud solutions, so obviously one doesn't have to deal with the database ever.

I think it's anecdote vs anecdote, but clearly you have more experience. In a fulltext indexing situation I'd probably look at elasticsearch again; I don't deliberately hamstring myself to try and be right on HN.

On the other hand, if I have structured-enough data, I'm going to prefer relational tools for it. Hammers and algebraic nails and so on.

In which case they would not be designing for the future.

Did you miss this: "(unless of course they are already starting with a huge amount of data)"?

This is a really insightful comment. It's not that you can't get Postgres to handle any volume...it's how much work is required on your part.

> Cassandra is an eventual consistent database.

No. Cassandra is a database with selectable consistency.

You can set it to be strongly consistent (ALL or QUORUM) with the tradeoff being worse performance.

Define "big data"? You can buy normal tier 1 server vendor hardware with 6T of memory.

If it fits into 1 machine or a few, it isn't "big data".

To be honest 6 TB of data isn't that much. I'd say big data starts at around one petabyte because one petabyte requires 3 servers with 45 [1] 8TB hdds. At this point your data is so large it's impossible to store it on a single physical machine.

"big data" usually doesn't require transactions though or at least no transactions across servers so you could probably get away with sharding postgresql via citus.

[1] http://www.45drives.com/products/storage/

I agree with this response. Those aren't very big either though. The Dell DSS7000 series servers[1] support 90 x 8TB drives (We run a ceph cluster on 4 of the DSS7500 storage nodes).

[1] http://www.storagereview.com/dell_announces_availability_of_...

Except that it's just stupid to do that, you don't want to store that much data in one host, it goes against any sane architecture design, good luck when you're mb / cpu dies. Splitting data on smaller / cheap hosts is the way to go for ops / scaling / backups.

Maybe. I recently updated some of my company's internal training material, and discovered there still isn't a single (good) definition of "Big Data". Most definitions talk evaluating data on four dimensions: volume, variety, velocity and veracity. That's still a vague, but it's more helpful than looking at volume alone.

Besides, if we picked machine size as _the_ definition, it would change every year. And that feels like a bad way to talk about big data.

(BTW, I'm not a huge fan of the term in general: I've seen a lot of companies sold a bill of good because some sales guy convinced them they needed a Big Data solution instead of something like Postgres)

I've also struggled with the definition of Big Data. I typically see "size" as being the line of demarcation people throw around when thinking about Big Data, but I think it's it's actually size + latency that's the real definition.

If I have a 12TB dataset, and I need to be able to find a single record in that dataset, and return the results to a process in less than one second, I don't believe that needs to be considered "big data". The dataset is "large", but the latency / access pattern surrounding that data is low enough that you can solve that with a handful of hard drives and your data sorted on disk.

If I have a 12TB dataset and I need < 10ms latency on random lookups, that is now probably big data. To guarantee those sort of lookup times for random IO, I either need very expensive hardware, or a software based solution to distribute my data to multiple machines.

If I have 10GB of census data, and I need to run an analysis that shows what are the 100 most similar cities (non approximation) to Los Angeles in terms of demographic make up, and I need an answer in less than 1 second, that is also probably big data. The analysis is easily parallelized, but if done serially on a single machine is a "slow" process.

I realize this is a moderately unconventional way of approaching Big Data, but it's how I've started to structure some of my thinking and explanations around why it's not just size, but size + latency (which is inherently a question of what you're trying to do with the data).

I've heard this 6TB number thrown around a bit recently, but no one can actually provide me with links to hardware I can buy - has anyone successfully setup a machine with this much ram who can post links to what they used?

Don't know about hardware you can buy, but AWS will rent you a machine with 3TB Ram by the hour :) https://aws.amazon.com/ec2/instance-types/

And the postgres maximum table size is 32T.

You can partition your table if it got anywhere near that large (with the shiny native partitioning introduced in PostgreSQL 10).

You really don't want to have the whole DB be much above 30TB. You'd either want to shard or use some alternative solution.

Ultimately it depends on your use case and how much tuning/schema design/explicit management you are willing to do.

My team lead did a talk at NGINX Conf 2017 which touched on how we tuned Postgres to handle 1 Billion metrics per day for NGINX Amplify (https://www.nginx.com/products/nginx-amplify/). I would link it, but it seems it hasn't been published by our Conf team yet.

We did this on 9.4 so the changes outlined here regarding native partitioning, parallel queries, replication improvements, etc. would likely make doing a similar scaling structure easier and just straight up more performant.

I wouldn't classify 1 Billion metrics a day as "Big Data". I also would point out that our use case is unfairly biased to "recent" data rather than "all historical" data...which allows us to make some query optimizations that wouldn't be acceptable in the latter use case.

But we are using Postgres to host a constant, write-heavy workload that conventional wisdom dictates it is unsuited for. Moreover, it was/is fairly simple to implement, understand, and scale indefinitely. Personally, I think both Redshift and CitusData are various degrees of proof that scaling Postgres to handle these types of workloads is possible.

For me, the reason you want to consider using a Big Data focused tool (such as Cassandra) is at some point you end up fighting your tool to make a square peg fit in a round hole. At some point it makes more sense to adopt a specialized tool rather than work to specialize your implementation of a specific tool.

In all fairness, we are considering just that. Our road map actually has Scylla in the near future. It's not that we don't think Postgres could handle the load, but rather we think it will be easier to move specific data to Scylla and in the long run will save us resources (primarily in man hours and potentially in compute requirements as well).

Even if we move all our timeseries data to a specialized tool, we won't be getting rid of Postgres. For most/if not all our other persistence requirements Postgres is more than adequate. This is one of the reasons we chose to start with Postgres: even once we hit a scale that specialized tools made sense, Postgres would still be useful for everything else that doesn't require a specialized tool.

Moral of the story: Postgres will get you a lot farther than many give it credit for...and if you are just starting out I would highly encourage you to just pick Postgres and see how far it can take you. If you hit the limit, replace that specific part.

Edit: Also, there appears to be some movement on Columnar indexes which is worth keeping an eye on:


CitusData yes but RedShift is the Paraccel backend and just uses the frontend of postgresql.

Hmmm...I thought ParAccel was a derivative of Postgres:


Does Cassandra allow aggregate functions like sum() or avg() yet?

Yes in newest versions it does. But for any non-trivial analytics you would use Apache Spark (most likely Spark/SQL).

Cassandra's CQL is very limited in terms of types of queries it supports.

Yes, Cassandra supports common aggregation functions as well as user-defined aggregators which have to be created in Java.

I am interested in the answer to this question, too.

Very exciting. Still badly miss (bi-)temporal tables and native query support [1] without using extensions [2].



Does anyone know if the partitioning can enforce a uniqueness constraint across child tables?

I couldn’t tell from the write up if this is one of the things addressed with Native Partitioning

No it can't currently. There has been some discussion on how to do that for pg 11 though.

Seconded, this is a pretty important feature for some use cases

No. You need Global indexes for that.

To those working on and contributing to the goat DB PostreSQL thanks a lot

Every time I see stuff about Postgress it sounds so much better than MySQL.

Every time I get a new job, the existing system is using MySQL.(Every time it causes problems of some sort of other - it's not a bad database, but has too many quirks like JavaScript that will end catching you out at some point).

Last time I got to build something new the time pressure was so great I had to use MySQL, because it was there, set up and working and I knew it. I wanted to use Postgres, but at the end of the day the extra time setting it up wouldn't really have been much benefit as it was a fairly standard Django app, and didn't need any of Postgres's extra features. (I quit that job shortly afterwards).

What happens if I try to insert into a master partition table, and there are no child tables that can store my data?

For example, data is partitioned by month, and I have no table for the month in the data I'm trying to insert.

It will throw an error. Partitioning support in PostgreSQL 10 only solves a few simple use cases well, but the next version should get much more generally useful partitioning. This is how PostgreSQL usually releases major features.

I recently started using PostgreSQL for a project and have been pleasantly surprised on more than one occasion. I also have been hearing more and more people speak highly of it, especially for its security.

Can anyone recommend a good resource for learning PostresQL? I have a decent understanding of SQL, I've used MySQL in the past, but every time I see the latest features of Postres I'm left thinking I'll need to buy a book and take a pretty deep dive to figure out how to apply the new features.

You'll likely find the PostgreSQL docs[0] very useful. They're organized differently than the MySQL reference making it easier (in my experience) for reading stand-alone.† They're quite well-written and up-to-date.

[0]: https://www.postgresql.org/docs/

† The MySQL reference is very useful particularly when you're looking to compare feature differences between versions.

You may like PostgreSQL Exercises: https://www.pgexercises.com/

I'd start with the docs: https://www.postgresql.org/docs/

They are pretty good.

Completely off topic... but did anybody notice how they used partitioning example column names were in Spanish?

What I miss the most from Postgree is the compression feature in the same level as RocksDB or Tokudb. The actual compression is very bad compared to them.

Do anyone know if they have something planned in this department ?

Advanced compression of data is really something that should be looked at separately from database design. Look into using ZFS beneath Postgres for some pretty good compression.

Would love to know about FDW push down. That could be really good for specialized applications where you want to use a relational db and wrap a specialized data store outside postgres with a FDW.

The only thing I'm upset about is the fact I'll need to wait 3-6 months after it's released for AWS RDS to support it...

Can the xml table accept any xml document? What if it's a bunch of unrelated elements?

Can you get back to xml from the table?

Any XML document that you can address via XPath in a sensible way. If your XML document is a mess, you'll only be able to get out a mess. :)

To get XML from a table, functions already existed in previous releases (XMLELEMENT etc.).

What does native partitioning do?

It basically just adds a convenient syntax for creating partitions plus adds automatic tuple routing (routing inserts on the partitioned table to the right partition). This is for most users not a huge improvement over using PostgreSQL's table inheritance to implement partitioning, but it is a very important stepping stone for implementing competitive partitioning in PostgreQL. There are lots of promising patches in the PostgreSQL 11 release cycle which improve on the native partitioning, making it useful for more users.

Thanks. I guess I meant to ask what a partition is used for? When is it useful to use one and why?

Oh, you meant partitioning in general. It is useful when you have huge tables with many rows, and works by splitting a table into multiple smaller tables which acts like it was one big table to the outside. Some advantages:

- You can partition by for example month and cheaply delete or archive old data by instead of deleting all rows just dropping the entire partition which is much cheaper.

- You can have different indexes on different partitions. This is commonly used when some indexes are only used by queries which operate on recent data.

- You can put different partitions and their indexes in different table spaces. E.g. you can put your older and less frequently accessed data on cheaper and slower disks.

- Random inserts into B-tree indexes are expensive and when you for example have time series data on the format (device_id, timestamp, v1, v2, ...) it can be useful to partition on the device id so all rows are inserted in strictly increasing timestamp order in each partition. This way you can get more write performance out of your hardware.

- Most databases have a maximum table size (PostgreSQL's maximum is 32 TB) and you can get around that by splitting your huge table into multiple smaller partitions which all fit in the maximum table size.

- The partitioning condition can be used as a very coarse index to entirely skip looking at some of the partitions (even at plan time for some queries). This does not always give you much over if you just had had one big table with one index over all of it, but in some cases this can be a big win.

- For PostgreSQL specifically it looks like you can have some of your partitions be foreign tables. Meaning some of your data can be on an entirely different server, not necessarily PostgreSQL or even a relational database system at all. I am not sure how useful this will be in practice.

Partitioning has a maintenance overhead and adds a bit of extra work to query execution and planning so it is not always worth it.

"This means that users no longer need to create triggers for routing data; it's all handled by the system. "

How does the native partitioning compare to something like Timescale's hypertables?

Hey, HN! Would you recommend me a practical introduction to data modelling with Postres?

Learn normalisation from wikipedia, then read "The Art of SQL" to get a feeling for the shape of the weirdnesses, then read the postgresql documentation from end to end.

(disclaimer: worked for me, may not for anybody else)

I intend to read the PostgreSQL docs end to end at some point, but last time I checked the full PDF comes out to something like 3600 pages! I am astounded by the quality of everything about PostgreSQL, the community included.

Nice !

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact