I would appreciate a really long text that would in a convincing manner explain why Postgres Is so awesome.
I work in the industry, and all I see are Oracle and Sybase everywhere. The experts are zealots also, not even having heard of Postgres. Not willing to believe a word I'm saying about Postgres.
I am already convinced of course, but the industry is not. Not finance, not trading, not telecom.
If you're fighting the attitude that the only viable option is one that costs a ton of money instead of a decision based on technical merit, additional facts about Postgres won't help.
Sometimes the decision gets made on a golf course and not after reading very long texts.
This is why I swore to only write software at software companies some years ago. For other industries software is just another expense like buildings or supplies. The people buying it have no idea what they're looking at so they buy the shiny thing that everyone else is buying.
When I worked in other sectors I regularly saw executives attempt to assemble software as if it was a large building. Materials were gathered, agreements were made to purchase or rent heavy equipment, and the whole project was planned out with exacting detail. When all the permits and contracts were in place construction would begin. This was to be done in a measured amount of code using the chosen architecture, and delivered complete on a certain date. Any deviations were seen as workarounds for not following the original plans, and overruns the result of incompetence. The whole thing was just a shitshow of incompetence at the higher levels.
Demanding and exercising the use of a throat to choke is a massive, massive part of how at least American business works due to prevailing leadership styles as well. It also shows how they tend to view their workers as well. "Blameless culture" is something that is a tiny, tiny minority that only seems to have worked out well among socialist-ish boutique software companies and until we start seeing companies that practice traditional Taylorist and authoritarian leader worship cultures fail very disproportionately, nothing will fundamentally change.
I'm familiar with a number of very large deals done basically between C-level to C-level where the scope of IT projects has nothing to do with technologies but entire about cost savings - literally "I will save you $n MM / yr in opex so you can get your bonuses" and other vendors get shut out. Sometimes these deals work out, other times they don't and the executive is basically ousted. Companies with bad politics and enormous cronyism may have worked fine for decades, but they just may not be doing as well anymore unless you're on Wall Street and you make so much money it doesn't matter how it's done.
Software is not the end-goal itself. The point is not to make (or use) amazingly elegant software. The point is to make money.
If a supplier says "I will provide the same service as you are currently getting and cost you $X less" then that's a no-brainer regardless of what service they're providing. It's got nothing to do with technology, and technology doesn't change the nature of that decision.
"Having a throat to choke" is also a matter of insurance. You can't insure against your own incompetence, but you can sue a supplier for not fulfilling the terms of their contract. Executives would much rather negotiate what they think is a tough contract with a supplier than manage a complex project themselves. To that mindset, the removal of risk (because if anything goes wrong they can sue the supplier) is a huge bonus.
It's a totally different mindset from those of us who actually make things.
I was a Sybase point person for years at a fortune 50 medical devices enterprise company (we had revenues of well over $1B annually on devices running Sybase dbs). There were dozens of bugs/issues I found that I pushed up the chain to an engineer and had special patches turned around within 48 hours, sometimes even hours.
Before I left that job I started playing around with Rails and mentioned MySQL and Postgres for a potential greenfield project. I was told it would be fine for internal use but no way, no how were they going to deploy any software without that kind of parachute based on economic leverage.
Often enough you'll even get similar turn-around times from the community, if you can't (or don't want to) afford such a support contract.
Making software work at scale without that economic leverage could never work in theory. It only works in practice.
Oh, and those who did it were above oracle in terms of data size making it rational decision too. The calculation is still different for majority of companies.
Specifically, I've seen pg take a query that looks like this:
select ... from a join b join c join (select ...) d
select ... from (select ...) d join a join b join c
With the lack of hints, almost the only tool you have to control query plans effectively in postgres is parenthesized joins. Since it's more liable to rewrite the query, the language ends up being less imperative, and thus less predictable. And I like predictability in production.
SQL-level feature set is no comparison of course, pg wins easily.
I'd also suggest disabling nest_loop_entirely if you are having problems with bad cardinality estimates resulting in nestloop plans that run 100 times when the planner estimated once.
It is interesting to see how postgresql will often choose hashmap scan, even with very up to date statistics and much better paths available.
SQLServer's planner does an amazing job of digging right into joins/sub-selects to constrain preliminary results for joins.
It's a very hard job and MS and Oracle obviously have had some of the best people on the world paid well to work on this.
The company for the longest time wouldn't even touch basic firewall rules without having the firewall contractor implement it.
Many also think looking up issues on stackoverflow, google or blogs as unreliable. Then there are times when issues might be specific to installs or data, in which case sharing the logs/sample data (even masked ones) can be risky. They feel comfortable sharing logs/masked data with for example Oracle because they believe it to be safe and locked under Oracle's security guidelines.
The 2nd refrain I hear is - security. In case of a major security issue being revealed, there is a general sense that FOSS will be slower to react in releasing a
"stable" patch. Comparatively paid software take it as a reputation risk and work towards quickly releasing a "stable" patch.
If people have to use FOSS, then they try and search for the paid support flavor. Recently we were looking at MQ software. When we zeroed in on RabbitMQ we were asked to deploy only the paid Pivotal version and not the free version because "support".
Sure, these things might not be completely true but for many higher ups paying for something somehow makes them sleep better at night than a "free" alternative.
> Written for longevity
OSS is much better at longevity than proprietary. Even if the authors all die without will, it is possible to fix the little bugs that prevent you from using the software on [NewTechnologyHere]. I've done it countless times with Java software; If anything OSS is the guarantee that you own your future and that the system will exist in the legacy.
> Use paid flavor
It's good, but what's better is joining the golf club of a principal maintainer. He's key in paying him to fix the issue you're having quickly and merging upstream.
Long, long ago when I worked at Nortel (a now defunct, but then huge telecommunications company), they used to pay millions of dollars a year to Cygnus to support a particular embedded version of GCC. This, despite the fact that Nortel had more than 10k programmers on staff including a compiler team!
I think the real reason these support contracts exist is because companies (even large ones) don't want to dilute their focus maintaining projects that are peripheral to their core business. It's not so much a technical problem, or a money problem -- it's a management problem. They can't scale out to handle every little thing.
I think OSS is a red herring in this conversation. Most companies just don't care about that. They don't want to support it themselves (even if they are big enough to do so), and they need to have confidence in the company that provides the support. Build that company (hint: you need to be sales heavy!) and you could sell Postgresql just as easily as any other database. Of course breaking into an entrenched area in Enterprise software is always going to be difficult, so I'm not sure how successful you would be with this particular product, but you get my point, I think.
As hindsightbias puts it they want solutions and 24*7 support.
I think is the better point or rewording of your point.
Let me rather call it, The Real World.
I work in a Fortune 500 company, a consultancy of 400.000 people, on projects for other Fortune 500 companies.
My experience is from the real world.
That is, although those two products may indeed have some merit, that is not why they are chosen.
Golden Gate, RAC, Management Tools, stable plans, decent parallelism, decent partitioning, some columnar stuff - that's not just marketing fluff.
Oracle's politics / sales tactics, and cost are one large argument against, being able to influence feature development / add features yourself another, against Oracle, that I've seen driving companies - including big financial ones - away from oracle over time. Often that's not starting with the business critical stuff, but with some smaller project, and then grows over time.
There's some things (better replication out of the box , higher performance).
> but not much discussion on how PG will come to parity
That's because this subthread started with "No one ever chooses Oracle or Sybase on technical merits." - neither Postgres' strengths and needed/planned improvements are relevant to refute that position.
> on how PG will come to parity and how we'll know when it's finally good enough to use.
Just because Oracle has some features that postgres doesn't match doesn't mean it's not good enough. There's a lot of features where postgres is further along than Oracle, too. For a good number of OLTPish workloads postgres is faster.
We're talking about large and complex products here - it's seldomly the case that one project/product is going to be better than all others in all respects. Postgres has been good enough to use for a long time.
If you're interested in which areas postgres needs to improve, I'm happy to talk about that too.
The first result is a blank copy of a license agreement that they presumably forgot was on their website.
You aren't allow to publish benchmark results, a condition that is both upsetting and not at all unique as commercial databases go http://m.sqlmag.com/sql-server/devils-dewitt-clause
This unfortunately has become very common.
It is awesome in a way but had some awful bugs (lost hours to installer bugs) and was topped up with dark patterns IMO (expensive features would be one click away)
Edit: and can we stop pretending that OS X is better? It is different. Some people like that. Other has just as legitimate reasons to stay away.
I spent 3 years on a Mac and went from really enthusiastic to really disappointed. I still defend others right to prefer it though and hope you'll defend my choice as well.
Having been exposed to Linux on the desktop for more than a couple of decades, in addition to using OS X since 2003, my subjective opinion is that it's not only far more mature, but better in almost every conceivable way.
That goes for casual users to developers. The ecosystem from Apple is maybe not perfect, but I still dare to use the word fantastic.
But we are not supposed to argue over such things here.
I'm just asking that Mac people respect that I and others way prefer other OS-es like Linux, BSD or even Windows.
: like 1.) not having consistent shortcuts for moving using the keyboard 2.) With two monitors the menu bar will be very far away when you work on the other monitor 3.) One Chrome window would block the other, preventing me from finding the instructions on the wiki while having a file select box open in another.
Etc etc. This is before I start my rant about things more unrelated to their OS implementation like a) putting fn in the bottom left corner b) not giving me any chance to fix it c) the fact that many programs I want to use was either unavailable or looked horrible.
The best example of this is Salesforce, which has their own proprietary SQL-like query language that's clearly just a crappy front end to generate raw SQL to feed their Oracle DB's. Without Oracle's per-tenant limit this would be far too risky because of idiots making bad queries.
An better solution these days is to put each tenant in a Postgres container and let the OS control resource limits for them, but this wasn't an option until recently.
Killing connections can be done in postgresql too. The reason for sf to be on oracle is probably history.
A few things I've found great in Oracle that aren't (AFAIK) available in PG:
- Straight better/more reliable performance on average
- More advanced parallel queries (obviously this is changing in PG right now)
- Flashback queries
- Better materialized views
- Plan stability (maintains predictable query performance, rather than the nasty jumps you sometimes see when plans change)
- Better clustering story (RAC is super expensive but pretty good)
It was actually good and very easy to activate. If you did activate it though you could expect a sizable extra invoice after next audit.
Netflix, eBay, Apple, Sony (for PSN), Spotify are all Datastax customers paying for the commercial version of Cassandra.
Facebook, Foursquare, eHarmony, Buzzfeed, LinkedIn are listed as paying customers of MongoDB.
Postgres is in the same range of relational enterise SQL database engines. Postgres offers even Oracle PL/SQL compatible syntax, so you can think of Postgres as the Linux of relational databases. (Linux is a clone of Unix). If you need advanced SQL syntax, XML support, complex triggers, inlined procedural code, GIS, etc look no further than this and choose Postgres. If you just want to hit your DB with thousends of connections from your web application frameworks and public APIs or think of easy clustering, it might be a good idea to add a caching layer in-front like memcached/redis or read on... (as the forking model doesn't scale that good)
And then there is a unique database-software with a common SQL dialect that supports dozends of database engines. It's called MySQL and it supports plugin-engines like InnoDB (default, true web scale, very fast), ISAM (old 1980s style features, very fast), etc. MySQL can handle many concurrent connections and InnoDB is really good, that's why it's a very good fit for web apps, using basic SQL features and used by Google, Twitter, Facebook, etc as a main production database.
And there are NoSQL databases that have different kinds of features like MongoDB/RethinkDB, Lucene (Solr/ElasticSearch), Hadoop (HBase, etc), Cassandra, etc. - they have often just basic SQL-like query language or non at all, but a custom API to interact with the datastore. Those domain specific solutions are often very fast for certain use cases. Some have limited index support, limited join support, transactions, etc. so the it really depends. E.g. for JSON datastore and full text search, those are ideal solutions.
We would need a database guide that highlights the common open source database engines. And provide a transition guide and compatibility matrix compared to legacy binary blob database engines - the real competitors to open source. People in the open source communities are often fishing in other communities and try to convert them - instead look no further than your corporate colleagues and enterprise fellows and try to convert them away from their rusty databases.
The real costs are:
1. Need for a third party connection pooler (built-in in many client libraries like JDBC, ActiveRecord or Sequel). Some poolers like pgboucner lack support for prepared statements.
2. Memory usage can become an issue, especially if you have tens of thousands of stored procedures since stored procedures are compiled and cached per connection. Also working memory for sorting is per connection which means PostgeSQL will use more memory than MySQL on some workloads.
3. Having to recompile all stored procedures on first use after reconnecting to the database can be an issue.
In cloud/managed space, Postgres is also hit and miss - still no good cross-region option and frankly other than AWS RDS not many other managed Postgres services.
So, MySQL rules the cloud/managed databases, and oracle/sybase/mssql rule on prem.
Google recently added beta version of Postgres to Cloud SQL as well (https://cloud.google.com/sql/docs/postgres/). Of course, AWS RDS and Heroku have been on the market for some time.
I'd claim managed Postgres market is in a pretty good shape.
"This is a Beta release of Cloud SQL for PostgreSQL. This product might be changed in backward-incompatible ways and is not subject to any SLA or deprecation policy."