Hacker News new | past | comments | ask | show | jobs | submit login
Top Reasons I like Postgres Over SQL Server (datachomp.com)
235 points by craigkerstiens on April 8, 2013 | hide | past | favorite | 106 comments



"Top Reasons I like Postgres over SQL Server as a Database for Web Applications."

SQL Server has some neat features, and yes, it's more expensive than a free tool (and yet, wildly less expensive than Oracle, its main competitor). What Postgres doesn't have is the entire BI suite of tools that comes with SQL Server: Analysis Services, Integration Services, and Reporting Services. It also comes with MDM, Data Quality Services, a complex event processor (StreamInsight), etc, etc. It's not just a database.

I like SQL Server, and I use it all the time--but again, I do BI. That being said, if I were creating a web application, I'd use Postgres in a heartbeat. SQL Server's defaults are aimed at different applications than Postgres is.


Keep in mind I've been a SQL Server DBA for the past 10 years, and while it does come with builtin tools like SSIS/SSRS and stuff, it isn't all sunshine and lollipops. For example, how well does SSRS works for people that don't use Internet Explorer? Or any type of reasonable debugging in SSIS. Yes, there are tools that come with it, but many of them feel like crap thrown together by an MS intern. and yes, some are really nice and refined as well.

That being said, this post was intended for the web app crowd.


My point was that the context of this article is important. SQL Server has its place, and it does what its meant to fairly well. I really just despise these giant sweeping statements without context, so I wanted to add context. I also imagine that many HN readers are probably not very well versed on SQL Server. After all, it's not free (well, BizSpark makes it free for a while), and it's rarely used outside of the enterprise (for good reason).

To answer your questions, SSRS pre-2012 didn't work for non-IE users. In 2012, it works like a charm, though. As for debugging in SSIS, if you're talking about script tasks, again, they fixed it in 2012. Debugging data flows is pretty easy, too--I've never really had a trouble except for the script tasks.

As with any Microsoft product, there are a plethora of quirks that you just have to know about (or a quick googling will tell you), but I've never felt my dev time was stunted because I was using SSIS instead of Informatica or Talend or something else. These things all have their quirks, so none really bother me over the next.

Anyway, I'm not here to fly the banner of SQL Server. Microsoft, sadly, doesn't pay me royalties. I use it, it's a decent tool that does what I need it to do, but I do work on too many different platforms to really care about one or the other. My point was to give context and other info about what SQL Server is.

It's a good list for web apps, though. Certainly things you want to think about when building a web application. Especially pessimistic concurrency. I think a lot of web devs have no clue what concurrency models are. I cannot tell you the amount of `with (nolock)` arguments I've seen over the years.


with (nolock) is amazing. I worked at a very large fortune 500 company where that was the company policy because if you removed nolock, the thread contention would grind the servers to a halt. This also lead to some really interesting bugs, like a 5 million dollar order not flowing through a system because the row wasn't in the result set.


We had to throw nolock into a few things but to be honest, they should have been solved by better architecture and stopping the DB engine doing everything including app logic.


Out of interest, how would you do OLAP on Postgres? (MSSQL comes bundled with an OLAP server...)


http://mondrian.pentaho.com/ Is one I have used in the past


I thought Mondrian doesn't do OLAP cubes, it fakes them by just converting the OLAP queries into SQL queries and hitting the database with SQL directly?

Am I mistaken?


That's correct. You just have an abstraction layer that represents the cube as a specific star schema on a vanilla SQL database.

Other than the way you load data in, what's the difference, though? It's just an implementation detail.


Speed.

You can periodically build a read-only cube that has pre-computed answers to any number that any pivot-table could read from that star schema.

Then, when the multinational supermarket CEO queries for his company's total profit by product line over the past five years, he gets an instant answer.

The other way, the CEO must wait for weeks to get that answer from the RDBMS - or the cube must be replicated the cube inside the RDBMS somehow using tons of materialized views.


For my money, there's no better multidimensional database I've used for operational analytics than SSAS. JasperSoft, Pentaho, Essbase, Cognos, etc all just don't cut it. Essbase if you're doing GL (SSAS doesn't aggregate Parent-Child very well), but SSAS is an incredibly powerful tool. You can use Postgres with it, so you don't need SQL Server RDBMS to go with it, and you can just pay the license for your SSAS server (which needs more memory than CPU, anyway).


But if you've already got SQL Server for the SSAS, would you go through the bother of installing Postgres?

I think that's Microsoft's advantage in the database market: their system has all the tools out-of-the-box.

And the licensing's really not that expensive - especially compared to the heart-attack inducing prices coming out of Oracle, IBM, SAS, Teradata, Kx etc etc etc


I have just started to get into Postgres. I have been using SQL Server across its various versions.

SQL Server really is a great database but is it worth the Microsoft tax? I've come to realize that its just not. For my own projects, I am starting to go with Postgres. I still have doubts on the performance side but based on my understanding these have largely been addressed in the latest releases.

I think using Postgres on a linux box and ASP.NET MVC as the web platform is a good compromise. Are you doing this, or have you entirely moved to Ruby on Rails (or something else)? I would love to get your thoughts on this.

I think your statement around Microsoft's reporting and BI offerings is harsh but isn't entirely off the mark. The tools are good for the simple cases but for real BI, Microsoft can't stand up to Oracle's (and others) offerings. I am sure there are companies that base their entire BI on the Microsoft stack but based on my experience, its just not common. A lot of the systems that hook into the BI stack tend to be legacy or non-Windows data sources, so in a lot of companies, purely using Microsoft isn't viable.

It's also a dangerous thing to do as Microsoft tends to use SharePoint as a glue and I would be extremely hesitant to bring SharePoint in an environment (unless its for its out-of-the-box functionality).


I think you're a tad off the mark as far as Microsoft BI goes. I've seen some very large firms use it as their entire BI stack. And I've had a lot of work in it.

Then again, you really only want to do this if you already have SharePoint, since that's the web presence (it's a module in SharePoint called PerformancePoint). But SharePoint has sold a metric shitton of licenses (I don't know why). You're right to be hesitant to bring SharePoint in, but for those who have, it works all right. Even for "real BI" (whatever that means).

The backend stuff works well, and there are a lot of tools that can pick it up--my personal favorite being Tableau. Also, PowerView in Excel 2013 is pretty nifty, and it's what I've been waiting on since Microsoft acquired ProClarity in 2006.


In a previous life, I did a lot of MS BI & SharePoint work. On the BI side MS was just starting a push into the BI world and was gaining quite a bit of traction. I've worked on huge MS BI implementations and you are correct there. A lot of large firms use it for their whole BI stack.

On the SharePoint side of things, while I believe MS did oversell its capabilities, the problem was (could still be) that most places thought it was plug and play. It seriously wasn't. I don't know if things have gotten better but documentation those days was severely lacking. Add to that enterprises that were in the mindset that SharePoint was just another ASP .Net application and their in house devs could just pick it up without some sort of guidance.

There were a bunch of places where I was brought in to clean up the mess that happened because of those two things. I'm not sure if the landscape has changed or not but SharePoint is a beast unto itself and I've personally seen it implemented well at 30k+ seat organizations and the largest I've done was 50k seat org across 3 continents. It's not cheap to set up or customize but when it is done correctly, it does work.


Everyone owns SharePoint because of the CAL bundles. Many places were scared by the impending doom of Exchange public folders 10 years ago.

Others just got it in a licensing transaction... they signed up for the Core CAL, and took care of their Windows, Exchange, SharePoint and SCCM licensing, even if they didn't use all of them. If you owned any two products (Exchange/SMS, Exchange/SharePoint), you actually saved money in the process.


I don't disagree that the stack is good.

I also see that the term "real BI" is a bit meaningless :).

In terms of the companies using it, it looks like I might be a bit off, but its probably somewhat dependent on the industry.

One company that I am following is http://www.domo.com/. The founder has a great blog (www.joshjames.com) and seems to be trying to shake up the BI space a bit. I would love to know their back office stack.


I've used Domo for mobile stuff, and it's worked very well. I've been very impressed with it, and it makes a great impression. I've really only used it for quick demos and minor things--not for a full project, sadly. So, I can't give you a full on review of it. Sorry!


Yeah, we have issues with SSRS and Crystal Reports in cross-browser environments. i.e. the "preview" never looks like what you intend. Instead, we work around it by exporting directly to disk, which seems to keep the WYSIWYG aspect.

I'm more interested to know how MS people have integrated PostGres into their environments. Which ORMs work? What are the gotchas, etc?


Nhibernate. No issues.

To be honest we're shifting to java. Our license fees are well into 7 figures for MS stuff for no gain. This is the first step. Some subsystems are moving from asp.net web forms+mvc to spring, tomcat, hibernate, postgres. Cost savings are huge, staff available are orders of magnitude better and reliability is higher as to be honest, windows has been a piece of shit to manage over the last 10 years.

Fortunately we've got a pretty well designed system so we can move it over piecemeal as and when we update major subsystem features.


Wow, my experience has been a bit different. Every Java project I've worked on has been an exercise in frustration... Though a lot of the .Net projects I've worked on I can say the same.

I think that NodeJS + MongoDB has been the least resistance I've seen so far... though some of its' limitations are really frustrating (non-indexed result limitations). I like Postgres at least as much as MS-SQL Server. And the Entity Framework generators have been close to a blessing. That said, I'm much more of a fan of the DB as dumb storage, and not a big fan of SSIS. So changing from one DB to another is mostly transparent when things are how I prefer them.


Well, we can't shift to Java--we're consultants. We're a total MS and esri (GIS) shop. I was hoping for some hints that ASP.NET could integrate smoothly w/ PostGres in order to offer something else to clients.


I also think LLBLGEN PRO (http://www.llblgen.com/) is highly underrated.

There are commercial providers but npgsql- http://npgsql.projects.pgfoundry.org/, works well.


As someone who works on a project that extensively uses LLBLGen Pro I'd have to disagree with you.


Would you mind elaborating on this a bit? I would love to hear about it.

I understand that some people get turned off by the code generation aspect of it, but what in particular didn't you like about it.



I've looked at PetaPoco and Massive as a replacement for Linq2Sql, etc but don't know what experience others have had with it. We're mostly a MSSQL shop, but are seeing more Oracle projects. And now that PostGres is becoming stable in our GIS environment I'd like to know how people are dealing with multiple DB backends. Have you built anything in production with PetaPoco?


I would really recommend you check out Llblgen Pro. The name is just weird but its a really good ORM.

The support it has for all the major databases (SQL Server, Oracle, Postgres) and others is really solid.

Also, there is a full Linq provider that works well.


I would recommend BLToolkit (and the newer lighter version Linq2Db -- not to be confused with Linq to SQL). You keep an abstraction level above raw SQL via LINQ and with multiple DB providers you can easily switch between databases without rewriting your queries (provided you don't use any DB-specific features).


Check out the lightweight ServiceStack postgres ORM for light-weight usage, consider it more on par featurwise with Linq2SQL: http://nuget.org/packages/ServiceStack.OrmLite.PostgreSQL

Edit: just noticed someone posted this below. Sorry.


In most scenarios, end users aren't hitting SSRS' interface directly: The product offers a robust web service that is an excellent way of add very rich, capable reporting to your own application.

The comment about "thrown together by an intern" seems suspect (though not as much as criticizing SQL Server for defaulting to data integrity. If ever there was a non-complaint, that is it). I've been using SQL Server since 6.5, and generally find the toolset to be excellent, and anything other than "thrown together" (quite contrary, the product moves far slower than I would prefer).

They both have their place. In SQL Server land the things that they limit to enterprise edition generally are functions that only really benefit data warehouses (partitioning and compression being two of the most obvious), and given that they know that the people who use data warehouses tend to have warehouses of cash, it comes with a cost.

Worth noting that SQL Server has had clustered indexes forever -- psql still has a rather hacked variant of it -- along with materialized views. Both are of tremendous value to web developers. Far more valuable than the default transaction isolation level.


A quick plug for my open-source alternative front end for SSRS - CrissCross

https://github.com/codeulike/crisscross

It uses the standard report viewer but aims to handle multi-pick parameters and reports with lots of optional parameters more intelligently.


"They both have their place. In SQL Server land the things that they limit to enterprise edition generally are functions that only really benefit data warehouses (partitioning and compression being two of the most obvious), and given that they know that the people who use data warehouses tend to have warehouses of cash, it comes with a cost."

Redmond could throw column level encryption into the lower tiers as a security measure, but oh, no!, that's "enterprisy".


> end users aren't hitting SSRS' interface directly

They'll hit the rendered output though. Which in HTML4.0's case is horrid (at least in 2008).


We plugged spring integration, spring batch and jasper in and we're sorted.

The main problem with SSIS and SSRS is that you just can't put it in version control or scale it up inexpensively. Plus to be honest the tooling is shitty and unreliable (says me who spent all day fighting T-SQL debugger HRESULT crashes in SSMS whilst trying to reverse engineer a huge shitty black box SQL sproc into something scalable and testable in Java). Ugh.

We run both massive SQL server and postgres installations.


To save me a lot of reading time, is spring batch and integration more or less a drop in for SSIS?

I am looking at Postgres at the minute and the biggest stop for me is mass data imports from our customers on a nightly basis.

I've been keeping an eye on Postgres since version 9 as it has (reportedly) the same features as SQL but for a lot less.

We currently pay £250 a month per processor license and this shoots up costs a lot when you have a cluster of servers. If it's unavoidable then we carry on paying but if there are free alternatives out there then I'll jump at the chance.

The rest of the DB is used for serving sites, no other BI work done really.

EDIT: Typo


Re mass data imports - Postgres offers parallel restore of dumps since version 8.4:

(Quoting the documentation for pg_restore):

-j number-of-jobs --jobs=number-of-jobs Run the most time-consuming parts of pg_restore — those which load data, create indexes, or create constraints — using multiple concurrent jobs. This option can dramatically reduce the time to restore a large database to a server running on a multi-processor machine.

Each job is one process or one thread, depending on the operating system, and uses a separate connection to the server.

The optimal value for this option depends on the hardware setup of the server, of the client, and of the network. Factors include the number of CPU cores and the disk setup. A good place to start is the number of CPU cores on the server, but values larger than that can also lead to faster restore times in many cases. Of course, values that are too high will lead to decreasing performance because of thrashing.

Only the custom archive format is supported with this option. The input file must be a regular file (not, for example, a pipe). This option is ignored when emitting a script rather than connecting directly to a database server. Also, multiple jobs cannot be used together with the option --single-transaction.

http://www.postgresql.org/docs/8.4/static/app-pgrestore.html


Its not a drop in replacement. You have to write code/configuration (but not much). Its definitely suitable for your use case - we use it for the same thing.

Feature set is comparable. We tend to avoid specific platform features as they are a migration risk.

Ha - our SQL license fee is around £60k a machine one off per major drop. We're not dropping that again for 2012. No way. Not when we have 8 machines :)


We have dug ourselves the hole really by using feature specific code and relying on certain features in our structure.

We are slowly digging ourselves out the hole with dependency injection in our code from a Entity point of view and want to go a similar route with our DB.

We are nowhere near yourselves but we have spent £18k on SQL licenses in the past 2 years. It's a lot of cash and I don't think we see the full benefit to be fair. Coupled with the abundance of other Microsoft licenses we pay out for, more than half our server costs are licenses.

I appreciate your comment, it's taken me a step in the right direction to sorting out DBs out.


Out of curiosity, how comprehensive is your test suite? For one company I worked for, we actually found PostgreSQL was outperforming Oracle and because of how comprehensive our test suite was, the lead dev connected to PostgreSQL and got 80% of the test suite passing in one evening. A strong test suite allows you to instantly find out where your app breaks down.

Side note for those who don't believe PostgreSQL can outperform Oracle: we needed a custom data type that we were aggregating over. The projected table size was over a billion rows (last I heard it had reached over 4 billion rows). As I recall, in Oracle, you were limited to writing user-defined data types in SQL or Java (now you can use C). PostgreSQL, being open source, allowed the leave to implement the custom data type in C. The query we needed in Oracle took several minutes to run due to the need to constantly serialize/deserialize the data over the aggregation. The PostgreSQl version returned in a few seconds.

We even hired a well-known Oracle performance consultant who did wonderful things to all of our queries ... except for this one custom data type which left him stumped.


The code change (and risk) can potentially be quite substantial if you're relying on default SQL server behaviour (i.e. not MVCC/snapshot isolation). Code that relies on blocking when it hits rows that are locked by another process get a bit of a nasty surprise during that kind of switch :-).


Always have deterministic known behavior :)

We shot ourselves when we first implemented NHibernate by setting our transaction boundary at the wrong place. This caused all sorts of portability problems.


I would recommend Talend Open Studio for the ETL portion. You'll get the same SQL bulk load feature as SSIS with any Database. http://www.talend.com/products/talend-open-studio

Choose Big Data or Integration downloads.


Thanks for the link, I've been looking around for software that does similar things.

I have been taking a look at SQL Web Edition in the view of coming away from standard that supports SSIS so it's good I'm getting a few options of other DBs too.

Although a majority of our stack is Microsoft based, I would love to get off the Microsoft train when it comes to the DB. It's a massive cost to our business


In a competitive world, you can either compete in having the best product, or try to change focus on bundled extras. Fast food restaurants was an early adopter with this when they started the "a toy in each meal" campaign and stopped competing for the best food. Has SQL Server started go down on that road, focusing more on the bundled tools than the main product?


> Has SQL Server started ... focusing more on the bundled tools than the main product?

Yes, quite a while ago. SQL Server has an outstanding SQL engine, but it had that a couple of versions ago, so they need a new unique selling point. This may be because a SQL engine is now a very mature technology. They always tweak it, but gains here are marginal.


Is there still a shared tempdb across ALL databases in SQL Server?


I don't know, try google. Is it important?


Yes, there's still the idea of tempdb, what's the issue though one DB eating it all up?


I totally understand your points in here - I have experience with both SQL Server and PostgresSQL. (+ Oracle, but that's not relevant in this case).

I wouldn't call myself advanced in both, more like an average user from DBA point of view + some more advanced BI things. One of the last tasks which I was doing on SQL Server was creation of a set of SSIS / SSRS packages to replace odd SQL queries scattered around dozens of utilities. I had chosen SSIS for the ETL tool as I had heard lots of positive things about this one and I must say it was really pleasant to use ... to a point when it became tedious trying to deal with wrong export formats of columns. Not mentioning the fact that when I was writing code for Script Tasks, sometimes the code was mysteriously disappearing, that signing of packages setting up agent tasks and all the boilerplate for that one to work was a task on it's own. I was blaming my lack of knowledge at that point about the whole stack - you don't usually master tool that big in a week.

Yes, PostgresSQL is harder to pick up, it's harder for your average and not IT-centric company to implement as a BI tool. Microsoft is providing a ready-made full-fledged and powerful Behemot which can solve most of your problems with a snap of a finger. Unless there's some magic happening and lack of capabilities to have total insight into the tools is driving you furious. Postgres has it's quirks - at least for me it was harder to pick up than other database engines, but I'm quite happy with all the additional features which it provides and additional understanding of the whole system which I'm gaining as I'm learning.


You could use something like greenplum for doing BI with postgres. Unfortunately they do not list prices, which leads me to believe it's not cheap. I wonder how it compares in price to a similar SQL Server install.


Just to play devil's advocate:

5. SQL Server supports indexes on computable columns [1]

6. CLR Integration allows you to write a lot of things in C# (or VB.NET) [2]

10. Unicode is there by default if you use nchar/nvarchar/ntext columns, there's no problems unless you choose to use char/varchar/text instead

[1] http://msdn.microsoft.com/en-us/library/ms189292(v=sql.90).a...

[2] http://msdn.microsoft.com/en-us/library/ms254498(v=vs.80).as...


You did lose the point here:

> 6. CLR Integration allows you to write a lot of things in C# (or VB.NET) [2]

You can also use a Mono plugin at Postgres, just like Perl, Python, C, C++, Lisp, Haskel, R-Cran... But it's not that, #6 was about the core functionality of the DBMS changing. For example, you can add new index types at Postgres with plugins, and people do add them. Once somebody wrote a plugin for manipulating SQL queries at what was at the time an object oritented database called Postgres. That's how deep you can go.

> 10. Unicode is there by default if you use nchar/nvarchar/ntext columns, there's no problems unless you choose to use char/varchar/text instead

Yeah, Unicode is used by default if you tell the DBMS to use Unicode... Thus, it's not the default. I'd agree that this is not a big problem.


> Yeah, Unicode is used by default if you tell the DBMS to use Unicode... Thus, it's not the default.

False. Positively false. Because there is no default. Whenever you create a new column, which incidentally should contain text, you need to choose a type for that column. There is no default column-type. You need to be specific about how your data gets stored. That's why it's called a schema.

And when making that choice, for your column which will contain text, you must ask yourself two questions:

1. What sort of capacity is needed?

2. Are you living in the 1980s or in the year 2013? If your answer is the latter, you always choose Unicode unless you can argue for why it wont ever be needed. That's all.

In 2013 you don't have to argue for the need for Unicode. You have to argue for the right to use antiquated, primitive 8-bit text-types. And then you choose (because you and noone else will have to choose that column-type) Unicode text-types by default.

Anything else is madness. If any tool you rely on to do this for you chooses non-Unicode by default it is broken and needs to be fixed.

There might be a million other things SQL Server should address, but this is not one of them. Nothing is wrong with text-support.


To be even more pessimistic you could point out:

1. Read Committed is not that restrictive to be honest. If you're short on memory it can be an issue but even in a high frequency system it should only lock at the row level. Beyond that, why would you be pushing to read dirty data? That's an issue in and of itself.

8/9. Arrays? JSON? XML? It's not relational data. Either parse it or put it in a different system, no RDBMS is going to handle these very well because they break the 'Relational' part. They might be accomodated, but trying to work with them is not a good idea in the vast majority of cases. I'm promoting SQL here and I still have to point out that SQL handles XML like crap too, try and index an XML blob.


> I'm promoting SQL here and I still have to point out that SQL handles XML like crap too, try and index an XML blob.

To be fair SQL Server has constructs for this, but they are plenty in types and awful in nature. They are even worse to understand how they work and interact together and how they will optimize your queries.

And I guess maybe that was your point?

Handling XML in SQL was "reasonable" when the option was full fledged, tripple-verbose DOM code in compiled languages. These days though, you have better ways of "shredding" XML into usable data-constructs and the desire to do anything of that sort inside the DB is definitely approaching zero.


What I haven't seen recently is an explanation of "Reasons I like Postgres Over Mysql." It seems like after a long period of stagnation, postgres suddenly became hugely popular among startups again in spite of some of the things that I consider its core failings not being fixed (mainly: an even more expensive per-connection model than mysql and extremely painful replication, largely offloaded to third party tools).

Is it because of HStore? Heroku using it? The Oracle buyout of Inno and then Mysql?

Mysql 3.x was quite awful (other than replication), but by 5.x it had started getting good, and now the forks are doing some fantastic work. So it surprises me.


My own personal anecdotal evidence is that mysql is fine, until you have complicated queries or a lot of data. And then it suddenly sucks. We had a huge problem where the query planner refused to use the indexes we had unless we manually put index hints in the query (yuck).

Rather than actually doing that, we migrated all our data to postgresql. The postgres query planner works much better than the mysql one. Since then we've started using some of the postgres specific features, like windowing, which have made complex queries that we didn't think were going to be possible not only doable, but easy and fast (running).

Postgres has some different philosophy about certain things in the devops side of things, but if you don't try to force it to be like mysql it'll be fine.

At this point I wouldn't ever use mysql again (for new projects where I have the choice).


It's because, fundamentally, it's better for most stuff. It has transactional DDL, check constraints, window functions, CDEs, joins other than nested loop, and a better focus on safe-by-default behaviour. It also has a query optimiser that is generally more competent on difficult queries.

The per-connection cost isn't much of an issue with a connection pool, and replication has improved a lot lately (and is clearly only going to get better).

Aside from ease of getting up and running, IMO there's very little reason to pick MySQL over Postgres nowadays.


"an even more expensive per-connection model than mysql"

Did you actually run into a practical problem here, or is this theoretical? Most people get along fine for a while, and when they have a problem, they use pgbouncer, which seems to solve it.

"extremely painful replication, largely offloaded to third party tools"

Do you have some more specific/constructive criticism here? A lot of work has gone into postgres replication over the past few years, and it's improved dramatically for some use cases. But there are many use cases for replication, so it's possible that you didn't see improvements for your use cases.

FWIW, I started out with MySQL, and then tried out Postgres because of the license (MySQL was not yet GPL). After MySQL was GPLed, I used both for a while, but found the postgres community more welcoming/informative, and found postgres much more pleasant to use. At one point around 2003 I basically threw mysql away completely because I was fed up trying to do simple date manipulation to make simple reports. Postgres made it so easy and it seemed so natural -- it's really hard to use mysql afterward without getting frustrated. I'm sure there are things like that in reverse as well; e.g. mysql has the ON DUPLICATE KEY clause, but to me it still didn't add up to a pleasant/productive experience.


http://www.postgresql.org/about/featurematrix/#performance

IIRC, at 8.3, Postgress got an order of magnitude bump in speed. As MySQL gains credibility as a "real" db, Postgres is making itself fast.


Heroku integration did it for me, doubly so when they released Postgres.app for Mac. It's incredibly appealing to have the same database on both ends. MySQL isn't hard to install by any means, but Postgres has become dead simple thanks to the work of Heroku.


Postgres is primarily designed around data integrity. Therefore it does the "right" thing for certain data types. For example, it stores dates and times in ISO 8601 format and converts to UTC. The arbitrary-precision decimal type has a ridiculously high number of maximum digits and decimal places. It is so high that only a very small number of applications would exceed the precision. This is the main reason we are switching from MySQL to Postgres.


By "a long period of stagnation" I assume you are referring to public exposure -- because it's not true that Postgres itself has stagnated.

Postgres has been developed steadily the last decade, and has been a serious competitor to major RDBMSes since version 8.1 (which added two-phase commit, table partitioning, index bitmap scan and shared row locking, among other things.)

I say public exposure because Postgres has clearly been popular among many developers for a long time It's ben beneath the radar for a lot of people, partly due to pervasive myths ("Postgres is slow"), partly due to the fact that, well, a product doesn't sell itself.

I think the reason Postgres has shown up on HN a lot is simply that people are finally discovering it and recognizing its technical brilliance, careful engineering and high performance, and it's snowballing. I myself have used Postgres for web apps since 2003 myself, and I have evangelized among colleagues and acquaintances to the best of my abilities, so I am very happy about this progress.

I don't see why it should surprise you. While I'm sure MySQL 5.0 and its various forks have gotten better over the years, it's hard to argue that Postgres is not technically superior in pretty much every way. Even if you care solely about performance, which used to be MySQL's main selling point, Postgres has overtaken MySQL a long time ago.

> an even more expensive per-connection model than mysql

This is not a practical issue for most people, in my experience. At least nobody is really complaining about it. Rails apps happily pool connections, for example.

If you can't pool connections in the app, there are several excellent, mature connection-pooling proxies around (including a very cool one, pgPool-II, which can also load-balance and replicate at the same time).

> and extremely painful replication

Is it possible that you have not been following the developments? If you don't need per-table/per-database replication, then streaming replication is actually incredibly painless. It is by all accounts awesome. And it scales very well.

If you do need fine-grained replication, there are several mature third-party projects that are fairly painless to operate, including Bucardo and Londiste. (I would not count Slony as "painless", although it's probably the most comprehensive solution right now. Avoid it unless you are a bank.)

There is also a multimaster-replicating fork (PostgreSQL-XC) that is apparently working very well, and implements true distributed ACID replication, as opposed to the kind of "de facto multimaster replication" you set up in MySQL, where you have to make sure primary keys don't overlap between hosts.


On compression: > Once you have paid for that ability, you still have to figure out how to implement it.

Postgres has lots of advantage, but "figuring out how to implement it" is rarely an issue on SQL Server. SQL Server must be the easiest-to-use complete database system on the market.


All that is missing in Postgres is materialized views feature. I would happily contribute to a kickstarter project or similar for implementing them in Postgres.


Good news! It's slated for the next release. :) http://michael.otacoo.com/postgresql-2/postgres-9-3-feature-...


I think most people used to a materialized view features in proprietary databases will find the incarnation of Materialized Views as committed quite limiting...nevertheless, it seems important to me that this foundational work has been integrated.

In particular, incremental updates of the materialized view are not yet a feature. That said, a lot of hang wringing has been going on in the lists on how to build on this work, so it looks like this whole subsystem will receive some attention.


That's how the Postgres team works. They tend to start with the primitives and build a bit more of the niceties in each release. And they do so very carefully, which is important.

People forget that this is true of every software system ever. No successful system ever starts with every feature it's going to have.


That is just the initial work to provide the basic infrastructure (and a solution to a couple of the simplest use cases) and not fully fledged materialized views.

In 9.3 materialized views will only be refreshable by recomputing the entire view and they will be locked while they are being refreshed. To shorten the lock you would have to swap the view with ALTER MATERIALIZED VIEW.


According to the PostgreSQL wiki there is a way to get something like a materialized view using triggers: http://wiki.postgresql.org/wiki/Materialized_Views

Worth looking at.


It's not the same performance-wise. Materialized views are required to do ROLAP analytics efficiently.


I don't think this is a very helpful or accurate statement. I would encourage people to actually investigate things for themselves. Generally you do trade some performance (depending upon implementation, they can be closer to the metal so to speak ) for flexibility ( again depending upon implementation ).

Consider this documentation from oracle, a very popular materialized view implementation:

"When changes are made to the master table or master materialized view using DML, an internal trigger records information about the affected rows in the materialized view log. This information includes the values of the primary key, rowid, or object id, or both, as well as the values of the other columns logged in the materialized view log. This is an internal AFTER ROW trigger that is automatically activated when you create a materialized view log for the target master table or master materialized view. "


The main reason why my company uses SQL Server is that HR can find DBAs to hire here in São Paulo. Finding good DBAs to take care of our few MySQL servers was hard enough.

But I still sigh every time I have to convert perfectly valid UTF-8 before a bulk insert...


Well he missed the main point:

Posgresql is free software.

And therefore for example no company can force you into their lock in for their profit and your loss.


That's my main argument for using Postgres:

1 - At the start of your project's life, you'll want it because it's free as in beer. Except for ease of developping (where Postgres is also great), nothing else matter.

2 - Later, when you need some performance, even if you have a usage patter where Postgres is slower, you'll discover that the price of any of the other DBMS is highter than just adding hardware. Thus, free as in beer wins again.

3 - If you get huge, you'll discover that free software is a big deal, and the limitations of the proprietary DBMS add a huge amount of drag. Free as in speech wins now.


I like your reasoning and its a large driver for my I'm moving away from SQL Server.

If I get to a point where I actually need the features, that SQL Server or another commercial RDBMS has, then I have a good problem. It means my business is doing well and I may be in a position to consider a changeover.

I used to fret about features that frankly don't really matter when you are starting out. Coming from an Oracle and SQL Server background, I wouldn't touch MySQL but Postgres hits the 80% mark quite easily.


freedom is more valuable than features


You point 3 makes no sense whatsoever.


Yes, because the great value of freedom is independent of the company size.


SqlServer issues I've encountered or found on the interwebs:

* Microsoft SQL Server 2005 (the release version and service packs) and earlier versions of SQL Server are not supported on Windows Server 2012 or Windows 8. * SQL Server 2008 only runs on W2012 /Win8 from If you install R2. * Importing .bak files from SQL2005 to 2012 Express (Not sure about the full version) will not import. * Copying a database from SQL2005 to 2012Express in management studio will not work either. * SqlServer 2012 Will not run on Server 2003 * SQL Server 2012 requires a minimum of 6 GB of available hard-disk space. * SQL Server 2012 will not run on Server 2008, Vista, Win7 without a Service Pack

http://msdn.microsoft.com/en-us/library/ms143506.aspx


90% of the points are irrelevant. The real value of a database is how well does it scale up and out. All the rest is minutiae.

Why do most companies that scale out (and thus massively benefit from free-as-in-beer software) actually choose MySql over PostgreSQL? Because of the features? Nope...

Why do most companies that scale up (and thus massively benefit from expensive, but highly performant software) actually choose MSSQL or Oracle over PostgreSQL? Because of the tag price of the Enterprise features? Nope...


> Why do most companies that scale out (and thus massively benefit from free-as-in-beer software) actually choose MySql over PostgreSQL?

http://en.wikipedia.org/wiki/Path_dependence


While I could probably write a long list of my bad experiences with MSSQL (or just any database really), I think a lot of the authors grievances come from it really solely being an enterprise application while ignoring some of the good points that comes with enterprise applications; such as support (MS support is a mixed bag, some amazing support reps and some terrible ones) and business integration.


You can buy excellent support for PostgreSQL at quite cheap prices actually. Especially since you just have to pay for the support and not any license.


Just learned sargable/function index for Postgres. Looks really cool feature.

http://www.postgresql.org/docs/9.1/static/indexes-expression...

One question: Does this work for bitwise operators?


One of my top reasons: You get the latest and greatest with all the scalability features without having to pay for the production version.


How does Postgres performance rank on Windows?


It varies wildly based on platform, filesystem, hardware, and usage cases.

Is your workload write-heavy? Is it more read-heavy than write-heavy? How much query throughput are you talking?

At the smaller scale, it doesn't really matter. At the larger scale, your app's characteristics determine how much (if any) you are disadvantaged over running a more performant Postgres+Linux combo.


Ok I'll try to be more specific. How much slower is it than SQL Server for a web app that does a 70%-30% split between reading and writing. 10,000 users a day each spending about an hour on it. I've no idea how to measure query output, will have to look that up. Maybe this is more of a Stack Exchange question.


That still depends. It'll depend on the size of query X number of queries, type of data, coverage of your indexes (what depend on your actual data), and a ton of other factors.

I'm not even sure there is somebody out there capable of correctly measuring this. You may need to go with a fast biased study anyway.

Also, why are you assuming Postgres is slower?


I'm not the parent but I sort of operate on the assumption that Windows isn't Postgres primary target when they make optimizations. While SQL Server is not only purely focused on Windows they have have access to knowledge about (and in server editions, probably influence on) the low level internals that Postgres people never will. As pointed out in on of the top comments SQL Server has a bunch of extra junk built in(ish) but still I too would, having not tested anything, assume SS would be faster than Pg on Windows.


I think you're correct here. Postgres has had lots of optimization work specifically for Linux and Unix/BSD. The Linux kernel has also had some optimization done to improve hot paths that Postgres hits.

I'm sure the Postgres team has kept an eye on Windows, but I doubt it gets nearly as much attention as the *nix environments.


Indeed, most of the benchmarking of the core PostgreSQL team is done on Linux. Some of those optimizations also benefit Windows but there is little dedicated performance work for platforms outside Linux as far as I can see from reading the mailing lists.


Oracle has spent a lot of effort improving the MySQL's performance and usability on Windows for version 5.5. I believe Oracle would like to take a bite out of MSSQL sales with MySQL.


I would like that everyone call it "Microsoft SQL Server" over "SQL Server".

Or else Microsoft has successfully squatted a common, generic term as a name of their product. Not a good thing if you ask me.


Databases have always been known as relational, RDMS, etc.

In the "old days" I never heard of people referring to Oracle as a "SQL Server", nor DB2, etc.

Even now, people talk of Postgres as a "RDBMS", not, "Postgres, a SQL server".

So I'd say that I don't particularly agree with your postulation.


Postgres and MySQL are SQL servers; SQLite is not, but it's still a SQL RDBMS. Similar with Gadfly. University Ingres was an RDBMS server, but it didn't speak SQL, so it wasn't a SQL server either. Generally speaking, software that's written to work with one SQL server is pretty easy to retarget to another, but retargeting it to work with SQLite or University Ingres would be pretty rough.


"RDBMS" does not necessarily imply that it speaks SQL, correct? My company produces document clustering software that can interact with virtually any database (via ODBC) that speaks SQL to examine any stored documents and cluster them. It is extremely annoying to try to convey to a prospective customer that it works with "virtually any SQL server" without them thinking that means just Microsoft SQL Server. I agree with guard-of-terra -- we should not be ceding common/descriptive terms to Microsoft.


The name comes from Sybase, not Microsoft.

Microsoft SQL Server is an offspring from Sybase SQL Server.


You will never find someone use the term "SQL Server" to refer to anything else. Now, sometimes people just call it "SQL" and that is legitimately confusing.


Too late, Microsoft already own the term SQL Server. Not that it matters because everyone knows SQL server as being the MS product anyway.


Your preferred term is too long. "mssql" will do.


I also don't do much Visual stuff (I do mostly console based apps on Windows) with Visual Studio, but hey - they got the name :)


I agree with you that this is not a good thing, but I don't think changes in language like this ever reverse (similar to generic trademarks).


ship: sailed.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: