Hacker News new | comments | show | ask | jobs | submit login
PostgreSQL 9.2: Full-throttle Database (pgcon.org)
148 points by mace 1686 days ago | hide | past | web | 32 comments | favorite

If my experience with other DBMS (mostly SQL Server) is anything to go by, index only scans will do wonders for append-mostly workloads, and the new transactional scaling looks really good.

I can't wait to try this out, alas I use debian so it's a couple of years out :/

Also my inner grand prix follower is slightly irked by the varying years of the graphics.

there's an official postgres debian repository now which has the latest versions: http://pgapt.debian.net/

Thanks, I have no idea how I didn't find that.

> index only scans will do wonders for append-mostly workloads

Won't updates be actually slower once indexes are added to take advantage of index-only scans?

Another issue that I can see comes not from having the index-only scan feature (which is cool, though belated) but having a query planner intelligent enough to figure out which index to hit - ie query optimization.

If the columns used in WHERE clause do not contain the appropriate columns (particularly for non-clustered indexes), shouldn't index hints be supported to help pick up the relevant index or would we end up with horrible execution plans leading to tablescans?([1] - I see this as a case of devs overriding users)

[1]: http://it.toolbox.com/blogs/database-soup/why-postgresql-doe...

Yes, updates will be slower if you add more indexes, but if you're doing significantly more reads than writes then it's not much of an issue.

Also Postgres' query planner has been pretty smart about indexes for a long time, and its index format is pretty good as well. For example, if you have table (a, b, c) and index (b, c) Postgres can use the index to filter c by itself, which SQL Server certainly can't.

This is normally because indexes are sorted and hierarchical b-trees - it's ordered by b then c, so c is not ordered. However, a since the index could be smaller than the table itself it might be faster to scan every record of the index than the table.

Postgres' query planner is a thing of wonder.

This is a horribly naive theory (I have no idea what I'm talking about) but if you have foreign keys you need to get those indexed anyway (otherwise inserts or updates will need to do a sequential scan), and with index-only scans, if the referenced key exists in the index, the tuple won't have to be visited, hence inserts and updates getting faster in tables with foreign keys?

I think that's a very keen insight. There may be a detail that gets in the way, but nothing comes to mind at the moment. Did you know that PostgreSQL's referential integrity checks are -- in the most literal sense -- implemented as triggers that construct SQL, and then plan and execute them just like a client program?


I don't think anyone has measured the effect, which seems it would be greatest on tables with wide rows, whereby the wide rows would cause cache thrashing but the smaller index would not.

"shouldn't index hints be supported"

Index hints already are supported in multiple different ways:


"I see this as a case of devs overriding users"

I don't think that's an accurate characterization. I'm sure there are some improvements to hinting that would be approved if they are well thought out and don't introduce major maintainability concerns, etc. Someone just needs to do the work required, including considering many alternative approaches which may be radically different from what you typically think of as hints.

The proposals that tend to fail are ones that follow an argument pattern like "Oracle does it this way... therefore it is good... therefore postgres should do something similar".

I'm not trying to dismiss your point at all. I think it's important that postgres give users as much power as possible, and a part of that is having some control over the optimizer without waiting for a new release. Postgres offers that in some innovative ways, but it could do better.

It is already in Debian experimental and as alternize said there is an official Debian repo ran by the PostgreSQL project.


I'm sure it will end up in back-ports soon.

Here's the corresponding video to these slides from a separate event that gives an overview of features in 9.2: http://vimeo.com/49337386

Although PostgreSQL is featured database enterprises still using mainly SQL Server or Oracle. http://blog.protegra.com/2012/06/22/sql-server-2012-cheaper-... What can we do for wider adoptation of PostgreSQL?

The simple answer is to make better tools. There is still no single, awesome, dedicated tool to migrate from MySQL to Postgresql. (side note: Did you know that Postgres can actually act as a replication slave for SQL Server ? ).

Again, there is no official Mysql Workbench for Postgres, let alone things like SequelPro. (although a member on the Indian HN - Hackerstreet - is building one at http://pgxplorer.com/)

Second - documentation as compared to other database vendors. No, this is not flamebait - by itself, it is very good and adequate. But for people used to MySQL, SQL Server - it feels...scary.

Check out http://dev.mysql.com/doc/ (sidenote: has docs in epub format as well) vs http://www.postgresql.org/docs/

> Second - documentation as compared to other database vendors. No, this is not flamebait - by itself, it is very good and adequate. But for people used to MySQL, SQL Server - it feels...scary.

What is scary about it? Genuinely curious since I started out with MySQL and then when I started using PostgreSQL thought the PostgreSQL documentation was the best ever. It probably is a matter of preference, but I am curios what is scary.

EDIT: As for your two links the MySQL page is way scarier.. So many options, so many links.

I basically agree with you, except in regard to documentation, of which I find exceptional in PostgreSQL, even compared to other database systems.

Your toplevel comparison of the index of both manuals is not very informative to me. What do you feel is missing from PostgreSQL's manual that you find in other documentation sources?

I think it is hard for me to explain, because actually -> I <- prefer the Postgres documentation. However, I realized that was because it is fundamentally a man page. For someone who is exploring Postgres, or for someone who comes from other databases, it is way too abridged.

A simple example is the getting started part http://dev.mysql.com/doc/refman/5.5/en/creating-database.htm... vs http://www.postgresql.org/docs/9.2/interactive/tutorial-crea... Notice that Postgres has switched to using system commands like "createdb" while mysql sticks to SQL - there is a perfectly viable way to do this in Postgres using SQL, but perhaps the author didnt want to go directly to explaining template databases (which is simpler IMHO).

Go and ask someone who is new to both Mysql and Postgres to read these sections. Based on the distro they are on, you have completely lost them between the complexities of the OS vs the database Here is an example of the kind of confusions that result - http://superuser.com/questions/220771/how-can-i-create-a-dat...

Check this section out - http://dev.mysql.com/doc/refman/5.5/en/privilege-system.html. The reason why I am picking up client authentication example is that this where 90% of beginner postgres problems occur: they happened to me and they happen to every other first timer. Is postgres more complicated ? No, but you need to explain it better. I find this to be a much more useful tutorial to point first timers to, even if they are not on Ubuntu : https://help.ubuntu.com/10.04/serverguide/postgresql.html

I would say that I have the exact opposite opinion of you: the manual is not so great at narration because it is really extensive. Consider the example you gave: the postgres tutorial makes no assumption that you can even connect to the database or have the 'createdb' binary, and fairly exhaustively addresses each error case, whereas the MySQL tutorial at this stage presume you have a the mysql binaries and database session live already. In the stage of the tutorial that is most similar to the 'createdb' step is more similar to this page:


Whereby it's presumed that if you cannot get a connection you should talk to your administrator and give up. That may not be bad advice: the exhaustive covering of error cases by postgres tutorial may detract from rather than enhance the exercise.

I agree that Postgres' default authentication system one uses for development -- based on the account name -- probably deserves a different kind of explanation. It's an approach considerably better that most other software that it's probably worth keeping (especially for someone who hates more passwords), in spite of figuring out a way to explain it.

All in all, I basically don't agree with the notion that it is "abridged." It could be improved, but I think you are repeatedly using these words describing the scantness of the documentation whereas I think it is borderline over-detailed as it is. Perhaps a different kind of information presentation is required, is what I think.

Although still nascent, this lack of narrative documentation is probably why Craig started writing http://postgresguide.com/. This doesn't do a good job of explaining how to install the software, but is a lot more conversational than the Postgres documentation:




Clearly, one is more complete in some sense, but it's not going to help the beginner.

Personally, I found that SQL Server documentation was written to tick the checkbox of "100% documentation coverage", while PostgreSQL documentation was written for human to find useful information.

Get more billion-dollar companies to use PostgreSQL. How? Use PostgreSQL in our own startup, and build it into another billion-dollar company.

These will be web startups. Main problem for enterprises is lack of support from enterprise applications (ERP,CRM..) like SAP.

Web startups can become enterprises. Today's enterprises were car factories, farms, and accountants.

Wider adoption? In enterprise-scale companies?

For starters, acquire the reputation for providing high-touch, enterprise-grade, enterprise-focused, enterprise-costs support, with all that entails, and preferably with a cost somewhat below what Oracle charges.

Provide enterprise-friendly DBA training, certifications, and other related materials.

Have your sales reps schmooze the C-level office, and the IT leadership.

But adoption of databases in enterprises in general? They're loathe to change their databases, and have massive investments in their existing software and hardware processes, and their staff and training. The costs involved here are staggering.

If you're looking for more organic growth (and as some of those smaller companies grow and get bigger), that'll happen. But it'll take years. Maybe some acceptance in smaller projects in enterprises, but this sort of "shadow IT" can be a guerrilla effort in an enterprise; discouraged by management.

> enterprise-costs support, with all that entails, [...] Provide enterprise-friendly DBA training, certifications, and other related materials.

Most of those things are basically a scam (and often also a racket).

> Have your sales reps schmooze the C-level office, and the IT leadership.

And there you got your answer: basically, bribery.

This is just nonsense.

I work in enterprise and support/training is vital for maintaining 24/7 support coverage and ensuring that developers have the best knowledge from the vendor.

The presentation in the article you link to is highly biased in favor for SQL Server. it tells us that "From a cost perspective, SQL Server is the clear winner." which is untrue since PostgreSQL supported by EnterpriseDB, SQL Server and Sybase all have so close TCOs that any minor tweaking of the assumptions could make any of them the winner.

For costs the article really only shows one thing: Oracle is way more expensive than the rest.

For the scoring model the article just asks us to trust in them since they do not provide the results of the feature scoring method.

What a fantastic example of false quantification and pretension to objectivity.

Yeesh, you're right, that's pretty bad. The assumption that the customer isn't going to need to update the database version (and thus shell out more...) in those 10 years invalidates the cost assumptions all on its own.

There's legitimate reasons to buy from the big three DB providers: there's a shortage of quality DBAs for postgres, and of course there are the 5-10% of cases that actually need the features/performance of the big 3. Evaluations like this, however, do not make a good case.

We need objective comparisons to promote PostgreSQL.

You need more than that. Comparisons and bullet points and blog posts aren't the problem here, there's already billions of those.

What's needed is more "PostgreSQL for SQL Server People" type conversion material and tools to help facilitate this process. NaviCat is not an acceptable solution here.

Years ago when I switched to MySQL it was because it was easy to get going, a lot of community support around the product. You have questions, you get answers almost immediately.

Postgres is a much more sophisticated offering and still needs to file down a lot of the sharp edges on their toolset. The psql command line tool for one is very bewildering to the uninitiated and has two conflicting syntaxes, one for the backslash commands and one for actual queries. This is not unlike other SQL shells, but MySQL, while highly non-standard, does have the advantage of a unified interface here.

Great talk, love the commentary relating the the index contributions by the Russian contributors really impressive work.

Read and write performance improvements are something that will directly improve even current applications.

Talks like this provide valuable insight into what is going on in this amazing project.

Does anyone have accompanying video of the talk?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact