

The Future of PostgreSQL - sprachspiel
http://rhaas.blogspot.com/2010/05/big-ideas.html

======
btilly
The WAL log/hot standby you can read from in 9.0 are big. Back when I used
PostgreSQL they were the biggest missing features that I wanted.

If you're doing reporting then you really care about analytic queries. That is
in 8.4. But the truth is that most developers don't use their databases in
ways where they will benefit. However if you run across cases where you think,
"I wish I could just suck this data out, sort it this way, then do this simple
processing/grouping and upload that back into the database" then you have
probably run across a use case for analytic queries. With analytic queries the
only cases where I've had to do that are to join data that is not in the
database, to process datasets that were too big for the database server to
physically handle, and once because performance really, _really_ required it.

~~~
warriors
you should better use an OLAP engine for that like Sql Server Analysis Service

~~~
btilly
OLAP engine for which? The reporting needs? Or the case where I needed
performance?

As for the reporting needs, introducing OLAP would have been a lot of work for
something that could be done perfectly well in the existing Oracle database
using a supported Oracle feature. There is no need to introduce an expensive
new technology stack for an already solved problem.

On efficiency, heh. Hundreds of thousands of items had been given arbitrary
tags (on average over a dozen per item, and the same tag could be given
repeatedly), and the question was to identify for each item the other items
which were "most closely related" based on shared tags.

The fundamental problem with doing this in any kind of a database is that the
fastest access path for a given fact is a lookup in an index. Which means a
binary search through a data structure to find out the page and row number the
data was on, followed by parsing that page in memory to locate the row, and
extract the wanted information from that row. This is absolutely the fastest
form of lookup available.

The equivalent operation in C++ is access an array by index and pull an
element out of a struct. As a bonus it is easy to organize your data so that
most of your data accesses take place in on CPU cache.

Processing time for the whole data set dropped from a week to 5 minutes.

------
volomike
I wish they had a click, click, click and boom -- easy data replication to
another server. Same with just hot backups where I could setup a backup job
that runs on the hour and lets me fallback to any hour in at least the last 3
days if I have an issue.

