

PostgreSQL 8.4 Released - xenoterracide
http://www.postgresql.org/docs/8.4/static/release-8-4.html
 After many years of development, PostgreSQL has become feature-complete in many areas. This release shows a targeted approach to adding features (e.g., authentication, monitoring, space reuse), and adds capabilities defined in the later SQL standards. The major areas of enhancement are:<p><pre><code>    * Windowing Functions
    * Common Table Expressions and Recursive Queries
    * Default and variadic parameters for functions
    * Parallel Restore
    * Column Permissions
    * Per-database locale settings
    * Improved hash indexes
    * Improved join performance for EXISTS and NOT EXISTS queries
    * Easier-to-use Warm Standby
    * Automatic sizing of the Free Space Map
    * Visibility Map (greatly reduces vacuum overhead for slowly-changing tables)
    * Version-aware psql (backslash commands work against older servers)
    * Support SSL certificates for user authentication
    * Per-function runtime statistics
    * Easy editing of functions in psql
    * New contrib modules: pg_stat_statements, auto_explain, citext, btree_gin</code></pre>
======
mattyb
tsally, don't look!

I was hoping to see simple built-in replication in 8.4 (didn't make it into
8.3, see here: [http://it.toolbox.com/blogs/database-soup/postgresql-
develop...](http://it.toolbox.com/blogs/database-soup/postgresql-development-
priorities-31886)), so I guess it's on the 8.5 wishlist.

~~~
mdasen
Part of the issue is that there is no such thing as "simple" replication
unless one means "unreliable" replication. Simple replication is just shooting
off the same SQL insert/update query on two servers. In the best scenario,
that means that if you use a time function, you're likely to get two different
results. In the worst case, it fails on one server and data gets out of sync.

Really, it's not that PostgreSQL needs more replication or built-in
replication. It needs _better_ replication than any of the current solutions.
pgpool, while great as a connection pooler, is a terrible replication
solution. Slony-I is based off triggers and its communication costs grow in
quadratically (O(n^2) - yuck!). Plus, Slony-I requires an immense amount of
setup for every table and every key. Mammoth replicator, which seems to be the
closest to the right track, has a website that doesn't show much life and only
a beta release for download. Plus, it still looks like there's a good amount
of setup.

What we all really want is for PostgreSQL to implement a log-shipping based
replication system where we can say "replicate this database" or "replicate
these tables" to another server and have it just work. But there's nothing
simple about that.

~~~
emmett
I completely agree there's nothing simple about that, but ever since we
switched from Slony-I to Londiste at Justin.tv I couldn't be happier. It's
been completely bulletproof, recovers from errors easily, and makes schema
upgrades a breeze.

------
nradov
It's a shame that PostgreSQL still lacks meaningful XML support. That's one
critical area where the commercial relational database vendors are way ahead.

~~~
sho
Why on earth would you store XML data in a database? XML is for when you have
structured data but _no_ database. Any structure you can store in XML is
already catered for in conventional RDBMS design. And if you just want to
store the string, there is nothing stopping you from doing that.

~~~
nradov
There are a lot of applications that benefit from native XML storage in a
database. Here are some examples:
<http://www-01.ibm.com/software/data/db2/customers/> . Of course those
customer statements are somewhat biased coming through IBM, but still fairly
accurate and realistic.

Storing XML as a string is really not very useful since you can't query into
the document contents. You also can't create indexes on particular elements or
attributes for fast access.

Microsoft has a good explanation of why it isn't necessarily appropriate to
store XML structures in a conventional RDBMS. I'll quote it here:

"The XML data model has characteristics that make it very hard if not
practically impossible to map to the relational data model. XML data has a
hierarchical structure that may be recursive; relational databases provide
weak support for hierarchical data (modeled as foreign key relationships).
Document order is an inherent property of XML instances and must be preserved
in query results. This is in contrast with relational data, which is
unordered; order must be enforced with additional ordering columns. Re-
assembling the result during querying is costly for realistic XML schemas that
decompose the XML data into a large number of tables."

[http://msdn.microsoft.com/en-
us/library/ms345117(SQL.90).asp...](http://msdn.microsoft.com/en-
us/library/ms345117\(SQL.90\).aspx#sql2k5xml_topic2)

~~~
rkischuk
Ok, seriously? You're asking an RDBMS to index into XML data you've shoved
into a column?

Quoting Microsoft & IBM on database design is like quoting GM & Chrysler on
vehicle design. All are years behind and focused on an under-informed and
declining customer base.

2 things:

1\. Postgres supports both Regular Expressions and indexing on expressions. If
you can't index your table on a regex expression, you are probably doing
something wrong.

2\. Reconstructing full XML documents from a single column based on a query is
a lazy anti-pattern. Either a) store the document in a way that it can be
retrieved on demand, or b) structure the data and the query in a way that is
performant. Jamming a verbose textual representation of data into a single
column of a table and then demanding performance is a fundamental
misunderstanding and misuse of an RDBMS table.

~~~
nradov
Apparently you don't understand how it actually works. At least in the case of
DB2, the database engine doesn't store a "verbose textual representation" of
XML documents. It breaks them down into a more efficient internal binary
format that still preserves the structure.

I deal with complex XML documents that go a dozen or more levels deep. It
would certainly be possible to represent that data in a relational structure.
But then to query based on the value of a deeply nested element would require
a large number of table joins. Keeping everything in a single XML column is
much simpler and faster, since the specific inner element values can be
indexed.

Hierarchical databases have been around for years, and actually predate
relational databases. XML databases are just a specialization of hierarchical
databases. Some applications are a better fit for the relational model and
some are a better fit for the hierarchical model. With hybrid databases you
get the best of both worlds and can choose the right tool for the job.

~~~
sho
_"Apparently you don't understand how it actually works."_

Man, nope. No disrespect but if you are regularly querying _12 levels deep_
into _an XML file you have stored in your database_ then IMO you are almost
certainly doing something horribly, horribly wrong.

~~~
nradov
Apparently you've never worked with complex healthcare data. If you can figure
out a simpler way be sure to let us know.

------
sethg
I am salivating over the new WITH RECURSIVE ... SELECT ... syntax
(<http://www.postgresql.org/docs/8.4/static/sql-select.html>). There's a very
hairy piece of C++ code in my group's codebase that grinds over a certain
schema to perform a certain transitive closure, and if I can redo that
operation in SQL, we might be able to change its status from "only godlike
programmers may dare touch this code" to "only demigodlike programmers may
dare touch this code".

------
drusenko
"A dump/restore using pg_dump is required for those wishing to migrate data
from any previous release."

Don't know much about postgresql, but that sounds fun...

~~~
jawngee
It's one command to dump and one command to restore:

pg_dump -Fc database > database.backup

 _upgrade postgresql_

pg_restore -d database database.backup

~~~
neilc
You should use pg_dumpall rather than pg_dump, typically. Also, remember to
use the pg_dump implementation from the _new_ version of PostgreSQL (e.g.
8.4), not the old one (e.g. 8.3).

There's also the new (beta) pg_migrator tool which allows upgrades to be
performed without a dump + reload. <http://pgfoundry.org/projects/pg-
migrator/>

