Hacker News new | comments | show | ask | jobs | submit login
PostgreSQL 9.2 beta adds JSON (h-online.com)
108 points by voodoochilo on May 15, 2012 | hide | past | web | favorite | 24 comments

Wow, range types with non-overlapping constraints:


CREATE TABLE room_reservation ( room TEXT, during TSRANGE, EXCLUDE USING gist (room WITH =, during WITH &&) );

INSERT INTO room_reservation VALUES ('123A', '[2010-01-01 14:00, 2010-01-01 15:00)');

INSERT INTO room_reservation VALUES ('123A', '[2010-01-01 14:30, 2010-01-01 15:30)'); ERROR: conflicting key value violates exclusion constraint "room_reservation_room_during_excl" DETAIL: Key (room, during)=(123A, [ 2010-01-01 14:30:00, 2010-01-01 15:30:00 )) conflicts with existing key (room, during)=(123A, [ 2010-01-01 14:00:00, 2010-01-01 15:00:00 )).

Used to have to rely on the application to do these sorts of constraints.

Yeah, this is another big one. I first came across the temporal project [1] in year 2008. Glad to see that it finally makes it to release.

[1] http://pgfoundry.org/projects/temporal/

I think the bigger deal (at least for most existing deployments) are the index-only scans. This is one of the only features of Oracle that I missed after switching to Postgres.

I don't know. Index-only scans bring a level of feature parity with Oracle. Range types / range exclusion constraints introduce features that no other DBMS has - and solves a messy, hard problem that you pretty much have to hand-code for each scenario.

One of these is legitimately revolutionary, and it isn't covering indexes.

I have to admit that I am not as familiar with this feature, so I will check it out. I glossed over it as something I could already accomplish with a check constraint or a trigger, but I think you might be right that there's more to it than that.

The reason a trigger or check constraint is not enough is race conditions. While the trigger or check constraint looks at the other rows in the table for a colliding range another transaction may at the same time insert a colliding range. The two concurrent transactions inserting ranges wont see each other and both will be inserted.

Without PostgreSQL's exclusion constraints this can be solved in various more or less ugly and problem specific ways. Like locking the entire table, allowing overlapping ranges and cleaning them up later, implementing an own locking infrastructure (e.g. have one lock per day or per room), only having one process inserting at the same time, or just ignoring the problem and hoping you wont get any race conditions.

Exclusion constraints is a clean solution to a messy problem.

Can you please explain index-only scans, and what the advantages are (or why this isn't the default already)?

Sure. It's not the default already because the functionality did not exist at all in previous versions.

Say you have a table with a bunch of columns: (column1...column15) and you have an index on (column3, column4) and you run a query like:

  select column4 from table where column3 = x;
the DB will only read from the index (and not look at the table at all.) This is much faster as the index is much smaller in size than the table.

Thanks, that's interesting. So it only kicks in when your index covers all the columns in the "what" (select) clause. That tends not to be true for tables I've designed/worked with. I guess you could index differently to take advantage of this though.

Yes, databases which support index-only scans tend to encourage this pattern of adding extra columns in your indexes. In some cases this can be a big gain, but larger indexes means that the indexes are slower for queries which do not need the extra columns or need to read the entire row anyway.

Another set of queries which benefit from index only scans are those with EXISTS or JOINs just used for filtering the result set. In PostgreSQL 9.1 they always needed to read part of the actual row to check the visibility information. In 9.2 it will for VACUUMED pages only need to check the index and the visibility map.

It's not the default already because pre-9.2 the MVCC implementation in Postgres didn't guarantee that an index has the same visibility as the table data.


The title of the linked article is actually "PostgreSQL 9.2 beta improves scalability, adds JSON". Not sure why it get truncated.

I, for one, am not excited by the JSON addition at all. On the other hand, index-only scanning, scalability improvement, and reduced power consumption are all very big deal.

Will the JSON type have document indexing or is it just a text store that will validate that it's JSON.

You don't necessarily need special support for that. There's no special XML indexing support, but you can use the XPath and XSLT functions to extract the data of interest and index that:

    CREATE TABLE foo (a_doc xml);
    CREATE INDEX doc_title ON foo ((xpath('/root@title', a_doc));
I don't see any JSON functions for extracting data in the devel documentation, but that's what you would be waiting on, and it could probably be added by an extension without having to wait for built-in support.

There's no native function, but it can easily be implemented. See sickpig's answer (sibling to your post) for a Javascript-based solution.

just validation, json core type it's intentionally very minimal (see postgres hackers mailing list archives for more details)

anyway as a possible workaround just see andrew dustan's post:


It's kind of annoying to read so much good about PostgreSQL when you have hundreds of gigabytes of data in MySQL and no real reason to migrate to another database because of the expenses. I loved postgres in my previous project when I needed spatial indexing and I started to like its interface a lot.

It seems to be that Postgres is solving the problem of having different databases for your app. I would love to see the Redis datatypes built into it and Postgres might then become the strongest choice when selecting tools for my next app.

"I would love to see the Redis datatypes built into it and Postgres might then become the strongest choice when selecting tools for my next app."

Have you considered writing an extension to do that? It may be easier than you think; postgresql has a powerful extension mechanism (you can even do specialized indexing!).

As far as I can tell, JSON only made it into core because there are expected to be many users very quickly. But that doesn't scale -- it's better for users to look at pgxn.org and developers to use the extension mechanism. Once that becomes a matter of course, we'll see an explosion of specialized extensions.

Which Redis types specifically?

Have you seen?: https://github.com/dpage/redis_fdw

Seems pretty experiemental. But just for a key-value store and using the json datatype, how good is the support for all these features in libraries like ActiveRecord pg adapter? I remember that two years ago features like bounding box selects required plain sql, although some quite bad libraries existed.

I was suddenly excited for a JSON API support directly from PostgreSQL, been too naive. But I am still certain about the future.

Covering indexes is really good news!

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact