JSON vs hstore

wh-uws · on July 4, 2013

So I'm using both types in a current personal project and I have to agree with the comment on the blog post

"[...] I use JSON for external application data that doesn’t need to be interpreted by say, an SQL function or a trigger. Basically storage-only data..."

Basically I use hstore's as meta columns, for say a one off attribute of a model that I dont want to have as another column and have to do a migration. It is leaps and bounds better than using a text column because you can query.

JSON I use for basically caching api calls. For instance I save off the json from the itunes api for all of the albums of an artist or all of their songs.

It will be awesome when 9.3 comes out and you can query that though. That will be something.

_gtly · on July 5, 2013

I also think of this in terms of viable development migration paths:

For example, 9.3's upcoming: hstore_to_json(hstore) and hstore_to_json_loose(hstore) open up possibilities. If an hstore datatype is chosen now, you can feel fairly confident that it can be fairly straightforwardly migrated to a native json datatype in 9.3 or later versions, when you feel the json featureset and stability may have become mature enough for your particular project's needs.

moomin · on July 4, 2013

Now all we need is EDN support in Postgres.

craigkerstiens · on July 4, 2013

Here's the original post which fired off some further clarifications from Christophe:

http://www.craigkerstiens.com/2013/07/03/hstore-vs-json/

zapov · on July 4, 2013

I'm really failing to understand obsession with unstructured types in Postgres? Why can't you just use native types for hierarchical structures?

Is it really that hard for people to maintain database changes?

ketralnis · on July 4, 2013

I'm really failing to understand obsession with deriding people for solving problems that you personally don't have.

But yes, for some people it can be difficult to do this. It slows development, it makes programmers have to do sysadmin work (which in a two-person company without a specialised sysadmin can be difficult).

As another use-case, there was a point at which Twitter would take days or weeks to implement even minor schema changes across their entire MySQL cluster without impacting production.

And some people are solving problems that just do have unstructured or semi-structured data, rather than your assumption here that it's just a proxy for avoiding schema changes

zapov · on July 4, 2013

Ok, I admit my comment is a little overboard.

Comparing Mysql migration to Postgres is not really helping this discussion. Postgres has transaction DDL with many schema change operations being O(1).

Most of the time your data has structure. Only parts of it are usually semi-structured. And yeah, I had this problem too, until I automated it and don't have it anymore. So I guess I'm a little annoyed when I see that schema change automation didn't really took off and people accepted JSON as a storage medium.

jakejake · on July 4, 2013

Avoiding DB work strikes me as not the greatest reason to store JSON in your DB. But I find there are good reasons to do it. The main one for me is when the data fields are configurable by the user.

If I didn't want to be constrained by a schemas I'd just go with a NoSQL solution. Nothing wrong with that either.

tensor · on July 4, 2013

Postgres can make a pretty nice "nosql" solution too. Either via hstore, json, or bytea. That said, nothing frees you from a schema. Not having it explicit in the database just means you have it encoded into your application code.

philwelch · on July 5, 2013

> it makes programmers have to do sysadmin work (which in a two-person company without a specialised sysadmin can be difficult)

If you're a two-person company without a specialized sysadmin, programmers having to do sysadmin work should be inevitable, normal, and expected, not "difficult".

ishbits · on July 4, 2013

I used hstore in an existing database for metrics. It allowed me to add different metrics in the code, rather than having to update the database for new metrics, then a metrics to hstore layer that more or less "streamed" them into the database. I find it quite awesome for that case.

Sometimes there is a need to unstructured data, and if you are already using Postgres, hstore is a great alternative to adding another database backend to your stack.

kaonashi · on July 5, 2013

Depending on the application, hstore can see real performance increases.

http://blog.artlogic.com/2012/10/29/performance-of-custom-fi...

dragonwriter · on July 4, 2013

The article here seems to pretty clearly lay out usage consideration in selection between the types at issue (neither of which is really "unstructured".)

reczy · on July 4, 2013

a bit more hands-on, but it's possible to create a functional index to improve json query performance in postgres:

https://postgres.heroku.com/blog/past/2013/6/5/javascript_in...