

Postgres, NoSQL, or other... - brittonrt

I am writing an application which stores content (usually text, images, links, etc) and is highly relational (content can be associated with other content and queried as such).<p>An example would be fetching the 10 most recent submissions by a specific user, sorted by "rating" number which is also associated with a piece of content, as well as all content which has been related to each returned piece of content.<p>IE Bob requests the 5 most popular data points he's published and all the associated images/links/etc.<p>I am currently using MongoDB, and have had no issues, but after doing some research, I'm considering switching to an amorphous Postgres implementation where there are only 2 tables, a data table and a relationships table, with Memcached keeping some of the higher level requests precached for quick fetching.  This should keep my schema from the application's perspective highly flexible.<p>I know it's apples and oranges, but would anyone be willing to offer maybe some of their own experiences with similar setups that might help me determine if this is a good decision?  My main reason for looking into migrating away from MongoDB is write speed and data volatility.  I don't want to ever lose data if possible, and I understand that Mongo has table level locking when writing (is this true still?)<p>If anyone has any advice on the performance/reliability implications of either or both options, or if anyone thinks there's an even better 3rd option I haven't considered, I would be very interested to hear!<p>Thanks so much!
======
lmm
If you're using that kind of schema you won't be getting any of Postgres'
advantages. Obviously it has a pretty good reliability reputation, but I'd be
very cautious about switching to Postgres because you're going to be making a
lot of work for yourself.

If you're worried about performance, benchmark it (or find someone who has).
On the reliability side there are plenty of NoSQL options with better
reputations than MongoDB. My first thought is neo4j, but I don't really know
enough about your particular problem.

~~~
brittonrt
Thanks for the reply. From what I understand, the schema I described is
similar to what Reddit uses in their Postgres setup. You mentioned that I
"won't be getting any of Postgres' advantages", what are some of the most
important considerations that play into Postgres strengths? What are the
primary advantages you for Postgres your referring to?

Thanks!

~~~
lmm
I meant just in terms of Postgres expecting a "standard" relational schema;
the interface and query planner and the like are designed around that. It's
not a _bad_ schemaless data store per se, but it's not really designed for
that use case, so I would expect rough edges when using it that way.

The modern selection of schemaless data stores weren't as mature (if they even
existed) when Reddit was getting started, so don't assume that they would
choose that architecture if they were starting today. I suspect that if you
build on postgres in this way you'll have to write significant pieces of
functionality by hand that you get for free with
mongodb/couchdb/redis/neo4j/etc.

That said I'll emphasise again that I don't know too much about your system;
at some point, prototyping and measuring will be the only way to see what
works best for you.

------
dotborg
you should start from giving some numbers i.e. you want it to work for 10mil
documents, how ofter they will be updated etc.

if it's less than 1mil and they are not updated frequently, you should go for
postgres and have some fun with SQL,

more data and increased concurrency means problems also in postgres, a lot of
optimizing, materializing views and other fun problems to solve.

