Ask HN: Server application frameworks with excellent NoSQL resources?

oblib · on April 24, 2017

I'm using CouchDB and Apache Web server on Ubuntu 16.04 on the server side and PouchDB, Bootstrap, and JQuery on the client side. While maybe not a "Stack" in the traditional sense it's a solid approach with a lot of benefits.

Personally, I believe this is a sweet spot right now that's mostly being overlooked. CouchDB and PouchDB have come a long way in the past two years. They are now solid and offer an incredibly rich API that's well documented with a lot of example code and an active and helpful user base to help keep you moving forward.

It may not sound as jazzy or have the media hype of some of the newer tools out there, like React and Angular, but once you start building with them you'll find things come together quickly.

I've not dug too deep into CouchDB "Views" yet but that's another powerful tool to get and present data that is very fast and efficient.

Since almost everything is written in Javascript you have a huge number of existing libraries of code you can drop in to add needed features and you end up with an app that runs almost entirely on the client side, and can run offline there if you want or need it to. When you add "Service Workers" to cache those app files the web server doesn't do much besides deliver the app once and update it as directed.

In this setup the Apache web server is redundant since CouchDB is also a web server. I use it because I'm familiar with it and it provides an option to separate the DB server from the app file server that I figured might offer some management and scalability benefits. So far I haven't come close to needing that, but I didn't have to spend anytime learning how to configure the CouchDB web server so I saved a bit there. YMMV

cpburns2009 · on April 24, 2017

Unless you have a very clear reason to use some NoSQL database, stick with SQL. And even if a valid use case does arise, you'll likely be better off using the RDBMS for the bulk of your data, and only use the NoSQL database where it is strictly needed.

oblib · on April 24, 2017

That really depends on the data and how you intend to use it.

There are some very good reasons to not use an RDBMS.

itamarst · on April 24, 2017

Why do you want to switch to "NoSQL"? For many applications the lack of transactions will eventually cause significant problems.

sweetheart · on April 24, 2017

It was recommended by our data scientist as he made the claim it would be easier to gleen useful data from what we've collected. I know nothing about this, so I have no opinion on it at all. It was also proposed to prevent increased server load as a result of not denormalizing any data in our schema. Granted, we haven't started building yet, as this is all preliminary. Would this fall into the "premature optimization" category of common mistakes?

oblib · on April 24, 2017

Go take a look at IMB's Cloudant. (https://cloudant.com)

They have some examples there demonstrate how fast it is at digesting data from 100s of 1000s of records. It has to do with how the data from searches are stored in a B-tree. I won't pretend to know the science behind that but I can tell you it's impressively fast and easy to achieve.

And you can use PouchDB to create the client side interface and plug it into a CouchDB server you set up or IBM's Cloudant, or both for that matter, with just a line or two of code.

So you really don't even need to setup a database server to get started on you application code. You can create a Cloudant DB and load it with data and get right to work on your client.

oblib · on April 24, 2017

"he made the claim it would be easier to gleen useful data from what we've collected"

He's probably right.

"Would this fall into the "premature optimization" category of common mistakes?"

No. I don't believe so. You don't have to optimize your data. In CouchDB you need to convert it to JSON but that's a pretty straight forward process. I wrote a perl script to do that for data I was working it.

You can create databases that work on a specific dataset if you want, or create a single database and load it with everything you've got.

From there you design "Views" that work with data in the "files" you store data in. When you run the view the first time CouchDB creates a B-tree that is then stored and used to find the data you're working with and return the results based on your criteria.

That B-tree is where your data is "optimized" and CouchDB does that for you. The B-tree is updated whenever you add or make a change to your dataset, so you don't have to do anything to "re-optimize" it.

This is all fairly easy to do and the learning curve is fairly short. If this is being run in-house I'd recommend developing your app on a Raspberry Pi. You may find that's all you really need.

You can setup a CouchDB on a Raspberry Pi with over 250 GB of storage for around $100. (http://wdlabs.wd.com/products/pidrive-compute-centre/). This actually provides a pretty great way to distribute your app and data too, and gives those who have access to modify it themselves.

You can develop your app on a Pi and from there if you need more power you can replicate everything you've got to a Cloudant DB, or install CouchDB on Google Cloud or Amazon Cloud or a spin up a DigitalOcean VPS and install CouchDB on it. You've got several great options there.

CouchDB has a built-in tool to replicate your data to/from other CouchDBs so this is incredibly easy to do. Pretty much push a button easy.

And don't underestimate what you can do with the Raspberry Pi. They run CouchDB just fine. I'd be glad to send you step-by-step install instructions for CouchDB 2.0 (latest version) to help get you started with that.

itamarst · on April 24, 2017

Definitely seems like premature optimization. Unless you know you're scaling up a lot worry about it later... and then you can separate the different kinds of data into different databases for different uses.