
Ask HN: I think I need to use NoSQL? - mothsonasloth
I&#x27;ve been writing software for about 6 years and have always found the relational approach suitable for my needs.<p>However after spending a week inheriting a prototype which is all over the place. I am starting to think I will have to use something I never dreamed of using; NoSQL whether that be MongoDB, Redis or Cassandra.<p>To give you some background:<p>A large enterprise with ageing software infrastructure is looking to replace a system which consists of<p>A COBOL terminal app, 3 VB powered spreadsheets and a lot of emailing.<p>The solution the team has prototyped is effectively a spreadsheet with a business rules engine.<p>This spreadsheet at the moment has lots of tabular data and could change over time.<p>I&#x27;ve spent several sessions with my team looking at entities and mapping their relationships. However we keep hitting ruts.<p>There are other issues too:<p>* The business &#x2F; stakeholders, can&#x27;t agree on what data is needed and it is constantly changing<p>* The current schema has several tables but are just filled with CLOBs that store JSON<p>* A lot of the data the business wants presented would be hard to store in a normalised format<p>* The frontend is making multiple API calls from different services and assembling a document in redux &#x2F; local storage. This is an antipattern!<p>* Business priorities are changing every sprint &#x2F; product owner is getting arm twisted to add new &quot;fields&quot; , so that department X can get more  out of this app<p>* The relational model could cause added maintenance and refactoring when new changes come around<p>However I have concerns against suggesting this:<p>* I am in a junior heavy team, with noobies<p>* Seniors are scared of new practices.<p>* I don&#x27;t know how easy it is to do things like business reporting etc.<p>* It would mean a heavy refactor of the internal models<p>* How do I transplant existing data on production into a new DB technology?<p>* I have very little experience in NoSQL<p>Looking for people&#x27;s thoughts or suggestions.
======
fiedzia
> The business / stakeholders, can't agree on what data is needed and it is
> constantly changing

You are about to spread this chaos everywhere. If they can't decide on what to
store and how, nobody will be able to make sense of that, create useful
reports or maintain it. Things will fall apart constantly and everywhere.

> product owner is getting arm twisted to add new "fields" , so that
> department X can get more out of this app

Why is that a problem? Using relational db does not mean that adding new field
should be some big effort or require any ceremony. If they want it and it
makes sense they should get it on the same day.

> The frontend is making multiple API calls from different services and
> assembling a document in redux / local storage. This is an antipattern!

It is, but it has nothing to do with data storage. Looks like nobody is owning
backend and data, so frontend does what they can to get the job done. It is
not their fault.

> The relational model could cause added maintenance and refactoring when new
> changes come around

Non-relational model adds a lot more maintenance and refactoring if you want
quality.It is only easier if nobody cares about tomorrow.

> The current schema has several tables but are just filled with CLOBs that
> store JSON

Nothing you said indicates that you need a non-relational database, only that
you have non-relational data. You can keep them in db as it was before, with
option of moving to separate fields as definition will mature and ability to
use transactions.

>* I am in a junior heavy team, with noobies >* Seniors are scared of new
practices.

Its a mess, but looks like you are not in position to do much about it. You
can't fix organizational problems with technology. Main decision to make is if
you want to stay there or leave and go to better place.

~~~
mothsonasloth
I just joined the company and I am broke after taking a long break, so leaving
the company anytime soon is not an option.

The juniors on my team are from a coding camp, so I can't really expect them
to be of much help at the moment.

I will take ownership of this and be pushing back, because I am not afraid to
say no after many years of working in development.

After doing some research and the feedback from this post I will be
refactoring the mysql schema and the hibernate models in the java service.

I will also make use of the JSON data type instead of CLOB so I can actually
query some dynamic data.

------
osullivj
I've built large systems with RethinkDB, and proprietary NoSQL DBs. My
experience is this: if you need to query any significant volume of data, you
must have indexes. And once you start figuring out the indexes, you're
designing a schema. So it's best not to shy away from schema design up front,
and evolution during dev. Otherwise you risk very poor performance. On the
requirements front: dialog with management and users often degenerates into
handwaving. A concrete prototype can help clarify IMHO. Suggest you break out
one small piece of the legacy system and prototype a new solution. Disarm the
scared seniors by emphasizing that it's throwaway. When the users see
something new, shiny and better you'll have the political backing to face down
opposing noobs and seniors. That's a double edged sword, as you'll also have
the pressure to deliver. Good luck!

------
geophile
This sounds like a management problem more than a technical problem.
Requirements need to settle down a bit so that you can get some work done. A
schemaless approach does make schema changes easier, but woe is you if you
have data in your database from several different versions of the schema. I
don't see how you can make any guarantees about query correctness in such a
situation. I.e., your business reports will be buggy. Not to mention, hard to
write without SQL.

I don't buy your statement that "a lot of the data the business wants
presented would be hard to store in a normalised format", at least not without
some justification. Also, data to be presented may be structured differently
from data as stored. Forget the presentation, is there any reason why the data
model itself isn't normalizable? (Not talking about tweaks to the model due to
changing requirements. Pick some version -- is that normalizable?)

~~~
mothsonasloth
It is a management problem. I have been pushing back since I joined the team
but the problem is that there's so many execs and people screaming for this
app to be released.

As senior developer on the team I am having to fight bad software practices
and "just ship it " mentality.

In a way I am panicked because I don't know what the product owners want and
they in turn dont know what the business want.

Therefore I am trying to engineer against management's incompetence and I
don't see that being fixed anytime soon.

~~~
geophile
The top-level manager on the engineering side needs to be pushing back, saying
"Guys, this isn't going to ship if the requirements keep changing." If he
isn't doing that, you are in a tough spot. It sucks to work on such a project,
and it sucks to go over his head, especially if you are junior. Maybe talk to
the senior engineers about the problem?

------
davismwfl
I'd think the correct solution is better project/product management, but as I
know that isn't always an option...

Overall, given the information I am not sure NoSQL is the right path, but
there are some inherit flexibilities it gives you with a constantly evolving
data model. But that doesn't come for free, and your coding has to evolve to
support a dynamic data model which can be challenging on its own.

Simply based on your limited description, I'd avoid Cassandra, not because it
isn't awesome, but your use case doesn't seem to fit. Plus with mostly newbies
and skeptics it would most likely fail in implementation. Cassandra requires a
lot of design planning up front in general to get it right, it is not the most
flexible of the NoSQL alternatives in terms of dynamic data model and
querying, but it provides other trade offs that make it a great tool.

You can stick with SQL and just use a data migration toolset to help you
manage field updates etc. Set a policy for allowing additions but restrict
column removals etc. It has been done for years as this isn't a new problem,
just means you have lots more to manage and data migrations become a key
aspect, so data access code also needs to be isolated and flexible.

Alternatively you could still use something like Postgres or MySQL and
properly use JSON storage and index it which would let you have some benefits
of NoSQL without moving the entire app. For those tables you can define well,
do so in proper sql, those you can't do so in JSON storage.

And of course you could move more to a full NoSQL solution using something
like Mongo. Mongo is solid if used properly, but again comes with some trade
offs, so you'd have to evaluate those.

Personally, depending on how much data exists already and how complex the app
is at this stage, I'd opt for pushing back on project management some and/or
moving to a more toolset driven SQL solution with planned data migrations as
part of your sprints. That plus make sure you add a flexible data access layer
to isolate the changes to the application as best as possible. If you or
someone on the team had more NoSQL experience or even a different team makeup
my recommendations might differ.

