
Ask HN: Is it stupid to store GB's of data without a DBMS/DS/etc? - erik14th
I have a small website&#x2F;personal project and very limited money.<p>I&#x27;m trying to work out a way to keep it alive and well for the longest time possible and using as little money as possible.<p>It basically gets data from an external api, transforms that data using some node&#x2F;python scripts and save that as json files. Then I use the data in the json files to generate static html pages. As I save the json files with the specific data I need to generate the pages(kinda like a view), there&#x27;s no querying needs other than key&#x2F;value.<p>I&#x27;m divided wether to keep going with the file system or use a database&#x2F;document store&#x2F;whatever.<p>I&#x27;m running it on a small 512mb DO vps and I want to keep the costs as low as possible, so ram and general systems resources usage should be kept at minimum.<p>I have a lot of data and most of it is kinda redundant, so it&#x27;s not a big deal if some of it goes bad&#x2F;is lost. So I don&#x27;t really see an incentive to use a proper database&#x2F;document store&#x2F;etc. But it feels kinda wrong to deal with GB&#x27;s of data all sorted in lil folders saved as json files. It may be just prejudice I got from reading too much stuff about big data, trendy dbs and such. But it also may be some technical debt I&#x27;m failing to see right now that may be a giant pain in the near future.
======
nunobrito
I'm doing this kind of thing for hundreds of billion records stored across
hundreds of Terabytes (our business is to generate fingerprints based on
binary files).

The added advantages: \-- less failure points in case of data corruption \--
any tool or any language can be used for parsing data \-- easy to partition
and move data elsewhere as needed \-- grow as you see needed (only need disk
space) \-- dependable disk space, no caches nor anything else needed

So, will depend on your context but you are not alone in case deciding to
follow that route. On our side we are happy with the approach, sqlite gets
slow when adding billions of records.

~~~
erik14th
Yes I see many of these advantages. I already use two different languages.

One of my fears is that at some point I might want to query the data, but then
I'd probably transform the data and create new "json views" for improved
performance anyway.

Thanks for sharing. If it's not a problem I'd like to know the company name.
Never heard of anyone doing this.

------
dublinclontarf
DBMS's generally bring a couple of things:

ACID

Queries

Unless you need either of these things then dumping onto the file system is
fine.

To ease your anxiety you could always use SQLite (has JSON extensions) and
simply writes to a file on the file-system (which means no DBMS server to
run), this is only an option if access to SQLite is single threaded.

~~~
stray
SQLite is threadsafe:
[http://sqlite.org/threadsafe.html](http://sqlite.org/threadsafe.html)

~~~
dublinclontarf
Well, colour me surprised it is, that's awesome.

------
stray
The file system _is_ a proper document store. You're doing just fine.

------
detaro
As long as there is no pain point where you wish you had a database or are
starting to reimplement parts of one you should be fine with flat files. If
you generate static files parallel access, speed advantages etc likely are not
an issue as well.

