
Json-Base – Database built as JSON files - davidbarker
https://github.com/Devs-Garden/jsonbase
======
bvinc
Someone at my old company basically did this and put it into production.

The first problem he encountered was that multiple connections couldn't both
be using the database at a time without clobbering each other. "No problem,"
he thought, this is a good use case for micro services. A service sitting on
top would ensure that there was only one operation being performed at a time.

Next, his problem was that the database would get corrupt sometimes when
something bad happened in the middle of writing the file. His solution was to
put the entire JSON format inside of a JSON string. If it could be parsed
successfully, then he knew the whole file was written. Then all he needed were
"backup" files for each table, in case the current one was corrupt.

Next, his problem was that querying and iterating through a large table
performed badly, since it required parsing the entire thing first. Querying
several times required the whole file to be parsed every time. The solution
was to move SOME of the tables over to JSON-inside-SQLite.

EDIT: Oh yeah, the next problem was how to structure the data inside of
sqlite. He decided to make a single table called "kitchen_sink" that held
every JSON value. There was a column that said which "collection" it belonged
to. There was another column that represented the row's primary key. So you
could quickly query for a collection name, and a primary key, and get the full
JSON row.

So the next problem was that you couldn't query quickly for things that
weren't the primary key. So new columns had to be added called "opt_key1" and
"opt_key2" where certain rows could put key values, and indexes could be added
on those columns, so you could quickly query by it's first optional key, or
it's second optional key.

~~~
boynamedsue
It's easy to get a little laugh from this, but congrats to the guy for
exploring. Now he knows first-hand the inordinate challenge and can describe
it in detail, but more importantly avoid these hard-learned patterns later.

~~~
xienze
> but more importantly avoid these hard-learned patterns later.

Depends, I’ve known people who have gone through similar experiences and still
poo-poo all those “unnecessarily bloated” solutions like a proper database.

~~~
colecut
I make crappy thrown together frontends and will probably forever poo-poo the
'unnecessarily bloated' js frameworks

------
chmod775
You definitely shouldn't use this in production.

Looking at the code there's:

\- race conditions everywhere.

\- bad and inconsistent formatting, which doesn't help with the

\- huge if-else monstrosities.

\- Also uses synchronous IO and asynchronous IO randomly.

\- Uses try-catch liberally, doesn't check the caught errors, and just re-
tries blindly forever in some cases.

If you do any parallel updates/inserts/removals with this "database" you're
pretty much guaranteed to lose data. Updates are essentially: 1. read table,
2. make changes, 3. save table. Which at least would work if it was all
synchronous.

I know this is going to sound harsh, but building databases is hard for even
the most experienced coders, and whoever wrote this is clearly at the other
end of that spectrum.

[https://github.com/Devs-
Garden/jsonbase/blob/master/tables.j...](https://github.com/Devs-
Garden/jsonbase/blob/master/tables.js#L1664)

~~~
adamredwoods
They're using readFileSync? Doesn't this lock up the thread?

~~~
mbreese
It’s been awhile since I’ve looked at JavaScript like this, but I would guess
that the thread lock would only exist so long as the read() call is running.
That returns and then the write is initiated. It doesn’t look like there is a
lock that exists throughout the update callback. If that’s missing, then if
you had two threads/processes operating on the same table, you’d all but
guarantee one of those update calls would be lost (in the best case scenario).

Or maybe there a different locking mechanism in place that my cursory look
missed?

------
russnewcomer
I did something vaguely similar to this recently, and I still maintain it was
a good choice.

I volunteered to write a medical visit recording app for an NGO in a
developing country (a friend works with the NGO and asked me if I would help),
and they have almost no budget, no guarantees of internet connectivity when
their folks are in the field, and the likelihood that they may be using this
software for years.

So I wrote a C# app that uses Winforms, and stores all data as JSON files, the
'table' structure is basically directories in the file system.

It lets them share visit file by import/exporting a zip file of the JSON via
sneakernet USB drives [super naive last record written wins], does not rely on
an internet connection anywhere at all ever, and all files are stored in plain
JSON so that they can conceivably in the future do some data analysis on it.
Their alternate plan was to continue using paper, or some terrible regular
reconciliation of excel spreadsheets.

Having said all that and defending my decision on this single use-basically-I-
wanted-to-have-independent-JSON-instead-of-SQLite-so-in-the-future-maybe-have-
a-web-function-to-sync solution,

This feels like a different use case.

~~~
jaspax
I strongly approve of this sort of thing. Using dead-simple and human-readable
formats is a big win for things like this, even if it isn't architecturally
"correct". It sounds like your decision was a good one for the use case you
were looking at.

~~~
sangfroid_bio
It is great until your charity gets acquihired by a big think tank/bigger
charity/international aid group and most doctors/charity operators are not
known for their talent at scaling software.

~~~
russnewcomer
In this case, this NGO is not going to get acquihired. It's more likely that
I'll get an email in 5 years from someone who I don't recognize asking me if I
know anything about this program because my email has been attached to this
thing they got gifted from a dead project, and have been using after all of
the original people have moved on. :)

------
sanqui
Sounds like how MongoDB was born. I believe that has made a lot of people very
angry and been widely regarded as a bad move.

~~~
kevsim
Was my first thought as well. Mongo stores things as BSON rather than JSON but
that feels more or less like an implementation detail.

------
rubyn00bie
I'm not trying to be a huge asshole here, but this has zero tests and just
saves json files to disk. There's literally a full readme and contributors
guide but zero tests for something that's supposed to store data for you?

For a community that loves it some Jepsen analysis, I can't for the life of me
figure out why this has been up-voted so many times. This is just saving JSON
file to disk. I'd argue this is harder than using Redis (flushing to disk) or
(vomits in mouth) Mongo. Or shit, just use your filesystem and `jq`, you'll
have something likely faster, safer, and more maintainable.

~~~
identity0
What are you talking about? They have tests right here:
[https://github.com/Devs-
Garden/jsonbase/blob/master/test.js](https://github.com/Devs-
Garden/jsonbase/blob/master/test.js)

Just uncomment the tests you want to run! Easier than using a testing library
IMO.

~~~
choward
I can't tell if you're being serious but one of the main features of tests is
they are automated. They should be able to run as part of the build.

And just because you're not using a library doesn't mean you shouldn't have
assertions. All these "tests" do is log. What do I check the output for?

All you need to do to have a somewhat respectable build is uncomment those
tests, make them clean up after themselves, change the console logging to be
assertions instead, and make them run on GitHub.

------
jaywalk
It's all fun and games until this ends up in somebody's production
environment.

~~~
alecco
And there's never a serious consequence to the ones who did it. By then they
switched to a new position somewhere else. Like most prima donnas. And this
shows that deep, deep down they know they are fake.

~~~
mongojunction
That sounds like it comes from some specific experience you had... but it's
pretty uncalled for to apply it so confidently to someone you don't know.
Don't be mean, right?

Actually, another view is that there's nothing wrong with tinkering and DIY.
Perl, JS, Redis all came from people hacking their own solutions (as far as I
know).

Also, many big software orgs build extensive internal tools themselves.

Plus, making your own stuff is a lot of fun. You should try it sometime (if
you haven't already) :)

~~~
jfkebwjsbx
There is nothing bad about writing your own solutions.

What is bad is putting them in production when you don't have a clue about the
domain.

~~~
mongojunction
Being ignorant didn't make you a prima donna tho, as above says.

Also, they have to have some clue about the domain, because the domain is
their own problem and they're writing a solution for it. So I don't think we
can really just someone as not having any clue about their own engineering
challenges.... especially if they're working solutions to them....

Antirez said literally he didn't know about existing solutions when he went to
write redis, and he and redis are awesome. nothing bad about that

but I get your point about bad solutions are bad but that's sort of a
tautology, doesn't add much value, and who are we to judge someone else's
solutions are bad we don't know everything about their use case.

Again... even if we can say that you choosing someone else's technology for
your problem is not a good solution we just can't criticize the author because
it's your responsibility what you choose. so I just don't think it's valid to
criticize the author

~~~
jfkebwjsbx
> they have to have some clue about the domain, because the domain is their
> own problem

They can be lifelong experts on their problem, yet have no clue about writing
a database engine and low-level programming in general.

> Antirez said literally he didn't know about existing solutions when he went
> to write redis

Nobody is born with knowledge. The difference is that Antirez studied previous
solutions, studied how to do it, and then applied that knowledge right.

Instead, that person did the equivalent of building a bridge disregarding
everything humans learnt about it since the Roman empire. It will not be a
surprise if the bridge ends up collapsing.

------
tobr
So much pessimism! Not sure if the author is here, but it would be interesting
to hear what makes this different from, say, Lowdb.

Also, the writing in the README feels sloppy, which doesn’t inspire
confidence. For example, you might want to decide if it’s called jsonbase,
JSON-base, JSONBASe, Json-Base, JSON-Base, json-base or jsonDB.

~~~
teej
If the author doesn’t provide any explanation of why this exists or what
motivated them to create it, what am I supposed to assume?

They’ve called it a database. They have said explicitly “ You can use this as
a backend for your ReST APIs.” But it doesn’t meet the table stakes for a
database and encouraging folks to use it in a production environment is
actively harmful.

I wish more folks were up front with the trade offs they make. I respect an
OSS author a lot more when they are honest and upfront with what a thing is
good at and where trade offs have been made.

When I don’t see that, I assume that either the author doesn’t know/care (red
flag) or they can’t be bothered (annoying).

~~~
zachrip
I don't believe the OP of this post are the same person as the author of the
library. Someone publishing a project isn't harmful, you don't have to use it.
If someone uses it and gets burned that is their fault, not the author's. If
you're making this project a dep, it's your job to vet it, especially if it's
a database. Just because something is OSS doesn't mean it needs to be some
polished stone that meets your standards.

~~~
teej
I agree with you. For those same reasons it’s reasonable for HN commenters to
be “pessimistic” about a library with no track record and no discernible take
on why it deserves to be production ready.

~~~
zachrip
The project is 2 months old and they say nothing about its readiness for
production, simply that it _could_ be used for a REST backend or similar. They
also say that it could be used for a quick PoC. I'm not quite understanding
how either of those claims are wrong. Why are people torching a young project
that someone is releasing publicly for free? Again, if you don't want to use
the project, nobody is forcing you to.

------
MH15
I wrote some code to do this in high school when I couldn't figure out SQL.
Then I learned why databases exist.

------
jarym
What ever you try to do with JSON has probably been tried before with XML.
Including XML databases.

Now I’ll accept that XML databases has their use (especially if it involved
storing and transforming third-party XML) but I can’t think of any good use
for this when there’s SO many better options.

~~~
mbreese
Having lived through XML databases and build systems, it’s
sad/funny/interesting(?) to see this all play out again.

This has all happened before, and it will all happen again.

~~~
toyg
Same here! I tried pretty hard at the time to work with XML databases. In the
end, SQL is just more practical in most circumstances, and easier to reason
about. The same will likely happen with this sort of effort.

------
time0ut
I could see some niche uses for this. Anywhere you want a quick and dirty
local db for demos and hacking. That said, I think you'll get more mileage out
of SQLite. It is generally my go to for these use cases and far richer and
more powerful.

------
syastrov
Back in the day, we used to use CSV for this :)

------
techntoke
This is what happens when someone with basically little technical experience
joins a JavaScript coding school and has to build something in order to
graduate.

------
tushonka
Aaah the beginners project to store info without having an actual database.

~~~
pstuart
SQLite with the JSON extension works well.

~~~
tushonka
Who has time to learn SQL when you can just read/write JS objects from/to
files. ;)

------
winrid
I work on a system with a JSONB column with 11k unique paths with no schema. I
created some tools to generate jsdoc which is a start, but please no.

~~~
beamatronic
Are there tools which will scan a body of JSON and and generate all the unique
paths? Is it common to do this?

~~~
winrid
It's pretty easy to do. Also, a common interview question. :)

------
smt88
This is both more effort and a worse outcome than just using Postgres with
JSONB data types.

~~~
boring_twenties
I had the same thought, especially nowadays when you can just type `docker run
--name mydb postgres` and wait like 3 minutes.

------
waltpad
That's a very strange idea: JSON is basically structured data, like XML, it's
nice for documents with deeply nested structures.

The main issue is that contrarily to a DB, any modification will shift
everything after it, so any indexing will have to be corrected. I suppose that
if the document is not stored as is, but instead broken up in pages
(filesystems are likely doing that already, so piggy backing on that could
help), then indexing could be improved, but then storage starts to look like a
regular DB, rather than JSON.

Interesting nonetheless, time will tell.

------
slifin
Nesting in your primary data store is an anti-pattern for information
workloads where you need to ask questions against your data

Consider:

    
    
        {
          "name": "Sam",
          "age": 12,
          "friends" [
          {
            "name": "Tom",
            "age": 15
          }
          ]
       }
    
    

Attributes like name and age are properties of a person entity, when placed in
a JSON hierarchy something else is happening, the one dimensional relationship
the things have between each other is also being saved into the structure

That's dangerous because relationships should be formed on read, not on write,
otherwise you concrete all future reads towards whatever it was on write, and
if you're particularly sloppy the data gets duplicated which is even worse

The solution is to normalise your data store and use relational algebra to
reify relationships at runtime

The problem with mainstream databases is they don't force normalisation,
automatic indexing and pulling off attribute level normalisation is unworkable
performance wise, so in most teams this doesn't work but this idea does work
if you want to try this out learn Datomic

------
issa
I don't mean this negatively, but I'm having trouble thinking of a real-world
use case for this. Can anyone add some?

~~~
bradstewart
Looks a bit like
[https://github.com/typicode/lowdb](https://github.com/typicode/lowdb), which
I've used for a few production-ish usecases in the past.

Specifically for storing the results/state of a set of manually-executed
management scripts. The scripts needed to query the data from previous
executions, do some stuff, and store the output. Think poor mans version of
terraform.

Everything was dumped in a git repo that was shared across a few people. It
was a quick and dirty solution to manage some alpha customers before the
"real" system came online.

~~~
randtrain34
Yeah lowdb or even [https://github.com/LokiJS-
Forge/LokiDB](https://github.com/LokiJS-Forge/LokiDB) will probably be much
more performant/more featureful.

------
neilobremski
Didn't everyone do this at some point? Sort of like everyone that started C++
in the 90's rolled their own string class. I remember doing this before "JSON"
hadn't yet found its acronym (shaking rake ... get off my lawn!)

------
slantyyz
This reminds me a lot of LokiJS[1], which, if I recall correctly, could
optionally save the data as JSON file(s).

[1] [https://github.com/techfort/LokiJS](https://github.com/techfort/LokiJS)

~~~
waltpad
I think that, oddly enough, the point of that project is to use the JSON
format as the DB storage format, not as an export option. Just from the look
of it (I don't know either projects), LokiJS will very likely be always
faster.

~~~
slantyyz
Sorry for the confusion, by "save as", I didn't mean export.

LokiJS has multiple persistence options, with JSON files in the filesystem
being just one of them.

Alternatively, you could also just use it in-memory or with IndexedDB.

~~~
waltpad
Oh, I understand. I suppose it makes sense, if for instance one needs to store
a bunch of parameters somewhere, it might as well be a JSON file.

------
maxpert
I once built a toy document store using SQLite and Python using almost similar
idea, [https://maxpert.tumblr.com/post/47494540287/a-document-
store...](https://maxpert.tumblr.com/post/47494540287/a-document-store-with-
sqlite3-and-python) if done correctly the advantage of that approach IMHO is:
\- ACID (Powered by SQLite) \- Complex and efficient Index (Powered by SQLite)
\- CouchDB like API

I've been playing around with Rust recently maybe I will do a simple
implementation in Rust-Lang which will keep it memory safe and efficient.

------
randtrain34
Why would someone use this over
[https://github.com/typicode/lowdb](https://github.com/typicode/lowdb) ?

------
coding123
There are many use-cases for this for super simple ma-pa shops. It may be
relegated 100% to shopping carts and checkout processes, but this would allow
one to hand-edit her database, and even potentially let the server git push
live changes... kinda interesting concept.

------
aabbcc1241
I went further, to log the logical changes of state as json files. (aka
command sourcing)

To reduce the IO overhead, I batch multiple json values into a larger file.

I don't need random access because I'll replay all the changes when the server
start.

Going to open source the library soon.

~~~
jarofgreen
I'm also working on a similar system - a JSON datastore with a event stream so
there is a full history of changes.

However we are using Postgres as a backend.

Our code is already technically Open Source, but it's not relatively stable
and not doc'd yet, so I won't link.

~~~
aabbcc1241
I can see why people prefer sql database over plain files, for better edge
cases handing. But that requires more configurations and resources.

------
bullen
I also made a distributed JSON database over HTTP, it's been running live with
250.000 customers since 2016 without reboot:
[http://root.rupy.se](http://root.rupy.se)

------
deft
This is useful for a lot of projects. I've used a similar db library in a rust
project I was working on for an example application. This way there are no
heavy dependencies or even the need to say "you need sqlite".

------
xrd
I would prefer to use pouchdb as an in memory database. Then, if I outgrow
that, use the pouchdb server. Then scale up to couchdb if I need that. It's
much better architected and your front end code never has to change.

------
jordic
Minio also does this... Haha [https://docs.min.io/docs/minio-select-api-
quickstart-guide.h...](https://docs.min.io/docs/minio-select-api-quickstart-
guide.html)

------
mkl95
This brings me back sweet memories of saving my Pygame minigames state to
clunky JSON files when I was basically clueless about databases. It worked
surprisingly well until it became a huge spaghetti mess :-)

------
jmull
Seems like it should use standard callbacks and/or standard Promises for
responses.

That's probably beside the point, though. Hopefully no one ever uses this so
the non-standard async pattern doesn't matter.

------
laanfor
JSON file as backend, So can I directly edit the JSON file?

~~~
hedora
Bind does something like this (but not with JSON).

You have to run a “freeze” command before editing the database directly (so it
can flush the current version of the database, and redirect writes to memory +
log), and then “thaw” so it can read your changes and apply the log of updates
to it.

------
drbojingle
Could be good for things that are often read and rarely written. As soon as
multiple people try to update the same file though just use an established db

------
StreamBright
But why?

------
CogentHedgehog
This reminds me of XML Database from back in the day... except JSON instead of
XML.

------
tobyhinloopen
Makes me think of how many Minecraft server plugins use YAML files as a
database

------
globular-toast
What a stupid idea. SQLite exists and you should use it. I took over a
codebase at work that contained an ad hoc implementation of something like
this and it was surely the most unprofessional thing I'd ever seen. What are
they teaching kids in university these days?

------
trollied
Just learn SQL. Please!

