
JSONlite – A self-contained, serverless, zero-configuration, JSON document store - nodesocket
https://github.com/nodesocket/jsonlite
======
coleifer
If you want a real self-contained, serverless, zero-config JSON document
store, try UnQLite or the new SQLite JSON extension. I've written about both
of them on my blog if you're curious:

* [http://charlesleifer.com/blog/introduction-to-the-fast-new-u...](http://charlesleifer.com/blog/introduction-to-the-fast-new-unqlite-python-bindings/)

* [http://charlesleifer.com/blog/using-the-sqlite-json-extensio...](http://charlesleifer.com/blog/using-the-sqlite-json-extension-with-python/)

~~~
NelsonMinar
That JSON1 extension for SQLite looks very promising. Thanks for the writeup!

------
sqrt17
There are several use cases where this is a sure-fire way of shooting yourself
in the foot:

* if you have many records (i.e. more than a couple hundred), the file system will have a lot of work to do and the whole thing becomes sluggish

* if you want to query the data by content, there's nothing that gives you sublinear search capability here

* it's not easily possible to modify data under this scheme. If you add that functionality, you'll have the familiar choice between race conditions and added complexity. Having said that, if you don't modify the data you can also remove the store and just use the data itself (maybe encrypted if you need that) instead of the key.

As an alternative to this, consider each process appending to a file and
keeping filename+offset as the identifier for a particular record. This solves
at least the "too many files" problem.

Or, if you care for reading a static collection, put your Json (or some moral
equivalent, e.g. msgpack) into a CDB database:
[http://cr.yp.to/cdb.html](http://cr.yp.to/cdb.html)

Next step up: use LevelDB, or KyotoCabinet/KyotoTycoon to organize the
storage.

~~~
michaelmior
Whether a large number of files becomes sluggish really depends on your file
system. In any case, a common technique is to break large numbers of files
into subfolders which usually does reasonably well at solving this problem.

As for updating, flock[0] solves this issue on operating systems which support
it.

[0] [http://linux.die.net/man/2/flock](http://linux.die.net/man/2/flock)

~~~
sqrt17
Usenet news and maildir are cases where current operating systems already have
to cope with that kind of load, so it's definitely possible.

The question is, can this be useful without becoming a partial and bug-ridden
reimplementation of a NoSQL database (just because we have NoSQL databases
that fit the bill and carry less maintenance costs wrt a spit-and-glue
solution).

------
NelsonMinar
I've been wanting something like SQLite for JSON for awhile now. Or
alternately, something like Mongo but without the server process.

The challenge is the query system, finding documents again. This JSONlite
doesn't have one (yet) other than retrieving documents by UUID. There's been
some work to make jq usable as a library, that seems like a good basis for
JSON queries.
[https://github.com/stedolan/jq/wiki/C-API:-libjq](https://github.com/stedolan/jq/wiki/C-API:-libjq)

~~~
chrismanning
EJDB [0] is pretty good and seems to be actively developed again after a
period of inactivity. It uses BSON rather than JSON directly but close enough,
and the queries are modelled after Mongo too.

I made a C++14 wrapper [1] a while ago which will probably still work unless
EJDB itself had some drastic API changes. (I also have a C++14/1y BSON/JSON
library [2] that's handy for working with EJDB, but is a bit of a playground
for template metaprogramming so compile times will explode with certain
functionality).

The main problem with EJDB is that it's not crash tolerant so you need signal
handlers to attempt a graceful flush/close on a global handle.

[0] [https://github.com/Softmotions/ejdb](https://github.com/Softmotions/ejdb)

[1]
[https://github.com/chrismanning/ejpp](https://github.com/chrismanning/ejpp)

[2]
[https://github.com/chrismanning/jbson](https://github.com/chrismanning/jbson)

~~~
coleifer
I took a look at EJDB recently and thought it looked like a pretty neat
project. One potential gotcha is that it is built on TokyoCabinet which AFAIK
is no longer maintained. Looks like they have plans to build a v2, though.

------
halosghost
Honestly, I would not be able to use this unless it had, at least, a native C
library. Having it be bash-reliant, though quick and simple, makes it much
harder to actually use it in anything native (shelling out is generally a
BadTime™). I do not mean to discourage you though, using JSON as a human-
readable data-store is something I've definitely done before, so you have a
good idea, I just differ in my needs in the implementation.

Personally, I've used jansson[1] for this kind of thing in the past since it
makes working with JSON in C an absolute breeze.

[1] [http://www.digip.org/jansson/](http://www.digip.org/jansson/)

------
baxter001
How is this not massively worse than hacking something together using shelve
and json from the python standard library?

~~~
cbgbt
It turns out it's much faster, even if you ditch shelve and use anydmb to
avoid pickle.

~~~
baxter001
Really? Over what size of document collection?

------
jchrisa
We have a mature JSON database with p2p sync capability, that is open source
under the Apache 2.0 license.

[http://developer.couchbase.com/mobile](http://developer.couchbase.com/mobile)

We have native implementations for iOS, Android (Java), Windows (C#), and play
well with the Apache CouchDB ecosystem, so you can also use projects like
PouchDB.

I wish I had time to write the code it would take to add p2p sync to this
project. It might not be hard considering the PouchDB source already has
implementations of most of the algorithms in JavaScript.

~~~
sweetiewill
Pretty neat that a Peer-2-Peer Photo Sharing app can be built using Couchbase
Technologies -

[http://blog.couchbase.com/photodrop](http://blog.couchbase.com/photodrop)

------
marknadal
For all those who are curious, I'm building something similar - but it has
realtime sync (like Firebase) and offline support:
[http://github.com/amark/gun](http://github.com/amark/gun) . I'm glad others
are working on this problem too!

------
buffoon
I actually used a system back in 1997 that used exactly this data store system
but with a proprietary encoding rather than JSON. It worked really well in a
single-user single-process capacity but we had to adjust the block size of the
filesystem so we didn't run out of inodes quickly.

The only killer was when some muppet put it on an NFS share when they were
trying to be clever and get it working for 5 users. Every failure mode
possible turned up at once.

Edit: also to keep directory lookups cheap, it used nested filesystem prefix-
based: /a/b/c/1/abc112341982389129382 for example.

------
KirinDave
What is this... what is its intended use?

The vast majority of people who use SQLite use it because they have flexible
queries they want to run. Just storing structures to disk is so simple it is
usually just easier to integrate with your codebase.

So, cool shell script hack. But what's it for?

~~~
nodesocket
In the README: "JSONlite is a proof of concept, and it may not make any sense
to actually use it in development or production."

~~~
ketralnis
A proof of what concept, though?

------
djfm
Git is also a very nice zero-conf tool to store arbitrary objects and retrieve
them by hash :)

------
lawlessone
If i had something like this for Android it would be great.

~~~
GaiusCoffee
Shared Preferences* works for small data.

*[http://developer.android.com/training/basics/data-storage/sh...](http://developer.android.com/training/basics/data-storage/shared-preferences.html)

------
a_c
What about adding an option to using a user defined key instead of the uuid. I
think it will be useful

~~~
s_kilk
Just guessing, but it looks like jsonlite relies on the uniqueness property of
UUIDs to 'guarantee' uniqueness in what amounts to it's primary-key index, the
index of ids.

User-defined keys would need to be checked against all existing keys before
inserting, to ensure uniqueness. Not impossible, but potentially much slower.

~~~
nodesocket
Correct, this all works, because we assume the UUID is globally unique, thus a
read is a O(1) operation.

------
duaneb
Please rename? The current one gives no idea that it's a data store until
AFTER you see the association with SQLite.

