
LokiJS – Lightweight JavaScript in-memory database - joeminichino
http://lokijs.org/
======
skrebbel
Great!

I'm really no expert, but I think that language-specific persistent in-memory
databases are the way to go.

Deep inside, what I really want on my backends is just a data structure. A
single big tree (or graph) of objects. Some primitives to make it thread-safe
somehow (immutable collections, or mutexes, or just old-fashioned single-
threadedness like Node - whatever), and that's it. Requests from the client
query and mutate that data. In _very many_ situations, I wouldn't need
anything special to make this work fine with an application up to a pretty
large number of users. Not all apps do big data analyses. RAMs are gigantic
these days, bigger than the SQL datafiles of swathes of webapps.

However, there's that pesky problem of code upgrades and crashing servers. If
my server would never crash and I could hot-swap the code, I wouldn't need
persistence at all. But I do, so I don't just want an in-memory datastructure,
but a _persistent_ in-memory data structure. I want to use this datastructure
as easily as the programming language could possibly make it be for me while
still guaranteeing some sane level of persistence and fault tolerance.

Redis is a nice idea, but the fact that it's a separate server, with a client
and a protocol and data structures that _just not precisely_ map to my
programming language's data structures force me to write a whole bunch of
boilerplate anyway. Even more so with other databases like Mongo or Postgres.

If I understand it well, Loki doesn't entirely do what I'd want: it does not
save the data to disk on every change, but less often, if I get it right. That
might be good enough if my problem allows for many little independent Loki
databases, but if the dataset is a gigabyte and persistence means flushing
that whole gigabyte to disk after every data change, it probably won't work
very well.

I'm really curious if other people here have similar ideas, maybe
implementations, of these concepts. Maybe Loki could be extended with a
snapshot+operation_log type of data storage like Redis has and then it'd be
pretty close.

~~~
yourad_io
> If I understand it well, Loki doesn't entirely do what I'd want: it does not
> save the data to disk on every change, but less often, if I get it right.

From a cursory glance at the source, it doesn't look like anything internal
ever calls .save()/.saveToDisk(), so it is up to you to decide how often you
want to be creating your on-disk checkpoints.

EDIT: Seems I am wrong: "In a nodejs/node-webkit environment LokiJS also
persists to disk whenever an insert, update or remove is performed." [1]. But
I still can't spot where .save[/ToDisk] is called in the source [2]. Anyone?

> That might be good enough if my problem allows for many little independent
> Loki databases, but if the dataset is a gigabyte and persistence means
> flushing that whole gigabyte to disk after every data change, it probably
> won't work very well.

If you're looking at gigabyte+ datasets, then something designed to be in-
memory is probably not your best bet. Aside from saving, (one imagines) this
would also affect load times (reading->parsing->importing a multi-GB JSON file
at launch can't be quick).

From [1]:

> LokiJS is ideal for the following scenarios:

* where a lightweight in-memory db is ideal

* cross-platform mobile apps where you can leverage the power of javascript and avoid interacting with native databases

* data sets are not so large that it wouldn't be a problem loading the entire db from a server and synchronising at the end of the work session

[1] [https://github.com/techfort/LokiJS](https://github.com/techfort/LokiJS)

[2]
[https://github.com/techfort/LokiJS/blob/master/src/lokijs.js](https://github.com/techfort/LokiJS/blob/master/src/lokijs.js)

~~~
skrebbel
> _If you 're looking at gigabyte+ datasets, then something designed to be in-
> memory is probably not your best bet. Aside from saving, (one imagines) this
> would also affect load times (reading->parsing->importing a multi-GB JSON
> file at launch can't be quick)._

Why not? I only need to load when my program starts, i.e. after a crash or
maybe an upgrade when hot code swapping is not available. Sounds like there
may be plenty of persistence schemes that make this not entirely painful (load
latest data first, or fallback to disk reads if the data isn't entirely loaded
yet, which makes the service slower but still available right after a crash).

Note, I'm really just dreaming here, and I appreciate you dreaming along.
Dreaming is good! Some frontend dev dreamed "i just want to rebuild the entire
page whenever data changes!" and then React happened.

Thanks for your dig into the code btw, nice findings! I can't find the save()
calls either, so I suspect that they used to be there but aren't now, or the
other way around. It's alpha, after all.

~~~
yourad_io
My (unseen) emphasis was on _best_ bet.

It would work, but I think it would be dangerous and/or inconvenient.

The danger part comes from having a sync-to-disk operation that lasts any
considerable amount of time - the longer it lasts, the more the likelihood
that an inopportune crash would leave you with an incomplete (read: corrupted)
JSON file. A DB built for fast disk persistence would only update the relevant
records, keeping the disk writes as small as possible. I don't think Loki has
any option other than writing the entire thing each time (with the current
JSON savefiles, that is). Since save() is an expensive op, you also can't be
calling it at every update (it wouldn't even work! the first one would lock
the file for writing and all subsequent save() ops, until the first one is
done, would fail!) so it would be inherently unsafe _unless you committed GBs
to disk for each update!_.

So, this might be somewhat dangerous, but we can mitigate that, right? We'll
save frequently, but not too frequently, and somehow version-control our JSON
file. Which is GB+ in size. So we'll also compress them? And when we need to
restore, we'd.. hmm.. start reading from the latest until a valid one is
found? (<\--inconvenience)

> Note, I'm really just dreaming here, and I appreciate you dreaming along.

Likewise!

> Sounds like there may be plenty of persistence schemes that make this not
> entirely painful (load latest data first, or fallback to disk reads if the
> data isn't entirely loaded yet, which makes the service slower but still
> available right after a crash).

While those certainly exist, the one currently chosen by Loki can't do any of
these things. Since you can't parse half a JSON file (especially in this
format, which appends index info etc at the end of the file), you have to read
the entire thing from disk, JSON.parse it, and then feed it into Loki. Only
after all of that will you know that you are actually restoring from
meaningful (and complete) data files and not junk.

The way I see this, it would be great for throwaway-prototypes on the
serverside (or for few/unimportant data) but its real value would show on the
client side. You basically get a mini-mongodb for 2K LOC that you can embed on
anything that speaks ECMA3.

What's even cooler IMHO is the future - on their slideshare presentation they
list replication and horizontal scaling on their roadmap. While I have no idea
how they have envisaged implementing those, I would love to experiment with
the meteor.js concepts and this - providing a full, fast cache of the user's
data on the client side and then throw differential updates around.

edit: spelling

------
Sir_Cmpwn
Related: SQL.js, which is sqlite compiled with emscripten

[https://github.com/kripken/sql.js](https://github.com/kripken/sql.js)

~~~
indubitably
Except that file is 10x bigger than loki…

~~~
bpicolo
So? It's also full-blown sqlite. Let's not pretend file sizes matter much in
modern times.

~~~
rakoo
It does if you're going to send that to a browser.

------
imslavko
From skimming the presentation it looks like a project similar to Meteor's
Minimongo and Miniredis ([https://www.meteor.com/mini-
databases](https://www.meteor.com/mini-databases)) used for browser caches.
Minimongo, for example, implements a great majority of Mongo selectors (there
are no secondary indexes, though).

(I contributed to both Minimongo and Miniredis)

------
jedireza
I'm curious why NeDB wouldn't suffice.

[https://github.com/louischatriot/nedb](https://github.com/louischatriot/nedb)

~~~
yourad_io
From the slideshare presentation on the homepage (p. 4)[1]

> Performs better than similar products (NeDB, TaffyDB, PouchDB etc.) and it's
> much smaller

It would be interesting to see benchmarks.

[1]
[http://www.slideshare.net/techfort/lokijs/4](http://www.slideshare.net/techfort/lokijs/4)

~~~
jedireza
Thanks for sharing that. I jumped the gun before commenting.

~~~
yourad_io
No worries. I just noticed they include the benchmarks here [1]

[1]
[https://github.com/techfort/LokiJS/tree/master/benchmark](https://github.com/techfort/LokiJS/tree/master/benchmark)

------
keithwhor
How does this compare to DataCollection.js?

[https://www.npmjs.org/package/data-
collection](https://www.npmjs.org/package/data-collection)
[http://thestorefront.github.io/DataCollection.js/](http://thestorefront.github.io/DataCollection.js/)

They look fairly similar. What's the test coverage and do you have performance
benchmarks?

------
remon
Out of pure curiosity, what are the common use cases for a fast in-memory
database? Are they exclusive to server-side applications (e.g. caching)?

~~~
ilghiro
Cordova is a big one. The ability to have rich queries into some persistent
data set is something that's difficult to achieve atm.

~~~
joeminichino
It's precisely why I created LokiJS in the first place, then it grew much
bigger than that but yeah - that is where it all came from.

------
marknadal
Excellent project, there needs to be more focus in this space. I've been
feeling a general trend of developers going back to basic embedded databases.

I'm working on a similar project, that is focused on ease of use combined with
distributed/decentralized (concurrency safe) behavior.
Http://github.com/amark/gun

------
cloudsheet
Has anyone used this is lieu of or in comparison to Redis? Am I correct in
assuming that with LokiJS, one could move in-memory data storage from the
server (with Redis) to the browser (with LokiJS)?

This project looks quite interesting. Thanks for sharing. Could be a great
solution for single-page web apps (and mobile, Cordova, etc.).

~~~
panzi
If you want a simple key-value-store in the browser, why don't you directly
use localStorage(+JSON)? It's supported since IE8. My guess is that this thing
is using localStorage under the hood.

~~~
RussianCow
Presumably because LokiJS is faster. (I don't know if that's actually true.)
And no, I just looked at the source code and Loki does not use localStorage
under the hood. It's an in-memory database without a persistence layer, so you
lose the data when you leave the page (in the browser).

Edit: Actually, reading more about Loki, it looks like the whole point of it
is to provide querying/indexing capabilities to data in memory. So I don't
know if there would be any value in using it as a replacement for key/value
stores.

------
leeber
This looks interesting. Coincidentally I was just searching around for
something like this to run in the browser.

I'll play around with it myself, but is there anyone who has used this that
knows how well it does with memory usage?

~~~
nobodysfool
Uhm, I'm pretty sure it can use all your ram if you want. It's an in-memory
database, so all your ram will be used for data...

~~~
RussianCow
I think the question was more about how compactly it stores the data in
memory.

------
quartzmo
I want a very fast, client-side, read-only database with rich query support.
Is this the best choice?

