Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Squirrelbyte – a SQLite-based JSON document server (squirrelbyte.com)
118 points by adamlouis on April 11, 2021 | hide | past | favorite | 31 comments



I have been looking for exactly this for a while now:

“For my usecase, I wanted a way to:

Stash my data in its "original" JSON form Explore it later & build whatever views I want Keep costs & infrastructure complexity low Self-host it / own my data”

However, I think compression is a key part. Large scraped json datasets can be compressed by a few orders of magnitude, which is the difference between scp’ing a db up to aws in minutes vs days. SQLite has an official proprietary extension to DEFLATE backing pages, but that’s not easy to buy/distribute for hobby projects. I’ve tried row compression with zstd dictionaries and it works well, but then you lose native indexing.

Mongo wiredtiger does pretty much what I want, it’s just not a neat flat file :/


Nice! Glad it resonated. Never quite sure how a project like this will land.

Thanks for sharing those - will check them out. Interested to see what happens as the size of the dataset grows.

I have not looked deeply, but Typesense[1] seems like another interesting project. Similar to ES or Algolia, easy to self-host, & with a seemingly efficient memory & disk footprint.

[1]https://github.com/typesense/typesense


Except the flat file part. What was lacking with postgress?

It has decent json support and v8 javascript build-in.


No compression of json docs. Jsonb is even bigger than json text column. There’s an extension out there but it requires manual compression passes and doesn’t allow indexing. Might as well use SQLite


I think it would probably just make sense to run postgres on a compressed filesystem rather than use an extension.

Looks like citus had good results: https://www.citusdata.com/blog/2013/04/30/zfs-compression/


Thanks for the informative answer!


Looks very similar to something I made a while ago[1]. SQLite is an awesome piece of software, and with JSON and Fulltext search baked in is ideal for these kind of things. Glad to see that people are finally realizing it!

[1]: https://h3rald.com/litestore


Thanks for sharing! This is great work. Cool to see it's written in Nim!


I am glad to see more projects down these paths.

We've been sticking JSON documents into SQLite databases for years, simply because we can't be bothered to map thousands of business facts to individual columns. Turns out that this approach is fast enough that our customers can't tell the difference either way.

Decisions like this can make for one of those 10-100x speedups in development timelines. Even if you are convinced that JSON serialization+SQLite are too slow, you should still try it to make sure. You will almost certainly be surprised.


Love it. Got some questions: What request volume does it perform against? Do you shard requests? How large is the dataset? What is an acceptable response time? Any major slowdowns at the highest percentiles?


Our request volume has a strict upper bound that we can depend on. The part of the user base touching JSON+SQL code paths is well-constrained.

For the most contentious area of our application - updating business state instance per user - We find that we are able to serve on the order of 1~10k requests per second. The size of these datasets is around 0.5~15 megabytes.

There isn't a whole lot of other magic involved. WAL is the most important thing for improving throughput.


This is phenomenal. The React app alone is truly impressive. You've put a lot of attention into things like displaying the query time and round-trip time.

At a glance (and it's a been a while since I've touched MongoDB) the query language looks very similar, maybe compatible with a couple of very minor changes. Was that intentional?


Thank you! Appreciate it! Glad you noticed some of the details.

I didn't intentionally model the Mongo query language. Some requirements were:

- JSON-based: so I did not have to parse the query

- Easy to extend: so I could add new operators later

- DB independent: so I could swap out SQLite with postgres, mongo, cockroach, duckdb, etc. IF I ever wanted to

- Expressive: so I could do aggregations / group-bys / etc

I considered using more popular standards like the ElasticSearch API but they weren't quite what I wanted. I liked how jsonlogic was basically just an AST. Its primitive structure makes code generation to other DB query languages straightforward.

I started this project with my own "JSON lisp" variant, but then found jsonlogic & used that since mine was _pretty_ close (only lists, no objects).

The top-level fields of my queries look like SQL (select, where, group_by, order_by) - I really like the Honeycomb query UI (https://www.honeycomb.io/query/) - so wanted to be able to support user experiences like that ... but also go further via arbitrary jsonlogic expressions.


This is really cool and interesting!

I built something kind of similar, a JSON document explorer/viewer specifically for the database from a video game - https://data.destinysets.com. Something I think is neat about this is that it uses the OpenAPI spec to link relationships. This is all completely client side for better or worse.

I really like the table view you've made here.


Nice! The nested explorer you have there is REALLY slick. Great work!

This is something I had been thinking about - when dealing w/ normalized JSON documents, how might I allow exploration of referenced documents? & how might I encode those relationships is some way that's lightweight?

Thanks for sharing.


I think this is great. And on a tangent, a few months ago I was searching "everywhere" to find a somewhat standard rest api query format. There's odata, but that was way too heavy.

I don't know how popular jsonlogic itself really is, but it definitely seems like a good fit for queries.

https://jsonlogic.com/


Thanks!

Yeah, jsonlogic does not seem THAT popular. A JSON-based solution that looks like an AST means you can skip query parsing & just do generation. I started this project with my own JSON Lisp DSL before going with jsonlogic.

In either case, I like the idea of being able to swap out arbitrary storage layers - sqlite, postgres, mongo, etc. - while keeping the JSON api.

Had not encountered odata - thanks for sharing!


do you know sql.js?it is sqlite in js.

not sure the json support. if you move the sqlite to frontend, no server is necessary. upload a sqlite db or csv and explore.


Will look into it - thank you for the tip!


Thanks for posting. This can be very useful, especially for personal local use. The "README" needs a bit more polish.


Thank you for checking it out! Glad to hear it might be useful.

Will take a cleanup pass soon - was excited just to ship it.


As a new junior developer can you tell me what this is good and what use case it's best for ?


Somebody is using a database (sqlite in this case) to store JSON objects. There is nothing strange or unusual here.

When doing this, though, the problem is that it's more difficult to run search queries with the same expressiveness as we would otherwise do had the data been properly "unfolded" into many columns of a relational database.

This tool allows to make such expressive requests. This is very useful.

Interestingly the search query itself is expressed as a JSON object (this is a design choice, it could or could not have ben the case. In any case it's cute.)


Thank you for the explanation. I'll be adding this to my ever growing list of things to study further haha.


I've been thinking about writing an ElasticLite with SQLite.

Good job.


Nice! Thank you!

By implementing the the ES API you may then be able to re-use parts of Kibana, Elastalert & other parts of the ES ecosystem ... could be cool!


How big is the sqlite database you are using? Here is a similar project you might be interested in. https://datasette.io/


They are already using it, it is stated in the author's note.


is there a github link? or won't you publish the code?



thanks!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: