select country_code, long_name from wdi_country
order by rowid desc limit 100
I have an animated GIF demo of this here: https://twitter.com/simonw/status/1388933800445452290
select country_code, long_name from wdi_country where rowid >= 164 order by rowid asc limit 100;
Turns out doubling isn’t the best strategy. The optimal solution is actually to add a constant increment to packet size. How much depends on relative cost of the terms in the regret function.
> That’s because I implemented a pre-fetching system that tries to detect access patterns through three separate virtual read heads and exponentially increases the request size for sequential reads.
> Since you're doing a reverse table scan my "sequential access" detection doesn't kick in.
You know, starting off with the default 4kB page size naturally adds some resistance to these kinds of failure cases. If the VFS isn't issuing many requests in parallel, I would think that setting up a page size near target_bandwidth * round_trip_time would be a better initial guess. 1kB would be appropriate for a pretty low latency-bandwidth product.
And note that the request chunk size is bound to the SQLite page size, and to change that page size you have to rewrite the whole DB. So it can't be set on the fly unless you have multiple copies of the database.
It looks like one of the requests made to the DB included a little over 700 bytes of response status line and headers, so that would probably end up spilling into more than one response packet, unfortunately.
iirc Jigsaw was used to develop and validate the WebDAV protocols and XQUERY which at the time I remember thinking XQUERY was sure to be the future as implementations of advanced data management and manipulation and query and distribution and declaration looked to be what the whole point of webservers were for. The incredible distractions caused by "rich media" as opposed to multimedia as it was understood then, are really worth thinking about. Saying that, however, the BBC is doing excellent work on restoring the balance of necessary powers to the network standards engineers https://www.bbc.co.uk/sounds/help/questions/about-bbc-sounds...
Connecting this to HTTP range requests, the edited video feed can consist of a list of the exact byte ranges that the clients need to download and play. Found this description of how Akamai uses range requests to serve low-latency streams: https://blogs.akamai.com/2020/11/using-ll-hls-with-byte-rang...
frankly I think that unless we do things like this, the Internet is simply going to become a closed shop to anyone not wielding enterprise budgets and legal department capabilities.
If you want it to fetch less data for querying a subset of columns, you could create create an index on those columns - then SQLite will do an COVERING INDEX scan and thus read only the necessary data (with the B-Tree itself and the start / end page alignment being the only overhead).
It also reminded me of a vague inverse of this hack. In old versions of Qemu (possibly it is still implemented, but I have vague memories it got ripped out), you could point Qemu at a directory on disk and it'd produce an emulated floopy disk drive with a virtual FAT12 image containing the directory contents. AFAIK it didn't keep the actual data in memory, I guess all it needed was file sizes to know how to build a virtual memory mapping that contained the filesystem metadata + proxied reads from the underlying files for data sectors. I look forward to seeing your implementation of this concept in a virtualized SQLite file <-> GraphQL proxy ;)
edit: insane, it still exists and apparently supports write mode?! https://en.wikibooks.org/wiki/QEMU/Devices/Storage#Virtual_F...
It's REALLY impressive that you only need to read 54 KB out of 700 MB, to fetch the records.
the harsh reality is that doing sensible queries that only reference and return the data actually needed always makes things faster. Even with server DBMS. Oh, how many times have I lamented the naive "select *" for forcing all the row contents even when there was index coverage for the actually needed data.
Not supporting range requests would be a disadvantage for any service hosting large files. Resuming failed long downloads wouldn't work so users might not be happy and there would be more load on your bandwidth and other resources as the AU falls back to performing a full download.
On the other it's still a lot of overhead.
I'm not saying the whole thing isn't impressive, just that the concept itself is one of those "because I can" rather "because I should" things, which kinda devalues it a whole lot.
In the standard scenario for personal projects (not enterprise) I would have a small VPS/Dedicated server with a REST service running - that would be hugged to death immediately if a link would make it to some site like HN. And also, I completely share the experience of the Author that after a couple of years you have moved on, the VPS is dead etc and you don't want to invest time.
Again, before considering using solution, be sure to understand how it works and the resulting limitations or you will likely chose wrong.
From this paragraph it should be pretty clear that it's actually a great result. The database will obviously need to read more data than it presents, so more is fetched.
But this approach lets you actually work out what the optimal size is:
select sum(length(country_code) + length(long_name)) from wdi_country;
Or on average:
select sum(length(country_code) + length(long_name))/count(*) from wdi_country;
(Note that it doesn't seem possible to use aggregation functions with a limit clause)
this achievement, and this blog post, to me is on par with blog posts that you would see from a major company where they solve some significant business critical technical challenge in-house. for example: the GitHub blog post about how they created their spinning globe of commits on their homepage, or a Netflix blog post of how they optimized their Network infrastructure to serve so many customers.
your work is truly incredible. You're next level of next level.
also does this mean that static sites now can scale without relying on db???
what other implications and application do you see?
wonderful work!!!! please let me know how i can donate
xhr.setRequestHeader("Range", "bytes=" + from + "-" + to);
Any well designed system, especially if it has static sources and is server cached should support it.
Surprisingly many web-frameworks don't support it out of the box, or don't support it well.
Either way gh-pages are static content and probably with some server side regional caches, so I'm not surprised it works.
If you're pulling a single TCP stream across a crowded network, you get maybe 1/40th of the available bandwidth. If you do four range requests instead, you might see >3/40th of the bandwidth.
This place we were contracting at, the managers were widely rumored to stream sports games at their desks, and our release cycle happened to fall on a game day. My poor coworker was downloading our installer every week, and the ETA was over 40 minutes. "Are you using DTA?" "No, what's that?" <fiddle fiddle> ETA: 12 minutes.
12 minute pauses in the middle of a manual process are a lot easier to stomach than 40 minutes. Especially if something goes wrong and you have to do it twice.
It's an embedded genome viewer, you can just point it at a multigigabyte reference files and .seg files and it loads super quick
I'm surprised this isn't used on mobile browsers to lower data usage. I'm sure with a little research you could figure out what a good mapping from pixel size to byte size should be to give good enough results.
However, one could use this approach: download as usual, and in a streaming fashion process the data and if it's a progressive JPEG, you can close the connection before you have received everything; and then you can cache the prefix and later download the rest if needed.
Fast clients will just swallow the whole file, while slow clients would be able to benefit from it.
It wouldn't work for pipelined HTTP connections though without cancelling the whole pipeline, so maybe not a very practical solution given the performance benefit that already gives. And HTTP/2 maybe doesn't support cancelling a transfer either, so.. ?
Maybe a direct "Accept" or "Prefer" header to indicate that it's enough to send just something useful for an icon would be a more ideal solution, but it would require server-side support.
Oh, wow initial release 1998,now I'm feeling a bit old...
Unfortunately, solid, stand-alone webdav servers are harder to come by than decent http2/1.1 servers.
But as long as you're dealing with a known server that does, then gravy!
Could you provide an example of server that does not?
AFAIK, Range is supported by all major CDNs, so not supporting it in web server would be a death knell for it's real-world adoption.
I would assume this is often because the site in question isn't using Apache etc. to serve a file directly, but is either essentially proxying it to some custom-built file serving service, or a script that processes/authenticates the file in some way, and they just never bothered to implement Range.
We convert the imagery into Cloud optimised geo tiffs and store them in S3 https://www.cogeo.org/ then the browser can request the tiles directly from S3.
Even the big imagery providers are now storing their imagery as COGs, eg https://registry.opendata.aws/sentinel-2-l2a-cogs/
I wonder when people using next.js will start using this for faster builds for larger static sites?
I actually did also implement a similar thing fetching data on demand from WebTorrent (and in turn helping to host the data yourself by being on the website): https://phiresky.github.io/tv-show-ratings/ That uses a protobufs split into a hashmap instead of SQLite though.
Datasette is one application for views of read-only SQLite dbs with out-of-band replication. https://github.com/simonw/datasette
There are a bunch of *-to-sqlite utilities in corresponding dogsheep project.
Arrow JS for 'paged' browser client access to DuckDB might be possible and faster but without full SQLite SQL compatibility and the SQLite test suite. https://arrow.apache.org/docs/js/
> Direct Parquet & CSV querying
In-browser notebooks like Pyodide and Jyve have local filesystem access with the new "Filesystem Access API", but downloading/copying all data to the browser for every run of a browser-hosted notebook may not be necessary.
If the underlying DB has been changed then the server should respond with 412 Precondition Failed.
I have one Next static site that has about 20k pages and takes about 20 minutes to build and deploy. I think that's an acceptable build time. But I do know of other people around the net who have mentioned having sites with 20k-ish pages taking an hour+ to build. For them I could see the desire to try this sqlite trick.
It’s got a ton of rough edges but the boilerplate is there to get a proof of concept pretty quickly
Its just a different stack.
In my experience, it can take a couple of minutes just to deploy 20 pages, but that could just be the overhead of Typescript and SASS compilation too...
Took some effort to get everything under 60 seconds, speed was the main reason to use Hugo.
And then using SQLite to insert and update DOM elements? Holy cow, icing on the cake. Unlike the first part, there’s no explanation of why you’d want to do that. But by that point I was so drawn in that I didn’t care and was just enjoying the ride.
Basically SQLite has a virtual table mechanism  where you have to define a few functions that figure out how to scan your "fake" table / which indices to use and then how to read / write the actual data. I hook into this mechanism and redirect the request to DOM functions like querySelector() etc. Then there's the issue about SQLite being fully synchronous, but I have to run it in a WebWorker - and the WebWorker can't actually access the DOM and it can only communicate asynchronously with the main thread... So I have to do some weird stuff with SharedArrayBuffer and Atomics.wait to make that work .
:-) There a high amount of SQLite content/articles/blogs on the web that can provide this effect. SQLite is to programmers like the stars are to astronomers. A wonder.
Might it be feasible to easily strip down SQLite so that it only compiles the parts for read-only use? The browser version is obviously somewhat read-only but that's because of the sandbox. I'm talking about excluding the code for CREATE, UPDATE, INSERT and everything else which is just for writing. The aim here would be to produce a significantly smaller WASM binary.
I'm guessing that the answer is no, there's no easy way of doing this without significant rewrites of SQLite's core, but... I can't be the only one to think of this, surely?
Stripping out the write parts is a good idea. SQLite actually has a set of compile time flags to omit features . I just tried enabling as many of those as possible, but it didn't seem to reduce wasm size much, though I might be doing something wrong. There's also no easy flags to disable CREATE / UPDATE / INSERT .
The storage of SQLite is already really efficient, for example integers are always stored as varints so small ones only take a byte. The only thing I think could maybe be improved for this use case is changing the structure of the b-tree to be more columnar - since right now all the data from different columns is intermingled with the btree structure itself, querying a subset of columns has a high overhead.
Being able to use fts-5 without the penalty of having to pull down the whole index make it work much better at larger scales, even with the penalty of additional network requests.
There isn't actually just 1 big domain instance either. It's more like one per user session, and then a global instance.
The impact this had on reducing complexity and bugs is incredible. I haven't seen a null ref exception in a long time. Also, being able to dump your entire universe to disk by serializing a single object is really nice.
For web browser based genome browsers I suspect this (very cool!) sqlite hack would require many more http requests.
I've used this for the same basic idea as this article, only letting me store SQLite databases in AWS's S3 that I can access with AWS APIs so they don't need to be public. It works well, though it's absolutely not for every use case, the overhead is considerable.
I even used it once to read SQLite database files in a zip file stored in S3 without having any local storage to use. Not one of my prouder moments, but hey, I coded my way out of the corner someone else designed for me.
Once you wrap your head around how you need to pass parameters to the helper, it's really straightforward, you just need to implement the xOpen and xRead calls.
I didn't fiddle with the window sizes like the submitted article (the chunk is fixed to 8KiB), but I did optimize it so that reading chunk N+1 of the file reused the response reader of chunk N rather than make a new request. Furthermore I keep an LRU cache of only the last three chunks in memory, because the ZIP files are each only read once.
Using the "Range" HTTP header to read chunks of the database file absolutely works!
But to be clear, there's no write equivalent, is there? You can't use "Range" with a PUT request.
It's amazing but also slightly terrifying in the knowledge that then someone's going to write an SMB-over-HTTP connector.
However, I imagine a service to support your scenario could be written in a standard back-end server language like Go or JS. The challenges involved would be much greater, however -- how to handle concurrency in particular. I suspect one would do better to just run PostgreSQL behind a web API.
But as always, it's seldom the tools, but the right tool used for the wrong usecase that is the problem.
You could of course implement a writing API with POST requests for changing pages of the database - but then you would lose most of the benefits of this (not requiring any special kind of server).
I also thought about implementing a kind of overlay filesystem, where chunks that are written to the file are stored in a local storage so the modified data is available locally while still reading everything else from the remote database.
Interestingly in SQLite that's already exactly what the WAL mode does: It's a second file next to the database that's just a set of pages that are overlaid over the main file when read queries happen - which allows concurrent readers and writers since the database itself isn't in an undefined state even when write transactions are happening.
So you could enable WAL mode and disable WAL auto checkpointing, then you get a downloadable WAL file that can be read by normal SQLite and written back to the main file. It would be neat, but I'm not sure what the actual use case would be ;)
If ever the intent were to involve eventually persisting those changes, then it would be worthwhile looking at remoteStorage, which works like this.
Perhaps adding IPFS to the mix for persisting data would be interesting, I'm sure there are use cases in peer to peer applications. Anyway, amazing innovation thank you for writing this :)
It's slightly more centralized than perfect, but man do you get a lot for a little.
There has to be a git.js implementation out there and you could move the DB to it's own repo and create an https access token (for Github)... the issue there is that someone could use that token to commit whatever to your database repo.
I'd solve it via sharding: divide the database into N pieces via range- or hash-sharding.  Choose an N that's large enough for each piece to be reasonably small. When you look up a key, fetch the shard of interest.
You can put each piece into separate files (a little simpler to code, and most static servers will use pre-gzipped files for "Content-Encoding: gzip requests" easily, but you waste more disk space due to internal fragmentation) or one file (with range serving and an index of the byte range offset for each piece).
The format for each piece can be anything, eg json (simple) or an sstable-like format (more efficient). 
 Content-Length of https://phiresky.github.io/youtube-sponsorship-stats/sql-was...
 hash-sharding means: piece[i] has all the keys where hash(key) % N = i.
 https://github.com/google/leveldb/blob/master/doc/table_form... although they just say "formatted according to the code in block_builder.cc" instead of describing the most relevant part.
Fucking amazing, mad props. Beautiful work!
>>> From SQLite’s perspective, it just looks like it’s living on a normal computer with an empty filesystem except for a file called /wdi.sqlite3 that it can read from.
Or more generally: I predict a WASI implementation which will treat ANY server resource as a virtual file, replacing REST.
I have a question. It's a 668.8MB database file. What does actually happen if the query has to scan 300 mb before finding the right answer? Wouldn't it be better to do the work up front and deliver the answers as static json files? Sure you loose the flexibility of dynamic queries, but do you really have that flexibility in non trivial cases (e.g. 300 mb search)?
I see myself using this in conjunction with a conventionally hosted pg db for dynamic content.
Although one should be aware of one very important git behavior - git does not diff binary files (like SQLite dbs). That means 2 things:
1. Each db update will generate a new file in git, maintaining the whole old file in history, instead of the diff in bytes. This will accumulate a lot of clutter in the repo
2. As git does not diff binaries, there is a very small risk of corruption (especially if you work in multiple OSs, because of CRLF)
Ref - https://robinwinslow.uk/dont-ever-commit-binary-files-to-git
Note that the diffs that git shows you are completely unrelated to the deltas it uses to compress it's database - which are always "binary deltas" and not line-based diffs.
Also I'm not sure why you mean that db corruption possibility has something to do with whether or not it stores diffs?
/edit: The sibling comments mentions that git can infact delta compress older commits for storage efficency. But my point was that git commits are not deltas but full snapshots.
I moved from US to Brazil 3 years ago, and I still notice the latency when a site runs their backend only in one location. This cleaver solution makes interacting with the graph supper snappy even compared to enterprises that do have database servers in Brazil. Very impressive!
If queries make use of indices, only a fraction of the database needs to be downloaded.
Also, you can use SQLite to query the DOM.
- SQlite opens the file and reads 4kb worth of header -> range request for byte 0-4096
- headers/metadata refers to index table with root node at 8192kb
- user issues SELECT * from index WHERE name = 'foo'
- SQLite reads root node from the file (range request for 8192kb..)
- Root node indicates left branch covers 'foo'. Left branch node at address 12345kb
- Fetch left branch (range request for 12345kb)
- New node contains an index entry for 'foo', row 55 of data page at 919191kb
- SQLite reads data page (range request for 91919191kb..)
etc etc etc
For the example the wdi_data table is 300MB and an index on it is 100MB in size. This index has a tree depth of 4 - which means SQLite has to read exactly 4 pages (4KiB) to get to the bottom of it and find the exact position of the actual row data.
you can check the depth of the b-trees with `sqlite3_analyzer`.
This is really a great hack: using a system in a unexpected and creative way for which it wan not originally deigned, but which is also very useful and nice.
But my question revolves around databases assuming that the disk they access is local or at least a fast network storage. I wonder if there are any databases optimized to access slow storage over low bandwidth, where you're really trying to optimize the amount of data read more than anything else.
Actually, this sort of partial access (i.e. partitioning) is rather easy to implement by addressing separate data files by name, instead of using numeric ranges into a database. Basically just put the data into files named by the years (in my example); or bucket the data into chunks of arbitrary size and use the chunks as files. Elementary to extend this to multiple fields in the index. In short, partitioning based on actual field values can be much easier in the static-http approach than using opaque ranges. Probably also more effective if something like http2 allows requesting several files in one request—since you can avoid requesting too little or too much.
There are of course other similar libraries too.
EDIT: Whoops, just saw a few comments below Lunr is already mentioned.
Indexes are usually btree, it finds what and where the relevant index is (probably from the schema queries), then goes trawling through the b-tree.
That’s how db engines normally work, they don’t liberally go through the entire db contents.
If there’s no index then it finds the start of the table and scans it sequentially (which is exactly what it sounds like).
Yes the web server supports range requests. Yes it only returns 50kb.
But what mechanism is letting it scan to just those ranges in the binary file. Doesn't the file system make you seek to the starting block and then read from there?
The point is, while it looks very efficient, there might be a crap ton of IO going on for your little 50kb response.
EDIT: probably the webserver is doing an fseek(), and performance will vary based on file system impl. This is something I will need to benchmark.
perhaps read-only should be added to the title
I'm also surprised that Github Pages lets you present arbitrary JS to the point where you can upload SQLite as WebAssembly. Isn't this dangerous?
That said, all you need to query an SQLite database on the server side is a simple PHP script (as opposed to a running db server), and most static hosting providers offer PHP support.
For example, on a 1 megabit/s link with 300 ms RTT, one example would take about 2 seconds for the data transfer itself while spending another 3 seconds or so on waiting. Downloading the entire file would take around an hour and a half.
For your 10 MB database, transferring it as a whole would take 80 seconds. Assuming this solution instead needs to read e.g. 250 kB (taking 2 seconds to transfer), it could still bounce around 250 times to the database before those 10 MB are fully downloaded. (This would be a really odd query, since it would only read on average two pages per read request)
In my experience SQLite databases of millions of rows of raw tabular data tend to compress very well into dozens of megabytes. Indeed SQLite is often touted as a file format for applications.
SQLite is most interesting not when the database is small, but when there are very few writes and all you do is reading. You can also look at https://datasette.io/ and see how SQLite is perfect for representing a lot of datasets and querying them
Most workstations have GBs of available memory. If not you can dump it in indexeddb as a raw data store.
I never disputed that it would be useful for some use cases. I only said it would be unusable with high latency links. If you have a low latency link and aren’t running very complex queries with lots of random seeks, then this should work fine for you.
We're using them as a replacement for leveldb's sstables, but with the structure of full SQL. It is highly effective.
> The fact that SQLite continues to reach for more users is what has made it such a successful general-purpose tool.
I never disputed this. You’re responding to a straw man.
> this would be unusable over high latency links.
That is objectively true
> SQLite databases of pure data usually aren’t over 10MB in size.
No one here has refuted this point.
Any other counterargument is addressing a claim I did not make.
But most high latency links are very slow (so downloading large databases is a horrible experience) and (more importantly) are often priced by the size of downloads.
That’s false. Not all web applications suffer equally from high latency links. Depends on how reliant the web application is on independent requests. Making one request and receiving a single bulk download is much less bad than making many dependent requests on a high latency link.
It’s certainly possible that people are using SQLite databases with sizes on the order of gigabytes but in my experience those are the exception not the rule.
BTW, SQLite has a (theoretical?) max size of 140TB! (or so I've read)
A more relevant question might be: what would the average size of SQLite databases for web type (or even this specific use case) applications. I don't know, but 10s or 100s of MBs might not be a bad guess.
But for a more open or a hobbyist project where minimizing amounts of data transfer is less important than minimizing the amount of doing more work (server-side code), then this seems like a decent solution.
It is also worth reminding that this solution only practically works for read-only databases.
Why do you think this?