
Better-sqlite3: A faster Sqlite library for Node.js - jeswin
https://github.com/JoshuaWise/better-sqlite3
======
jlongster
It's mentioned in other comments but I'm going to repeat it as a top-level
comment because it still doesn't seem clear enough to most people: sqlite's C
API is synchronous and doesn't use threading, and thus _it will always block
the node process no matter what_. Throwing an async API on top of it is
useless - all your doing is making it slower by deferring the computation by a
few ticks but when it does run, it's still going to block everything!

It's definitely a good example of buying into the "hype" of async without
really understanding what's going on. Async is helpful when you need to wait
on something in the future (like new packets come in from a socket) but don't
want to block the whole process while waiting. It doesn't make sense at all
for CPU work that's going to block it regardless.

I've lightly contributed to both better-sqlite3 and node-sqlite3 and the
latter's async implementation makes it much more confusing and difficult to
work with. And it slows things down considerably.

I switched node-sqlite3 with better-sqlite3 in my Electron app and found non-
trivial performance gains. Make sure to run it in a separate process - never
run sqlite in the main thread or the same renderer process as your app. If you
run it in the main thread, it still blocks the renderer process. I wrote an
article about this: [https://medium.com/actualbudget/the-horror-of-blocking-
elect...](https://medium.com/actualbudget/the-horror-of-blocking-electrons-
main-process-351bf11a763c)

I'm glad somebody finally made a robust synchronous API to sqlite.

~~~
zbjornson
Just because SQLite's API is synchronous does not mean that it must block the
node process. libuv manages a thread pool, which node uses to offload sync
APIs and let the main event loop run. This is described here under "What code
runs on the Worker Pool": [https://nodejs.org/en/docs/guides/dont-block-the-
event-loop/](https://nodejs.org/en/docs/guides/dont-block-the-event-loop/)

~~~
tkone
You don't just get the threadpool by default when you write a native add-on.
The native add-on needs to opt into the behavior. I can't find a single place
in this source code where they're using the threadpool to schedule synchronous
code to run in a separate thread.

~~~
zbjornson
Indeed, you have to write your native code to use it. My point is that you can
probably write an even better sqlite library by using the thread pool, and
that sync APIs can be made to act async.

~~~
jlongster
The fear is justified that somebody would naively use this library on a
network-heavy app and block the event loop. The readme should probably do a
better job describing how this should be used. It's not too hard to run it in
a separate process and create a "db service" that you query async. For many
small apps or certain use cases the sync api shouldn't be a problem.

------
masklinn
> Easy-to-use synchronous API (faster than an asynchronous API... yes, you
> read that correctly)

Isn't that entirely unsurprising?

The point of async is not that it's faster, it's that other shit can run while
you do your IO. Of course your sequence of operations is faster when you never
yield to other tasks.

~~~
gwbas1c
Uhhh, read the description.

node-sqlite3 uses an asynchronous API for CPU-bound functions.

An asynchronous API is only for IO-bound functions. CPU-bound functions should
be synchronous.

~~~
Klathmon
Well even CPU-bound async functions can have benefits in node.js as they can
free up the single javascript thread to do other work while it does the CPU
bound work on another thread in the native code.

It's definitely a bastardization of the terms a bit, but at its core it's
basically a message-passing multi-process system that hides the implementation
behind the well-understood (to most programmers that use node) async i/o

------
matharmin
I've considered this for an Electron app, but the synchronous API is actually
an issue for a responsive UI. It it's probably possible to work around it by
using a separate worker process for the database, but then you're just back to
doing things asynchronously.

With async-await being common now, a synchronous API is also not required to
have an easy-to-use API.

node-sqlite3 does have a massive performance problem when inserting 100 rows
in a transaction - the overhead on the JS side can be around 10x more than the
time required by SQLite. However, all that's needed to solve this is a batch
API: a way to run 100 insert statements in a transaction with only a single
callback at the end. The performance problem is really not about synchronous
or asynchronous - it's the fact that we need the same overhead for every
single insert in the transaction.

I've eventually worked around this issue by just combining multiple insert
statements into a single one. For example, instead of doing 3 separate
inserts, do one with `VALUES (?,?), (?, ?), (?, ?)`. While the code to achieve
this is ugly, it gets similar performance to better-sqlite3, without being
synchronous.

~~~
matharmin
[UPDATE] Based on jlongster's comment, it may still be best practice to run
this as a separate process, regardless of the module being used, since node-
sqlite3 isn't truely asynchronous.

------
wruza
>Easy-to-use synchronous API (faster than an asynchronous API... yes, you read
that correctly)

Who the hell decided that making sqlite asynchronous is a good idea in the
first place? Node is so religious on async that creating everyday
transactioned apps in it is PITA that never ends.

~~~
hliyan
Asking out of ignorance: won't a synchronous API block the entire process
while the I/O is going on?

~~~
masklinn
Yes. Unless the underlying reactor/scheduler is multithreaded but that's…
uncommon, as async code tends not to consider thread-safety.

It certainly isn't the case for nodejs.

One could wonder whether that's an issue when IO are fairly short & strongly
interspersed with code being run though, the overhead of switching tasks could
be higher than the IO you're synchronising. I guess whether this is a good
idea would strongly depend on _how_ you're using sqlite (aka do you tend to
make lots of very simple requests with small output or a smaller number of
more complex ones).

~~~
zbentley
> One could wonder whether that's an issue when IO are fairly short & strongly
> interspersed with code being run though

IMO one of the core observations made by Dahl and the other Node.js folks was
that this is _very very rarely_ the case in modern web applications. Most
webapps either a) spend the vast majority of their time waiting on IO, b) have
non-IO code very regularly interspersed with IO operations (making event-loop
based interleaving based on IO boundaries more useful) or c) both.

As a result, Node.js was built with the single-thread/IO-bounded reactor
model. I think that this, along with (and perhaps even more than) its choice
of language syntax, drove the platform's adoption. It means that code doesn't
need to think about thread safety, and that there is only _one_ IO model:
asynchronous by default (yes there are exceptions, but they're a minority).
That lets users get a lot of scale for free, provided their workloads meet
those criteria.

------
staticelf
It's probably a fantastic lib, and well done. But..

What if node-sqlite redesign it's library and gets even faster than this? Then
it won't be better anymore? I think it's a bit of a bad name for a lib to have
since stuff like this changes with time.

~~~
hnarn
You're absolutely correct. I mean, they could argue that "best" is still not
taken but objectively it's just bad naming practice to use relative terms.
Almost anything else would be better, they could call it "Blue Whale Sqlite3"
(not a good name) and it would be more iconic and neutral.

~~~
genezeta
They could call it "synchronous-sqlite3" since that seems to be the main
relevant difference.

------
sologub
Maybe not a big deal for many, but this is the first sqlite lib for Node.js
that supports User Defined Functions, would be nice to have an example though.

Anyway great work!

------
zubairq
I tried switching from sqlite3 to better-sqlite3 but found that fts didn't
work so it was a deal breaker for me, has anyone else got full text search to
work with better-sqlite3 as that would make me switch?

------
ivanfon
I'd love to see a benchmark against Sequelize as well.

Not sure how I feel about it not being async...

~~~
matharmin
Sequelize just exposes additional APIs on top of node-sqlite3.

------
always_good
The module name is a good example of why package systems should have qualified
names by default. `npm install JoshuaWise/sqlite3`

Elm got this right. You shouldn't have to be clever in thinking of a package
name just because the obvious one was taken and you have a different flavor to
offer.

Rust didn't and it's another ecosystem where people feel the need to squat on
names or suffix a '2' when they think they've written a better library, just
like NPM.

~~~
kryptiskt
And then you can't transfer a package to a new organization without stranding
users. Java showed how misguided using the domain name in the package name
was, organizations have far less permanence than software.

~~~
always_good
This can be managed in many ways, like identity immutability, package
redirection, and solved from a UI standpoint like "a/foo was moved to b/foo"
on your user's next package update because the package manager has first-class
support for it.

Did you think nothing can be improved upon Java?

