
RxDB – a real-time database on top of PouchDB - typingmonkey
https://github.com/pubkey/rxdb/blob/master/README.md
======
lukevp
The concept or RxDB, being able to iterate observables of your change stream,
is great. Centering on schemas and typescript is as well. The broadcast
channel based leader election also solved the common issue with Pouch where
you can’t respond to real-time changes and update your UI if a separate tab
was watching too.

It’s wise of them to also support pulling data from GraphQL. I built the first
version of NoteBrook on top of couch/pouch, and the biggest pain points were:

1\. Pouchdb was before async/await and typescript. The typings can be
inconsistent, and it’s very difficult to properly manage the lifetimes of
local databases because of the promise chaining.

2\. A database technology for Real time replication Needs ACLs on a per-
document basis. I built provisioning scripts to manage separate databases per
user, as suggested, and it’s very cumbersome. It prevents me from lots of data
sharing models like promoting one record to public view, or sharing a record
with another user in a different tenant.

3\. Personally I feel that the naming / marketing of the product is poor. It
does not feel professional ( couch, futon, fauxton, pouch, couchbase) do not
feel like professional grade products I can depend on to run a business.

I think it would be good for RxDB to support its own backend technology and
move away from PouchDB, and in the meantime, de-emphasize the relationship
with PouchDB. The RxDB product doesn’t require you to use PouchDB and at this
point it’s got a lot of buzz around it and doesn’t need that tie to PouchDB to
continue.

I feel an ideal library would offer a real-time and offline client db, user
auth, and billing, tied to a backend database of my choosing (eg. Postgres,
MariaDB). The data layer should allow me to specify the durability
requirements of each write (down to the specific entity type) and enforce an
atomic commit on all related records in one transaction at the level of the
most strict entity in the commit. It should be trivial to shard because the
syncing protocol the clients use itself should be available to the server.
Multitenancy should be built in. It should support real-time ephemeral in
memory data for collaboration/chat scenarios, with optional durability that’s
eventually consistent. Schema definitions and migrations should be built in to
the product.

If you have similar ideas and are interested in working with me on this
project, or hearing more, drop me a line.

~~~
typingmonkey
Actually, the long term plan is to move away from pouchdb. Before the last
major release [1] I decided to make RxDB useable with other frontend/backend
databases but then found out that I first should refactor some pain points.

[1]
[https://github.com/pubkey/rxdb/issues/1636](https://github.com/pubkey/rxdb/issues/1636)

~~~
lukevp
That’s great to hear! Are you pubkey? If so you should put that in your
profile!

I hope you know that I think RxDB is a great lib and going in a good
direction! I recommended it to the Supabase team a couple weeks ago. I am less
long PouchDB, it feels like an old generation of this tech and not the best
way to solve these problems in 2020.

~~~
typingmonkey
I updated my profile. I do not know supabase, is it similar to hasura? They
spend quite some effort [1] to make the RxDB graphql replication working with
their backend. Maybe supabase can use a similar wrapper over GraphQL.

[1] [https://hasura.io/learn/graphql/react-rxdb-offline-
first/int...](https://hasura.io/learn/graphql/react-rxdb-offline-
first/introduction/)

~~~
adav
Cool! I was just thinking that RxDB + Hasura would be pretty nifty for an
offline-first MVP and here the legwork has already been done :)

------
diegoperini
Looks amazing. I always loved how Rx composes with IO on the client side, this
looks like the missing half. I hope it survives.

For people who are soon to be choosing a stack for their projects, please be
careful. In 2015, we adopted RethinkDB which had very similar ambitions as
RxDB and was open source. Unfortunately, RethinkDB is now abandoned (kinda).
Many promising subscribe-able databases are still experimental. If you are
thinking long-term, you may want to consider more boring options.

~~~
remon
Subscribe-on-update databases are, almost by definition, problematic to use at
scale as a generic storage solution. The fundamental problem is that they do
not solve many real world problems efficiently enough to warrant the
significantly higher running cost. Of course there are exceptions but it'll be
hard to launch a MongoDB type of product that uses a subscription only model
(see Firebase RTDB and its problems and lack of adoption in this space)

The reason developers gravitate towards subscription/rx based paradigms is
because it results in very clean architecture and code. Unfortunately it comes
at a cost which increases rather than decreases per client/user when volume
increases. Some companies or projects can absorb that cost but not all, and
typically less so when the project or its userbase grows.

A subscription based model will do work whenever data changes for each
subscriber whereas more traditional pull based architectures only do work when
a client specifically needs the data. This can be mitigated to some extent by
being micromanaging subscriptions but that kills most of the value of the
model.

There are also plenty of issues with this model if multiple clients are
allowed to write to the same data which every single example project seems to
try and do. There's a reason master-master updates, consensus algorithms and
CRDTs all come at significant cost. It's usually hard. And when it's easy you
probably don't need it subscribe-on-update in the first place.

~~~
joshribakoff
I think you’re oversimplifying it. I was on teams that participated in
architecture design for this stuff at Twitch. If we had 100,000s of clients
polling at the same moment, that would absolutely knock over the server, the
canonical “stupid easy” fix is to add jitter, which can be done on either
client or server.

In fact the pull based approach is doing work all the time to process all the
polling. A push based approach only does work when needed (when data changes).

I’m not saying it’s not new or hard to scale. I’m just saying objectively that
push based is more efficient at least in terms of raw data sent down the wire

------
mauflows
I love the idea of couchdb, but the ecosystem centers too much on pouch, which
doesn't have consistent maintenance. If Ibm was smart they'd sponsor it in a
big way.

~~~
typingmonkey
Yes pouchdb does not get the love it deserves. But it is still actively
maintained, the last bug I had was timely fixed by the community and merged.

------
yen223
I've experimented with RxDB in a side project of mine, a spaced-repetition web
app that is designed to be usable offline, with background sync between
clients. The offline-first requirement meant using something like IndexedDB to
store state client-side, which is where RxDB comes in.

Overall I like the library. It makes the hard task of interacting with
IndexedDB a lot more pleasant. The GraphQL support is also a nice touch.
Coupled with something like Hasura, it took me like a day to get sync working.

My one criticism with RxDB specifically is that the documentation around
writing queries can be a bit hit-or-miss.

------
shellac
Title probably ought to be "a _reactive_ database". Though, used properly, you
should be able to reduce latency it doesn't make any guarantees.

~~~
typingmonkey
Yes and no. Read "realtime" not like "realtime computing" but like a marketing
keyword introduced by firebase which means something like live-replication.
[https://firebase.google.com/docs/database?hl=en](https://firebase.google.com/docs/database?hl=en)

------
aabbcc1241
Saw RxDB before but it feels too heavy for my small projects. Would try that
if it can support leveldb/sqlite as storage backend in the future.

My typical approach is directly logging onto fs with replaying upon restart
for small projects, with fs based json store / leveldb for more demanding
tasks. For non-trivial scale projects, RethinkDB is a fair choice.

------
haolez
What's the advantage of this over triggers in a relational database? Or even
notify/listen in PostgreSQL? (assuming these triggers are connected to the
API, of course).

I guess it's probably a matter of scale, but I really don't know.

~~~
typingmonkey
It runs fully on the client and is offline first. A listener to PostgreSQL
will not work when the device goes offline.

~~~
snthpy
Fair enough, so what about triggers on a sqlite db on the client?

~~~
typingmonkey
There is a big difference between having a changestream of writes to the
database, and observing results of multiple queries.

------
dezmou
I don't see any authentification related suff like pouchDB have, like create
an user with password hash, auto handle cookie in the browser, restrict
document to user or group.

------
b1ackb0x
But is it really "realtime"? I thought something should have consistent
millisecond or even microsecond latencies to be called realtime, but it's
probably not possible for JavaScript application.

~~~
joshribakoff
If you wanted to be pedantic, nothing is real-time not even what you see with
your eyes due to photon latency. The word is clearly being used colloquially

~~~
b1ackb0x
It is used clearly in a misleading way since this database is nowhere about
latencies and time at all.

~~~
SirSavary
I know few people that use "realtime" to mean it in the "realtime OS" sense.
It's usually used in layman's terms to mean "happening live".

RxDB isn't being misleading, it's another valid usage of the term.

------
remon
How is this different from Firebase RTDB? And perhaps more importantly, does
it address the scalability and consistency issues associated with Firebase
RTDB? Google introduced Firestore to Firebase specifically because RTDB has
limited usability for larger real world applications that go beyond "sync
device state to DB".

Even the offline first paradigm is fundamentally flawed in general and
certainly when it comes to offline data manipulation. Either you can afford to
mutate your data on the local device and sync it when possible, in which case
you clearly don't need subscriptions to real-time mutations of remote data
because within that scope you are the source of truth. Or, you're interested
in mutations of real-time data from multiple clients in which case you need to
deal with conflict resolution (mutually exclusive changes of data) which is
not reliably possible with this model and scalability (linear increase in
pub/sub * increasing query cost = exponential scaling).

Are there any large projects or companies that currently have this in
production?

~~~
jchrisa
In-flight business objects for major airlines is probably considered scale:
[https://www.couchbase.com/customers/united-
airlines](https://www.couchbase.com/customers/united-airlines)

~~~
evan_
that's CouchDB, not PouchDB. PouchDB is a JavaScript implementation of
CouchDB.

~~~
Graphguy
That's also Couchbase and not CouchDB. They forked off CouchDB many years ago.

Cloudant (API Compatible with CouchDB) has a number of case studies you can
reference for production success with the Couch API/ecosystem.

Cabify - [https://www.ibm.com/case-studies/cabify-
cloudant](https://www.ibm.com/case-studies/cabify-cloudant)

Ticket Fairy - [https://www.ibm.com/case-studies/the-ticket-fairy-cloud-
clou...](https://www.ibm.com/case-studies/the-ticket-fairy-cloud-cloudant)

We.Trade - [https://www.ibm.com/case-studies/wetrade-blockchain-
fintech-...](https://www.ibm.com/case-studies/wetrade-blockchain-fintech-
trade-finance)

Disclaimer: I work for IBM Cloud

