Slava Akhmechet – What's coming up for RethinkDB in 2016 [video]

szastupov · on Feb 29, 2016

I really dig the idea of subscribing to queries. I wish I knew about RethinkDB year ago so I could use it in current project :/

eternalban · on Feb 29, 2016

Is that a neologism for DB Views? [edit: don't have the clicks to watch the OP]

true_religion · on Feb 29, 2016

It's kind of like PubSub. Whenever you subscribe to a query, you get updates on its results in soft-realtime.

sidi · on Feb 29, 2016

We do this at https://appbase.io, but with ElasticSearch. (disclaimer: I am the co-founder)

infocollector · on Feb 29, 2016

We use rethinkdb in our current project: Python + timeseries data + mining. Pretty happy with it out of the box compared to mysql for speeds.

eis · on Feb 29, 2016

My experience was pretty much the opposite, especially for timeseries data the sequential scan performance of Rethink was pretty bad, especially when you add filters, grouping etc. Postgres gives me about 3-10x the speed.

And that takes into consideration that Rethink uses all cores and Postgres only one. Plus I used jsonb in Postgres to make it totally fair.

And on top I encountered really inconsistent performance. For example when you access a field of an object that doesn't exist, it apparently raises internally an exception and that slows things down a lot. But you don't really see this.

I get it that Rethink is trying to distinguish itself from the other players and find a nice niche for itself but I really wish they improved on the performance and while at it removed the limitation of 32 servers in a cluster.

Oh and the huge blow up of data on disk also needs a lot of work. Rethink used about 5x more disk space in my trials for a table with about 10 million rows. With fewer indexes than PG.

I really like the ops simplicity of Rethink, especially when it comes to clustering. So I have high hopes for it.

coffeemug · on Feb 29, 2016

Slava @ RethinkDB here.

The challenge with database systems is that essentially they're complete runtimes -- the user can send a huge variety of queries which are programs that need to be executed by the runtime. So there is no way to fix performance once and for all -- it's a continuous process that never stops. We've worked out a lot of performance issues since then, and keep working on them every day. For anyone reading this, if there is a particular workload that's slow, please let us know and we'll work to prioritize it ASAP.

JDDunn9 · on Feb 29, 2016

I just switched from RethinkDB to Postgres as well, although my issues were more with NoSQL than RethinkDB specifically. With JSON support in PG, and the ability to let Amazon RDS handle DB admin/scalability, I don't see any reason to sacrifice ACID compliance and security on NoSQL.

don71 · on Feb 29, 2016

Just out of interest, why did you only use one core for Postgres.

eis · on March 1, 2016

I measured one query that does a sequential scan over millions of rows. In PG 9.4 this only uses one core. Current versions are adding a feature to parallelize this. If I had run several queries in parallel, PG would have used all cores.

cryptica · on Feb 29, 2016

This is pretty impressive. Being able to decouple the user-facing realtime transport layer from the database engine is one of the main things which has been holding big companies back from adopting Firebase (due to Firebase's tight coupling with MongoDB and the associated lock-in factor).

I imagine that eventually you would be able to hook fusion into any database engine or service you like - If you don't want to put your sensitive data on a public cloud, you could just use fusion as-a-service but host the Database part yourself.

coffeemug · on Feb 29, 2016

Slava @ RethinkDB here.

> I imagine that eventually you would be able to hook fusion into any database engine or service you like

That's definitely the plan.

Scarbutt · on Feb 29, 2016

So will they endorse deploying only the fusion server(no expressjs, koa, etc middleware) with your frontend code in production? (assuming the fusion api is enough for your app)

coffeemug · on Feb 29, 2016

Slava @ RethinkDB here.

The main goal behind Fusion is to let people get started quickly, but eventually any meaningful app will outgrow the Fusion API. When that happens, instead of starting Fusion as a stand-alone server on top of RethinkDB, you'll be able to load it into Node as a library and integrate it with koa/hapi/etc. That will provide the Fusion API to your node app, and an upgrade path for modern complex applications. That's an essential part of Fusion and will ship along with v1.

renke1 · on Feb 29, 2016

Any tl;dw?

krstffr · on Feb 29, 2016

My main takeaway: They're creating a thing called fusion (will be renamed) which will allow you to spin up a rethinkdb instance with a single command which you then can connect to directly from your browser/JS-frontend, no backend needed (kind of like meteor with insecure/autopublish but with rethinkdb instead of mongo and no backend what so ever). The benefit of this is that you very quickly can create an app I guess, just create a frontend and you're good to go.

I suppose you can also setup rules as to what operations should be allowed from the clients, so you they just loop an insert to destroy the DB, but not much was said about security/auth etc.

I think he said they plan to release a first version in two months.

EDIT: Firebase is actually what he compares it to! But since I've worked a lot in meteor that's my first association :)

coffeemug · on Feb 29, 2016

Slava @ RethinkDB here.

Fusion will ship with a security model -- it will absolutely not be insecure out of the box. This type of project would be pretty useless without security (beyond basic demos), and a security engine is an essential feature in v1.