
We use RethinkDB - hiphipjorge
http://blog.workshape.io/we-use-rethinkdb-at-workshapeio/
======
_dancannon
I have been using RethinkDB for a while now (although mostly for small side
projects + maintaining the Go driver
[https://github.com/dancannon/gorethink](https://github.com/dancannon/gorethink))
and have really enjoyed it, having a functional query language is quite
refreshing. The recent introduction of change feeds are also really cool,
building a realtime app with websockets was surprisingly easy.

Honestly I would recommend RethinkDB to anybody looking to start a new (small-
medium sized) project. While there are some small performance issues this is
to be expected for a project at this early stage and after seeing how the
RethinkDB team works I am confident that these will be sorted pretty quickly.

------
simonpantzare
We have used RethinkDB in production for a handful of months now. 100M docs,
250 GB data spread out on two servers.

We added it to the mix because it got increasingly difficult to tune SQL
queries involved in building API responses, especially for endpoints that
needed to pull data from many tables.

Our limited experience of MySQL operations was also a factor. We're on 5.5 and
couldn't do some table operations that seemed promising without service
disruptions. There were solutions to perform the actions we wanted without
downtime but they scared us a bit. We also looked into upgrading to 5.6 or
MariaDB but that seemed like it would take a long time and need much testing,
while there were no guarantees that we would see performance gains.

We looked for alternative solutions and found RethinkDB. We reused the parts
that serialize data for the API and put the resulting documents in RethinkDB.
Then we had our API request handlers pull data from there instead of from
MySQL and added indexes to support various kinds of filtering, pagination, and
so on. We built this for our most problematic endpoint and got the two-server
cluster up and running in about a week, tried it out on employees for another
week, and then enabled it for everyone (with the option to quickly fall back
to pulling data from MySQL).

This turned out to work well and we saw good response times, so we did the
same thing for other endpoints.

There's some complexity involved in keeping RethinkDB docs up to date with
MySQL (where writes still go) but nothing extreme and we haven't had many sync
issues.

RethinkDB has been rock solid and it's a joy to operate.

~~~
otterley
> We're on 5.5 and couldn't do some table operations that seemed promising
> without service disruptions.

Everyone has this problem. But it's been largely solved in practice by
performing the schema changes on slaves, and then promoting the slaves to
master.

Also, if you're just using RethinkDB as a delayed (and almost certainly
inconsistent) secondary storage system, why not use ElasticSearch instead?

BTW, 250GB fits in memory on any decent size box. You're not really going to
see how things scale till you get into the terabytes.

~~~
xordon
I don't think 250GB will fit in memory on any reasonable sized box. What world
do you live in?

~~~
krunaldo
An r720 from dell or similar model from dell with 600GB*2 SSD intel s3500DC
model, 20 cores & 256GB of RAM will go for 5k-7k. You can bump this to 386GB
of ram without going above 10k.

~~~
davyjones
When I changed the country to Japan, the sticker price jumped from 2000 USD to
15,000 USD eq. for a very basic system. I am just at a loss as to what can
explain this disparity. Guess I will have to call up my vendor to get a
comparable quote.

~~~
krunaldo
My tip is always to try to get in contact with a couple of reseller and play
them out against each other in the price department.

If you are looking for larger purchases 50k+ USD than you should talk directly
with Dell, HP or comparable vendor and put them into the play off for who you
choose :)

~~~
davyjones
I always do that. I have worked with Strategic Sourcing for a while...so...
;-)

------
hckr1292
Things I love about RethinkDB as a NodeJS dev:

\- first class support for JS bindings, unlike mongoose which wraps the super
low level mongodb js library into something palatable but crashes in a
horribly undebuggable way.

\- server-side joins

\- a nice web UI for monitoring and running queries packaged up with the
service

\- public docker images that are super simple to run

\- easy clustering

------
felipesabino
From RethinkDB docks [1], I am still a bit confused how this locking system
works for read/write and also a bit skeptical regarding their claim that 'in
most cases writes can be performed essentially lock-free'.

I am using MongoDB and didn't have many issues when my databases had 120,000
documents either, the problem began when we hit the millions... The
combination of write locks and our need for dynamic queries (meaning: we can't
index) made the database the worst performance bottleneck in our system by
far. Although I must be honest that we haven't yet tried MongoDB's new 3.0
version that promises a boost in performance [2] and also has 'document-level
locking and compression' [3]

Is anybody aware of any benchmark that perform random writes (inserts/updates)
and non-indexed reads for RethinkDB? (Is it even a common use scenario,
anyways?)

[1] [http://rethinkdb.com/docs/architecture/#how-does-
rethinkdb-e...](http://rethinkdb.com/docs/architecture/#how-does-rethinkdb-
execute-queries)

[2]
[http://www.mongodb.com/mongodb-3.0#performance](http://www.mongodb.com/mongodb-3.0#performance)

[3] [http://docs.mongodb.org/manual/release-
notes/3.0/#wiredtiger...](http://docs.mongodb.org/manual/release-
notes/3.0/#wiredtiger-concurrency-and-compression)

~~~
tracker1
FYI, with MongoDB, just because you can't and shouldn't index everything,
doesn't mean you can't have any indexes... if your most common fields bring
your queries down, they're still pretty helpful.

I actually really like where RethinkDB is headed, and within the year most of
my issues should be resolved.

Another couple databases to consider, depending on your needs would be
ElasticSearch and Cassandra... it reallly depends on your use case.

~~~
coffeemug
Could you list your issues with Rethink? It would really help for product
prioritization.

~~~
tracker1
As mentioned in another thread, automagic failover the one still pending, and
geospatial indexes/searches (now in the product).

It's wild how many options there are that tailor themselves to all kinds of
data out there.

~~~
pests
I'm not involved with RethinkDB but I lurk on their github issues and I'm
pretty confident that automagic failover is dependent on them getting (their
own implementation of) Raft integrated with everything. Looks like its getting
close as a whole slew of issues were opened just the other day relating to
Raft work.

------
jamescostian
+1 to using RethinkDB! I'm also using RethinkDB in production, and I love it!
The only issue is that you have to set up _persistent_ filters via iptables in
addition to having an authKey. They do have a guide[0] for that, however they
do not provide any instructions for ensuring that the filters on iptables stay
up, or how to restore them if they are temporarily wiped out :/

[0] [http://rethinkdb.com/docs/security/](http://rethinkdb.com/docs/security/)

~~~
habitue
I created an issue for this on our docs repo[0]. Thanks for the feedback!

[0][https://github.com/rethinkdb/docs/issues/674](https://github.com/rethinkdb/docs/issues/674)

------
simi_
I've been using/following RethinkDB since I started as Lavaboom's CTO. It's
been a smooth ride so far, and the occasional perf improvements are always
welcome. Some aspects of the database are especially lovely, like the web
admin or painless deployments of new nodes, especially if you're using Docker.

Shameless plug (we've just went open sourced most of our services):
[https://www.lavaboom.com/](https://www.lavaboom.com/)

------
kolencherry
We use RethinkDB in production and our main frustration lies around the lack
of automatic failover. We're looking forward to 2.0, which is supposed to
bring automatic failover (using Raft for consensus) to RethinkDB.

~~~
coffeemug
Slava @ RethinkDB here.

Unfortunately automatic failover won't be a part of 2.0, but it will happen
very quickly after that. Please hang in there, we expect to ship this feature
some time in May.

I just saw a demo of the failover feature yesterday from Tim Maxwell (the lead
engineer on this), and it's really impressive! Another side benefit of this
feature is live reshards -- you'll be able to reshard/rebalance data without
any availability loss on the cluster.

The code is there and just needs a bit more polish and _a lot_ of testing. I'm
very excited to get this out, it's probably the last part of RethinkDB that
I'm not 100% proud of yet (but will be in a month or two).

~~~
TylerE
You guys are killing it. Wish I had a product I could write around
rethink....currently at the day job our stuff is mostly Mongo....all layered
under django-nonrel with lots of mongo crud so a port wouldn't really be an
option I don't think.

~~~
hardwaresofton
As a person who agrees -- maybe you could write a port/adapter for django-
nonrel to Rethink?

Also, why not start a new greenfield project to test out Rethink? a something
something realtime something geospacial something app should be a fantastic
way to kick the tires, since that's one of the things that Rethink does really
well out of the box (as of 1.15) compared to other databases (relational or
not)

~~~
TylerE
Plenty busy at the moment and I don't generally code in my free time - I work
from home so I really try to maintain that work/home life seperation.

------
pquerna
Is anyone using RethinkDB as a "lightweight" Business Intelligence / Analytics
/ Datawharehouse store? (Maybe for use cases like Amazon Redshift?)

It seems like there could be a sweet spot of their nice query language,
schemaless, and easy scaling for moderate size data sets?

I'm kinda tired of having to go all in on a big hadoop-ecosystem just to
figure out average X to Y in a dataset....

~~~
coffeemug
Slava @ RethinkDB here.

Some of our biggest users (to be announced in a few weeks for 2.0) use
RethinkDB this way. You can't really do deep analytics/machine learning as
RethinkDB wasn't designed for that, but if you want to store a lot of data,
and then run lightweight aggregation or map-reduce queries on that data,
Rethink turns out to be a really good product for it.

One issue I see with this path is that if your queries ever get a lot more
complex, you'd have to migrate off of RethinkDB onto Hadoop (which is a pain).
I think that if you know for certain you just want lightweight querying
capabilities RethinkDB can be really wonderful, but if there is a good chance
you might need something deeper, it might be worth the effort to set up Hadoop
early on.

~~~
pquerna
Have you thought about a "read at timestamp" construct in RethinkDB?

It's not really an MVCC thing, and you can work around it in data model, but
for lots of reports (say, running in a cronjob), I want to run a query "as"
the database saw things from at midnight UTC, even if i start running it at
2am? It would also make reports a more reproducible... but maybe this is
really a datamodel problem. I felt when I read the Google Spanner papers that
it was a pretty potentially useful feature for Read-Only queries.

~~~
emidln
I frequently use Datomic for this and it's awesome.

------
lewisjoe
I've always wondered what's the best way to integrate a database engine with
the application.

1) Use a middleware/ORM/Whatever which abstracts away the query-lang of the
db, and provides a pluggable multi-db support

2) Just use native db query language with all exclusive features of the
engine.

Companies like workshape.io, why do they prefer the latter?

~~~
mbrock
Abstractions are leaky. [1]

ORM systems are highly complicated abstractions.

Most of your developers will use them without understanding how they work. In
many cases, the only way to understand how they work is to read the source
code.

They have magic features that are advertised as convenient but when they
inevitably do something you don't want them to do you'll tear your hair out
trying to circumvent them.

They put lots of complicated weird stuff in your stack traces so when
something goes wrong with "the database stuff," which is probably going to
happen every day, you will feel confused and overwhelmed.

ORM was famously referred to as "the Vietnam war of computer science" by Ted
Neward. [2]

There's a point that I think is even more important than the unruly and
bewildering complexity of ORM, but I'm not sure I know how to formulate this
point.

One way to formulate it would be to point out that your dichotomy of two
choices is missing an alternative, so I present:

3) Code your data access in a separate module exposing query & save functions
that make sense within your domain model.

In a reasonably complex system, this module might consist of fifty functions
that concatenate SQL strings or whatever. In most cases, I'd bet money that
rewriting this module to support some other data storage—especially if there
are integration tests—would be easier and more pleasant than switching your
ORM and then dealing with the random problems that will inevitably occur.

And when some query fails or is slow, the developer issued to fix it will just
go into the file, find the query, and change it. It's simpler, there's less
obscure technology to worry about, fewer things to get angry at.

[1]:
[http://www.joelonsoftware.com/articles/LeakyAbstractions.htm...](http://www.joelonsoftware.com/articles/LeakyAbstractions.html)

[2]: [http://blog.codinghorror.com/object-relational-mapping-is-
th...](http://blog.codinghorror.com/object-relational-mapping-is-the-vietnam-
of-computer-science/)

~~~
politician
Preprocessed prepared queries that are shipped along with whatever package is
using them is a far easier solution than quirky ORM tools that can do the
simple things, but have a tendency to break on the harder things or encourage
an authoring style that destroys performance.

That said, I think the world could use a CoffeeScript-esque transpiler for
targeting SQL. Preferably with some kind of frontend/IL/backend separation, so
that everyone can take a crack at replacing the awful SQL syntax.

------
616c
I see him here and Reddit every once in a while, but there is a cool client-
side encrypted note taking app, Turtl, using Common Lisp and RethinkDB server-
side, and what was node-webkit client-side. Very cool, everyone should check
it.

[https://turtl.it/docs/server/running](https://turtl.it/docs/server/running)

[http://www.reddit.com/r/lisp/comments/1mkxp2/turtl_clientsid...](http://www.reddit.com/r/lisp/comments/1mkxp2/turtl_clientside_encrypted_browser_addon_for/)

Not a dedicated user, but I have been playing with this dude's CL work and I
like his approach and attitude. Maybe thought people would want to see a self-
hosted RethinkDB proj.

------
z3t4
I'm also working on a noSQL database. What I'm struggling with is the
abstraction for searches/filters. For example, if you want to get all books
with "beginner" in the title, in SQL it would look something like:

    
    
      "SELECT * FROM books WHERE title LIKE %beginner%"
    

Where in no-SQL it would look like

    
    
      book.filter({title: ["like", "beginner"]});
    
    

Any ideas on how to abstract the filtering in a more clear way?

~~~
dkersten
In RethinkDB the query looks like this:

    
    
        r.table('books').filter(function(doc){
            return doc('title').match("beginner")
        })
    

Or like this:

    
    
        r.table('books').filter(r.doc('title').match("beginner"))
    

Or if you have a secondary index setup correctly, this could work:

    
    
        r.table('books').getAll("beginner", {index: "title"})

------
nartz
Interesting to hear - would be interested to hear someone utilizing more of
the selling points - distributed queries, etc.

------
ShinyCyril
I'm a relative newcomer to the NoSQL scene and have been using RethinkDB for a
couple of sideprojects. The IRC channel (#rethinkdb on FreeNode) is really
second-to-none - the people on there are incredibly friendly and patient when
answering what are probably obvious questions.

------
dilipray
I use RethinkDB for [http://soopara.com](http://soopara.com) with PubSub.
[http://rethinkdb.com/docs/publish-
subscribe/python/](http://rethinkdb.com/docs/publish-subscribe/python/)

------
junto
Is there a Windows version on the roadmap?

------
ddrum001
Great post, would love to see RethinkDB with horizontal scalability in future
releases.

------
emergentcypher
Why is it so trendy these days to say that everything was built "with love"?

~~~
bbcbasic
It's just a meme.

Like it is also trendy to have an over-sized picture of young people working
on wooden desks in an industrial-chic office taking up 70% of the screen
space.

Goes nicely with the Bootstrap template, Lobster font, Circular cropped photos
of founders, Ping pong tables, Ruby on Rails. Etc. Etc.

~~~
pbowyer
Upvoted for sheer cynical accuracy ;-)

