

RethinkDB screencast - from queries to sharding under 15 minutes - coffeemug
http://www.rethinkdb.com/screencast/

======
RyanZAG
Would be really nice to see a demo of RethinkDB under real load. All of the
RethinkDB slides and info I have seen generally have 1-4 servers and maybe
50mb of data at most. At this level, you might as well just be using
textfiles...

Anybody know of any demos of RethinkDB handling, say, 100gb of data? And
running decent queries on it?

~~~
coffeemug
slava @ rethink here. Let me explain the state of affairs on this.

The underlying storage engine was tested on commodity systems _and_ super-
duper enterprisy storage systems, and can do hundreds of thousands of
ops/second on tens of terabytes of data (that required pretty beefy setups,
though). When we added clustering on top of the storage engine, we avoided
thinking of performance too much (in the interest of shipping), so everything
slowed down significantly. Here's our (rough) roadmap:

    
    
      - New protocol buffer API and some more checklist features (1.4)
      - Secondary indexes, huge ReQL improvements (1.5)
      - Performance and scalability (1.6)
    

We'll be doing scalability and performance demos that I hope will be really
impressive, but it'll take ~4 months to get there.

~~~
amikazmi
Does it mean that until these changes are done, you wont declare RethinkDB to
be production ready? (you mentioned at 2.0)

Can you guys add a "rough roadmap overview" page to the docs, so we could have
a general idea what is the status?

I like the way the RubyMine does it:

[http://confluence.jetbrains.net/display/RUBYDEV/Development+...](http://confluence.jetbrains.net/display/RUBYDEV/Development+Roadmap)

------
nullspace
I love the way you guys have stated the advantages and disadvantages of
RethinkDB in your FAQs. Just wondering about one thing in there:

"RethinkDB is a great choice if you .... are planning to run anywhere from a
single node to sixteen node clusters."

With a sharded master-slave setup with one slave each, this leaves us with a
total of 8 shards. This is enough for most use cases, but is there a reason it
is limited to 16 nodes?

~~~
coffeemug
There is a bottleneck in the metadata propagation code that slows down the
system after roughly 16 nodes (there is one place where we used an O(N^3)
algorithm in the interest of shipping the product). This isn't an inherent
limitation, just the state of affairs today. We'll resolve this in the next
few releases, but we wanted to be up front about this limitation for the time
being.

------
amikazmi
What is the state of Rethink db? Is it fit for production use?

In the website it stated as 1.3.2 (which imply production ready) but I think I
saw some comments a month ago from you that it's not fit for production use
yet.

What about secondary indexes?

Are the machines in the screencast are very weak? a simple query (get 2 rows
of the dota table) running ~100ms is really slow- is it because you're using
the web interface?

RethinkDB seems cool and I really want to try it in my next pet project :)

~~~
coffeemug
> Is it fit for production use?

Not yet. We'll bump the release to 2.0 when it's ready for production.

> What about secondary indexes?

They're coming -- see <https://github.com/rethinkdb/rethinkdb/issues/88>

> Are the machines in the screencast are very weak?

No, the 100ms roundtrip includes the HTTP request over our admittedly very
unsophisticated WiFi network.

Hope this helps!

~~~
tomjen3
Any reason you didn't go with the standard 1.0 is the first non-beta release?

~~~
coffeemug
Yes -- we had an internal versioning scheme that crossed 1.0 very early on.
Having different internal and external versioning schemes went against our
intuition of having an open development process, so we decided to bite the
bullet and keep the version post 1.0. It isn't ideal, but it's done :)

------
ukd1
RethinkDB has been awesome to learn, I found it really easy to get up and
running, install the ruby driver and just start coding! Reminds me of starting
with MongoDB.

------
taf2
[edit] I realize there is a lot of buzz around being "NoSQL", but seeing how
you support similar concepts: join, count, group, where - why not provide at
least a partial SQL interface for the features you do support? Even mysql
originally did not provide "all" features of SQL e.g. no foreign key
constraints and probably still multiple features are unsupported in mysql but
at least with a sql interface it is easy for anyone to quickly pick up the DB
and start integrating it... plus SQL is actually a really nice "Query
language"... IMO... and perhaps many others too...

\- p.s. love the demo video looks awesome how easy it is to use and I like the
query language you created looks nice too.

~~~
coffeemug
Under the hood RethinkDB already supports multiple protocols, so from our
perspective there is little difference between a SQL front-end and a ReQL
front-end. Of course to users, that's the only thing that matters.

There are still many improvements we can make to the core system and an
enormous number of people are already interested in it, so we decided to
satisfy them first. We might add a SQL front-end at some point, but it's not
very high on the priority list now.

