
RethinkDB 1.6 is out: regex matching, new array operations, random sampling - coffeemug
http://rethinkdb.com/blog/1.6-release/
======
bryanh
I really encourage everyone to just give this DB a spin, with homebrew or
their Ubuntu PPA it is a dead simple install process. I guarantee that the
first touch experience will be among the best DB experiences you've had. Their
_built-in_ web UI is so incredibly polished and handy.

Also, a nice +1 for RethinkDB is when I was importing way too much data for
the (accidentally) way too small EC2 instance, the OOM killer kept knocking
off RethinkDB but not a single doc had been corrupted. I've done similar
stupid things with other DB's and... well, data was lost.

We're still working through it for analytics, I hope to do a write up sometime
soon.

------
efuquen
I'm kind of flabbergasted that there is no Java driver listed as either a
primarily supported driver or a community one. I personally do most of my work
in Scala now but it would make the most sense to have a well supported driver
written in Java so any other JVM based language can take advantage of it. Why
is there no love for the JVM?

~~~
dkhenry
Give me one or two more weeks and mine will be functional enough to use. I
haven't put it in the community wiki becuause its just not ready yet.

[https://github.com/dkhenry/rethinkjava](https://github.com/dkhenry/rethinkjava)

~~~
mglukhovsky
dkhenry, thank you for your work on this driver!

I would encourage everyone working on drivers for RethinkDB to publish them
regardless of how complete they are, and advertise them as partial
implementations. I know of a lot of people interested in working on a Java
driver, and it would be great to see collaboration, even early on!

------
mrkurt
This is the first time I realized that the secondary indexes in rethink are
expression based. That's amazingly useful.

~~~
coffeemug
slava @ rethink here. This is entirely our fault -- good docs are really hard,
and we're working on it now. We should have _much_ more comprehensive
documentation up in a few weeks.

This has actually been one of my biggest frustrations -- there is an enormous
amount of really cool stuff in Rethink that's relatively poorly documented, so
people can't find out about it. I'm really looking forward to fixing this
soon.

~~~
winter_blue
Hi Slava, this is a bit tangential to the topic at hand -- but I was wondering
how RethinkDB plans to support itself...

It looks like an open source project with an enormous amount of amazing talent
behind it, and YC backing; but considering that it's FOSS now; how do you plan
on generating revenue?

~~~
coffeemug
Rethink is a VC funded company. Our goal is to be a defacto database choice
for every new web app. This is tongue in cheek --
[https://github.com/rethinkdb/rethinkdb/issues/1000](https://github.com/rethinkdb/rethinkdb/issues/1000)
\-- but should give you a sense of the direction for the project.

RethinkDB already has paying customers, and we'll soon be offering publicly
available commercial support (not unlike JBoss, MySQL, 10Gen, etc.) There are
also other revenue streams I can't talk about yet.

For anyone who's interested in the pilot program before commercial support
becomes publicly available, shoot me an email -- slava@rethinkdb.com.

~~~
eaurouge
I think you should work with the Meteor folks to bring RethinkDB support to
Meteor. There's a huge opportunity there.

~~~
coffeemug
We've been looking into it. I'd really like to get that done, but it's a non-
trivial undertaking, and Meteor's DB APIs are still in flux.

I'll see what we can do to accelerate this -- possibly by enticing the
community to pick this up.

~~~
JulianMorrison
Please consider this encouragement to do so, because MongoDB is an instant
interest killer for Meteor right now. I just don't trust MongoDB at all.

If you post what needs done in detail, someone will surely pick it up. I'd
certainly take a look and see if it was within my capabilities.

------
teraflop
I have a design question that I didn't see answered in the docs. What's your
consistency/atomicity model for secondary indexes?

It seems like it would be hard to guarantee that the document and its index
entries are always in the same shard, and since RethinkDB doesn't seem to
support general multi-document transactions, I'm curious how you go about
updating them simultaneously.

~~~
coffeemug
Great question. In Rethink we guarantee that the document and its index
entries are always in the same shard, so any change to the document that
affects secondary indexes is consistent and atomic.

There are tradeoffs to this approach -- to read a secondary index the db has
to go to every master/shard for the table, but there are lots of tricks we use
to minimize the impact of this tradeoff.

~~~
robert-zaremba
I will really love to see information about time complexity of DB operations

~~~
coffeemug
I'm actually working on this right now. We should get the updated api docs in
about two weeks.

------
farslan
Wow, the array operations are so good. I'm dealing currently with large sets
of arrays in mongo and the whole $pull, $push whatever things are incredible
frustrating. After seeing the array operations of RethinkDb it's so great.

For example the "insertAt" operation is really helpful when you need ordered
arrays and you want to swap or change some elements. I'll give it a try
definitely.

~~~
coffeemug
Thanks -- don't hesitate to let us know if there's something that's missing
that you need. You can use the standard channels
([http://rethinkdb.com/community/](http://rethinkdb.com/community/)) or feel
free to email me directly (slava@rethinkdb.com).

------
ConceitedCode
Since there is now basic authentication, I can finally finish up my Heroku
add-on for RethinkDB this weekend. If anyone would like to help me test it,
send me an email at cam@camrudnick.com.

~~~
elithrar
I've dropped you an email. Been hanging out for a Heroku RethinkDB add-on for
a while!

------
wc-
I'm about to start a new project and had the usual nodejs + mongo skeleton set
up and ready to go. After reading the comments here, I will definitely be
giving rethink a try. It just feels like rethink is headed in the right
direction, whereas mongo is slowing down (maybe focusing on enterprise /
profitability stuff instead of features and fixes?). just my 2 cents...

------
dkhenry
I have been working on a Java/Scala Driver for Rethink and I am now convinced
that everyone who ever wants to define a wire protocol should use protocol
buffers.

~~~
pjscott
I have fixed bugs in an IMAP protocol parser, and therefore could not possibly
agree with you more vehemently.

------
albiabia
RethinkDB vs. MongoDB? They seem comparable. What is the big picture
difference?

~~~
coffeemug
Take a look at these two writeups:

* A biased/big picture one -- [http://rethinkdb.com/blog/mongodb-biased-comparison/](http://rethinkdb.com/blog/mongodb-biased-comparison/)

* An unbiased/technical one -- [http://rethinkdb.com/docs/comparisons/mongodb/](http://rethinkdb.com/docs/comparisons/mongodb/)

~~~
andrewflnr
The biased comparison still says secondary indexes are in development. Might
want to update that.

~~~
mglukhovsky
Thanks for noticing this, we'll update it shortly.

------
rb2k_
Just playing around a bit, even with soft durability, insertion speeds seem to
be pretty slow (<100 inserts/s from ruby using bulk inserts on a i7 Macbook
Pro with an SSD).

Also: All the ruby driver doc seems wrong. 1.6 changed a lot of the APIs?

~~~
coffeemug
< 100 inserts/s with soft durability is far too low. Shoot us an email to the
google group
([http://groups.google.com/group/rethinkdb](http://groups.google.com/group/rethinkdb))
and we'll see if we can work this out!

~~~
rb2k_
Seems the speed it really declines with the size of the inserted documents. My
production data is a hash with 14-15 keys and string/int values of maybe 10
characters each

When I loop and keep generating these documents:

    
    
        to_insert = {}
        10.times {|i| to_insert["key#{i}"] = rand(33333).to_s * rand(6)  }
    

I get around 140/s. Without the insert() call, I reach 50.000+, so it doesn't
seem to be the overhead.

With simple documents (3 keys, int values) I reach 350-400.

p.s. script @
[https://gist.github.com/rb2k/5777997](https://gist.github.com/rb2k/5777997)

~~~
coffeemug
Ah, I see. This is a known problem with the performance of the protobuf
serialization library we're using in the Ruby driver (see
[https://github.com/rethinkdb/rethinkdb/issues/897](https://github.com/rethinkdb/rethinkdb/issues/897)).
It bottlenecks the CPU and should be fixed in the next release.

In the meantime, you could try running a multithreaded/multiprocess script --
that would significantly increase throughput.

Sorry you ran into this -- it'll be fixed ASAP.

~~~
rb2k_
the driver doesn't seem to be threadsafe and just locks once I start using it
in threads :(

~~~
coffeemug
Sorry -- the drivers are meant to be used in a way where you create a new
connection in each thread. If you do it this way, things will work.

------
zmitri
Random sampling is such a simple, but nice touch. I've built up large
tables/collections and only after realized I needed random samples I had to go
through and generate random seeds for entries.

------
jroesch
I've also been working on a Scala Driver in stealth mode, I've been able to
successfully send a subset of queries. Though my primary focus has been on
making sure queries are typed correctly in Scala. As well as concurrently
engineering a good interface from typed objects to JSON data, so that you
don't end up with Map[String, Any] or Array[Any] all over the place. My
current approach is adopting some of the ideas from Bryan O'Sullivan's
excellent Aeson library in Haskell.

~~~
coffeemug
Cool -- shoot us an e-mail to the google group
([https://groups.google.com/forum/?fromgroups#!topic/rethinkdb...](https://groups.google.com/forum/?fromgroups#!topic/rethinkdb/hnk_GTjuc2M))
we'll be happy to help you out any way we can!

------
imslavko
I hope one day RethinkDB will be as wide-spread as MongoDB is now.

~~~
dexterbt1
I have a strong conviction that they will! See all the hard work made and
focus on the right stuff (durability, ReQL, ui, auto sharding+clustering,
etc.)

------
lampe3
hey

i could not find a good article about rethinkdb in production.

Can maybe someone share some thoughts about using rethinkdb on a database
heavy web e commerce shop?

thx!

~~~
coffeemug
There are a number of customers using RethinkDB in production now. We'll be
announcing them closer to the end of the summer and will write up case studies
(not in the business-y sense but in a tech-y sense) that will give you an idea
of how Rethink helps people solve problems in the wild.

~~~
lampe3
Thank you for your response. I'm looking forward to see it.

------
kclay
Just when just finish my Scala libary... guess I need to find some time to add
theses and release to the public.

~~~
noelwelsh
Please add a link here:

[https://github.com/rethinkdb/rethinkdb/wiki/Community-
contri...](https://github.com/rethinkdb/rethinkdb/wiki/Community-
contributions)

Every time I see an announcement from RethinkDB I get excited to try it, and
then I see there is no driver for Java/Scala.

~~~
kclay
I guess I'll polish it up and move it to github, been pushing to private
bitbucket.

------
orthecreedence
Damn, these guys are too quick for me, I haven't even updated my driver for
version 1.5 yet!

This is great news and excellent progress. Keep up the great work, Rethink
team.

~~~
mglukhovsky
Andrew, thanks for your hard work on cl-rethinkdb
([https://github.com/orthecreedence/cl-
rethinkdb](https://github.com/orthecreedence/cl-rethinkdb) \-- anyone using
Common Lisp should check out his client driver!)

Expect an email shortly briefing you on the protobuf changes for 1.6!

------
overgard
I have nothing against the database, but I wish they'd come up with a better
name. At work I'd feel silly trying to pitch a piece of technology that sounds
like a motivational slogan. That's not an criticism of their work, but
sometimes names matter.

~~~
dualogy
Doesn't matter in the long run, Mongo had similar issues in the early days.
"Mongo? What?" \-- just pretend the name was chosen by Steve Jobs or whoever
and it will add the "weight" to your voice next time you're pitching.

------
rojabuck
Is there anything in the roadmap for geospatial index support?

Our major usecase for mongo is returning entries in ordered distance from a
location and / or within some radius of a location.

Support for such queries would be seriously useful.

~~~
coffeemug
We generally try to adopt the philosophy of making the existing features
really good before building new ones, and there is a lot of low-hanging fruit
for us to pick for now.

I'd like to get geospacial support in, but it will take a bit of time before
we can get to that. We'll likely get it in within a year, so it's a fairly
long-term feature for now.

------
gizzlon
I understand that it's not a priority right now, but how hard would it be to
port Rethink to other unix-like operation systems?

~~~
coffeemug
slava @ rethink here. I would _love_ to get a windows port (if only because it
will give us the option to use Visual Studio's debugger). Unfortunately it's a
rather non-trivial undertaking. I'd like to get this done some time within a
year, but I can't make any promises.

~~~
robert-zaremba
Right now there are much useful issues then MS Windows support ;)

