Thanks for taking the time to do this. I have only looked at the Python example, but I think it highlights a basic issue with the API:
r.table('todos').get(todo_id).run(g.rdb_conn)
It would be more natural to have it go the other way:
g.rdb_conn.run(r.table('todos').get(todo_id))
This way, the connection runs the request instead of the other way round.
Also, you are opening and closing DB connections with each HTTP request. I imagine this is for the purpose of simplifying the example. However, it does make me curious: how expensive is the creation of a new connection? Are connections thread-safe? Would it make sense to have connection pooling?
The distinction between query.run(conn) and conn.run(query) is very fine indeed. There has been much discussion about this and there are both solutions have their merits. At one point we even supported both but decided the simplicity of having only one solution merited dropping the other.
Creating a new connection involves opening a TCP connection and sending a message to validate the driver to the server, altogether a few round trips. There is a small bit of per connection state stored on the server but since RethinkDB uses a custom coroutine implementation for concurrency support this does not amount to the overhead of an independent OS thread for each connection.
Invoking the query with a specific connection object is thread safe. There is a feature designed to help REPL users that stores the last connection in global state that is slated for removal in the upcoming release (1.4) that is obviously not re-entrant.
It would make sense to have connection pooling, especially in the python driver where connections block on requests. This is an idea we're exploring but is lower down on the priority list. As it is a fully client side feature there is nothing stopping 3rd party driver developers from implementing a solution though the official drivers will have to wait for other priorities.
This way, the connection runs the request instead of the other way round.
I don't know if I'd call it an issue with the API -- there is a pretty active debate going on about it: https://github.com/rethinkdb/rethinkdb/issues/256 Please chime in there with your POV, it would really help!
how expensive is the creation of a new connection?
Connections aren't thread safe on the clients. It's very efficient to open a connection to the server, but it still requires a TCP handshake. This is a good low-level API -- we'll build a connection pool on top of that soon (I just opened an issue for it here -- https://github.com/rethinkdb/rethinkdb/issues/281). People seem to want one because they're used to one, but it hasn't really been necessary in our testing.
I would love to try out your DB, but your build system is garbage, and I mean that in the most constructive way possible. I know you guys put lots of effort into it, I can tell as I am reading through it trying to get it to compile on my Fedora system. Please for all that is good in this world take the time _now_ and switch to a sane build system. I know its cool that you scripted everything in Make, but this is stupid how hard it is to build on Fedora.
I am putting together a patch and instructions for fedora, but this is silly.
I would love to try out your DB, but your build system is garbage
If you only knew how much we agree with you :) We have an engineer working on that now, so it will soon be much, much nicer. Also, thanks for the instructions, I'll post them on the build page after I confirm everything.
I'm the maintainer of the archlinux AUR pkgbuild - a better build system would be much appreciated :) For now, I do 3-4 sed invocations over your various Makefiles.
Thanks -- I've opened an issue for the new build system https://github.com/rethinkdb/rethinkdb/issues/286 and posted this info. We'll try to take it into account as we develop the new one. Please comment if you have feedback/ideas!
I am excited to hear that. I know it means pulling resources away from developing features, but seeing the plumbing taken care of properly really lends confidence to those who might be early adopters of a young project.
I continue to be very psyched about RethinkDB - looks like an awesome product coming together nicely.
As a boring non-cutting-edge, non-expert Rails person, any word on projects a la Mongoid to give a nice ActiveRecord-style ODM to guys like us? Would LOVE to be able to play with Rethink in place of Mongo.
(Also, as an aside, if I were about 3 years more advanced in my ruby skills I'd totally volunteer to start that project. But I'm not. So I can't ;)
There is a Python ORM already (https://github.com/nviennot/nobrainer) but no Ruby one yet. I suspect it would be pretty easy to port Mongoid to rethink, so I wouldn't be surprised if a port pops up soon.
Thanks for your interest -- we'd really love to get everything out ASAP, but the team has limited resources and we have some more low-hanging fruit to take care of first.
I really like RethinkDB, but the lack of a driver for JVM languages (Java, Scala, etc.) is a showstopper for me.
I read that you're revamping your driver architecture and am anxiously looking forward to a JVM driver (more than willing to write a Scala-friendly wrapper - let me know if you're interested).
Thanks -- helping with a Scala wrapper would be great. Only a few weeks left until the new API is out. I wish we got it right on the first release, but hey, better late than never (there was a lot we learned from the first version).
We had lots of people asking "how do I get started with RethinkDB and stack X". These familiar examples can be extremely helpful to people for getting started with a new technology, so we posted a couple based on existing code. Obviously they're not meant to be new or groundbreaking :)
We didn't have any to show off something based on RethinkDB. I'm looking forward to your example app that takes a program and tells you whether or not it will ever finish executing.
There are a few ways to deal with this issue -- contribute to the PHP protobuf library, or have an alternative layer on the server that accepts a more portable but less efficient serialization format. It's a little frustrating, but wouldn't be too difficult to solve.
Also, you are opening and closing DB connections with each HTTP request. I imagine this is for the purpose of simplifying the example. However, it does make me curious: how expensive is the creation of a new connection? Are connections thread-safe? Would it make sense to have connection pooling?