ArangoDB 3.0 Release – A Solid Ground to Scale

hexalisk · on June 23, 2016

I would love to see a comparison between Arangodb and Rethinkdb, especially performance wise. Is it suitable for realtime applications? What language drivers are supported?

don71 · on June 23, 2016

Claudius from ArangoDB here. Frank already answered a question about realtime queries below: "We are evaluating various possibilities, how to implement streaming queries in an efficient and scalable way. For instance, are restrictions to the general AQL necessary for such queries to be able to scale? Stay tuned."

Regarding performance, it would be great if somebody with excellent knowledge about rethinkdb would contribute to https://github.com/weinberger/nosql-tests

arthursilva · on June 23, 2016

Mad respect for these guys/gals, they managed to squeeze two game changing features in a single release: VPack and Clustering 2.0.

pluma · on June 23, 2016

Not to blow my own trumpet but I think the new Foxx API (for letting you write your own HTTP endpoints in JS) is a huge step forward, too ;)

(I'm the lead developer of the new ArangoDB Foxx, AMA)

egeozcan · on June 23, 2016

Do you have an example app written with some benchmarks?

It says that we can utilize npm, so, how is the compatibility with node? Native (C++) modules?

Is it used already in production somewhere?

(I am asking a lot of questions because I'm very interested, not anything else)

pluma · on June 23, 2016

Hi egeozcan,

1) I don't think we currently have a benchmark specifically for Foxx. However we did some one-off tests in 2.8 and there was no noticeable difference between the plain HTTP API and a Foxx service using the equivalent JS API (just the V8 context switches).

2) See the section "Compatibility Caveats" in the docs: https://docs.arangodb.com/3.0/Manual/Foxx/Dependencies.html Specifically we can't support native modules and there are limits to what Node APIs we can support (network and filesystem APIs are different and more restricted, also Foxx is entirely synchronous).

3) Yes! Check out the case studies linked a the bottom of the website: https://www.arangodb.com/#casestudies All three of them use Foxx. I also dogfood Foxx: I'm the lead developer of a commercial e-learning product that uses full-stack JS with ArangoDB.

Feel free to swing by our community Slack channel if you have more questions: https://slack.arangodb.com

merqurio · on June 23, 2016

Foxx microservices are the best. I love them

buckbova · on June 23, 2016

Is the Foxx API supposed to imitate what would be stored procedures on the traditional rdbms?

don71 · on June 23, 2016

Hi! I'm Claudius from ArangoDB.

Saying Foxx is like stored procedures is overly simplistic.

ArangoDB's external API is the HTTP API. Foxx lets you extend that HTTP API with arbitrary code that will be executed in V8 with direct access to the same native APIs it uses internally.

The closest equivalents I'm aware of are RethinkDB's Horizon (which isn't hosted in the database) and CouchDB's CouchApps (which are far more limited in what they can do).

You can think of it as having a subset of Node.js running right on top of the database with direct memory access and scaling (Foxx services run on each coordinator).

buckbova · on June 23, 2016

So the expectation is for client applications to directly connect to the API. But of course that's not required.

don71 · on June 23, 2016

You are right that that is not required. That ArangoDB speaks HTTP doesn't mean it has to be exposed to the browser.

You can use Foxx to host your entire server-side application inside ArangoDB in some cases but in our experience, the best approach is having a middleground where the data-intensive server-side code lives in Foxx services and the client-facing part (e.g. server-side rendering) exists outside of it.

Foxx offers you the option to put backend logic directly inside the database; it's not an all-or-nothing decision.

saiko-chriskun · on June 23, 2016

whoa. foxx looks awesome!

merqurio · on June 23, 2016

Great news! We decided a couple of months ago to go with arangoDB, we have built already a 32Gb Graph Database and so far it has performed really well. It's easy to install and great as a first contact on the GraphDB world aside from Neo4J.

Diederich · on June 23, 2016

This looks pretty fantastic! I'm going to give it a whirl at my company...except...

It apparently uses Facebook's RocksDB, which is also, I'm told, a pretty nice package. However, there is this:

https://github.com/facebook/rocksdb/blob/master/PATENTS

"The license granted hereunder will terminate, automatically and without notice, if you (or any of your subsidiaries, corporate affiliates or agents) initiate directly or indirectly, or take a direct financial interest in, any Patent Assertion: (i) against Facebook or any of its subsidiaries or corporate affiliates," ... and so on.

I think that means that if my company goes after (in any way, shape or fashion) any Facebook patents, and we're using ArangoDB, we're instantly in copyright violation.

Is this a roughly accurate assessment?

Thanks!

bryanlarsen · on June 23, 2016

You don't lose your copyright license, you lose the patent grant. Most MIT/BSD licensed software doesn't come with a patent grant; many people assume that you get an implicit grant but that opinion will vary depending on which lawyer you talk to.

Diederich · on June 23, 2016

Ok, I think that makes sense. So, to restate, that means that RocksDB (or whatever) might have Facebook patented code/material in it, and Facebook is granting a very generous grant to use those patents.

...until a given organization comes into patent conflict with any of Facebook's patents, at which time their grants disappear.

So if I'm understanding that correctly, doesn't that basically mean the same thing? You're no longer allowed to use their software?

bryanlarsen · on June 23, 2016

"until a given organization comes into patent conflict with any of Facebook's patents"

Note that the grants only disappear if you initiate the conflict; they don't disappear if Facebook initiates and you counter-sue.

Diederich · on June 23, 2016

Ok, so that's a useful clarification.

My problem is that I'm working for a subsidiary of a really big company that is directly competing with some parts of Facebook. There's a non-zero chance that some part of my company, far far away from me, will initiate an action that triggers this clause some time in the future.

In short: my company's ability to use a growing set of otherwise high-quality projects is greatly impacted because of this legal language. I also strongly suspect this isn't limited to just my organization.

One way or another, thank you for your attention.

I would love to hear from any other members of large organizations about this matter. Granted, my view is pretty limited, but this doesn't strike me as a good way forward for the overall free software community.

bryanlarsen · on June 23, 2016

AFAICT, Apple finds this patent grant acceptable, whereas previous versions were unacceptable. Apple is well known for suing competitors for patent infringement, so they obviously feel that the loss of this patent grant would be acceptable, that the devolution to pure BSD is enough protection.

Diederich · on June 23, 2016

Well, Apple certainly qualifies as a big company, thanks for the input.

reactor · on June 23, 2016

We have been using ArangoDB (2.8.x) for some time. Solid product with awesome support in Google group and SO. Great job guys! Looking forward to migrate to 3.x especially for the persistent index.

overcast · on June 23, 2016

Awesome, I JUST started using ArangoDB for my latest project. So far so good, and these changes look solid. Only thing I wish they had(unless I'm missing something), is a GridFS type storage.

janemanos · on June 23, 2016

Hi this is Jan2 from ArangoDB... my team mate Jan1 wrote a blogpost: You can already handle binary data with ArangoDB Foxx framework... here is how http://jsteemann.github.io/blog/2016/06/22/handling-binary-d...

overcast · on June 23, 2016

Interesting, not sure I want to go down that route of creating a separate Foxx service to essentially shoehorn in blobs. We'll see what happens in the future, when that becomes native. But since you guys love incorporating many types of databases into one, add ArangoFS ;)

merqurio · on June 23, 2016

Pure curiosity, why do you miss GridFS? I think ArangoDB it already does that, back in time there was journal size setting, I'm not sure if it still has.

overcast · on June 23, 2016

I need a simple way to store a lot of photos, GridFS interface is simple, and I don't have to worry about screwing around with filesystems / folder structures. Mongo takes cares of all the sharding and what not. One man operation here!

sedlich · on June 23, 2016

Congratulations! What a nice bunch of new features. Thinking all three models from the ground up has really payed off! Will check out the docker image soon...

neunhoef · on June 23, 2016

Max from ArangoDB here. Just to avoid disappointment: The official Docker image arangodb will need a few days to be updated and be validated by Docker. Use arangodb/arangodb:3.0.0 in the meantime.

zimbatm · on June 23, 2016

Does ArangoDB support streaming queries like RethinkDB?

fceller · on June 23, 2016

We are evaluating various possibilities, how to implement streaming queries in an efficient and scalable way. For instance, are restrictions to the general AQL necessary for such queries to be able to scale? Stay tuned.

ralusek · on June 23, 2016

This looks really interesting. Is there a NodeJS ORM comparable to Mongoose for using Arango?

fceller · on June 23, 2016

You can use the Foxx framework to create a REST api that suits your needs. There is also a community project for an ORM, see https://github.com/arangodb/arangojs/issues/215

However, in general it is more flexible and safer to use Foxx, because it allows you to fine tune complexes queries and supports transactions.

continuational · on June 23, 2016

Can I change the number of shards on an existing collection without taking it offline?

fceller · on June 23, 2016

Hi, I'm Frank from ArangoDB. You should create a number of shards that is much higher than your initial number of servers. ArangoDB can cope with multiple shards per server. This way, you can easily redistribute shards when adding new servers.

continuational · on June 23, 2016

Thank you for the answer. I'm currently dealing with an inflow of about one million documents a day; it's an ever growing collection (grows by a few TB each year). Should I just configure it with 1000 shards? Or would it perform bettet with fewer shards?

fceller · on June 23, 2016

yes, using 1000 shards would work. Eventually, we will also support splitting shards.

continuational · on June 23, 2016

Thank you, that was the answer I was hoping for.

nrjames · on June 23, 2016

Is there a reliable python module for working with ArangoDB 3.0?

merqurio · on June 23, 2016

We use joowani/python-arango, but I haven't test it with 3.0 yet.

nrjames · on June 23, 2016

My dream is to see one of these graph databases use NetworkX for the python API and the database for the back end / storage.

merqurio · on June 23, 2016

That's a pretty cool idea.. Up until know I have used JSON whenever I wanted to use NetworkX. It shouldn't be too complex to connect them.