Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
ArangoDB 3.0 Release – A Solid Ground to Scale (arangodb.com)
136 points by sachalep on June 23, 2016 | hide | past | favorite | 41 comments


I would love to see a comparison between Arangodb and Rethinkdb, especially performance wise. Is it suitable for realtime applications? What language drivers are supported?


Claudius from ArangoDB here. Frank already answered a question about realtime queries below: "We are evaluating various possibilities, how to implement streaming queries in an efficient and scalable way. For instance, are restrictions to the general AQL necessary for such queries to be able to scale? Stay tuned."

Regarding performance, it would be great if somebody with excellent knowledge about rethinkdb would contribute to https://github.com/weinberger/nosql-tests


Mad respect for these guys/gals, they managed to squeeze two game changing features in a single release: VPack and Clustering 2.0.


Not to blow my own trumpet but I think the new Foxx API (for letting you write your own HTTP endpoints in JS) is a huge step forward, too ;)

(I'm the lead developer of the new ArangoDB Foxx, AMA)


Do you have an example app written with some benchmarks?

It says that we can utilize npm, so, how is the compatibility with node? Native (C++) modules?

Is it used already in production somewhere?

(I am asking a lot of questions because I'm very interested, not anything else)


Hi egeozcan,

1) I don't think we currently have a benchmark specifically for Foxx. However we did some one-off tests in 2.8 and there was no noticeable difference between the plain HTTP API and a Foxx service using the equivalent JS API (just the V8 context switches).

2) See the section "Compatibility Caveats" in the docs: https://docs.arangodb.com/3.0/Manual/Foxx/Dependencies.html Specifically we can't support native modules and there are limits to what Node APIs we can support (network and filesystem APIs are different and more restricted, also Foxx is entirely synchronous).

3) Yes! Check out the case studies linked a the bottom of the website: https://www.arangodb.com/#casestudies All three of them use Foxx. I also dogfood Foxx: I'm the lead developer of a commercial e-learning product that uses full-stack JS with ArangoDB.

Feel free to swing by our community Slack channel if you have more questions: https://slack.arangodb.com


Foxx microservices are the best. I love them


Is the Foxx API supposed to imitate what would be stored procedures on the traditional rdbms?


Hi! I'm Claudius from ArangoDB.

Saying Foxx is like stored procedures is overly simplistic.

ArangoDB's external API is the HTTP API. Foxx lets you extend that HTTP API with arbitrary code that will be executed in V8 with direct access to the same native APIs it uses internally.

The closest equivalents I'm aware of are RethinkDB's Horizon (which isn't hosted in the database) and CouchDB's CouchApps (which are far more limited in what they can do).

You can think of it as having a subset of Node.js running right on top of the database with direct memory access and scaling (Foxx services run on each coordinator).


So the expectation is for client applications to directly connect to the API. But of course that's not required.


You are right that that is not required. That ArangoDB speaks HTTP doesn't mean it has to be exposed to the browser.

You can use Foxx to host your entire server-side application inside ArangoDB in some cases but in our experience, the best approach is having a middleground where the data-intensive server-side code lives in Foxx services and the client-facing part (e.g. server-side rendering) exists outside of it.

Foxx offers you the option to put backend logic directly inside the database; it's not an all-or-nothing decision.


whoa. foxx looks awesome!


Great news! We decided a couple of months ago to go with arangoDB, we have built already a 32Gb Graph Database and so far it has performed really well. It's easy to install and great as a first contact on the GraphDB world aside from Neo4J.


This looks pretty fantastic! I'm going to give it a whirl at my company...except...

It apparently uses Facebook's RocksDB, which is also, I'm told, a pretty nice package. However, there is this:

https://github.com/facebook/rocksdb/blob/master/PATENTS

"The license granted hereunder will terminate, automatically and without notice, if you (or any of your subsidiaries, corporate affiliates or agents) initiate directly or indirectly, or take a direct financial interest in, any Patent Assertion: (i) against Facebook or any of its subsidiaries or corporate affiliates," ... and so on.

I think that means that if my company goes after (in any way, shape or fashion) any Facebook patents, and we're using ArangoDB, we're instantly in copyright violation.

Is this a roughly accurate assessment?

Thanks!


You don't lose your copyright license, you lose the patent grant. Most MIT/BSD licensed software doesn't come with a patent grant; many people assume that you get an implicit grant but that opinion will vary depending on which lawyer you talk to.


Ok, I think that makes sense. So, to restate, that means that RocksDB (or whatever) might have Facebook patented code/material in it, and Facebook is granting a very generous grant to use those patents.

...until a given organization comes into patent conflict with any of Facebook's patents, at which time their grants disappear.

So if I'm understanding that correctly, doesn't that basically mean the same thing? You're no longer allowed to use their software?


"until a given organization comes into patent conflict with any of Facebook's patents"

Note that the grants only disappear if you initiate the conflict; they don't disappear if Facebook initiates and you counter-sue.


Ok, so that's a useful clarification.

My problem is that I'm working for a subsidiary of a really big company that is directly competing with some parts of Facebook. There's a non-zero chance that some part of my company, far far away from me, will initiate an action that triggers this clause some time in the future.

In short: my company's ability to use a growing set of otherwise high-quality projects is greatly impacted because of this legal language. I also strongly suspect this isn't limited to just my organization.

One way or another, thank you for your attention.

I would love to hear from any other members of large organizations about this matter. Granted, my view is pretty limited, but this doesn't strike me as a good way forward for the overall free software community.


AFAICT, Apple finds this patent grant acceptable, whereas previous versions were unacceptable. Apple is well known for suing competitors for patent infringement, so they obviously feel that the loss of this patent grant would be acceptable, that the devolution to pure BSD is enough protection.


Well, Apple certainly qualifies as a big company, thanks for the input.


We have been using ArangoDB (2.8.x) for some time. Solid product with awesome support in Google group and SO. Great job guys! Looking forward to migrate to 3.x especially for the persistent index.


Awesome, I JUST started using ArangoDB for my latest project. So far so good, and these changes look solid. Only thing I wish they had(unless I'm missing something), is a GridFS type storage.


Hi this is Jan2 from ArangoDB... my team mate Jan1 wrote a blogpost: You can already handle binary data with ArangoDB Foxx framework... here is how http://jsteemann.github.io/blog/2016/06/22/handling-binary-d...


Interesting, not sure I want to go down that route of creating a separate Foxx service to essentially shoehorn in blobs. We'll see what happens in the future, when that becomes native. But since you guys love incorporating many types of databases into one, add ArangoFS ;)


Pure curiosity, why do you miss GridFS? I think ArangoDB it already does that, back in time there was journal size setting, I'm not sure if it still has.


I need a simple way to store a lot of photos, GridFS interface is simple, and I don't have to worry about screwing around with filesystems / folder structures. Mongo takes cares of all the sharding and what not. One man operation here!


Congratulations! What a nice bunch of new features. Thinking all three models from the ground up has really payed off! Will check out the docker image soon...


Max from ArangoDB here. Just to avoid disappointment: The official Docker image arangodb will need a few days to be updated and be validated by Docker. Use arangodb/arangodb:3.0.0 in the meantime.


Does ArangoDB support streaming queries like RethinkDB?


We are evaluating various possibilities, how to implement streaming queries in an efficient and scalable way. For instance, are restrictions to the general AQL necessary for such queries to be able to scale? Stay tuned.


This looks really interesting. Is there a NodeJS ORM comparable to Mongoose for using Arango?


You can use the Foxx framework to create a REST api that suits your needs. There is also a community project for an ORM, see https://github.com/arangodb/arangojs/issues/215

However, in general it is more flexible and safer to use Foxx, because it allows you to fine tune complexes queries and supports transactions.


Can I change the number of shards on an existing collection without taking it offline?


Hi, I'm Frank from ArangoDB. You should create a number of shards that is much higher than your initial number of servers. ArangoDB can cope with multiple shards per server. This way, you can easily redistribute shards when adding new servers.


Thank you for the answer. I'm currently dealing with an inflow of about one million documents a day; it's an ever growing collection (grows by a few TB each year). Should I just configure it with 1000 shards? Or would it perform bettet with fewer shards?


yes, using 1000 shards would work. Eventually, we will also support splitting shards.


Thank you, that was the answer I was hoping for.


Is there a reliable python module for working with ArangoDB 3.0?


We use joowani/python-arango, but I haven't test it with 3.0 yet.


My dream is to see one of these graph databases use NetworkX for the python API and the database for the back end / storage.


That's a pretty cool idea.. Up until know I have used JSON whenever I wanted to use NetworkX. It shouldn't be too complex to connect them.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: