
Modeling Data in MongoDB vs. ArangoDB - spountzy
https://www.arangodb.com/2014/11/06/data-modeling-mongodb-arangodb
======
tedchs
This actually looks pretty interesting. I appreciate their FAQ has a great
answer to "What is ArangoDB and for what kind of applications is it designed
for?" \-- more projects need to offer this kind of statement.
[https://www.arangodb.com/faq](https://www.arangodb.com/faq)

~~~
shangxiao
I like this project and am keeping an eye on it, but tbh that answer doesn't
really answer the question in a way that seems objective. It just says it's a
_" general purpose database offering all the features you typically need for
modern web applications"_.

------
wiremine
Does ArangoDB use the same storage strategy MongoDB does? From the FAQ:

"So how much RAM do you need? This depends on the size and structure of your
data: Your application will access one or many collections (think of
collections as denormalized tables for the time being). Once you open a
collection the indexes for this collection are created in the RAM and the data
is loaded into the RAM using memory-mapped files. If your collections are
bigger than your RAM, the operation system will be forced to swap data in and
out of the swap space."

I'm not an expert, but a lot of people seem to harp on MongoDB for this very
reason. Does ArangoDB use the same strategy? If not, how is it
similar/different?

~~~
bjerun
In principle, ArangoDB behaves similarly to MongoDB here. Both are essentially
"mostly-in-memory" databases in the sense that they hold the data in memory
and persist it at the same time to disk via memory mapped files. This approach
is good for performance and if you run out of RAM you ought to shard your
data.

However, MongoDB often uses a lot of memory for the actual data, since its
BSON binary format stores the names of the attributes with every single
document. ArangoDB detects similar shapes of documents (see
[https://www.arangodb.com/faq#how-do-shapes-work-in-
arangodb](https://www.arangodb.com/faq#how-do-shapes-work-in-arangodb)) and
thus avoids this particular problem.

~~~
wormit123
I have been bitten by this using MongoDB as well. The shape recognition of
ArangoDB sounds very useful. If this works well, it would alleviate a problem
that NoSQL solutions so far have in comparison to classical relational
databases.

------
neunhoef
Interesting article. An obvious reaction is to say: "In a document store, not
all joins will be efficient in a sharding situation!". This is true, but
certain queries involving joins backed by the right secondary indexes will
indeed scale well, therefore one should not use this argument as a reason not
to implement joins at all.

~~~
MillstoneX
Can you give an example for a join between two different sharded document
collections that can be executed efficiently on say 100 servers?

~~~
rafekett
just because the dataset is sharded doesn't mean that one query has to hit
every shard. for example, suppose you're looking for documents with `parent_id
= foo` and your sharding key is `parent_id`, then an intelligent query planner
would only query one shard (the one that "foo" hashes to), and then this looks
a lot like a join in an RDBMS. indeed, if you wanted to do (in RDBMS terms) a
self-join to load the whole tree of documents rooted at parent_id = foo, and
your sharding key were the root for each document, that query would only hit
one shard with a. the trick is deciding which keys to shard on (and, in many
cases, what other keys to shard on in redundant datastores that serve
different types of queries).

~~~
neunhoef
Right, you were quicker but are essentially saying the same thing as I said in
my example.

------
Lerato
Is there a rule of thumb, in which situation you would model your connection
as foreign key and in which situation you would model it as graph? Or do you
always use graphs?

~~~
Marc64
I think if you are connecting the same type of objects (i.e. users) you should
use graphs. If you have a 1:n relation between different types, you could as
well use foreign keys. For n:m you again need graphs.

~~~
Philippos91
Having a 1:n relation which you might want to annotated with, for instance,
"type of relation" it is also feasible to use the graph model, as edges can
carry attributes.

------
MillstoneX
I like that ArangoDB can be extended by micro services. Does this not raise
security concerns, because user code is executed on the DB server?

~~~
don71
This is an argument one often hears. However, V8 is encapsulated quite well,
since chrome has the same issue.

Furthermore, these micro services can actually improve security: You can
implement your own scheme for authentication and authorisation on the document
level and deploy it to the database. Then, if your application has various
clients for different devices, they are all authorized in the same way by the
same code. This leads to a simplification in app development and thus to more
security, because there are fewer places to get right and the whole approach
is less error prone.

------
aikah
First time i've heard about ArangoDB ,and it looks quite interesting.

When did the project start?

Could the Foxx thing be an independent application ?

~~~
don71
ArangoDB was only started in 2012 but many years of experience in developing
special-purpose database solutions went into it. This is how the rapid
evolution into a market-ready product was possible at all.

Foxx is designed as the extension framework for ArangoDB and so it does not
really make sense to rip it out of the DB kernel. Furthermore, a lot of its
advantages would vanish if it does not longer have immediate and rapid access
to the data.

------
dang
No astroturfing on HN, please.

~~~
shangxiao
Can you explain how this is astro-turfing?

~~~
dang
It's hard to know for sure, of course, but according to the data we look at,
some of these comments appear promotional rather than organic discussion.

That's not to say this isn't a great database, and we admire anyone who's
undertaking a hard project. But there are proper and improper ways to get
attention on HN. This one appeared to cross a line, hence my comment.

