
ArangoDB 3.7 – a big step forward for multi-model - porker
https://www.arangodb.com/2020/08/arangodb-3-7-a-big-step-forward-for-multi-model/
======
rishav_sharan
Arangodb has always been the db that I always keep checking back every few
months. I love the idea and using it has been a peach so far. This is a very
underrated db and I wish more people can get into it.

To me, its best bits are;

1\. AQL - I absolutely adore AQL- the query language that you use in Arangodb.
I haven't seen anything like it so far and it was somethingI could pick up in
5 mins.

2\. Multi-modal - love the idea of being able to use the same db for graph and
object based data. With the new search and other geo features, this bit has
become even more tempting.

3\. Orchestration - ability to have self healing clusters of db nodes makes
the overall backend super redundant and safe. and also helps with latency by
enabling me to have a node in each region that I want to optimize for.

Despite all this, I almost always end up going for Postgres for my personal
projects.

Coming to the negatives, for me the biggest missed opportunities are;

1\. Not providing a free tier for the managed service. I really would want to
use the service for small toy projects without worrying about a time limit.
All of the managed service providers have a free/community tier and without
it, I don't see their service succeeding.

2\. Foxx server-in-db This one pains me so much. I feel Foxx is the killer
feature thay have that can push them past Postgres. Having the web server and
the db in the same node makes architecting the backend so simple. But Foxx
really isn't designed with developer ergonomics in mind. Issues that i opened
years ago are still open, without any major update which would make developing
in Foxx less than a chore. I would drop other web frameworks+dbs in a
heartbeat, if Foxx was easy to develop in.

3\. No Jepsen report. No techempower entry for a Foxx+arangodb backend. No
realworld backend app using arangodb. No major independent benchmarks.

~~~
rikroots
I agree with the above points (except Foxx - that looked too scary, so I
pretended it didn't exist so I have no opinion on how good/useful it could
be). Getting the database up-and-running on my local machine was pretty
painless and AQL was a joy to use.

The thing that put me off using it for anything other than a local toy project
was the costs (at the time) of hosting the database anywhere outside of my
laptop. Deploying ArangoDB in a Docker container on AWS ECS was far too much
money to justify for any of my side projects. Hopefully running costs are a
lot cheaper nowadays?

------
mroche
I’ve been meaning to look Arango for some side projects to learn some NoSQL.
Having the primary NoSQL types in one db makes that a bit easier when writing
the code, though OrientDB can fit that bill as well. From my limited
experience their AQL syntax is pretty good and easy to grok, and the
documentation is also good for diving in and understanding the
models/features. The concept of Foxx API’s someone can quickly spin up is
interesting, though I’m not sure I’d use it myself. I also quite like their
WebUI console, and having everything packaged up in the official arangodb[2]
container image makes it simple to get up and running.

They have some performance benchmarks on their site from 2018[0] that at the
time showed it to be reasonably competitive with the competition. I’ll leave
that up to actual DBAs to qualify though, I’m not really that knowledgeable in
databases. The benchmark is also open-source[1].

[0] [https://www.arangodb.com/2018/02/nosql-performance-
benchmark...](https://www.arangodb.com/2018/02/nosql-performance-
benchmark-2018-mongodb-postgresql-orientdb-neo4j-arangodb/)

[1] [https://github.com/weinberger/nosql-
tests](https://github.com/weinberger/nosql-tests)

[2] [https://hub.docker.com/_/arangodb/](https://hub.docker.com/_/arangodb/)

~~~
PaulHoule
I use adb for online applications (my home control system; it archives sensor
readings, keeps configuration data) and also research projects such as loading
the MeSH biomedical ontology, following blood vessels from heart to hand, etc.

Arangodb = Mongodb - hype + results

Don't let your bad experiences with overhyped database scare you away from
arangodb. It is real German engineering not US startup culture that starts
with 'the problem I am trying to solve is Larry Ellison is worth more than
me... People's lives are empty if they don't have an expensive service
contract..."

~~~
jd_mongodb
I'm sure ArangoDB is wonderful technology but your characterisation of MongoDB
is misplaced. The people who build the WiredTiger storage engine that MongoDB
depends on have been building database engines for over 20 years. They created
the original BerkeleyDB, built Oracle's NoSQL engine then left Oracle to
create the Wired Tiger engine. They were acquired by MongoDB to completely
revamp our storage engine technology in 2014. Don't let 7 year old memes cloud
your judgement.

(I work for MongoDB).

~~~
oxfordmale
The below is not a seven year old meme:

Even at the strongest levels of read and write concern, MongoDB 4.2.6 failed
to preserve snapshot isolation. Instead, Jepsen observed read skew, cyclic
information flow, duplicate writes, and internal consistency violations. Weak
defaults meant that transactions could lose writes and allow dirty reads, even
downgrading requested safety levels at the database and collection level.

~~~
fortran77
I find MongoDB to be a very reliable database as long as I build my apps to
tolerate losing data.

~~~
jd_mongodb
If you have ever lost data running MongoDB (whether you are a paying customer
or not) we want to hear about that and fix it. We treat any instance of data
loss as a high priority bug.

------
FpUser
I guess it is just me but as soon as I see "contact our sales" instead of
price I turn away. Absence of perpetual license instead of monthly fee does
not help either.

~~~
dotBen
Running a database business is especially hard - constant development is
needed and so one time perpetual licenses don't really reflect the on-going
costs (esp staff) that a database company has.

Mix in the risk that your high value users end up using the FOSS edition and
it can get especially difficult.

------
dariosalvi78
I have used Arango in a couple of project, it's a great piece of software. I
wish they had support for time series and then I'd be 1000% happy.

------
didibus
Those features are interesting, but how much can it scale? What performance
(read and write) can I expect? How easy is it to update a schema? How does it
handle pagination or queries with really large result sets? What's the limits
on the size of data I can store in it (total, per field, per row, etc.) And
how does it handle security? Is everything encrypted at rest and in transit?
Can the index work over encrypted fields? Etc.

~~~
kyriakos
Great points here. I hear a lot of about a lot of DB's especially the new kids
on the block but unless I see how they behave once you got a few TB of data in
them I cannot even contemplate using them in production. Things like schema
changes, backup/restore, latency, fault tolerance etc are very important to
leave to chance and believe the initial marketing.

------
maxpert
Would somebody say it can hold data as reliably as postgres. I’ve been very
very impressed recently with Postgres and how long a DB can go (using Aurora
right now). I would love to try it out but I am just too iffy on if it will
withstand the load reliably (even in events of a crash).

~~~
brian_cloutier
I wish databases were more modular, I would love to be able to experiment with
other models for what a database could be, backed by the rock-solid postgres
write ahead log.

------
jmnicolas
Do I understand correctly that multi model mean you can create relational dbs,
documents dbs, key value pairs etc in Arrango?

If yes, I wonder why someone would need a multi model db, do you have any
examples?

~~~
controversy
Let’s play make a Facebook. You need profile information. That’s information
that’s access as block. We can use documents for that. We then want to track
relationships. Friends. Friends of friends. We can use a graph. We might need
a lightweight cache. Opaque entries accessed by key. We can use a key value
store for that. ArangoDB does all of theses. Some times you want to join
documents to documents or any other form of pairing. ArangoDB does that too.

You can then scale this across multiple machines as necessary. The benefit of
such a design is that your team only needs to learn on technology not many.
You don’t need to know redis, postgres and Neo4j to derive the same benefits.

~~~
zaphirplane
Isn’t a graph dB a super set of a document dB Node == document Properties =
attributes Edge == relationship

~~~
jlokier
At a high level, you could say that you can model your data in either, so
either can implement the other, and you can also include relational DBs in
that too. They are all "equivalent" in an abstract sense. But it doesn't mean
they support all uses equally well.

A graph DB is optimised for a traversing a general graph structure, whereas a
document DB is optimised for a tree-structured document and sometimes queries
can't traverse links between documents.

Optimised means performance, layout in storage (so locality, retrival and join
patterns), the kinds of query operators that are offered, and even that the
language they use is more suited to different ways of modelling data.

~~~
zaphirplane
once you’ve implemented a graphDB you have a document DB. In a graphDB you may
query all vertices with a property x=foo, which translates to get all
documents with field x=foo

Effectively you can market a graphDB as a document DB, the reverse isn’t true.
What am I missing

~~~
jlokier
You're missing that the documentDB will be faster and simpler to use for some
kinds use cases, is simpler to understand in some ways, and that the
query/update language used by the documentDB will funnel application design
towards storage and access patterns that work better with a documentDB.

Of course you can implement a documentDB on top of a graphDB, or market the
latter as the former. And of course there are applications running on a
documentDB that would be as fast or faster on a graphDB.

The differences are one of "impedance mismatch" rather than insurmountable
differences.

For example, if you query all vertices with a property x=foo, then query all
properties of the vertices, and then traverse all tree-child like properties
to more vertices, and continue doing this recursively, _that_ query will be
like getting all documents with x=foo. But that's more complicated to express
in a graphDB QL than a documentDB QL, and likely to run slower on the graphDB
(due to data non-locality) if there are many properties or much tree depth.

In general a documentDB stores all the data for a document clustered together
_without being told to_ , and likes to retrieve them as a unit. Because that
structure is clear, applications tend to be designed around it as an
assumption.

