
ArangoDB Receives $10M Series A Funding - kylesellas
https://www.arangodb.com/2019/03/arangodb-receives-series-a-funding-led-by-bow-capital/
======
rrampage
Congrats to the Arango team! I used it at my earlier workplace for computing
suggested friends/followers which replaced an older service (Postgres + redis
and application server-side caches). The resulting solution was faster, ran on
a single modest machine (8 GB RAM, 4 cores) and allowed us to spin down 3
higher-end machines and reduce a layer of caches on application-servers.
Writing the microservice using Foxx (which is built-in to Arango) was a
pleasure and the easy deployment + Swagger API was a great developer
experience. The community slack was friendly and helped me out with some AQL.

------
ZeroCool2u
I'd never heard of Arango before seeing this post, but the main selling point
seems to be:

"A native multi-model database from the ground up, supporting key/value,
document and graph models. You can model your data in a very flexible way."

Which actually seems like it could actually justify having its own query
language. The built in search is also a nice touch.

~~~
ifcologne
Imagine you would go to your preferred online marketplace and search for a
generic product.

You get 1000+ results.

So you filter by avg.star-rating > 4.0

Still 500+ results.

Those with just one 5 star rating in front of the one with 300 reviews and a
4.8 avg. Annoying.

What I really want:

I would like to filter for products that have at least 5 (relatively long)
reviews, an average rating of 4.0 and at least 2 of these review comments
mentioning the use case for which I would like to use this product. Maybe I
just want the verified purchases to be counted or the reviews of friends and
friends of friends...

Using a native multi-model approach you can do both. Simply retrieve all
category X products ranked by product rating, limit 50/page or perform
advanced lookups - without having to synchronize data from a document or
relational model with an additional graph or search engine.

Combining full text search with scorers, graph traversals and/or join
operations you could do an ad-hoc query in AQL to get the most relevant
products & reviews with a single query.

Multi-model provides choice. In data modeling and querying.

~~~
ddebernardy
> Those with just one 5 star rating in front of the one with 300 reviews and a
> 4.8 avg. Annoying.

That's easy enough to solve with a bayesian average. The problem is many
developers and product managers don't know much or anything about stats.

~~~
dorgo
How exactly? By adding a number (C) of average (2.5 stars) "virtual" reviews
to all products?

[https://en.wikipedia.org/wiki/Bayesian_average](https://en.wikipedia.org/wiki/Bayesian_average)

~~~
ddebernardy
Like so:

[http://www.evanmiller.org/how-not-to-sort-by-average-
rating....](http://www.evanmiller.org/how-not-to-sort-by-average-rating.html)

------
asien
Seeing all the comments here it seems like Arango is a good fit for many use
cases.

I would really recommend to the founders of the company to invest in marketing
, it’s really important for developers to have something that speak to them.

My point is maybe the issue here isn’t performance of features of the database
but rather the marketing that prevent it from finding its market fit.

~~~
ganeshkrishnan
We don't really care about the performance (speed) but mostly about features
(text search + timeseries) and scalability with clusters

~~~
MarvelousWololo
What? We who?

~~~
ganeshkrishnan
"We" is a prospective customer of this db. Maybe other devs/companies do need
speed over multi-functionality

------
jwr
I'm waiting for an official round of testing by Jepsen (e.g. not in-house
testing, but paid testing by Kyle Kingsbury).

It should be a bar to pass for every distributed database.

------
Sander_Marechal
Well deserved! At work I have been using ArangoDB for a few years now as a
graph database. So far it's been working great with up to 100K graph nodes
across two dozen collections.

~~~
k__
I liked the idea very much, but I guess I won't use any unmanaged
infrastructure software any time soon.

I had too many struggles in the past with MySQL and RethinkDB so I will go
with what cloud providers are offering.

------
antirez
Congrats, the ArangoDB folks are also very nice people other than skilled
developers.

~~~
don71
Claudius from Arango. Thanks Salvatore! Long time not seen, I hope you are
well.

~~~
antirez
All fine, greetings!

------
arxpoetica
ArangoDB is a perfect tool for prototyping or early stages of companies that
need data that might have multiple looks to it. I've used it on tons of small
projects and have nothing but praise. It's a solid not-so-little beast.

~~~
bitcoinmoney
Can you ELI5 the use cases?

------
yingw787
Congratulations on the funding!

I don't understand under what specific use case ArangoDB works best; the
comparisons section lists Cassandra and Neo4j, and my understanding was
Cassandra was for something like chat apps and Neo4j was for something like
GIS analytics. Enlighten me?

Also, how convertible is a proprietary query language like AQL/CQL to SQL? Is
it fully declarative, and version completely independently of the database
core?

~~~
jsteemann
Regarding the question on the query language, AQL is fully declarative. In
this respect it is like SQL. However, there are a few differences between AQL
and SQL: * SQL is an all-purpose database management and querying language. It
is very complex and heavy-weight as it has to solve a lot of different
problems, e.g. data definition, data retrieval and manipulation, stored
procedures etc. AQL is much more lightweight, as its purpose is querying and
manipulating database data. Any data definition or database administration
commands are not part of AQL, but can be achieved using other, dedicated
commands/APIs. * for data retrieval and manipulation, the functionality of SQL
and AQL do overlap a lot, but they use different keywords for the similar
things. Still simple SQL queries can be converted to AQL easily and vice
versa. There are some specialized parts of AQL, such as graph traversals and
shortest path queries, for which may be no direct equivalent in SQL.

AQL is versioned along with the database core, as sometimes features are added
to AQL which the database core must also support and vice versa. However,
during further development of AQL and the database core, one of the major
goals is to keep it always downwards-compatible, meaning that existing AQL
queries are expected to work and behave identically in newer versions of the
database (but ideally run faster or are better optimized there).

~~~
yingw787
Okay, I like how backwards compatibility is preserved. I worked with mongoDB
at my previous company and we ended up not being able to migrate to mongoDB
3.x. I think it was because we forked 'eve-mongoengine' and couldn't merge
upstream changes, which ended up forcing us to version the entire stack
through the database at the same time, which passed the threshold of
feasibility.

We were absolute idiots, but I still think a data warehouse should be idiot-
proof, which is why I like SQL.

I read through the documentation for ArangoDB and I would be concerned about
the lack of native strict type definitions and referencing in AQL, as well as
the dearth of type availability in ArangoDB in general. Is this a design
decision related to not supporting data/database administration, or something
to be added later to the roadmap?

It sounds like if you support write-intensive paths through the database, it
would be considered an OLTP database for some OLTP workloads; do you publish
TPC-C benchmarks anywhere? What about resource utilization?

Is there a particular reason to support JavaScript first? Is it because
Swagger has JavaScript-first support, or a different reason?

~~~
jsteemann
ArangoDB is a schema-less database. There is currently no support for schemas
or schema validation on the database core level, but it may be added later,
because IMHO it is a very sensible feature. When that is in place, AQL may
also be extended to get more strict about the types used in queries. However,
IMHO that should only be enforced if there is a schema present.

To keep things simple and manageable, we originally started with AQL just
being a language for querying the database. It was extended years ago to
support data manipulation operations. I don't exclude the possibility that at
some point it will support database administration or DDL commands, however, I
am just one of the developers and not the product manager. And you are right
about the main use case being OLTP workloads. For OLAP use cases, dedicated
analytical databases (with fixed data schemas) are probably superior, because
they can run much more specialized and streamlined operations on the data. To
my best knowledge we never published any TPC benchmark results somewhere. I
think it's possible to implement TPC-C even without SQL, however, implementing
the full benchmark is a huge amount of work, so we never did...

------
topicseed
Used the free edition for a while, mainly for graphs, and it was amazing. Will
most likely use a paid version once it exists as a managed service.

~~~
princetman
I also feel managed service is quite critical offering they're missing right
now. Considering recent AWS and Elastic debacle, market is going to be tough
for Open Source products like ArangoDB.

~~~
janemanos
Jan from ArangoDB here. Can't disclose anything yet but feel free to join the
webinar of our co-founder Claudius. He will share details about our future
plans in only 2 weeks [https://www.arangodb.com/arangodb-events/why-native-
multi-mo...](https://www.arangodb.com/arangodb-events/why-native-multi-model-
plus-sneak-peek/)

------
mjburgess
I'm a bit more confused on this one, not having seen the tech before. Isn't
the DB space absurdly saturated with open source tools like this not really
having much life in them?

Postgres is a multi-model db, with document/keyvalue/graph -- isn't it just
pretty easy for an established player to add data model onto their platform?

~~~
ifcologne
The art is to combine all data models using one query language without
duplicating data.

------
veritas3241
I played with Arango a few years ago to prototype some graph stuff. Super fun
to play with and it was awesome being able to traverse the graph so easily.

We were playing with data to make it easy to go from a specific analyte that
was generated all the way up through its protein, DNA, chromosome, disease,
and phenotype via the graph. I'm sad the project never went anywhere, but even
back then Arango was great.

Congrats to the team!

------
azzuwan
ArangoDB is definitely my database of choice. There is a lot to like. Ease of
setup and clustering, free REST API, solid graph features with AQL, great
docs. I have been promoting it in my projects. I would love to be their
partner or tech evangelist for Southeast Asia. If you guys are looking, I am
game for it.

------
magthor
Congrats, Arango! We recently ported a large rethinkdb app to arango and it
has been a joy to use. AQL is awesome.

------
princetman
I wonder how would one douse investors concern of having Open Source product
like ArangoDB, and AWS effectively eating their lunch if/when wide adoption
comes?

Congratulations on the funding btw! I'm a happy and grateful user.

~~~
jsteemann
The database market will all its competition is definitely challenging. I have
no doubt AWS will increase their database market share over time. The good
thing about this competition is that it is forcing all vendors to be
innovative and to find (more) USPs.

AWS DocumentDB seems to be pretty much tied to the MongoDB API right now... So
At the moment this will somewhat limit its functionality. However, they will
not stand still and probably also extend into the multi-model space at some
point. Apart from that, not everyone will be willing to pay for DocumentDB or
have their data located in Amazon datacenters.

~~~
k__
_" AWS DocumentDB seems to be pretty much tied to the MongoDB API"_

I could imagine that they didn't build DocumentDB from the ground up.

DocumentDB is probably just a MongoDB compatible API for one of their base
services (S3 or DynamoDB).

As far as I know, they build Serverless Aurora on top of S3, with the help of
S3 select. So they will probably just create another custom-DB compatible API
if they have the impression that this custom DB becomes the next big thing.

~~~
jsteemann
Exactly, AWS DocumentDB is only MongoDB API-compatible, but it's not using any
MongoDB components.

It's an implementation of its own, leveraging many the base building blocks
and infrastructure Amazon has created.

DocumentDB is currently tied to the MongoDB 3.6 API, that means all the
transactional extensions MongoDB has added recently is not present in
DocumentDB (yet).

------
jbjorge
Congrats!

We've used ArangoDB for a while where I work, and have only had positive
experiences so far. The query language, speed, and flexibility are all nice to
work with.

------
z3t4
hmm, I wonder how hard it would be to make a JavaScript driver that lets you
manipulate data just like you do in JavaScript, eg. using map,reduce/filter,
push etc. For me it's a lot of overhead when switching back and forth between
different languages, eg. between JS and SQL. Even though SQL is a powerful
language and I'm really good at it.

~~~
pluma
(full disclosure: I work for ArangoDB but this is my own personal opinion)

Coming from a JS background AQL is actually pretty easy to learn. Personally
the only thing that keeps tripping me up is that AQL doesn't have a triple-
equals and JS has trained me to avoid double-equals in comparisons.

This is how you fetch every user in a collection:

    
    
        FOR user IN users RETURN user
    

This is how you fetch every admin:

    
    
        FOR user IN users FILTER user.role == "admin" RETURN user
    

This is how you fetch their email addresses:

    
    
        FOR user IN users FILTER user.role == "admin" RETURN user.email
    

Compare this to the equivalent in SQL:

    
    
        SELECT email FROM users WHERE role == "admin"
    

The AQL example is IMO easy to read if you know JS or any similar language.
AQL even has object and array literals. There are a few idiosyncrasies but you
can get very far without needing to invest time to "properly" learn the
language. The naive approach usually results in pretty good performance out of
the box.

I'd say the mental overhead of switching into AQL and out doesn't quite
compare to that of e.g. SQL or even MongoDB queries but you are of course
correct that there is some overhead nevertheless. That said, there are
community-maintained ODMs for ArangoDB if you don't want to touch another
language to write the queries by hand.

I would strongly recommend giving AQL a try though. When I started using
ArangoDB (before becoming a contributor) I was hesitant as well but what
quickly won me over was that I was able to read most AQL queries without
having to learn an entirely new language.

~~~
z3t4
> This is how you fetch every user in a collection:
    
    
        await db.users
    

This is how you fetch every admin:

    
    
        await db.users.filter( user => user.role == "admin" )
    

This is how you fetch their email addresses:

    
    
        await db.users.filter( user => user.role == "admin ).map( user => user.email )
    
    

I think it can be done using JavaScript Proxy.

Here's an update query:

    
    
       user.email = "updated@email.ltd"

------
wray
Arango feels wonderful already. I'm thrilled to see how new funding improves
the user experience. :) Cheers and congrats!

------
RHSman2
Single best tool out there and am a big fan and user in our products.

------
iblaine
What are the closest equivalents / competitors to ArangoDB?

~~~
vlangber
OrientDb is a multimodel db with graph support, so it seems pretty similar.

------
cwoodward
Outstanding! Well deserved for an outstanding product!

------
lrfink
Congratulations to the whole team. Well deserved!

------
LFNL
Nice going, congratulations!

------
ahs1200
World domination from here

------
dlxsrc
Great work boys.

------
janemanos
Congrats, guys!

