

DB-Engines Ranking ranks database management systems according to popularity - X4
http://db-engines.com/en/ranking

======
espeed
It's cool to see graph DBs like Titan and Neo4j moving up the ranks.

There are some major advancements coming down the pipe in the world of open-
source graph computing that will make working with big graph data accessible
to anyone, not just the Googles, Facebooks, and Twitters of the world.

Here are some of the things coming down the pipe:

Titan 0.4 was just released
[https://github.com/thinkaurelius/titan](https://github.com/thinkaurelius/titan)),
and the number of backend datastores Titan supports is growing. Datagrid
support was just added for Hazelcast, and it can serve as the reference
implementation if anyone wants to add support for Infinispan, Galaxy
([http://puniverse.github.io/galaxy/](http://puniverse.github.io/galaxy/)), or
one of the other datagrids.

MapR is in the final stages of certifying Titan on M7 Tables
([https://groups.google.com/d/msg/aureliusgraphs/RTeFVssIvoI/m...](https://groups.google.com/d/msg/aureliusgraphs/RTeFVssIvoI/mkUEla0MgCgJ)),
which will allow you to run Titan on HBase without all the HBase complexity.
AWS
([http://aws.amazon.com/elasticmapreduce/mapr/](http://aws.amazon.com/elasticmapreduce/mapr/))
and GCE ([http://www.mapr.com/products/google-cloud-
platform](http://www.mapr.com/products/google-cloud-platform)) already have
direct support for M7 so it's easy to spin up a cluster.

TinkerPop3 is scheduled for release within the next six months
([https://github.com/tinkerpop/tinkerpop3/wiki](https://github.com/tinkerpop/tinkerpop3/wiki)),
and it will blur the lines between graph databases and graph processing
engines.

Marko just released the first version of TinkerPop3's OLAPGraph -- this is
Blueprints for graph-processing engines, which means that in addition to OLTP
graph databases like Titan and Neo4j, TinkerPop3 will support OLAP engines
like Giraph, HAMA, Faunus, GraphLab, and the new GraphX engine in Spark
([http://amplab.github.io/graphx/](http://amplab.github.io/graphx/),
[http://www.youtube.com/watch?v=mKEn9C5bRck&list=PLbDk7g7PotW...](http://www.youtube.com/watch?v=mKEn9C5bRck&list=PLbDk7g7PotW0vi5zI0w7Cn4APmsrUXz0E&index=4)).

You'll be able to run Gremlin over any Blueprints-enabled graph-database or
graph-processing engine, and Gremlin will be able to jump between the database
and processing engine, depending on if it's a local or global graph algorithm
([http://markorodriguez.com/2011/04/19/local-and-
distributed-t...](http://markorodriguez.com/2011/04/19/local-and-distributed-
traversal-engines/)).

And Marko's recent breakthrough on swarm computing over derived graphs
([https://groups.google.com/d/msg/gremlin-
users/1KObZ8F2d00/CJ...](https://groups.google.com/d/msg/gremlin-
users/1KObZ8F2d00/CJo_tN7qqKQJ)) means you'll be able to run traditional graph
algos over property graphs. This will pave the way for the community to
construct a massive library of graph algorithms in Gremlin
([https://github.com/tinkerpop/furnace/wiki](https://github.com/tinkerpop/furnace/wiki)).

All of this is coming together now. 2014 will see a major leap in open-source
graph computing.

~~~
kbenson
I understand very little of what you just posted, but purely because of my own
ignorance. This is wonderful, I now have a reference doc on what to read up on
this weekend. HN FTW.

------
SloopJon
What's not apparent from the title is that this is a popularity ranking,
similar to the TIOBE Programming Community Index:

[http://www.tiobe.com/index.php/content/paperinfo/tpci/index....](http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html)

------
smoyer
Interesting but I wonder how accurate it really is. Two (at a minimum) of the
metrics in their average can't really be used to measure popularity (of use):

\- Number of mentions of the system on websites

\- General interest in the system

There are two different problems with these metrics:

\- Both will include people who have negative opinions of the system. There's
no effective way to separate out the people who are discussing their distaste
for a system (remember comcastsucks.com?).

\- Both are subject to misrepresentation caused by external factors such as
the quality of the system's documentation, use of forums for support and how
consistent the system is in general. Some systems require much more research
to be run in a stable fashion.

I've used Oracle, PostgreSQL and CouchDB extensively and MySQL plus a couple
other to a (much) lesser extent and I get the sense that the rankings are in
the right order (I have no idea about MSSQL) but I doubt that the scale is so
logarithmic.

~~~
pfarrell
Mention of the database, no matter the popularity, are signals of its use and
the activity of it's community. I guess it depends if you consider the
definition of popularity as whether it's liked, or whether it's deployed and
being used. They should make the raw data available if they want this to be
believed. I'd like to know to which metrics most contribute to a specifics
db's rank.

I have a problem, however, with the use of google as a source. Not sure how
you can trust it to give you an unbiased result now that it produces
personalized results using who knows what (ie incognito on chrome still
exposes IP addresses)

------
Spearchucker
There's little use for me in popularity rankings. Much more useful are TPC-C
rankings
([http://www.tpc.org/tpcc/results/tpcc_results.asp?orderby=dbm...](http://www.tpc.org/tpcc/results/tpcc_results.asp?orderby=dbms))

~~~
X4
wtf, are you kidding?! There are ONLY commercial solutions from a handful
commercial vendors. Yeah, truly trustful resource.. really, have you even
looked at the results, I mean dude IBM,MS,ORACLE,NEC and then it starts to
repeat...

So I understand that you request a cost per transaction column and it makes
sense, but that link sucks.

~~~
Spearchucker
That's still a useful set of benchmarks. Do you know of something similar that
includes F/OSS? I'm assuming tpc.org doesn't include them because open source
OLTP isn't as performant. The TPC-C benchmark is particularly interesting to
me because I work with high availability systems where eventual consistency
isn't something I can consider.

~~~
jeltz
> I'm assuming tpc.org doesn't include them because open source OLTP isn't as
> performant.

That assumption is incorrect. The open source databases are not included due
to nobody having spent the time and money necessary to be included there. Sun
did it for PostgreSQL many years ago and got decent results.

------
gwu78
This is a decent list.

Columns I'd like to see added:

1\. Language the db is written in 2\. Lines of code

Popularity tells me little about how capable the developer is and the quality
of the software.

~~~
ams6110
How do the implementation language and LOC counts tell you anything about the
capability of the developer and the quality of the software?

~~~
gwu78
See my other reply.

In short (pun intended), it is part of the heuristics I have developed over
the years for evaluating software.

I like small programs. That's not unheard of. I also look for programs that
compile quickly and easily on more than one platform.

The latter is actually of practical significance, since I cannot as quickly
and easily try new software if it will not compile without modifications, or
if it takes too much memory or too much time. If it takes me longer to compile
some userland program that it takes me to compile my kernel, I'm more likely
to look for lighter weight alternatives.

Personally, I have not found popularity alone to be a reliable heuristic when
it comes to software.

------
mgleason_3
Hmm How does Oracle come out that far ahead given they supposedly measured?:
\- Number of mentions of the system on websites \- General interest in the
system

~~~
misterjangles
It's not really surprising to me. Oracle has been the DB system for enterprise
systems for years - they have a lot of corporations running on their platform.

------
vjoel
Where's Datomic? Maybe it's excluded because it relies on other databases as
storage services?

------
snissn
It would be great if column oriented DBs weren't tagged as relational
databases!

~~~
cynwoody
Column orientation is an implementation detail. It has nothing to do with
whether or not a DBMS is relational. Whether a system should be classified as
relational depends on how fully it supports the standard SQL interfaces, not
on how it works under the covers.

------
TheLoneWolfling
It would be interesting to have a programming language with an integrated OO
DBMS.

------
coldcode
Amazes me to see that there exist this many databases.

~~~
X4
Yeah me too. I have the feel that developers find themselves attacking the
problem of compression, entropy and abstraction of meaning more and more often
these days. Which I see as a good future, while I hope that some convergence
of the evolutionary better DBs happens. Commercial DBs rot and die in
popularity compared to opensource solutions, which marks the value of our
time, that is cooperation and usefulness, over individual benefit.

Can't wait for 2014, and not just because 2013 was a bad year.

