Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Poll: What database does your company/you use?
228 points by rorhug on May 11, 2014 | hide | past | favorite | 95 comments
Please upvote this if you would like to see more people take the poll.

This has been asked on occasion in the past but like all things, database preference/technology changes quickly. Let's see what people are using today.

If you think I've left out an important one, leave a comment and I will try and edit the post.

PostgreSQL
627 points
MySQL
501 points
Redis
301 points
MongoDB
220 points
Microsoft SQL Server
196 points
SQLite
157 points
Memcached
128 points
ElasticSearch
128 points
Oracle
111 points
The File System
80 points
MariaDB
68 points
Cassandra
61 points
Other
45 points
Amazon DynamoDB
41 points
Riak
27 points
Neo4j
25 points
DB2
25 points
CouchBase (Couch, Membase included)
24 points
BigTable
19 points
Custom
19 points
LevelDB
18 points
RethinkDB
16 points
HBase
15 points
Firebase
12 points
Amazon SimpleDB
11 points
Vertica
10 points
RavenDB
9 points
Datomic
8 points
Tokyo
7 points
Teradata
7 points
Firebird
5 points
Informix
5 points
Netezza
2 points
VoltDB
2 points
Greenplum
1 point


A word of opinionated advice to anyone deciding on a database to use for their new project: Just go with something popular (MySQL, Postgres, Mongo, etc.), or go with an Amazon hosted cloud database if you really don't want to set anything up (RDS, SimpleDB, DynamoDB).

There's no good reason to start a project with other databases. If you're worried about the scalability of something like MySQL when you haven't even got your first 1000 users, your priorities are messed up. Finish your product and get your users first. And I hope to god if you DO get your users and I ever have to maintain it, it's not running on some bullshit like Tokyo.


Better advice: just use Postgres.

Three reasons: 1) your data is probably relational, even if you think it's not, 2) if you have non-relational records, Postgres has JSON and hash storage types that perform better than most NoSQL databases, and 3) The architecture, standards compliance, and sheer capabilities of Postgres are all best in class. Seriously, if you want to read some good C, peruse the PostgreSQL source. Go Bears.

Unless you're trying to do some analytics (in which case, find a column store), or want to use it and forget it (Amazon's database services; and RDS is Postgres), Postgres is the way to go.


RDS can be Postgres, but isn't always. You can choose from MySQL, SQL Server, Oracle and Postgres, but Postgres support is currently in beta [1].

[1]: https://aws.amazon.com/rds/postgresql/


I don't think this is terribly good advice. Many databases have specific domains in which they excel. I agree you shouldn't be worried about scaling before you have users, but you should absolutely evaluate different tools before picking one.


This is good advice. I've scaled multiple sites to +3 million registered users and more than 1 million unique visitors with heavy DB usage. This is excellent advice for starting out. Along the way we found areas that could be improved and we were able to move quickly to them because our underlying DB was standard stuff. We knew what areas were breaking down and then fixed them with targeted moves to other DBs and things like memcache. Yes, each DB has a specialty but a good general DB gets you a lot of mileage because those specialty DBs do very little a general DB can't but that is often not true the other way.

You would never start a journey with cloudy destination and of a million miles by asking for a moon rover as your vehicle. As your journey develops you would start looking at other options or options for certain segments. Apps follow similar patterns.


What you're saying is true, but worrying about scaling before it's necessary is unfortunately common, and I've felt the pain caused by this (modern) premature optimization.


Balance is the answer.

This tends to be true of all things.


What exactly is wrong with Tokyo Cabinet?


I would guess, very few people have any experience with it, let alone know its quirks and good reasons to use it.

I wouldn't build my MVP on obscure technologies just because they're hip or promise something useful when you have 1M users...


This might be good advice if you're building cookie cutter systems for clients that have high turnover of contractors, but for startups I think it's horrible.

I'd take one person that really groks devops and database internals well enough to make informed choices over a million one-size-fits-all SQL/Mongo+MVC devs. Any day of the week.


but for startups I think it's horrible.

Why? Wouldn't it make more sense for startups to color within the lines, in terms of getting features to market faster?

I'd take one person that really groks devops and database internals well enough to make informed choices

Funded or not, a startup has the responsibility to turn every dollar into a feature that creates revenue, not architecture.


First, I never said "don't use SQL, even where it makes sense." The original commenter wrote, "advice to anyone deciding on a database".

I strongly disagree with the notion that any practicing or aspiring creator should refuse to learn about and consider the full range of tools at their disposal, simply because there's a popular default choice.

Specialized tools provide a form of leverage to those who understand and employ them wisely. Every problem I have ever come across that is sufficiently difficult and interesting to warrant forming a startup requires deeper insight on a technical level than "just ignore everything else and use X".

Can you imagine if the founders of Google took this kind of advice?


Can you imagine if the founders of Google took this kind of advice?

As a couple of examples, Google started on clusters of relatively commodity hardware. For years, AdWords ran on MySQL. Google's advantage wasn't in the innovative architecture, but rather, in the PageRank logic, which led to leadership in other areas. The architecture supports the innovation, which is the competitive advantage; the architecture isn't the competitive advantage.


What do you do when that guy quits?


    575 PostgreSQL
    462 MySQL
    275 Redis
    201 MongoDB
    184 Microsoft SQL Server
    144 SQLite
    124 Memcached
    122 ElasticSearch
    100 Oracle
     80 The File System
     66 MariaDB
     57 Cassandra
     44 Other
     41 Amazon DynamoDB
     24 Riak
     23 CouchBase (Couch, Membase included)
     23 DB2
     20 Neo4j
     19 Custom
     18 BigTable
     16 LevelDB
     14 HBase
     14 RethinkDB
     12 Firebase
     11 Amazon SimpleDB
      9 RavenDB
      9 Vertica
      8 Datomic
      7 Tokyo
      5 Teradata
      5 Firebird
      5 Informix
      2 VoltDB
      2 Netezza

    var res = [], arr = document.querySelectorAll('table')[3].childNodes[0].childNodes;
    for (var i = 0; i < arr.length; i+=3) {
     res.push({name: arr[i].innerText.replace(/\n/,''), points: parseInt(arr[i+1].innerText,10)})
    };
    res.sort(function(a,b){return (0+b.points > 0+a.points) ? 1: -1 }).forEach(function(i){console.log(i.points, i.name)})


I'm really happy to see PostgreSQL doing so well. They've been up against MySQL and Oracle for so long and really deserve it due to all the hard work the team have put into the amazing OSS project.


As far as I'm concerned, PostgreSQL should be the standard for any new free software projects. It's just too bad so many older projects (Wordpress, etc.) are tied to MySQL.


Sybase ASE and Sybase SQL Anywhere are not listed, yes they are different.

If you are going to have column stores, you should have InfiniDB.

And if you are going to have SQLite, you should also list MS SQL Server Compact. MS Access is also a viable option, also DBase and Filemaker.

Clustrix and NuoDB should also be in the list if you are listing NewSQL systems like VoltDB.

I am sure there are others :S as a DBA would be interested what businesses or systems only back onto column stores without a traditional RDBMS or a modern NoACID system.


Sorry. There are hundreds of Database projects and I can see where you're coming from. I added the most popular and had a search around for some smaller ones that seem to be gaining traction. The comments should suffice otherwise.


Ask HN: Has anyone ever considered "MS Access a viable option"?


"Access, why would we need that? We already have a shared folder with all our Excel sheets"


This is true, and several of the world's biggest companies run that way. Several-hundred-megabyte spreadsheets, with VBA interfaces copy & pasted between them; can't change it because the unknown cost of breaking it is potentially greater than the known cost of hiring an entire department of people to do one database server's job :(


I freelanced once for the local VA hospital and they sure tried!


The numbers for redis are going to be misleading.

Are people voting for it because they use redis as their primary data store, or just because they use it at all?

I use it redis a cache and I haven't voted for it because I assume this poll is asking about the primary data store. Does anyone use redis as their main db?


I personally would have included Google Datastore given how many app engine apps there are out there, but I guess Bigtable is a reasonable approximation of that.


For some reason I thought all major distros dropped MySQL in favor of MariaDB. The list, however, is not too short and it includes Debian, Ubuntu, RHEL, Fedora, openSUSE, among others.

https://mariadb.com/kb/en/distributions-which-include-mariad...

If you thought you were using MySQL, check your package repository!


I just felt silly voting so many times.

I guess I work at a big company. Our big data team is dealing with HBase and Vertica, our game team is dealing with mysql, our web team migrates a redis into couchdb, my team deals with mysqldb, neo4j, cassandra and sqlite on a test server where no one wants to set something proper up. Memcached and the file system is working on most services it makes sense on.


Yep, similar here -- many teams, many use cases, many DBs in use:

* Postgres (including RDS), Oracle, and MariaDB all used as transactional stores on projects of varying vintage

* Vertica for aggregate reporting

* Hadoop/HBase in another data warehouse

* Mongo in several key-value stores

* Amazon DynamoDB under evaluation for likely adoption for a newer key-value store

* memcached ubiquitous

* SQLite for simple read-only stores within web services

* Cassandra and Redis under evaluation for task queue result stores

In a large, mature organization, I think this kind of heterogeneity is both inevitable and appropriate.


"Some companies have valid reasons not to start prototyping with postgres, it's just yours isn't one of them" - anon

I've found that to be true time and time again. With other systems it's easy to quickly paint yourself into a corner, with postgres it's fairly straightforward to move to another system, e.g. cassandra/mongo/couch/etc (that's IF you actually ever need to move from postgres)


All the academic recordings, such as what student took what course and what grades he has a stored in Informix, at Middle East Technical University (metu.edu). It is such a pain in the ass that at the beginning of the semester, in the interactive registration the system always goes down. As far as I know it has something like maximum of 30-100 active sessions. They were unable to move it to some modern database. Here is some messaging https://groups.google.com/forum/#!topic/comp.databases.infor...


It'd be nice if HN sorted polls by points


Only after you had voted and reloaded the page. Otherwise the effect would be to promote the most popular options.

Ideally when you hadn't voted, the ordering would be random.


Has anyone written a little web app to turn an HN poll into a nice chart, automagically? (Extra hipster points for using D3 and some random and inappropriate visualisation no one has ever heard of.)



Nice!


I suppose it really depends on the poll. Answering ‘What is your age?’ would be frustrating if all the options were scrambled. But anything subjective like ‘What’s your favourite programming language?’ would certainly benefit from random ordering.


This is not a popularity vote, it's a vote on what DB you actually use. For a "Best DB system" question, you need to evaluate each DB and rank them all.


It has a lot of options, people will answer a few, get bored... scroll down and read some comments.

The options towards the bottom of the list may not get as much attention and votes even if they are used by people.

If it were alphabetical, people might hunt and peck for the one they use most, e.g. PostgreSQL, they might then see things around there (memcached) but fail to scroll back up and vote for Custom too.

Random deals with it well enough without even knowing the poll content or number of options.


No Solr but ElasticSearch?


The reason I thought of adding ElasticSearch in the end was because some projects have started to use it as a DB on its own.


We have been using Solr on it's own as a database for years. Many times I hear the "don't use solr as a database" rhetoric but honestly for our usage it is amazing. We're using it as a document store and querying using fulltext and key/value indexes. It is fast and reliable and has handled millions of documents pretty easily. I've been happy with it.

Currently evaluating ElasticSearch and it looks really promising also. We may move over to it as it seems to have a bit more momentum and better support for cloud/scaling.


ElasticSearch and Solr are both search engines built on top of Lucene. ElasticSearch just has all the hipster hype these days. Maybe update the description to include all three.


Lucene isn't really a database, though.


Would be interesting to know the context; small or large company / is there a strong IT focus or is IT just an internal service for the company, etc; think that may give more telling results (i.e. enterprises that buy off-the-shelf are often forced the Oracle/SQL-Server route causing them to select the same techs for bespoke systems to limit DB diversity).


I think the only easy way to find that sort of stuff out is through the comments as HN doesn't allow for much customisation. It has its upsides though.


Relevant – http://www.databasefriends.co/2014/03/favorite-relational-da...

It's a recent poll about favorite relational database and Postgres came first by far.


At work i inherited a mysyiam,no foreign keys, no transactions,duplicate indexes,bad indexes, no composite indexes, no backups mysql db and many other stuff.

But at home and for personal projects i use postgresql + it's advanced features. So voted for both.


more opinionated advice: ElasticSearch and the entire "ELK" stack (Logstash+ElasticSearch+Kibana)is awesome; but don't use it as a primary data store. It's not meant for that even though some of us are using it as such.


I wonder how this distribution is skewed based on company size, sector, and region. (I.E…Large established financial companies in the Midwest primarily prefer X. Small social media startups on the West Coast primarily prefer Y).


Currently using Postgres, but not utilising any of its more powerful features.


I'd start with going over the points of this article (assuming you haven't done just that:) -- https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Serv...


The PostgreSQL 9.0 High Performance book has been great for us, goes in depth into most interest variables that you can tweak on your instance to make sure you're making the most out of your machine's resources: http://www.packtpub.com/postgresql-90-high-performance/book

A bit of a lengthy read, but worth it.


Seconding this. Packt Publishing is really hit or miss, but PostgreSQL 9.0 High Performance is a hit.


No SQL Server?


I have it in as 'Microsoft'.


That could also mean Access, unfortunately. Microsoft SQL Server is the name most people would be looking for.


Fixed :)


For those curious here's a similar post from ~4 years ago:

https://news.ycombinator.com/item?id=1411937


The vast majority of places I've worked at used MySQL, which isn't surprising considering most have been LAMP shops. Only one Postgres, and one MSSQL


Only 8 for Datomic; I thought HN was only Clojure users?


We are using RocksDB as well for something. Although based on LevelDB, but its pretty different now. So can be added as a separate entry.


Interested, what are you using RocksDB for?




No love for ZODB? (Zope Database) No, we don't use it, but I'm surprised to see it drop completely off the map.


Redshift? or would that go under PostgreSQL?


memcached as a database is a stretch.


MemcacheDB and memcached are both databases. You put data in, you get data out (if it exists).

In CAP theorem they are CP.


I just understand that I use only DB2 in AS/400 right now that's scary


BerkeleyDB?


LevelDB


Good idea, added.



Neo4j is listed twice.


Fixed, Thanks.


Virtuoso Open Source


Virtuoso is great for handing large-ish graphs and then being able to use SPARQL to walk the nodes. We used it at the last company I worked for and we were able to get fairly good performance out of it while walking hundreds of thousands of edges. I think it would have scaled even beyond that, however the python RDFlib bindings we were using were a bit slow.


None of your companies are old enough to use Oracle ?


IBM Domino... :-(


I worked on Domino--it's a mail server that uses DB2.


Wasn't aware of it using DB2, not that I care much what it is as long as it works. As you probably know having worked on it it's an application platform too, which is mostly what we use it for. Sadly, none of my colleagues has the time to develop any apps for it and I only dabble around in high level languages, so everyone just uses the blank office template and creates documents with tables and stuff. But the search is great, and the save conflict management keeps some of the IT-related headaches away. I wish there was a self-hosted web app equivalent though as the UI for editing is similar to Ms Office, thus easy for Average Joe to grasp. The drag-drop storage of files inside documents is nifty.


Poor Domino. Domino is a NoSQL-ish distributed document database and application platform that traces its roots back to 1989. Mail is just one possible application that happens to come out-of-the-box. Domino was mismanaged by IBM after its purchase of Lotus. If IBM had only provided more killer apps out-of-the-box like project management, etc that businesses need (instead of hoping 3rd parties would write 'em) it could dominated (pun?). Instead, we have MS SharePoint (blah!). In fact, today Domino sucks ... but I do get misty eyed thinking about what it could have been.


Where is OrientDB on this list? Please add it


FoundationDB


No CouchDB?


CouchBase (Couch, Membase included)


CouchBase isn't the same as CouchDB.


PostgreSQL and MySQL have more in common than Couchbase and CouchDB


Hana


LDAP?


MonetDB


/dev/null


Which only support one operation: DELETE. Isn't it?


INSERT works fine too, but you cannot SELECT tho :(


I bet you get wicked performance.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: