Poll: What database does your company/you use?

fragsworth · on May 11, 2014

A word of opinionated advice to anyone deciding on a database to use for their new project: Just go with something popular (MySQL, Postgres, Mongo, etc.), or go with an Amazon hosted cloud database if you really don't want to set anything up (RDS, SimpleDB, DynamoDB).

There's no good reason to start a project with other databases. If you're worried about the scalability of something like MySQL when you haven't even got your first 1000 users, your priorities are messed up. Finish your product and get your users first. And I hope to god if you DO get your users and I ever have to maintain it, it's not running on some bullshit like Tokyo.

calinet6 · on May 11, 2014

Better advice: just use Postgres.

Three reasons: 1) your data is probably relational, even if you think it's not, 2) if you have non-relational records, Postgres has JSON and hash storage types that perform better than most NoSQL databases, and 3) The architecture, standards compliance, and sheer capabilities of Postgres are all best in class. Seriously, if you want to read some good C, peruse the PostgreSQL source. Go Bears.

Unless you're trying to do some analytics (in which case, find a column store), or want to use it and forget it (Amazon's database services; and RDS is Postgres), Postgres is the way to go.

ahjones · on May 12, 2014

RDS can be Postgres, but isn't always. You can choose from MySQL, SQL Server, Oracle and Postgres, but Postgres support is currently in beta [1].

[1]: https://aws.amazon.com/rds/postgresql/

imsofuture · on May 11, 2014

I don't think this is terribly good advice. Many databases have specific domains in which they excel. I agree you shouldn't be worried about scaling before you have users, but you should absolutely evaluate different tools before picking one.

tiquorsj · on May 11, 2014

This is good advice. I've scaled multiple sites to +3 million registered users and more than 1 million unique visitors with heavy DB usage. This is excellent advice for starting out. Along the way we found areas that could be improved and we were able to move quickly to them because our underlying DB was standard stuff. We knew what areas were breaking down and then fixed them with targeted moves to other DBs and things like memcache. Yes, each DB has a specialty but a good general DB gets you a lot of mileage because those specialty DBs do very little a general DB can't but that is often not true the other way.

You would never start a journey with cloudy destination and of a million miles by asking for a moon rover as your vehicle. As your journey develops you would start looking at other options or options for certain segments. Apps follow similar patterns.

emn13 · on May 11, 2014

What you're saying is true, but worrying about scaling before it's necessary is unfortunately common, and I've felt the pain caused by this (modern) premature optimization.

calinet6 · on May 11, 2014

Balance is the answer.

This tends to be true of all things.

MasterScrat · on May 11, 2014

What exactly is wrong with Tokyo Cabinet?

raspie · on May 12, 2014

I would guess, very few people have any experience with it, let alone know its quirks and good reasons to use it.

I wouldn't build my MVP on obscure technologies just because they're hip or promise something useful when you have 1M users...

christiansmith · on May 11, 2014

This might be good advice if you're building cookie cutter systems for clients that have high turnover of contractors, but for startups I think it's horrible.

I'd take one person that really groks devops and database internals well enough to make informed choices over a million one-size-fits-all SQL/Mongo+MVC devs. Any day of the week.

bdcravens · on May 11, 2014

but for startups I think it's horrible.

Why? Wouldn't it make more sense for startups to color within the lines, in terms of getting features to market faster?

I'd take one person that really groks devops and database internals well enough to make informed choices

Funded or not, a startup has the responsibility to turn every dollar into a feature that creates revenue, not architecture.

christiansmith · on May 11, 2014

First, I never said "don't use SQL, even where it makes sense." The original commenter wrote, "advice to anyone deciding on a database".

I strongly disagree with the notion that any practicing or aspiring creator should refuse to learn about and consider the full range of tools at their disposal, simply because there's a popular default choice.

Specialized tools provide a form of leverage to those who understand and employ them wisely. Every problem I have ever come across that is sufficiently difficult and interesting to warrant forming a startup requires deeper insight on a technical level than "just ignore everything else and use X".

Can you imagine if the founders of Google took this kind of advice?

bdcravens · on May 13, 2014

Can you imagine if the founders of Google took this kind of advice?

As a couple of examples, Google started on clusters of relatively commodity hardware. For years, AdWords ran on MySQL. Google's advantage wasn't in the innovative architecture, but rather, in the PageRank logic, which led to leadership in other areas. The architecture supports the innovation, which is the competitive advantage; the architecture isn't the competitive advantage.

pbiggar · on May 11, 2014

What do you do when that guy quits?

Flenser · on May 12, 2014

    575 PostgreSQL
    462 MySQL
    275 Redis
    201 MongoDB
    184 Microsoft SQL Server
    144 SQLite
    124 Memcached
    122 ElasticSearch
    100 Oracle
     80 The File System
     66 MariaDB
     57 Cassandra
     44 Other
     41 Amazon DynamoDB
     24 Riak
     23 CouchBase (Couch, Membase included)
     23 DB2
     20 Neo4j
     19 Custom
     18 BigTable
     16 LevelDB
     14 HBase
     14 RethinkDB
     12 Firebase
     11 Amazon SimpleDB
      9 RavenDB
      9 Vertica
      8 Datomic
      7 Tokyo
      5 Teradata
      5 Firebird
      5 Informix
      2 VoltDB
      2 Netezza

    var res = [], arr = document.querySelectorAll('table')[3].childNodes[0].childNodes;
    for (var i = 0; i < arr.length; i+=3) {
     res.push({name: arr[i].innerText.replace(/\n/,''), points: parseInt(arr[i+1].innerText,10)})
    };
    res.sort(function(a,b){return (0+b.points > 0+a.points) ? 1: -1 }).forEach(function(i){console.log(i.points, i.name)})

rorhug · on May 11, 2014

I'm really happy to see PostgreSQL doing so well. They've been up against MySQL and Oracle for so long and really deserve it due to all the hard work the team have put into the amazing OSS project.

ForHackernews · on May 11, 2014

As far as I'm concerned, PostgreSQL should be the standard for any new free software projects. It's just too bad so many older projects (Wordpress, etc.) are tied to MySQL.

mathnode · on May 11, 2014

Sybase ASE and Sybase SQL Anywhere are not listed, yes they are different.

If you are going to have column stores, you should have InfiniDB.

And if you are going to have SQLite, you should also list MS SQL Server Compact. MS Access is also a viable option, also DBase and Filemaker.

Clustrix and NuoDB should also be in the list if you are listing NewSQL systems like VoltDB.

I am sure there are others :S as a DBA would be interested what businesses or systems only back onto column stores without a traditional RDBMS or a modern NoACID system.

rorhug · on May 11, 2014

Sorry. There are hundreds of Database projects and I can see where you're coming from. I added the most popular and had a search around for some smaller ones that seem to be gaining traction. The comments should suffice otherwise.

dgesang · on May 11, 2014

Ask HN: Has anyone ever considered "MS Access a viable option"?

staz · on May 11, 2014

"Access, why would we need that? We already have a shared folder with all our Excel sheets"

Shish2k · on May 12, 2014

This is true, and several of the world's biggest companies run that way. Several-hundred-megabyte spreadsheets, with VBA interfaces copy & pasted between them; can't change it because the unknown cost of breaking it is potentially greater than the known cost of hiring an entire department of people to do one database server's job :(

gamegoblin · on May 11, 2014

I freelanced once for the local VA hospital and they sure tried!

evv · on May 11, 2014

The numbers for redis are going to be misleading.

Are people voting for it because they use redis as their primary data store, or just because they use it at all?

I use it redis a cache and I haven't voted for it because I assume this poll is asking about the primary data store. Does anyone use redis as their main db?

icco · on May 11, 2014

I personally would have included Google Datastore given how many app engine apps there are out there, but I guess Bigtable is a reasonable approximation of that.

pestaa · on May 11, 2014

For some reason I thought all major distros dropped MySQL in favor of MariaDB. The list, however, is not too short and it includes Debian, Ubuntu, RHEL, Fedora, openSUSE, among others.

https://mariadb.com/kb/en/distributions-which-include-mariad...

If you thought you were using MySQL, check your package repository!

tetha · on May 11, 2014

I just felt silly voting so many times.

I guess I work at a big company. Our big data team is dealing with HBase and Vertica, our game team is dealing with mysql, our web team migrates a redis into couchdb, my team deals with mysqldb, neo4j, cassandra and sqlite on a test server where no one wants to set something proper up. Memcached and the file system is working on most services it makes sense on.

aaronharnly · on May 11, 2014

Yep, similar here -- many teams, many use cases, many DBs in use:

* Postgres (including RDS), Oracle, and MariaDB all used as transactional stores on projects of varying vintage

* Vertica for aggregate reporting

* Hadoop/HBase in another data warehouse

* Mongo in several key-value stores

* Amazon DynamoDB under evaluation for likely adoption for a newer key-value store

* memcached ubiquitous

* SQLite for simple read-only stores within web services

* Cassandra and Redis under evaluation for task queue result stores

In a large, mature organization, I think this kind of heterogeneity is both inevitable and appropriate.

Pci · on May 12, 2014

"Some companies have valid reasons not to start prototyping with postgres, it's just yours isn't one of them" - anon

I've found that to be true time and time again. With other systems it's easy to quickly paint yourself into a corner, with postgres it's fairly straightforward to move to another system, e.g. cassandra/mongo/couch/etc (that's IF you actually ever need to move from postgres)

darkhorn · on May 11, 2014

All the academic recordings, such as what student took what course and what grades he has a stored in Informix, at Middle East Technical University (metu.edu). It is such a pain in the ass that at the beginning of the semester, in the interactive registration the system always goes down. As far as I know it has something like maximum of 30-100 active sessions. They were unable to move it to some modern database. Here is some messaging https://groups.google.com/forum/#!topic/comp.databases.infor...

DevX101 · on May 11, 2014

It'd be nice if HN sorted polls by points

buro9 · on May 11, 2014

Only after you had voted and reloaded the page. Otherwise the effect would be to promote the most popular options.

Ideally when you hadn't voted, the ordering would be random.

bshimmin · on May 11, 2014

Has anyone written a little web app to turn an HN poll into a nice chart, automagically? (Extra hipster points for using D3 and some random and inappropriate visualisation no one has ever heard of.)

Sami_Lehtinen · on May 12, 2014

Results as chart: http://hnlike.com/hncharts/chart/?id=7729603

bshimmin · on May 12, 2014

Nice!

craz · on May 11, 2014

I suppose it really depends on the poll. Answering ‘What is your age?’ would be frustrating if all the options were scrambled. But anything subjective like ‘What’s your favourite programming language?’ would certainly benefit from random ordering.

Moru · on May 11, 2014

This is not a popularity vote, it's a vote on what DB you actually use. For a "Best DB system" question, you need to evaluate each DB and rank them all.

buro9 · on May 11, 2014

It has a lot of options, people will answer a few, get bored... scroll down and read some comments.

The options towards the bottom of the list may not get as much attention and votes even if they are used by people.

If it were alphabetical, people might hunt and peck for the one they use most, e.g. PostgreSQL, they might then see things around there (memcached) but fail to scroll back up and vote for Custom too.

Random deals with it well enough without even knowing the poll content or number of options.

thezilch · on May 11, 2014

No Solr but ElasticSearch?

rorhug · on May 11, 2014

The reason I thought of adding ElasticSearch in the end was because some projects have started to use it as a DB on its own.

cakeface · on May 12, 2014

We have been using Solr on it's own as a database for years. Many times I hear the "don't use solr as a database" rhetoric but honestly for our usage it is amazing. We're using it as a document store and querying using fulltext and key/value indexes. It is fast and reliable and has handled millions of documents pretty easily. I've been happy with it.

Currently evaluating ElasticSearch and it looks really promising also. We may move over to it as it seems to have a bit more momentum and better support for cloud/scaling.

jjguy · on May 11, 2014

ElasticSearch and Solr are both search engines built on top of Lucene. ElasticSearch just has all the hipster hype these days. Maybe update the description to include all three.

mhluongo · on May 11, 2014

Lucene isn't really a database, though.

johnlbevan2 · on May 11, 2014

Would be interesting to know the context; small or large company / is there a strong IT focus or is IT just an internal service for the company, etc; think that may give more telling results (i.e. enterprises that buy off-the-shelf are often forced the Oracle/SQL-Server route causing them to select the same techs for bespoke systems to limit DB diversity).

rorhug · on May 11, 2014

I think the only easy way to find that sort of stuff out is through the comments as HN doesn't allow for much customisation. It has its upsides though.

lauriswtf · on May 15, 2014

Relevant – http://www.databasefriends.co/2014/03/favorite-relational-da...

It's a recent poll about favorite relational database and Postgres came first by far.

ddorian43 · on May 11, 2014

At work i inherited a mysyiam,no foreign keys, no transactions,duplicate indexes,bad indexes, no composite indexes, no backups mysql db and many other stuff.

But at home and for personal projects i use postgresql + it's advanced features. So voted for both.

whatever2001 · on May 12, 2014

more opinionated advice: ElasticSearch and the entire "ELK" stack (Logstash+ElasticSearch+Kibana)is awesome; but don't use it as a primary data store. It's not meant for that even though some of us are using it as such.

jtcain · on May 12, 2014

I wonder how this distribution is skewed based on company size, sector, and region. (I.E…Large established financial companies in the Midwest primarily prefer X. Small social media startups on the West Coast primarily prefer Y).

wldlyinaccurate · on May 11, 2014

Currently using Postgres, but not utilising any of its more powerful features.

wfn · on May 11, 2014

I'd start with going over the points of this article (assuming you haven't done just that:) -- https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Serv...

akurilin · on May 11, 2014

The PostgreSQL 9.0 High Performance book has been great for us, goes in depth into most interest variables that you can tweak on your instance to make sure you're making the most out of your machine's resources: http://www.packtpub.com/postgresql-90-high-performance/book

A bit of a lengthy read, but worth it.

sciurus · on May 11, 2014

Seconding this. Packt Publishing is really hit or miss, but PostgreSQL 9.0 High Performance is a hit.

jliechti1 · on May 11, 2014

No SQL Server?

rorhug · on May 11, 2014

I have it in as 'Microsoft'.

JohnTHaller · on May 11, 2014

That could also mean Access, unfortunately. Microsoft SQL Server is the name most people would be looking for.

rorhug · on May 11, 2014

Fixed :)

bredman · on May 15, 2014

For those curious here's a similar post from ~4 years ago:

https://news.ycombinator.com/item?id=1411937

canadiancreed · on May 11, 2014

The vast majority of places I've worked at used MySQL, which isn't surprising considering most have been LAMP shops. Only one Postgres, and one MSSQL

elwell · on May 12, 2014

Only 8 for Datomic; I thought HN was only Clojure users?

vaidik · on May 11, 2014

We are using RocksDB as well for something. Although based on LevelDB, but its pretty different now. So can be added as a separate entry.

canadi · on May 11, 2014

Interested, what are you using RocksDB for?

ghosttie · on May 11, 2014

JADE http://www.jade.co.nz/jade/index.htm

khaledh · on May 12, 2014

Compare http://db-engines.com/en/ranking

HeyLaughingBoy · on May 11, 2014

No love for ZODB? (Zope Database) No, we don't use it, but I'm surprised to see it drop completely off the map.

warrenmar · on May 11, 2014

Redshift? or would that go under PostgreSQL?

nodesocket · on May 11, 2014

memcached as a database is a stretch.

buro9 · on May 11, 2014

MemcacheDB and memcached are both databases. You put data in, you get data out (if it exists).

In CAP theorem they are CP.

errorrrr · on May 12, 2014

I just understand that I use only DB2 in AS/400 right now that's scary

blogytalky · on May 11, 2014

BerkeleyDB?

oakaz · on May 11, 2014

LevelDB

rorhug · on May 11, 2014

Good idea, added.

securingsincity · on May 11, 2014

Lebron stack? http://lebron.technology/

onedognight · on May 11, 2014

Neo4j is listed twice.

rorhug · on May 11, 2014

Fixed, Thanks.

yamalight · on May 11, 2014

Virtuoso Open Source

Patrick_Devine · on May 11, 2014

Virtuoso is great for handing large-ish graphs and then being able to use SPARQL to walk the nodes. We used it at the last company I worked for and we were able to get fairly good performance out of it while walking hundreds of thousands of edges. I think it would have scaled even beyond that, however the python RDFlib bindings we were using were a bit slow.

exadeci · on May 12, 2014

None of your companies are old enough to use Oracle ?

rMBP · on May 11, 2014

IBM Domino... :-(

wting · on May 11, 2014

I worked on Domino--it's a mail server that uses DB2.

rMBP · on May 11, 2014

Wasn't aware of it using DB2, not that I care much what it is as long as it works. As you probably know having worked on it it's an application platform too, which is mostly what we use it for. Sadly, none of my colleagues has the time to develop any apps for it and I only dabble around in high level languages, so everyone just uses the blank office template and creates documents with tables and stuff. But the search is great, and the save conflict management keeps some of the IT-related headaches away. I wish there was a self-hosted web app equivalent though as the UI for editing is similar to Ms Office, thus easy for Average Joe to grasp. The drag-drop storage of files inside documents is nifty.

BareNakedCoder · on May 12, 2014

Poor Domino. Domino is a NoSQL-ish distributed document database and application platform that traces its roots back to 1989. Mail is just one possible application that happens to come out-of-the-box. Domino was mismanaged by IBM after its purchase of Lotus. If IBM had only provided more killer apps out-of-the-box like project management, etc that businesses need (instead of hoping 3rd parties would write 'em) it could dominated (pun?). Instead, we have MS SharePoint (blah!). In fact, today Domino sucks ... but I do get misty eyed thinking about what it could have been.

lvca · on May 11, 2014

Where is OrientDB on this list? Please add it

spullara · on May 11, 2014

FoundationDB

sparrish · on May 11, 2014

No CouchDB?

rorhug · on May 11, 2014

CouchBase (Couch, Membase included)

sparrish · on May 11, 2014

CouchBase isn't the same as CouchDB.

hmsimha · on May 11, 2014

PostgreSQL and MySQL have more in common than Couchbase and CouchDB

nuetrino · on May 12, 2014

lr · on May 11, 2014

LDAP?

hfmuehleisen · on May 11, 2014

MonetDB

etaty · on May 11, 2014

/dev/null

tokanizar · on May 11, 2014

Which only support one operation: DELETE. Isn't it?

patrick3a · on May 11, 2014

INSERT works fine too, but you cannot SELECT tho :(

rgrieselhuber · on May 11, 2014

I bet you get wicked performance.