This article is a good example of how myths are created and engineering ignorance is perpetuated.
CouchDB doesn't "scale"? If you're trying to "scale" with it, you don't know what you're doing in the first place. CouchDB federates. That's a wholly different thing. And in terms of federated databases, I challenge anyone to come up with one as good or better than CouchDB. (And if you do, it will be news to me, and I'll thank you profusely!)
If its not obvious to you how to scale a federated database, then its not couchDB that can't scale, its you. (which is ok, everyone has to learn sometime, just don't put forth your lack of knowledge as proof of a weakness in an open source product!)
Further, rather than just saying "We've got this great new invention-- a better technology, and we're moving to that!" the message seems to be "we are just wanting to re-invent the wheel, so to justify it, we have to make a negative claim about couchDB.
Now, I expect some particular databases fans to tell us, in the future, that "couchDB doesn't scale".
Ironically, they're punting on CouchDB to use, among other possibilities, SQLite. To claim that "Scaling" is the problem is .... bad engineering form.
CouchDB is great if you want to federate, have databases across the planet talking to each other and keeping in sync (its almost a turnkey CDN in a way), want to run a noSQL DB on a mobile device, etc.
MongoDB is great if you care about SQL and single node performance and its complex distribution mechanism works for you.
IF you want "scale" your choices are Riak or CouchDB-- for "scale" where homogenous distributed servers are the best solution.
And of course there's Cassandra and graph databases, etc. which provide different solutions to scalability.
IF you're serious about scalability, I strongly recommend people look at and choose Riak. I don't think anything out there touches it-- at least for the type of data I need. Cassandra and what I consider the "more complicated" alternatives might fit your particular problem type well. And if you think that its silly of me to recommend Riak then this is probably the case for you. But in terms of general databases, Riak seems to be pulling away from the pack. IF you're a fan of CouchDB, then BigCouch is a dynamo/Riak like version of it that I understand to be quite good. Plus, since its based on CouchDB, if the CouchDB way of doing queries (which is distinctly different from Riak) fits your way of working, then BigCouch deserves a look.
But please, don't ever say "couchDB doesn't scale". If you do, really its that you don't scale, CouchDB is fine.
 In an earlier edit I named a database. That was a mistake, not only is it bad form, I don't think that my characterization is appropriate at this time, as that database's fans are not as rabid as I imply. Apologies.
> If you're trying to "scale" with it, you don't know what you're doing in the first place
They said they worked with the company behind CouchDB and were not able to make it scale. So while you might accuse Canonical of not knowing what they're doing, I doubt you could say the same of the founders of CouchDB. Here is the official announcement :
> For the last three years we have worked with the company behind CouchDB
to make it scale in the particular ways we need it to scale in our
server environment. Our situation is rather unique, and we were unable
to resolve some of the issues we came across. We were thus unable to
make CouchDB scale up to the millions of users and databases we have in
our datacentres, and furthermore we were unable to make it scale down to
be a reasonable load on small client machines.
There John Lenton lays out in not so many words that "for the last three years we have worked with the company behind CouchDB to make it scale in the particular ways we need it to scale in ourserver environment. Our situation is rather unique, and we were unable to resolve some of the issues we came across..."
This sounds like a fair assessment to me. No "myths created", no "ignorance perpetuated".
I would also say that for every kind of technology at some point it's fair to say that it does not scale or is not sufficient in other ways.
I think what nirvana is worried about is that this will get simplified to "CouchDB doesn't scale" for developers who don't know all the circumstances. It's the slashdot blurb that landed in the front page of HN, not the sober list post.
It's like saying "$hot-software-de-jour doesn't scale" when the article actually said "we couldn't get it to scale in our server environment - which is a cluster of Arduinos powered by solar cells".
There's nothing wrong with having a workload/architecture/environment that doesn't suit a particular piece of software, and if you do it's right to choose something else that does suit your needs. Saying (or implying) that piece of software is bad because it doesn't suit _your_ strange requirements is about as credible as saying "Photoshop sucks 'cause it can't send email!"
I have read quite some things on the different larger key-value stores, especially on how they scale. And what I have seen I really like Riak as well. However, we have been setting it up over the last few weeks, and sofar it's less stable than I have hoped/expected: we have had nodes crash for no apparent reason. I hope we can resolve them, as I really like the model, especially the horizontal scaling, but it must be stable to use...
There are worse things than crashing. Like soldiering on and corrupting data.
The most unstable clustered database I have ever come across was suffering from broken TCP drivers. Never assume a cause until you have actually tracked it down.
I agree with itaborai83 that individual node crashes shouldn't be as big deal with the consistency model and redundancy offered by Riak. That is one reason you might go with Riak over something that offers stronger consistency, but is more picky about node crashes and recovery.
I've never had Riak crash on me. If you are having trouble ask any of the Riak guys and they will bend over backwards to help you. If you have found a crasher bug, I bet they want it fixed more than you do.
but aren't nodes supposed to crash, albeit in a tolerable way? Even though I conceptually like Riak I haven't tried it yet, because the problems that I deal with aren't worth the trouble of setting up a cluster and managing it.
I think that the cluster is supposed to be stable. The nodes, not so much.
> Further, rather than just saying "We've got this great new invention-- a better technology, and we're moving to that!" the message seems to be "we are just wanting to re-invent the wheel, so to justify it, we have to make a negative claim about couchDB.
Ubuntu re-invent a lot of wheels (sometimes poorly). Just off the top of my head: upstart, unity, launchpad...
I get the impression that the problem was less about scaling within one database than about scaling across many; Lenton practically says as much about a half dozen messages down-thread. Deploying literally millions of separate databases would be a total nightmare of resource contention, version skew, and general administrative burden. U1 was probably looking for ways to use some sort of multi-tenancy to serve the same number of users with fewer CouchDB instances, and that's the kind of scalability they apparently found lacking. Your point about scale vs. federation, while perhaps accurate and valuable, doesn't seem to address the actual reason for this change.
That is exactly how I understood the problem. The term "doesn't scale" is far to fuzzy to describe the problem.
I also see no mechanism in CouchDB to solve this. If such a problem exists, I expect a company to react with architectural improvements. If that doesn't happen, the software does not fit the use case and has to be exchanged. This is really bad for the manufacturer for his reputation as well for the user who loses massive amounts of his investment.