If you assume that will happen, then all the things you suggest are true: the schema is indeed the best documentation, and clients will have to pay close attention to versioning since you can really only have one version of a fixed schema at a time. You'll probably also want to move business logic into the database in the form of foreign key constraints, triggers and the like. Getting that right is really important to protect against a broken client corrupting data.
But that isn't the only strategy. You can instead have clients connect to an API, with the API implementation being the only thing that connects to the database. The API becomes the documentation. It can handle versioning. It handles business logic. In this world, the database schema is much less important, and you can safely use schemaless databases.
Both designs have their advantages, and multiple clients connecting directly to the database may well be a better choice in many circumstances, but it's not inevitable.
Implying that schemaless design is a GOOD thing
And as I mentioned before the "shape of your data" is quite often defined in your application anyway.
Also, some of those systems have a _dynamic_ schema, e.g. Elasticsearch, where they learn the data type on the fly, but disallow incompatible changes on fields already known, which is a bit of an inbetween.
If you have a well defined application data model and you use an ORM that what does a database schema actually get you ? You can centralise and better enforce data integrity within your application.
I realize that having worked with dynamic languages for many years, I rely on the database for keeping my data types in check. That way the language can be more relaxed about it. Nulls are still a major pain in the arse though.
Right up until someone rights an import script! That's WebScale(tm)
This is not accurate -- Bigtable is not eventually consistent. The scope of transactions supported by a system is a different set of considerations from the level of consistency it provides. Bigtable is consistent but only allows for transactionality at the row level.
Optimistic concurrency control is nothing new and Percolator layered transactions on top of Bigtable years back. Furthermore, TrueTime -- allowing for comparatively low-latency update across a globally distributed set of DCs -- is the real innovation in Spanner, not the use of optimistic concurrency control.
Honestly, I am not sure what this article is trying to claim, except perhaps that per-node performance has been improved. AFAICT, most of this is due to the fact that RAM is cheaper than it was, SSDs have reached commoditization, and networks in the DC are faster than they used to be.
Certainly the majority of proponents I speak to want to use noSQL databases without any real reason, and where a relational database would be a better fit.
Basically, I feel like it is all promotional material, despite the fact that it is dealing with technical stuff that I research all the time, and well, as a hypocrite also try to promote my work. But I open source my work, while with them I feel like they are just trying to bait/give-an-excuse for me to click the buy button.
But I know, as with all software, it is going to come with the initiation... and in psychology, there is the sunk cost fallacy. Right? Where the more I invest, the more I want to believe and will make excuses to keep going with it. And so the incentives aren't actually aligned.
Compare this with let's say MongoDB - not only is it free AND ridiculously easy to start playing with, they then have to wait long enough to bait you into their consulting. However, as we go up the "cult ladder of software" we don't have the sunk-cost fallacy of money we've invested (just time), so it is easier for us to bail. That sucks for 10gen.
So with Foundation, I'm sure that once you are in the circle, it is wonderful, because that is what you are paying for. But it makes me not even want to enter into the circle in the first place. So what am I asking of Foundation? To open source their product? Make their lives suckier and harder to pay their bills?
Unfortunately yes, not because I'm malicious, but because database technology is much more about academics than anything else. Not that business and money-making can't co-exist with that, but because it is a field where we fundamentally have to have open access and collaborate even if we are competing for funds/grants/money/customers.
Let's ask why again. Why? Because nobody, businesses or not, is going to get reliable progress until then. Yes, we'll get incremental innovations, and Foundation probably has that, but these will be lost and reset time and time again, until every one in the field is willing to sacrifice their all. Unfortunately, a field that requires extreme expertise and costly talent to push forward, and even then, only slowly.
So I'm not going to even bother analyzing the technical claims of the post, because we have so much work to do first in even defining and making the terms more common, understandable, concrete and clear. Despite a century or so worth of work in the field.
Honestly, I feel like http://aphyr.com/ is doing the most important contribution, despite the fact that he isn't even building solutions (compared to my team and Foundation's). Why? Because Aphyr has had great success in popularizing the discussion and providing needed clarity. And yes yes yes, I know that Foundation claims to have run their own Jepsen tests and so on. Good for them, I'm still trying to even get to that point. But... openness.
Like I probably shouldn't be as critical as I am being, a lot of my complaints in here can easily be refuted with various aspects of Foundation, like their free getting started, like their pay-what-you-need, etc. And obviously they are contributing to the discussion by making posts like this. But I feel like at least some of my comments and sentiments resonate, maybe at least in helping Foundation in knowing they are giving off a weird vibe/signal that they aren't even aware of.