And for those wondering, this is why Oracle wants billions of dollars from Google for "Java Copyright Infringement" because the only growth market for Oracle right now is their hosted database service, and whoops Google has a better one now.
It will be interesting if Amazon and Microsoft choose to compete with Google on this service. If we get to the point where you have databases, compute, storage, and connectivity services from those three at equal scale, well that would be a lot of choice for the developers!
There are also plenty of choices evolving for developers who aren't looking for hosted solutions (which can sometimes be a showstopper for enterprise on-prem deployments). There's a growing ecosystem of distributed open-source databases to look out for too.
Take Citus, for instance – a Postgres-compatible distributed store which automatically parallelizes normal SQL queries across machines. It's as easy to set up as adding an extension, and people are doing some staggering things in prod with it.
Different audience from BigQuery and Spanner, but no less exciting.
Disclaimer: no professional association, but love their product and the team.
If you are looking for something that is more Postgres flavored (meaning we're just an extension to it so you get all the good stuff of Postgres such as JSONB, PostGIS, etc.) then we hope we'd be a good fit. And we run a managed service on top of AWS as well (https://www.citusdata.com/product/cloud) built by the team that built Heroku Postgres. If curious on pricing you can find it at https://www.citusdata.com/pricing/
I'm not trying to compare on a per-mb level, but it would be nice for smaller scale workloads.
For example, being a MSSQL performance tuning expert requires years of experience and probably pays very well, but just the other day I read an anecdotal story where someone switched a large BI database to use columnar indexes, allowing them to replace very complex (extreme manual tuning to achieve acceptable performance) queries with just standard SQL with comparable performance.
How long until the scale, pricing, and now transparent & full(?) sql compliance offered by these cloud platforms starts to make traditional RDBMS platforms a niche platform?
EDIT: Also, most DB users don't need global-scale databases.
Four years ago, I determined that while development work might seem to be near the top of the food chain, there will at some point where my work will be replaced by AIs.
This is not so different from how word processors replaced the specialist job of typesetters. Word processors make "good enough" typesetting. You can still find typesetters practicing their craft; the rest of us use word processors and don't even think about it.
At the time, I was learning to put the Buddhist ideals of emptiness and impermanence to practice, and to become more emotionally aware: the _main_ reason I had thought I would never be replaced by an AI writing software has more to do with wishful thinking and attachment than any clear-sighted look at this.
I also made a decision to work on the technologies to accelerate this. Rather than becoming intoxicated by the worry, anxiety, and existential anguish, I decided try to face it. Fears are inherently irrational, but just because they are irrational does not mean it is not what you are experiencing. Fears are not so easily banished by labeling them as irrational. Denial is a form of willful ignorance.
Now, having said all that, whether our tech base will come to that, who can say?
Since then, I have been tracking things like:
Viv - a chat assistant that can write it's own queries
DeepMind's demonstration of creating a Turing-complete machine with deep learning using a memory module.
I watched a tech enthusiast write a chat bot. He does not write software professionally. Talking with him over the months when he tinkers with in his spare time, I realized that in the future, you won't have as many software engineers writing code; you would learn how to _train_ AIs when they become sufficiently accessible to the masses. Skills in coaching, negotiation, and management becomes more important then some of the fundamental skills supporting software engineering. And like typesetting, I can see development work being pushed down the eco-ladder.
It's not surprising to me to see that Wired article about how coding becoming blue collar work. And even that will eventually be pushed down even further.
It's not surprising to me about Google's site-reliability engineering book, branding, and approach. I have done system admin work in the past, and I can already see traditional, manual sysadmin work being replaced.
It's easy to get nihilistic about this, but that isn't my point here either. I know the human potential is incredible, but I think we have to let go of our self-serving narratives first.
A sad choice though. The centralization of computation is likely not a good thing in the long run.
I agree. It only makes sense if you need special data for statistics, AI training, etc.
In all other cases the classic way of programming on pc and notebook is smarter. If you do everything in the cloud, what if you lose Internet connection? I had that experience several times over the last years.
So, the computational "Gini index" is increasing, but no one is being thrown into computational poverty.
Now, you could play with that analogy further and see some issues as well, but I don't think the issue here is centralized failure; all these data centers/"clouds" are at least good. The Cloud is about businesses focusing on core business and not supporting functions.
[Disclosure, I work on the Google Cloud team, I'm biased]
This is not true. Oracle is far more than a database company nowadays in the same way that Microsoft is more than Windows. Oracle has been acquiring high-growth startups at a significant rate.
Which has their 'cloud services' doubling their contribution to revenue year over year and licenses losing 50% of their contribution to revenue year over year.
There 'cloud' collateral is pretty opaque though.
We've been seeing this demand at Fauna. FaunaDB offers distributed consistency, based on Raft and the Calvin protocol instead of depending on specific networking and clock hardware. We've seen a big part of our appeal is the ability to run FaunaDB across multiple cloud services.
This is a globally-available, nearly-CAP-beating datastore that powers one of the biggest websites on the internet.
It's not quite apples and oranges, but this is definitely a different problem they are solving.
[1] http://www.computerworld.com/article/2953299/cloud-computing...
Do you have a sense of what that limit is?
There's a pretty big price difference between Spanner and Aurora at the entry level so it's useful to explore this.
I don't have a good idea what the upper limit is for an Aurora database setup.
For example, see this benchmark:
http://2ndwatch.com/wp-content/uploads/2016/09/Graph-3.jpg
from this article:
http://2ndwatch.com/blog/benchmarking-amazon-aurora/
Thoughts?
To be fair, Spanner's cross-region service is coming "later 2017".
EDIT: Found quite a bit of my answers in your linked article:
> Cloud Spanner uses a SQL dialect which matches the ANSI SQL:2011 standard with some extensions for Spanner-specific features. This is a SQL standard simpler than that used in non-distributed databases such as vanilla MySQL, but still supports the relational model (e.g. JOINs). It includes data-definition language statements like CREATE TABLE. Spanner supports 7 data types: bool, int64, float64, string, bytes, date, timestamp[20].
> Cloud Spanner doesn't, however, support data manipulation language (DML) statements. DML includes SQL queries like INSERT and UPDATE. Instead, Spanner's interface definition includes RPCs for mutating rows given their primary key[21]. This is a bit annoying. You would expect a fully-featured SQL database to include DML statements. Even if you don't use DML in your application you'll almost certainly want them for one-off queries you run in a query console.
> Though Cloud Spanner supports a smaller set of SQL than many other relational databases, its dialect is well-documented and fits our use case well. Our requirements for a MySQL replacement are that it supports secondary indices and common SQL aggregations, such as the GROUP BY clause. We've eliminated most of the joins we do, so we haven't tested Cloud Spanner's join performance.
This seems like it'd prevent any kind of easy switch over to Spanner.
Details -
https://cloud.google.com/spanner/docs/query-syntax#join-type...
Disclaimer: I work on Cloud Spanner
And yeah it makes it sound like writing an OEM adapter will be much more difficult.
>> Cloud Spanner doesn't, however, support data manipulation language (DML) statements. DML includes SQL queries like INSERT and UPDATE. Instead, Spanner's interface definition includes RPCs for mutating rows given their primary key[21].
Does this mean I need to rewrite my application?
My application uses an ORM and it typically converts my logic to SQL statements and fires them off to Postgres. Would I need to change it such that it doesn't issue INSERT / UPDATE statements?
The join performance is by far the most interesting part of this to me. A more traditional NoSQL solution sounds like it would have worked just as well for you, sans all the atomic clock fanciness. Joining across geographically disparate data is a real trick, and it seems like there would be some physical performance limits?
No, why? Query can be executed in parallel.
BTW, isn't 20k/sec is a very very small performance for 30 node installation. Cassandra can handle 50k+ (both writes and read) on a single node. When in most queries you are trying to collect data from many nodes it will scale almost linearly.
And yes Cassandra will scale linearly-ish as long as you're in the same datacenter. Try running a geo-distributed 30-node Cassandra ring and it's a whole different story at that level of consistency and availability.
How:
1)Hardware - Gobs and Gobs of Hardware and SRE experience
"Spanner is not running over the public Internet — in fact, every Spanner packet flows only over Google-controlled routers and links (excluding any edge links to remote clients). Furthermore, each data center typically has at least three independent fibers connecting it to the private global network, thus ensuring path diversity for every pair of data centers. Similarly, there is redundancy of equipment and paths within a datacenter. Thus normally catastrophic events, such as cut fiber lines, do not lead to partitions or to outages."
2) Ninja 2PC
"Spanner uses two-phase commit (2PC) and strict two-phase locking to ensure isolation and strong consistency. 2PC has been called the “anti-availability” protocol [Hel16] because all members must be up for it to work. Spanner mitigates this by having each member be a Paxos group, thus ensuring each 2PC “member” is highly available even if some of its Paxos participants are down."
Anyone know how exactly this is defined for them? (Time? Queries? Results?)
Google prefers building advanced systems that let you do things "the old way" but making them horizontally scalable.
Amazon prefers to acknowledge that network partitions exist and try to get you to do things "the new way" that deals with that failure case in the software instead of trying to hide it.
I'm not saying either system is better than the other, but doing it Google's way is certainly easier for Enterprises that want to make the move, and why Amazon is starting to break with tradition and release products that let you do things "the old way" while hiding the details in an abstraction.
I've always said that Google is technically better than AWS, but no one will ever know because they don't have a strong sales team to go and show people.
This release only solidifies that point.
Spanner doesn't exactly hide the details, but it lets you make transactions that span multiple shards. You still eat the cost of the transaction, you're just free from having to implement it at the application level, which is a more difficult and error-prone way of doing things. The bottom line is that if you need consistency, it needs to be implemented somewhere in your stack. If you don't need consistency (analytics workloads come to mind) then you have more flexibility with your database.
Disclosure: Google employee, reconstructing what I know from published information.
However, by 'abstracting' this away, you're not being forced to think about failure domains. If there is ever a massive country-wide connectivity break to the wider Internet (feasible for lots of people inside censored countries), you'll be pretty pissed when you can't use the DB services for your servers in the Google-local datacenter that you still have connectivity to because it can't get quorum.
Google tried that a decade ago and found it lacking, this is why Spanner exists in the first place.
You're both right.
I don't think it's easy to port existing applications to use it and in the end you will still need to accommodate shortcomings in your application.
Either way, they are trying to abstract away having to think about eventual consistency with this offering.
As cutesy of a sentiment as it is, it's also full of misconceptions. The pens were invented by an American corporation that wanted better pens to sell in general (a smoother flow in a pen, regardless of gravity/orientation, is a better pen), and they saw a good opportunity to market the pen to NASA for use in space. Both NASA and the Russians used pencils in space, but the problems with pencils is the flakes can pollute an environment pretty quickly in low gravity and the pens turned out to be a much better solution. (So far as I've heard, every space agency these days buys similar pens.)
[1] https://www.cockroachlabs.com/docs/frequently-asked-question...
[2] https://www.cockroachlabs.com/docs/frequently-asked-question...
Google launching Spanner is generally a positive thing for our industry and our product. It's more proof that what we're aiming for is possible and that there's demand for it. We expect that in five years, all tech companies will be deploying technology like ours.
One of the big differences is that Spanner only uses SQL for read-only operations, with a custom API for writes. We use standard SQL for both reads and writes, which means we also work with major ORMs like GORM, SQLAlchemy, and Hibernate (docs should be live today or tomorrow). Spanner's custom write API will make it difficult to work with existing frameworks, or to convert an existing application to Spanner.
Cloud Spanner only works on Google Cloud and is a black-box managed service. CockroachDB is open source and can be run on-prem or in any cloud on commodity hardware. (We don't offer CockroachDB as a service yet, but may in the future)
At this point, both products are still in beta and are still missing features like back-up and restore (according to the Quizlet blog post). We plan to launch CockroachDB 1.0 with back-up / restore enabled.
* For anyone wanting to know more about how we make CockroachDB work without TrueTime, see our blog post: https://www.cockroachlabs.com/blog/living-without-atomic-clo...
No startup will be able to replicate that anytime soon, a lot of time (and money) has been put into it by a lot of people over a long time.
Could any government? Has any government?
My impression is that, infrastructure wise, Google is genuinely in a class of size one.
> A simple statement of the contrast between Spanner and CockroachDB would be: Spanner always waits on writes for a short interval, whereas CockroachDB sometimes waits on reads for a longer interval. How long is that interval? Well it depends on how clocks on CockroachDB nodes are being synchronized. Using NTP, it’s likely to be up to 250ms. Not great, but the kind of transaction that would restart for the full interval would have to read constantly updated values across many nodes. In practice, these kinds of use cases exist but are the exception.
CockroachDB is waiting for time keeping hardware to improve.
for anyone interested
Companies with more data than can fit in a single-instance RDBMS system (like >3TB of hot data, more throughput than a single node can handle) but still seeking transactional consistency are a clear use case. Single-person startups could definitely benefit, but it's a less-likely scenario that they would require the level of coverage Spanner provides.
I'd certainly love to see us get to a world where we can split up a single spanner "install" in an isolated, multitenant manner, but even for a small company, $8k/year is admittedly a small fraction of one engineer. At a company with several, you can share your single Spanner instance just like you would any other database.
Disclosure: I work on Google Cloud (but not Spanner).
My intuition, which I hope is wrong, suggests this is a small market.
Having your database problem solved however, one less thing to worry about, if you're bigger.
It's somewhat ironic that Brewer, the original author of the CAP theorem, is making this sort of marketing-led bending of the CAP theorem terminology. I think what he really should be saying is something in more nuanced language like this: https://martin.kleppmann.com/2015/05/11/please-stop-calling-...
But perhaps Google's marketing department needed something in the more popular "CP or AP?" terminology. I don't see what would be wrong with "CP with extremely high availability" though.
It's certainly wacky to be claiming that a system is "CA", since as the post admits it's technically false; to me this makes it clear that CP vs. AP (vs. CA now?) does not convey enough information. I'd prefer "a linearizably-consistent data store, with ACID semantics, with a 99.999% uptime SLA". Not as snappy as "CA" (I will never have a career in marketing I suppose), but it makes the technical claims more clear.
Global Spanner looks like a different beast, though. It looks like Google has configured a database for master-master(-master?) replication, across regions and even continents. They seem to be pulling it off by running only their own fiber, each master being a paxos cluster itself, GPS, atomic clocks and lot of other whiz-bangery.
> Does this mean that Spanner is a CA system as defined by CAP? The short answer is “no” technically, but “yes” in effect and its users can and do assume CA.
The purist answer is “no” because partitions can happen and in fact have happened at Google, and during some partitions, Spanner chooses C and forfeits A. It is technically a CP system.
However, no system provides 100% availability, so the pragmatic question is whether or not Spanner delivers availability that is so high that most users don't worry about its outages. For example, given there are many sources of outages for an application, if Spanner is an insignificant contributor to its downtime, then users are correct to not worry about it.
Basically, the underlying system is CP, but A is so high (because of the custom fiber, paxos etc) that they're rounding it off to 100% and calling it CAP.
The kool-aid isn't too bad, though if they can measurably guarantee A > 99.999999%, I'm happy to round off to 100% and call it CAP.
(I work for Google Cloud)
CAP is about how the system deals with partitions not whether it has partitions or not.
I don't believe that's true, but I could be mistaken?
(Work at Google, not on Cloud Spanner)
With Aurora the basic instance is $48/month and they recommend at least two in separate zones for availability, so it's about $96/month minimum. Storage is $.10/GB and IO is $.20 per million requests. Data transfer starts at $.09/GB and the first GB is free.[1]
Spanner is a minimum of $650/mo (6X the Aurora minimum), storage is $.30/GB (3X), and data transfer starts at $.12/GB (1.3X).
Of course with Aurora you have to pick your instance size and bigger faster instances will cost more. Also there's the matter of multi-region replication, although it appears that aspect of Spanner is not priced out yet. So maybe as you scale the gap narrows, but it's interesting to price out the entry point for startups.
[1] https://aws.amazon.com/rds/aurora/
When Google announced Spanner back in 2012, I'm sure Amazon and Microsoft started teams to reproduce their own versions.
Spanner is not just software. The private network reduces partitions. GPS and atomic clocks for every machine help synchronize time globally. There won't be a Hadoop equivalent for Spanner, unless it includes the hardware spec.
You're right that there's literally nothing else out there that has tight synchronization using atomic clocks, though.
I just noticed Google says the cross-region feature is coming later in 2017. Amazon might be planning to announce a similar change for Aurora in the coming months.
The idea is that the A-or-C choice in CAP only applies during network partitions, so it's not sufficient to describe a distributed system as either CP or AP. When the network is fine, the choice is between low latency and consistency.
In the case of Spanner, it chooses consistency over availability during network partitions, and consistency over low latency in the absence of partitions.
1: http://cs-www.cs.yale.edu/homes/dna/papers/abadi-pacelc.pdf
https://cloudplatform.googleblog.com/2017/02/inside-Cloud-Sp...
https://cloud.google.com/spanner/docs/whitepapers/SpannerAnd...
This is a bold claim. What do they know about the CAP theorem that I don't?
Separately, (emphasis mine):
> If you have a MySQL or PostgreSQL system that's bursting at the seams, or are struggling with hand-rolled transactions on top of an eventually-consistent database, Cloud Spanner could be the solution you're looking for. Visit the Cloud Spanner page to learn more and get started building applications on our next-generation database service.
From the rest of the article it seems like the wire protocol for accessing it is MySQL. I wonder if they mean to add a PostgreSQL compatibility layer at some point.
It's right there in the article:
"Remarkably, Cloud Spanner achieves this combination of features without violating the CAP Theorem. To understand how, read this post by the author of the CAP Theorem and Google Vice President of Infrastructure, Eric Brewer."
The post they are referring to: https://cloudplatform.googleblog.com/2017/02/inside-Cloud-Sp...
Furthermore, there are already more than a few attempts underway to build scalable relational databases ("NewSQL") outside Google.[4]
1: https://research.google.com/pubs/pub36971.html
2: https://research.google.com/archive/spanner.html
3: http://datascienceassn.org/sites/default/files/F1%20A%20Dist...
4: http://db.cs.cmu.edu/papers/2016/pavlo-newsql-sigmodrec2016....
1 - https://www.cockroachlabs.com
2 - https://github.com/pingcap/tidb
I don't know your expertise on the subject, but they do have a post on this topic.
Some highlights:
"Does this mean that Spanner is a CA system as defined by CAP? The short answer is “no” technically, but “yes” in effect and its users can and do assume CA."
"The purist answer is “no” because partitions can happen and in fact have happened at Google, and during some partitions, Spanner chooses C and forfeits A. It is technically a CP system."
It looks like the wire protocol is Protocol Buffers and client libraries will likely use GRPC: https://cloud.google.com/spanner/docs/reference/rpc/google.s...
Their statement, for what it's worth:
https://cloudplatform.googleblog.com/2017/02/inside-Cloud-Sp...
I doubt it, the spanner is not even offering full MySQL capabilities, so it's unlikely to to support any advanced PG SQL.
https://cloudplatform.googleblog.com/2017/02/inside-Cloud-Sp...
> The purist answer is “no” because partitions can happen and in fact have happened at Google, and during some partitions, Spanner chooses C and forfeits A. It is technically a CP system.
> However, no system provides 100% availability, so the pragmatic question is whether or not Spanner delivers availability that is so high that most users don't worry about its outages.
Seems like they might know a lot :)
Being Google they are probably prideful enough to think their servers could never have an outage. Which yes, I agree with you, that is a very scary claim.
This sounds too good to be true. But it's Google, so maybe not. Time to start reading whitepapers...
An example is the rows you get back from a query like "select * from T where x=a" can't be part of a RW transaction. I believe because they don't have the time-stamp associated with them. So, you have to re-read those rows via primary key inside a RW transaction to update them. This can be a surprise if you are coming from a traditional RDBMS background. If you are think about porting your app from MySQL/PostgreSQL to Spanner, it will be more than just updating query syntax.
Disclaimer: I used F1 (built on top of Spanner, https://research.google.com/pubs/pub41344.html) few years ago.
If he was alive, he could say these computers are Google, Apple, Microsoft, Amazon and Facebook.
https://arstechnica.com/civis/viewtopic.php?f=21&t=1109206
"You are charged each hour for the maximum number of nodes that exist during that hour."
We've been educated by Google to consider per-minute, per-instance/node billing normal - and presumably all the arguments about why this is the right, pro-customer way to price GCE apply equally to Cloud Spanner.
However with a database it is rare to scale up and down rapidly. Rather you expect change over the order of days. Imagine you go from 10 instance to 15 instances over a week. per minute billing only saves a possible 5 instance-hours over the week compared to hourly billing, which is less than 1% saving.
How is this possible across data centres? Does it send data everywhere at once?
Seems too good to be true of course but if it works and scales it might be worthwhile just not having to worry about your database scaling? Still I don't believe it ;-)
EDIT: further info...
> Spanner mitigates this by having each member be a Paxos group, thus ensuring each 2PC “member” is highly available even if some of its Paxos participants are down. Data is divided into groups that form the basic unit of placement and replication.
So it's SQL with Paxos that presumably never get's confused but during a partition will presumably not be consistent.
"CA except when there are partitions" is CP. It's not "effectively CA".
It's one thing to do that for a key-value store. Entirely another to support joins on a globally distributed database. This ain't just one availability zone. Spanner is amazing.
It took them a few years to make it a service, but when they announced its use internally a few years ago, it seemed like the nail in the coffin for in-house database hosting.
There's nothing wrong with saying it's CP, but since we control everything there's extremely rare P. Then he can show availability numbers (which he kinda does).
Saying it's "effectively CA" defeats the point of the CAP theorem, which says you have to make tradeoffs. See: https://codahale.com/you-cant-sacrifice-partition-tolerance/
If data from other places has synchronized to your zone, you may be able to do this globally-consistent read while only touching your local datacenter (because TrueTime guarantees that no other records anywhere in the system will be created at the time you are querying).
Note: I work at Google, but I don't know more about Spanner than the Spanner paper.
If you use 2 nodes/hour,
Cost = (20.9) 24 * 31 = $1400/month not anointing for storage and network chargers.
Postgresql ? How does this work for people migrating from traditional SQL databases - typically people use ORM. How would this fit in with, say , Rails or SqlAlchemy ?
Maybe I'm misunderstanding how the pricing works here. Any clarification would be highly welcomed :)
Looks like Google forgot to mention one central requirement: latency.
This is a hosted version of Spanner and F1. Since both systems are published, we know a lot about their trade-offs:
Spanner (see OSDI'12 and TODS'13 papers) evolved from the observation that Megastore guarantees - though useful - come at performance penalty that is prohibitive for some applications. Spanner is a multi-version database system that unlike Megastore (the system behind the Google Cloud Datastore) provides general-purpose transactions. The authors argue: We believe it is better to have application programmers deal with performance problems due to overuse of transactions as bottlenecks arise, rather than always coding around the lack of transactions. Spanner automatically groups data into partitions (tablets) that are synchronously replicated across sites via Paxos and stored in Colossus, the successor of the Google File System (GFS). Transactions in Spanner are based on two-phase locking (2PL) and two-phase commits (2PC) executed over the leaders for each partition involved in the transaction. In order for transactions to be serialized according to their global commit times, Spanner introduces TrueTime, an API for high precision timestamps with uncertainty bounds based on atomic clocks and GPS. Each transaction is assigned a commit timestamp from TrueTime and using the uncertainty bounds, the leader can wait until the transaction is guaranteed to be visible at all sites before releasing locks. This also enables efficient read-only transactions that can read a consistent snapshot for a certain timestamp across all data centers without any locking.
F1 (see VLDB'13 paper) builds on Spanner to support SQL-based access for Google's advertising business. To this end, F1 introduces a hierarchical schema based on Protobuf, a rich data encoding format similar to Avro and Thrift. To support both OLTP and OLAP queries, it uses Spanner's abstractions to provide consistent indexing. A lazy protocol for schema changes allows non-blocking schema evolution. Besides pessimistic Spanner transactions, F1 supports optimistic transactions. Each row bears a version timestamp that used at commit time to perform a short-lived pessimistic transaction to validate a transaction's read set. Optimistic transactions in F1 suffer from the abort rate problem of optimistic concurrency control, as the read phase is latency-bound and the commit requires slow, distributed Spanner transactions, increasing the vulnerability window for potential conflicts.
While Spanner and F1 are highly influential system designs, they do come at a cost Google does not tell in its marketing: high latency. Consistent geo-replication is expensive even for single operations. Both optimistic and pessimistic transactions even increase these latencies.
It will be very interesting to see first benchmarks. My guess is that operation latencies will be in the order of 80-120ms and therefore much slower than what can be achieved on database clusters distributed only over local replicas.
> The underlying time references used by TrueTime are GPS and atomic clocks. TrueTime uses two forms of time reference because they have different failure modes... TrueTime is implemented by a set of time master machines per datacenter and a timeslave daemon per machine. The majority of masters have GPS receivers with dedicated antennas; these masters are separated physically to reduce the effects of [GPS] antenna failures, radio interference, and spoofing. The remaining masters (which we refer to as Armageddon masters) are equipped with atomic clocks. An atomic clock is not that expensive: the cost of an Armageddon master is of the same order as that of a GPS master.
Source: https://static.googleusercontent.com/media/research.google.c...
What is a distributed system that is CA? Can you build a distributed system which will never have a partition.
> For distributed systems over a “wide area,” it's generally viewed that partitions are inevitable, although not necessarily common. If you believe that partitions are inevitable, any distributed system must be prepared to forfeit either consistency (AP) or availability (CP), which is not a choice anyone wants to make. In fact, the original point of the CAP theorem was to get designers to take this tradeoff seriously. But there are two important caveats: First, you only need to forfeit consistency or availability during an actual partition, and even then there are many mitigations. Second, the actual theorem is about 100% availability; a more interesting discussion is about the tradeoffs involved to achieve realistic high availability.
How does that answer it? Are they implying that partitions will not happen if you don't believe in them?
This, of course, is effectively useless in practice, and is dependent on an infinite buffer of pending operations, etc.
https://brooker.co.za/blog/2014/07/16/pacelc.html
Doesn't availability mean getting a response on success or failure. If during a partition there is no response on success of failure how is the system available? It seems re-writing a term like "x will happen" to "x will happen after an infinite timeout" should not be valid
https://en.wikipedia.org/wiki/PACELC_theorem
"PACELC builds on the CAP theorem. Both theorems describe how distributed databases have limitations and tradeoffs regarding consistency, availability, and partition tolerance. PACELC however goes further and states that a trade-off also exists, this time between latency and consistency, even in absence of partitions, thus providing a more complete portrayal of the potential consistency tradeoffs for distributed systems."
And I would take that argument one step further and say that latency and partitioning are effectively identical, and from the point of view of any given operation, it is impossible to say whether the system is in partitioned state until max lateny (timeout) has elapsed, because failure to make progress within timeout is the only meaningful definition of partion-induced unavailability.
Have they documented the wire protocol? I couldn't find it.
RPC: https://cloud.google.com/spanner/docs/reference/rpc/
Rest: https://cloud.google.com/spanner/docs/reference/rest/
https://github.com/GoogleCloudPlatform/google-cloud-node#clo...
Edit: not every server has an atomic clock; see replies by Google employees
The timeslave daemons running on each machine keep them synchronized with the master time servers, and maintain tight bounds on their inaccuracy.
(Disclaimer: I work at Google)
I know this is a single system, but I'll still say it. This seems like another step in a scary trend for our internet.
In any case this is much better than Amazon's offerings... when they actually ship it. :)
Of note:
They say Spanner is "both consistent and highly available despite operating over a wide area". So not 100% availability but they've got it to "more than five 9s of availability (less than one failure in 1066)."
Upd: Downvoting this warning will only increase that number.
Software is about separating concerns, and decentralizing authority. Responsible engineers shouldn't be using this service.
