Hacker News new | past | comments | ask | show | jobs | submit login

I'm surprised nobody's picked up on the main tradeoff here: very high write latencies (and high read latencies too, just not as spectacularly). From slide 9:

Performance: ● Very high commit latency - 50-100ms ● Reads take 5-10ms - much slower than MySQL ● High throughput

(And that's reads-with-no-joins, since they are not supported either.)




I don't think that has much to do with F1; any system with synchronous cross-datacenter replication will have that kind of write latency. Presumably if you ran in one datacenter it would be lower latency and just as scalable.


I think it's a fair thing to bring up, given that the F1 abstract comes across as claiming your-cake-and-eating-it-too:

"With F1, we have built a novel hybrid system that combines the scalability, fault tolerance, transparent sharding, and cost benefits so far available only in “NoSQL” systems with the usability, familiarity, and transactional guarantees expected from an RDBMS."

In other words: sure, you can build a distributed, transactional system by making all writes synchronous... but most (all?) NoSQL systems prefer to optimize for low-latency operations over transactionality. As an author of one of those systems, I think that low latency is a more useful property in general, but the fact remains that it's a tradeoff, and it's a shame that the F1 authors deliberately gloss over that.


Deliberately gloss over it? Second paragraph of the abstract:

> The strong consistency properties of F1 and its storage system come at the cost of higher write latencies compared to MySQL. Having successfully migrated a rich customerfacing application suite at the heart of Google’s ad business to F1, with no downtime, we will describe how we restructured schema and applications to largely hide this increased latency from external users. The distributed nature of F1 also allows it to scale easily and to support significantly higher throughput for batch workloads than a traditional RDBMS.


F1 is a successor of Megastore, even with relation to high write/read latencies. From bits leaked here and there by Google, one can safely assume they put highl value on cross datacenter consistency nowadays even if it means to hurt write ratios so much -- I recommend this video of Google IO 2011: http://www.youtube.com/watch?v=rgQm1KEIIuc (from minute 35:40 on). Megastore and F1 present abysmal latencies because they synchronously write to 3 data-centers at least before acknowledging the client side through Paxos or 2PC.

Only an idiot can assume that F1/Megastore is a drop-in replacement to MySQL (or Cassandra, jbellis!), but those guys invented BigTable, and battle tested it, long before writing the papers so they know where their priorities lie nowadays. On the other hand, I am highly curious about Spanner, the successor of BigTable, that powers F1.


> (And that's reads-with-no-joins, since they are not supported either.)

Am I missing something? The very next slide seems to describe how they support one-to-many joins by flattening the table hierarchy. (They also claim to allow "joins to external sources", but unfortunately there are no details given for that feature.)


You're right, I missed the part where "can't do cross-shard transactions or joins" was describing the old mysql infrastructure, not F1.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: