Hacker News new | past | comments | ask | show | jobs | submit login

> I don’t understand how aurora achieves the speed it does with a log based approach. Can someone please clarify?

Aurora splits out 'database' nodes (the server instances you provision and pay for) from 'storage' nodes (a 'multi-tenant scale-out storage service' that automatically performs massively-parallel disk I/O in the background). Instead of MySQL writing various data to tablespaces, redo log, double-write buffer, and binary log, Aurora sends only the redo-log over the network to the storage service (in parallel to 6 nodes/3 AZs for durability).

No need for extra tablespace, double-write buffer, binary-log writes, or extra storage-layer mirroring, since durability is guaranteed as soon as a quorum of storage nodes receives the redo-log. The reduced write amplification results in 7.7x fewer network IOs per transaction at the 'database' layer for Aurora (vs standard MySQL running on EBS networked storage, in the benchmark described in the paper), and 46x fewer disk IOs at the 'storage' layer [1].

[1] https://www.allthingsdistributed.com/files/p1041-verbitski.p...

That’s an impressive savings in io

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact