Hacker News new | past | comments | ask | show | jobs | submit login

Serious question, how do you backup a casandra database of that size. Do you even back it up or just rely on the sharing to prevent dataloss?

Cassandra has a snapshat command that creates a directory by symlinking files that hold data (this is safe cause Cassandra files are immutable). Then you just upload them to your backup storage. This is obviously for recovery scenarios that are catastrophic.

Normally though since the data is replicated on 3 nodes, you can technically loose a node completely and rebuild it from the other nodes.

Cassandra natively supports multi-cluster replication - so you can run an entirely separate cluster that also has a copy of the entire dataset (which itself has configurable replication within a cluster) which can be used as an online fully-active backup.

We run 3 geo-distributed clusters with no offline backups because of this.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact