Hacker News new | past | comments | ask | show | jobs | submit login

"once you have self-healing clusters having a node disappear / crash is not a big deal for availability of the entire cluster."

I was assuming the self-healing clusters in their setup would operate using code not produced with stability in mind.

"Also the blog post says nothing about integrity. It does not seem that integrity is affected."

I was assuming that the code that maintained integrity had to run correctly and stably in order to maintain integrity.




As long as it crashes rather than breaking integrity, the integrity-maintaining code doesn't necessarily have to be stable.


You can't know it will do that unless you designed it to. The high-assurance kernels of Orange Book era often used that strategy where they'd run correctly or fail-safe. They had strong assurance they would in terms of design, specs, proofs, tests, etc. What you said applied to CochroachDB minus a good subset of such practices is "We hope it will crash instead of affect integrity." Different ballpark entirely.

EDIT to add: Recall this played out in filesystems and regular databases where software errors could corrupt things. Now, just imagine same thing for distributed programming. Same or worse results from defects as always.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: