Author here :) We did have high-level metrics and expectations for how this change would behave, but a couple of factors made it much harder to reason about in practice that were happening in parallel.
Data in these systems moves slowly and with a lot of inertia, so the effects show up gradually and can lag behind the change itself. On top of that, the impact wasn’t uniform. Most of the overhead came from a small subset of volumes, so it took time to isolate what was actually driving the increase. These systems are hard to test at scale!
Author here. With SMR, you do have large zones that are essentially immutable. However, in this case our extents and volumes are immutable because we do volume level striping for erasure coding. This mean that if any extent changes, the parities have to be rewritten as well. Others, do block level striping, so they can just move data around within disk. There are lots of trade-offs with both approaches. Also, keeping volumes/extents immutable makes reasoning through correctness much simpler.
Author here. We do try to post updates as much as we are able to in our blog: https://dropbox.tech/tag-results.magic-pocket. While the talk did go through details of the system we've covered in the past, the purpose of the talk was to convey my personal learnings of managing such a system at this scale. See key takeaways here: https://qconsf.com/presentation/oct2022/magic-pocket-dropbox.... Sustaining a high amount growth while maintaining high availability, durability, and efficiencies at this scale is very difficult to do.
Author here. We recently published a post about our last 4 years on SMR usage here: https://dropbox.tech/infrastructure/four-years-of-smr-storag.... Note that SMR technology just 5 years ago was rather nascent and a lot of software support was not great. Using SMR for us is possible without penalty only because of our sequential write workloads.
We use a custom disk format along with libzbc, but libzbd now provides many advantages, which we are looking to adopt. I did want the QCon talk to have some super straight-to-the-point conclusions and these, I believe, have saved us the most since I have been on the team. Largely due to the sheer scale of managing such a system.
Data in these systems moves slowly and with a lot of inertia, so the effects show up gradually and can lag behind the change itself. On top of that, the impact wasn’t uniform. Most of the overhead came from a small subset of volumes, so it took time to isolate what was actually driving the increase. These systems are hard to test at scale!
reply