The filesystem supposedly-atomic file creation is buggy. The putIfAbsent implementation will leaves partial commit files visible to others (forever on errors), making other users see corrupt transactions. For filesystems, the correct answer is to write to a tempfile, sync that, then use link(2) to claim a transaction id.
We use the same approach in time series database I'm working on. While file creation and fsync aren't atomic, rename [1] syscall is. So we create a temporary file, write the data, call fsync and if all is good - rename it atomically to be visible for other users. I had a talk about this [2] a few month ago.
That's a good point but I think the system will still be correct in this scenario just unavailable. If a transaction ever fails to fully write a transaction metadata file all future transactions will crash while reading it. You can recover by deleting the faulty and final transaction file. Link is a better choice though you're right. I didn't remember it will fail if the new file already exists too. I'll rewrite my code. Thanks!
Only by virtue of the code seeming to store everything in JSON. If you used a format where binary data could be stored as-is, you could end up with a zeroed sector in the middle of a large value. If you used a format that allowed zero bytes, you could end up with a structure that could be confused for something legal (e.g. list of TLV with many type=0 len=0 objects ignored).