I've been messing around with writing a toy database for fun/learning, and realised I've got a fairly big gap in my knowledge when it comes to dealing with performance and durability when dealing with file reads/writes.
Example of some questions I'd like to be able to answer or at least make reasonable decisions about (note: I don't actually want any answers to the above now, they're just examples of the sort of thing I'd like to read in depth about, and build up
some background knowledge):
* how to ensure data's been safely written (e.g. when to flush, fsync, what
guarantees that gives, using WAL)
* blocks sizes to read/write for different purposes, tradeoffs, etc.
* considerations for writing to different media/filesystems (e.g. disk, ssd, NFS)
* when to rely on OS disk cache vs. using own cache
* when to use/not use mmap
* performance considerations (e.g. multiple small files vs. few larger ones,
concurrent readers/writers, locking, etc.)
* OS specific considerations
I recall reading some posts (related to Redis/SQLite/Postgres) related to this, which made me realise that it's a fairly complex topic, but not one I've found a good entry point for.
Any pointers to books, documentation, etc. on the above would be much