And it feels like it's been like this pretty much since the wayback machine was launched. Back then I expected that annoyance to be fixed over time - now, almost 16 years later I'm less hopeful. :/
However, I suppose the Wayback Machine receives many unique requests that would hit the database anyway, so caching may not be super effective.
I'm not an expert on big-data systems though.
The priority is to have as much data as accessible as possible, and for the same cost we can crawl and store far more web content on slow spinning disks than with SSDs or large RAM caches. Most RAM on storage nodes, which would otherwise be used as disk cache, gets used for derive tasks (like OCR or video conversion) or crawling/crunching tasks. That being said, if the service is so slow as to be unusable then there's no point operating it in the first place; hopefully we can get the latency a it lower.
(I was puzzled why it didn't seem to vary much depending on the time of day, etc.)
Really appreciate the writeup.