Oh, I see they recommend writing WARC files with wget using `--no-warc-digests`. You really should not do that - one, it's just a sha1 and neither costly in terms of CPU nor storage. Two, the digest is used to create revisit records for de-duplication. If you disable that you or someone else might end up with lots of duplicate resources on re-crawling.