The modern trend of compressors is to use more memory to achieve speed. This is good if you're using big-iron cloud computers...
"Zstandard has no inherent limit and can address terabytes of memory (although it rarely does). For example, the lower of the 22 levels use 1 MB or less. For compatibility with a broad range of receiving systems, where memory may be limited, it is recommended to limit memory usage to 8 MB. This is a tuning recommendation, though, not a compression format limitation."
8MB for the smallest preset? Back in the mid-2000s, I was attending a Jabber/XMPP discussion, about the viability of using libz for compressing the stream. It turned out that even just a 32kb window is huge when your connection server is handling thousands of connections at a time, and they were investigating the effect of using a modified libz with an even smaller window (it was hard-coded, back then).
I know Moore's law is in ZStandard's favor w.r.t. memory usage (what's 8MB when your server's got 64GB or more?), but I think it's useful to note that this is squarely aimed at web traffic backed by beefy servers.
Any modern server that handles a thousand or more concurrent connections on commodity hardware already uses only about as many threads as there are processor cores. In that architecture it's trivial to also limit the number of compression threads to the number of processor cores. That architecture gives the best performance and very low memory use.
In the mid-2000 it was still accepted norm to spawn one thread for each connection, where memory usage of the compressor would have been a problem. I doubt that it's a problem with today's software architecture.
A server like this could only work by buffering an entire response before compressing it once, requiring (compressed+uncompressed) bytes temporary space. In reality most servers of the design you mention operate streamily, flushing the compressor's output just in time as the backend fills its input buffer. In designs like that (most of them) a compression context per connection is still required
Not sure I agree. The 8 MB is the recommended upper limit so I don't think anyone is planning to use it for the web traffic. I think its designed to be faster and have better compression even at lower window size though not sure how low. It most likely perform better than zlib even at 32KB at least faster I would assume. Now if you are a jabber/chat server opening thousands of long running connection there it can be an issue. You already said how even standard zlib doesn't work there.
I don't think 8MB is the smallest preset, the text you quoted says that the lower levels use "1 MB or less".
The concern I have is that this makes it sound like the compressor can choose how much memory the decompressor will need to use. Does this mean that zstd can't be used in a potentially adversarial environment? (Eg. is there a denial-of-service vector here by forcing the server to use large amounts of memory to decompress my requests?)
It will not use (much) more memory than the size of the output in any case. 8MB is the window here, which just means the decompressor can discard data that falls outside this 8MB window as it is decompressing.
Minix 1.5 for 8086 had a slightly mad decompress program that would fork itself up to 4 times in order to address enough memory to decompress certain *.Z files:
You can get pretty far with "amnesiac" zlib for networking, too. You collect up writes in your out-buffer, and use zlib to compress it before transmission. The trick is that you don't retain context or find matches between chunks, so there's no memory overhead between sends.
Note that 8Mb is the size of the L3 cache on some modern Intel chips. You want fast lookups in the window area, where you constantly do random-access reads.
"Zstandard has no inherent limit and can address terabytes of memory (although it rarely does). For example, the lower of the 22 levels use 1 MB or less. For compatibility with a broad range of receiving systems, where memory may be limited, it is recommended to limit memory usage to 8 MB. This is a tuning recommendation, though, not a compression format limitation."
8MB for the smallest preset? Back in the mid-2000s, I was attending a Jabber/XMPP discussion, about the viability of using libz for compressing the stream. It turned out that even just a 32kb window is huge when your connection server is handling thousands of connections at a time, and they were investigating the effect of using a modified libz with an even smaller window (it was hard-coded, back then).
I know Moore's law is in ZStandard's favor w.r.t. memory usage (what's 8MB when your server's got 64GB or more?), but I think it's useful to note that this is squarely aimed at web traffic backed by beefy servers.