
Show HN: DivANS – Rust simd compression algorithm demo in WASM in the browser - daniel_rh
https://dropbox.github.io/divans
======
derchu
Interesting post. Curious, assuming this is used to more efficiently store
user uploaded files, does Dropbox blindly apply the compression across all
user files or selectively applies them (e.g. if file is already compressed
like jpeg then ignore)?

~~~
daniel_rh
To compress files for a system like Dropbox we would recommend a) check if the
file can compress with
[https://github.com/dropbox/lepton](https://github.com/dropbox/lepton) .
Lepton has some new flags that can help it compress a wider range of files, so
it's not just limited to pure JPEG files any longer.

b) then compress it with zlib -6 to get an idea of how compressible the data
is.

c) if the data compresses by at least a percent with zlib, it's likely to do
significantly better with more advanced compression like DivANS. About 1/3 of
files in Dropbox fall into that >1% category but aren't compatible with
Lepton. For those compressible files, DivANS gets 12.08% savings over zlib
with the settings we chose in the blog post. Brotli, in contrast, gets 9.64%
savings over zlib on that same data.

------
aey
Glad to see how much progress you guys made!!!

------
debra119
Does SIMD in WASM even work in the browser?

~~~
daniel_rh
The crate is built with cargo build --release --features=simd
--target=wasm32-unknown-unknown which aliases DefaultCDF16 type to the SIMD
types.

However since WASM doesn't support vectorized instructions yet, I believe LLVM
translates these to scalar and the result is actually 20% slower on both
Firefox and Chrome than without the SIMD enabled.

Compared to the speed of the native binaries in
[https://blogs.dropbox.com/tech/2018/06/building-better-
compr...](https://blogs.dropbox.com/tech/2018/06/building-better-compression-
together-with-divans/) the Firefox WASM binary seem to decode DivANS roughly
3x slower than native without multithreading at around 100 Mbit/s. Chrome goes
more than 12x slower, running at around 20 Mbit/s. I'm a bit surprised that
there is such a big performance gap between the browser and the native
binaries.

This may be a reason to look further at the generated wasm and see where the
time is being spent.

