Hacker News new | past | comments | ask | show | jobs | submit login

Ask HN: What is the best approach to compressing ProtoBuf streams? Does it still depend completely on the content being serialized, even with the ProtoBuf metadata?

For example: brotli is designed with an existing dictionary optimized for HTTP. Based on this design decision, it seems like it wouldn't necessarily be a good idea to use it with ProtoBuf, especially for small ProtoBuf serialized plain-old-[whatever]-objects in a data access layer.

Poking around a bit so far, it seems http://blosc.org/blosc-in-depth.html may be the right choice, but I'm not so sure about adopting its serialization stuff since it isn't as widely xplat as ProtoBuf.

If your decision was obvious after reviewing multiple algorithms then you might save me some time by sharing your experience; thanks in advance!




Results with geo.protodata (118588 bytes uncompressed) from snappy test data:

11728 bytes - brotli 11

11941 bytes - brotli 10

12056 bytes - lzma

12219 bytes - zstd 22

12314 bytes - brotli 9

12512 bytes - brotli 5

12526 bytes - zstd 15

12809 bytes - zstd 12

12831 bytes - zstd 9

14753 bytes - zopfli

15110 bytes - gzip 9




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: