The core idea here is really interesting; adapting a lossless compression system in a heuristic approximate mode. Instead of looking backwards for identical subsequences, look for approximate subsequences.
Unlike tools like pngquant, which use quantization to create palletized images, this directly hooks into the compression stage. I haven't looked at the source yet, but this is an idea that I've been thinking about recently, especially as regarding the somewhat obscure farbfeld [1] image format, which relies entirely on external compression.
Zlib in particular is "easy" to do this for, or at least an approach is clear, in which you can build up a big prefix tree and then "massage" it by combining subtrees. There are a number of different ways that you could approach this heuristically as well.
Bzip is a different beast -- the Burrows–Wheeler transform is sensitive to changes in the byte stream, so it's not easy to poke a byte to make it similar to another byte and then get the delta on the transform itself.
Unlike tools like pngquant, which use quantization to create palletized images, this directly hooks into the compression stage. I haven't looked at the source yet, but this is an idea that I've been thinking about recently, especially as regarding the somewhat obscure farbfeld [1] image format, which relies entirely on external compression.
Zlib in particular is "easy" to do this for, or at least an approach is clear, in which you can build up a big prefix tree and then "massage" it by combining subtrees. There are a number of different ways that you could approach this heuristically as well.
Bzip is a different beast -- the Burrows–Wheeler transform is sensitive to changes in the byte stream, so it's not easy to poke a byte to make it similar to another byte and then get the delta on the transform itself.
[1] https://tools.suckless.org/farbfeld/