Hacker News new | past | comments | ask | show | jobs | submit login

If you can put a bunch of large things together into a small file and then later (lossily) extract the large thing out of that smaller file, I'd argue that's compression, yeah. It doesn't really matter if it was intended to be art up as a compression algorithm or not in my opinion. If anything, this approach can be considered a revolution in lossy image compression, even though there's no real market for that at the moment.

If someone finds a way to reverse a hash, I'd also argue that hashing has now become a form of compression.

I think in 5 billion images there are more than enough common image areas to allow for average compression to become lower than a single byte. This is a lossy process, it does not need a complete copy of the source data, similar to how an MP3 doesn't contain most of the audio data fed into it.

I think the argument that SD revolves around lossless compression is quite an interesting one, even if the original code authors didn't realise that's what they were doing. It's the first good technical argument I've heard, at least.

All of those could've been prevented if the model was trained on public domain images instead of random people's copyrighted work. Even if this lawsuit succeeds, I don't think image generation algorithms will be banned. Some AI companies will just have spent a shitton of cash failing to get away with copyright violation, but the technology can still work for art that's either unlicensed or licensed in such a way that AI models can be trained based on it.




There is a strong, well-understood connection between deep-latent variable models (e.g. VAEs, diffusion models), and compression.

Many state-of-the-art compression algorithms are in fact based on generative models. But the thing is, the model weights themselves are not the compressed representation.

The trained model is the compression algorithm (or more technically, a component of it... as it needs to be combined with some kind of entropy coding).

You could use Stable Diffusion to compress and store the training data if you wanted, but nobody is doing that.


I still would not call the diffusion process a form of compression. The reason why is because as a whole these models don’t aim to exactly replicate their dataset. If they did, that’s considered overfitting which is a failure of the model (as another commenter said). Generally, these models can almost never be coaxed to give their original data back. To really be considered a form of compression, you’d have to make it easier to do that. Technically, you can do it (e.g. describing a very specific scene in a very specific style), but at that point you’re basically just giving detailed instructions on what to do. If I told a human to paint a very picture and gave them extremely specific steps, that would not be considered compression. That would just be them knowing how existing art patterns work and using that knowledge to follow my instructions. In general, I don’t think it should be considered compression because the results are almost always novel and it’s extremely hard to get anything even close to the original dataset.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: