Hacker News new | past | comments | ask | show | jobs | submit login

Still waiting for a competitive diffusion llm





So I can't find that paper that was posted on HN that said that, when viewed under the right theoretical framework, asserts that diffusion and transformers are doing the same thing under a different basis.. am I misrembering something?


Why?

Diffusion works significantly better for images than sequential pixel generation, there is a good chance it would work better for language as well.

Sequential generation used to be state of the art in 2016 and it's basically how current LLMs work:

https://arxiv.org/abs/1601.06759


Neural LMs used to be based on recurrent architectures until the Transformer came along. That architecture is not recursive.

I am not sure that a diffusion approach is all that suitable for generating language. Word are much more discrete than pixels.


I meant sequential generation, I didn't mean using an RNN.

Diffusion doesn't work on pixels directly either, it works on a latent representation.


All NNs work on latent representations.

The contrast here is real: there are pixel space diffusion models and latent space diffusion models. Pixel space diffusion is slower because there's more redundant information.

The most popular method using autoregression in image generation space is to predict image patches/tokens and not pixels, though that still scales worse than diffusion.

A fairly new but promising approach for autoregression that seems to scale as well as diffusion is predicting the next image scale/resolution rather than the next image patch.

https://arxiv.org/abs/2404.02905


I had similar thoughts to you.

However diffusion models suck at details, like how many fingers on a hand, and with language words and characters matter, both which ones and where they are.

So while I'm sure diffusion could produce walls of text that look convincingly like a blog post at a glance say, I'm not sure it would hold up to anyone actually reading.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: