Hacker News new | past | comments | ask | show | jobs | submit login
Speeding up FLUX.1[dev]: A Comparison between torch.compile, TensorRT, and Pruna (pruna.ai)
1 point by bertrand_charp 1 day ago | hide | past | favorite | 1 comment





At Pruna AI, we are huge fans of the FLUX.1[dev] model from the incredible team at Black Forest Lab.

It’s the highest ranked open-weights model at the Artificial Analysis text-to-image leaderboard.

With over 8,500 likes, it’s also the most popular model on Hugging Face.

However, due to its 12B parameters, generating a single high-resolution image can take up to 30s on modern hardware - too slow for many applications. Fortunately, there are ways to reduce inference time. In this post, we compare three such approaches: torch.compile, TensorRT, and our own Pruna optimization engine. The summary of this comparison is presented in the table below.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: