Hacker News new | past | comments | ask | show | jobs | submit login

I don't follow this stuff very closely - is there any open-source model for text generation that outclasses GPT-3? Stable Diffusion has been released for barely a week and already seems like the clear winner. It doesn't seem like any of the open (actually open) text models have made as much of a splash.

Of course maybe it's just because text is less visually impressive than images.




There are some open models as good as initial GPT-3 (which wasn't hard), but whatever they did to create InstructGPT hasn't been reproduced as far as I know, and it's the first one to really seem magical.


They’re just harder to run on your own resources, since large language models are very large. BLOOM was released a month ago, is likely better than GPT-3 in quality, and requires 8 A100s for inference, which pretty much no one has on their desk.


Can anyone confirm if BLOOM is better than GPT-3 at instruction following? I might have read somewhere that it's not as well behaved.


GPT-3 was fine-tuned after release to be better at following instructions. I don’t think that’s been done for BLOOM.

BLOOM incorporates some new ideas like ALiBi which might make it better in a more general sense. They haven’t released official evaluation numbers yet though so we’ll have to see.


That makes sense, I didn't consider that angle. Thanks for the info.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: