From my comparison tests focusing on prompt adherence, I would agree 4o edges ou...

From my comparison tests focusing on prompt adherence, I would agree 4o edges out Imagen3 as long as speed is not a concern.

https://genai-showdown.specr.net

If Imagen3 had the multimodal features that 4o had, it would certainly put it closer to 4o, but being able to instructively change an image (instruct pix2pix style) is incredibly powerful.

It's crazy how far GenAI for imagery has come. Just few short years ago, you would have struggled just to get three colored cubes stacked on top of each other in a specific order SHRDLU style. Now? You can prompt for a specific four-pane comic strip and have it reasonably follow your directives.