If Imagen3 had the multimodal features that 4o had, it would certainly put it closer to 4o, but being able to instructively change an image (instruct pix2pix style) is incredibly powerful.
It's crazy how far GenAI for imagery has come. Just few short years ago, you would have struggled just to get three colored cubes stacked on top of each other in a specific order SHRDLU style. Now? You can prompt for a specific four-pane comic strip and have it reasonably follow your directives.
https://genai-showdown.specr.net
If Imagen3 had the multimodal features that 4o had, it would certainly put it closer to 4o, but being able to instructively change an image (instruct pix2pix style) is incredibly powerful.
It's crazy how far GenAI for imagery has come. Just few short years ago, you would have struggled just to get three colored cubes stacked on top of each other in a specific order SHRDLU style. Now? You can prompt for a specific four-pane comic strip and have it reasonably follow your directives.