You would not need have of the images to perform that test. No more than a handful of images to prove that the text representation will not produce a identical image to a given image that has had a description described.
They don't even produce the same image twice from the same description and a different random seed.
They don't even produce the same image twice from the same description and a different random seed.