At this point the watermark can be meaningful content like "every time there's a bird there is also a cirrus cloud", or "blades of grass lean slightly further to the left than a natural distribution".
Because our main interest is this meaningful content, it will be harder to scrub from the image.
That would be indistinguishable from a model that was also trained on that output, wouldn't it?
It seems much more likely that it's their solution to detect and filter AI images from being used in their training corpus - kind of a latent "robots.txt".
Because our main interest is this meaningful content, it will be harder to scrub from the image.