It's quite unlikely that OpenAI didn't break any TOS with all the data they used for training their models.
Not just OpenAI but all companies that are developing LLMs.
IMO, it would look bad for OpenAI to push strongly with this story, it would look like they're losing the technological edge and are now looking for other ways to make sure they remain on top.
Similar to how a patent contract becomes void when a patent expires regardless of what the terms of the contract says, it's not clear to me OpenAI can enforce a contract provision for an API output they own no copyright in.
Since they have no intellectual property rights in the output, it's not clear to me they have a cause of action to sue over how the output is used.
I wonder if any lawyers have written about this topic.
How many thousands or millions of contracts has OpenAI breached by scraping data off of websites that have terms of service explicitly saying not to scrape data off their websites?