In the age of a crisis of reproducible research, publications without data and models aren't really "open". Having read many ML papers very few are reproducible easily.
How would you explain the dozens of other open source versions of computer vision and language models that didn't generate all those harms, even ones that were trained to recreate the exact models that OpenAI withheld due to those concerns?
The open source models can and do generate those harms.
The harms themselves are probably overblown. There are plenty of deepfakes of various celebrities. Mostly people can tell the difference or they just don't care.
I think the reality is that training these models and paying ML engineers is incredibly expensive. Not a good fit for the open source model, thus OpenAI had to convert to SaaS.
This is the problem with all advanced technology. It has to have control built in. Think of all the bad things you could do with self-driving cars for example. Imagine we make interstellar travel possible, the amount of energy involved could destroy worlds. It's a very sad thing about the future.
In a way, censorship seeks to make the Human AI "safe".
https://arxiv.org/abs/2102.12092
Their code/models are indeed closed, but there is no realistic alternative.
If they let the public have unrestricted access, deepfakes + child images would appear on Day 1, and OpenAI would get cancelled.
For OpenAI to survive, it has to be closed source.