I'm assuming the model has a hand writtn "prefilter" and "postfilter" which both modifies any prompt going in and the token that are spit out? If they discover that the model has problems with prompts phrased a certain way for example, it would be very easy to add a transform that converts prompts to a better format. Such filters and transforms could be part of a product sitting on top of the GPT4-model without being part of the model itself? As such, they could be deployed every day. But tracking changes in those bits wouldn't give any insight into the model itself only how the team works to block jailbreaks or improve corner cases.