Has o1 been jailbroken? My understanding is o1 is unique in that one model creates the initial output (chain of thought) then another model prepares the first response for viewing. Seems like that would be a fairly good way to prevent jailbreaks, but I haven't investigated myself.
The core concept is to pass information into the model using a cipher. One that is not too hard that it can't figure it out, but not too easy as to be detected.