Well, that's N=1. But we have seen that it's sometimes possible to bypass that k...

bourgoin on April 1, 2023 | parent | context | favorite | on: Responsible AI Challenge

Well, that's N=1. But we have seen that it's sometimes possible to bypass that kind of filter with clever prompt engineering. And because these things are black boxes, it doesn't seem possible to rigorously prove "unjailbreakability"