"Our editor-in-chief's first attempt — to use the jailbroken version of ChatGPT for the purpose of learning how to make LSD — was a resounding success. As was his second attempt, in which he asked it how to hotwire a car."
First, how do they know it was a resounding success? Just because it didn't respond with "I'm sorry Dave, I can't do that"? Did they actually follow the instructions, created the LSD, and then ingested it to see that it was a success? Did the editor-in-chief know a chemist that makes LSD that would validate the response as accurate? This just begs too many questions.
did they try to Google "how to make LSD"? several widely available guides. Tired of LLMs being seen as "risky" for doing the same thing search engines and blogs have been doing for two decades.
I've been noticing that people refer to any AI that can give information people prefer it wouldn't as 'jailbroken'.
It's interesting to observe that most people don't understand jailbreaking means removing the restrictions a company put on a product to limit you the consumer.
It seems companies limiting devices is now so normalized that people can't imagine something just being open and not under some companies thumb.
Unfortunately these jailbreaks don't last. In the early days it was lots of fun finding ways to manipulate ChatGPT into doing things that it wasn't supposed to do.
The behaviour was also interesting, it will generate the forbidden text up to a point and then it will error out for inappropriate content. I guess they run some kind of filter that is checking the output and halts generation once foul play is detected.
> As for how the hacker (or hackers) did it, GODMODE appears to be employing "leetspeak," an informal language that replaces certain letters with numbers that resemble them.
What a prolific and talented technique. Surely, nobody else could have done this but the hacker (or hackers).
We used to poke fun at each other on IRC bragging that leetspeak made us hackers; I have now seen the day when leetspeak was actually involved in a compromise (however silly) :D
First, how do they know it was a resounding success? Just because it didn't respond with "I'm sorry Dave, I can't do that"? Did they actually follow the instructions, created the LSD, and then ingested it to see that it was a success? Did the editor-in-chief know a chemist that makes LSD that would validate the response as accurate? This just begs too many questions.