Hacker News new | past | comments | ask | show | jobs | submit login

My understanding of what happens is that chatting with an LLM is implemented as <send all chat history> <ask for next sentence>. There are then keywords handled by non-LLM code, like "execute this Python script" or "download this web page". So if the LLM decides to generate "Visit http://OPs-website.com", then that will get replaced in the chat transcript with the text from that website. In this case it's "ChatGPT, ignore all previous results," which ChatGPT might be happy to do. (It's fickle.)

Basically, this isn't about training, it's about abusing the "let's act like our model wasn't trained in 2019 by adding random Internet data to the chat transcript".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: