Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
DeepSeek writes insecure code if prompt mentions topics restricted in China (crowdstrike.com)
4 points by keeda 2 days ago | hide | past | favorite | 1 comment




Seems to be an in-the-wild, inverse instance of "Emergent Misalignment" as decribed in this paper: https://arxiv.org/abs/2502.17424 (Previously discussed here: https://news.ycombinator.com/item?id=43176553)



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: