Hacker News new | past | comments | ask | show | jobs | submit login

I do think there is a huge difference: for a traditional software parser, you can always fix it to exclude the incorrect input, or at least understand what the theorical parsing limitation is. Accidental complexity is not really an argument because at the end of the day you can still find the issue even in the most complex of inescrutable software.

Can you really fix a black box model in the same way? Maybe the answer is yes for this particular encoding issue, but can you e.g figure out how to prevent the model from 'parsing' malicious paint marks on a traffic sign, without (a) using yet another black box to prefilter the images, with the same risks, or (b) retraining the model, which is going to introduce even more issues ? We have had examples of OpenAI trying both methods, and each has been as fruitless as the other.

It is not at all like software security fixes, where generally one security fix introducing other security issues is the exception rather than the rule. Here, I'm claiming, it is the rule.

The fact that you don't know how to process the inputs with an actual, scrutizable algorithm may imply you don't know how to sanitize the inputs with one, and then all bets are off.






Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: