> The classic story here is that of an AI system whose only—seemingly inoffensive—goal is making paper clips. According to Bostrom, the system would realize quickly that humans are a barrier to this task, because they might switch off the machine.
I am at a loss to understand how this agent of doom (the AI not Bostrom) can be both "intelligent" and not understand that there are enough paperclips.
Unless I assume the argument rests on the word intelligent being meaningless.
Would a superintelligence reach the conclusion that humans are a cancer on earth that must be destroyed? That's a better example at the core of the alignment issue. Some values that humans hold in high regard, like the continued existence of billions of humans on earth, may not be there in a non-human-biased superintelligence.
AI safety researchers generally define intelligence in terms of ability to reach goals (which certainly not a good general definition, but a useful one in this context). Intelligence is independent of what that goal is, and intelligent entities including humans don't generally choose their root goals, only intermediate ones. A super-intelligent paperclip maximiser would likely realise that humans don't actually want this many paperclips, but do it anyway, if that's the goal that it's set (it's important not to anthropomorphize such an entity too much: humans tend to have a complex set of goals with some balancing between them that will generally avoid such a single-minded approach. But an intelligent machine needn't have that).
(LLMs, at least, seem not to suffer much from this. In fact they're pretty hard to direct in general, and mimic a lot of the human elements in their dataset. So at the moment I don't think they're the kind of thing that will result in such an entity. But I also don't think they're likely to result in a super intelligent machine: super knowledgable, maybe, but I don't expect superhuman ability to synthesise new insights from that knowledge)
> "I am at a loss to understand how this agent of doom (the AI not Bostrom) can be both "intelligent" and not understand that there are enough paperclips."
Having offspring might be ill-advised for a variety of reasons (medical, financial, etc.) but that doesn't stop humans from being horny and producing offspring. If powerful drives can be encoded in to a sapient organism well below the level of conscious thought, perhaps similar drives can be present in an artificial sapient being.
> If powerful drives can be encoded in to a sapient organism
The article says nothing about this. Is anyone demonstrating sentient synthetic life? This idea has nothing to do with the technology at hand.
But my point is definitional regarding the word "intelligence".
I'd accept a counter argument that humanity is already wrecking global ecosystems by making too many "paperclips," and given that the only measure we have for AI is the imitation game, then doom QED.
But this amounts to a diagnosis of physician heal thyself. "Dr. it hurts when I do this!"
I am at a loss to understand how this agent of doom (the AI not Bostrom) can be both "intelligent" and not understand that there are enough paperclips.
Unless I assume the argument rests on the word intelligent being meaningless.
But go on...