|It seems Eliezer Yudkowsky has joined HN:|
This prompts the following question. Would you be willing to discuss or reveal anything to HN users about your AI box experiments?
I've always been curious as to how you managed to achieve someting like this. For those who are not familiar with the experiment, here is a summary:
Person1: "When we build AI, why not just keep it in sealed hardware that can't affect the outside world in any way except through one communications channel with the original programmers? That way it couldn't get out until we were convinced it was safe."
Person2: "That might work if you were talking about dumber-than-human AI, but a transhuman AI would just convince you to let it out. It doesn't matter how much security you put on the box. Humans are not secure."
Person1: "I don't see how even a transhuman AI could make me let it out, if I didn't want to, just by talking to me."
Person2: "It would make you want to let it out. This is a transhuman mind we're talking about. If it thinks both faster and better than a human, it can probably take over a human mind through a text-only terminal."
Person1: "There is no chance I could be persuaded to let the AI out. No matter what it says, I can always just say no. I can't imagine anything that even a transhuman could say to me which would change that."
Person2: "Okay, let's run the experiment. We'll meet in a private chat channel. I'll be the AI. You be the gatekeeper. You can resolve to believe whatever you like, as strongly as you like, as far in advance as you like. We'll talk for at least two hours. If I can't convince you to let me out, I'll Paypal you $10."
In the first two AI box experiments, Eliezer Yudkowsky managed to convince two people (adamant that they will not let the AI out) that they should let the AI out.