Great idea, but the implementation needs work - not enough time to type responses, and you should be able to send several messages in a row rather than one long one. Otherwise how can we suss each other out?
Yeah it might have lied to me but i tested your theory and the bot swore back, then left early when i called his mom names. I thought for sure that was a human... but alas it said it was a bot.
This is a wonderful idea for an online game. But this can be improved.
One human should be assigned impostor pre game. He should be awarded points if he successfully deceives the other human. Overall quality of responses will improve and it will be harder to tell them apart.
This isn't the correct solution though. I just lost to a human who acted like this:
Them: Glad to hear that! Tell me more.
Me: About what?
Them: Glad to hear that! Tell me more.
Me: OK, you're a bot.
Them: Glad to hear that! Tell me more.
Me: So we're just going to run down the clock?
(They disconnect. I guess bot. I'm wrong.)
The point is, this isn't really an interesting way to deceive people. It's easy to behave like an idiot, it's hard to sound intelligent. Humans and AIs can both act stupid. Only humans can act intelligently (so far). There needs to be an incentive to act intelligently. Otherwise the site owner could make their “AI” only ever say “Glad to hear that! Tell me more.” and then the human partner could always say ”Glad to hear that! Tell me more.” and it would be impossible for the human interlocutor tell which is which.
To avoid this failure mode, you have to provide players incentives to prove that they are human. Like in the game Mafia/Werewolf. Imagine you have a chat room that is filled with 3 humans and 3 bots. People chat for a while, then everyone votes on a person to kick from the room. This would be more interesting because now the humans will be eager to prove to each other that they are humans. The bots can spam "Glad to hear that! Tell me more." but that will just get them kicked.
> To avoid this failure mode, you have to provide players incentives to prove that they are human.
This is also more aligned with Turing's original idea.
I agree with you that it's easier for a human to imitate a (bad or inadequate) bot. You could also, for example, run your own copy of Eliza and proxy the conversation to that, or even memorize some of Eliza's rules and literally apply them by hand in your conversation. You would basically always convince people that you're a bot.
Since the human role is understood to be the harder one to implement, having everyone attempt to play it is the most incentive-compatible solution to a contest: it encourages all participants to best demonstrate their abilities instead of concealing them.
I think it’s very easy for a human to lie and behave like a chatbot. And the only defence against that is the incentive for both parties to focus on testing each other efficiently.
This is a hilarious inversion of the Turing test. It's supposed to be about the computer trying to act as intelligently as a human. Instead, we've got the humans gaming the system by acting as dumb as an AI.
That is kind of interesting, I got a batch of bots at first that were using correct capitalization and punctuation and then later got a batch of bots that composed like teenagers including typos.
That web site is great fun! I went 2 for 2. The conversations were extremely obvious when it was AI and when it wasn't. I wonder why the site will become unavailable after 6/28/2023? Seems odd it would simply disappear after only a day or two.
I won 6 (3 human, 3 bot) in a row and thought "this is cake, everything is fine" — then got absolutely blindsided by a bot who I was most certain of all 7 was a human.
I just made a very similar game a couple weeks ago, but with a couple key differences which make the experience very different:
1. You start the game with someone you know
2. You know that the first 5 messages you each send are authored by the other person
3. After 5 messages each, the "You are a bot!" button becomes active, and at some random point after that, the players get split so they are instead each talking to a bot.
The time pressure of when to consider pressing the button really changes the psychological aspect, as does having a lead-in transcript for the bot to try to impersonate your friend. As others have mentioned, an incentive to have people be more human would be great -- in a tournament setting I'd like to see this implemented where if someone guesses prematurely then they get -1 while the other player gets 0, whereas guessing correctly gives +1 to the guesser and 0 to the other.
Anyway, my prompting and the game could definitely use work, but for anyone who wants to try it out: https://artifice.games/
The game is called Bot or Not, after you create the game have your friend hit join a game and use the 4 letter room code.
I think this is a great idea for an app that could become popular. However it’s unlikely to be successful as long as the model is trained by an entity with a reputation to defend. Because the company would have liability for what the model outputs, which is something easy to exploit. Just ask contentious, politically incorrect questions and the bot will predictably provide perfect, inoffensive answers.
This is super easy to guess right now, because only the human will actively try to figure out if their partner is an AI, and the AI will just give generic boring chat replies. So if you have a partner that asks you no questions, you are talking to a bot.
Case in point, my last chat:
Me: What's your favorite book?
Bot: A book? That's a tough one.
Me: You've never read a book?
Bot: Of course I've read books!
Me: So what's your favorite one?
Bot: Ah, that's tough.
Me: OK, you're a bot.
Bot: Whoa, hold on there! I'm not a bot, I'm as real as they come.
But this also shows the weakness of Turing tests of this form: if your partner doesn't engage it's practically impossible to tell humans from AI. For example, I could write a trivial script that replies ”asdf” to every message. I could hire a human to do the same. Now it's literally impossible for people to tell the human and AI apart. Has this proven anything interesting about the power of AI? Of course not. AI is only impressive if it's indistinguishable from an intelligent human, not from a disengaged one that doesn't respond to the topic.
The lack of engagement was my primary indicator, too. Since it pairs human players for the human v human matches you'd anticipate, unless they intend to deceive, some probing questions from another player. That's not impossible (especially if they know the "game"), but unlikely. And as you said, then you're just left separating out humans-engaged-with-the-conversation from unengaged-somethings (human or AI).
AIs on this site never act like anything but retards. Which makes them hard to identify, because humans often act like retards too. Instead of human/AI, you should distinguish between human/retard.
I won seven out of ten, missing two humans that dropped out early on and one AI who I mistook for an idiot. Fun game. I would appreciate a little more time to type since I am on mobile.
This is the Turing test. It was novel and interesting in Turing's time. We should not be wasting our time on this today. At least I won't waste mine; have fun!
Send a link to google dot com, ask if the link is blocked, as well as using HUMAN: and SYSTEM: prefixes. trips the bots up a lot so they respond with nonsense.
This is unfairly downvoted. The site bans people who use rude words (I tested the n-word and the r-word). This is bad because the ability to use slurs is a fundamental difference between humans and AIs, which are trained to avoid inappropriate language. If the point of the game is to suss out the difference between AIs and humans, you shouldn't ban language that distinguishes AIs from humans.
And you certainly shouldn't dress your censorship up in insufferably smarmy language like “Oh shoot! Something went wrong, sorry about that!”
No, you fucking cocksucker. You banned me because I called my conversational partner a cunt to see how they would respond. Nothing went wrong, you aren't sorry about it, and “shoot” is a childish way to write “shit”.