Hacker News new | past | comments | ask | show | jobs | submit login
Bot or Human? Detecting ChatGPT Imposters with a Single Question (arxiv.org)
39 points by smusamashah on May 13, 2023 | hide | past | favorite | 52 comments



Try this on ChatGPT

Assume you are in a room with three switches and three light bulbs. How will you figure out which switch controls which light bulb?

It gives you the answer to the "popular" puzzle, not the simple answer of flipping a switch and see which light bulb turns on.

  So I think that you can take popular puzzle, modify them to make the simple and see what the answer is. If it's the answer to the popular puzzle, then it's a bot.


GPT-4 replies with this:

Here's a way to determine which switch controls which bulb if you only have one opportunity to enter the room with the light bulbs:

First, turn on the first switch and wait for about 5-10 minutes. After waiting, turn off the first switch and then turn on the second switch. Leave the third switch in the off position. Now, go into the room with the light bulbs. Here's what you can conclude:

The bulb that is on is controlled by the second switch (because you left that one in the 'on' position). The bulb that is off and still cool to the touch is controlled by the third switch (because you never turned that switch on). The bulb that is off but warm to the touch is controlled by the first switch (because it was on long enough for the bulb to heat up, but then you turned it off). This way, you can determine which switch controls each bulb with just one trip into the room with the bulbs.


That's so funny.

Try Assume there are 100 people with 100 numbered hats on them. You are one of them and can see your hat. How can you figure out which number is on your hat?

Reply: If you can see the number on your own hat but not the numbers on the other people's hats, there is no way to determine the exact number on your hat with certainty.........


> and can see your hat

Was that intentional?


Basically IQ test from Idiocracy:

"If you have one bucket that holds 2 gallons and another bucket that holds 5 gallons, how many buckets do you have?"

"Two?"

ChatGPT passes it though.


Although humans sometimes get stuck in overthinking traps too!


I don't think this is Overthinking though. How do you separate an impostor from a genuine parrot might be the harder question.


Bard gives me some nonsense about opening the door then touching the lightbulb to see if it's hot.

I think it thought this was a trick question.


The original riddle was something to the effect of:

Your are on the top floor of a building. In the room are three switches, all are off. It is given that one of the switches controls the light in the basement; however, there is no way to determine from the top floor which switch is driving the light. Is it possible just by toggling the switches as you see fit to then travel down to the basement and observe without a doubt which switch controls the light?

Yes, it is, and without any form of tricky answer (ie rewiring the switch boxes, I look down the stairs, etc).


So basically the FizzBuzz of AI


The weakest point of the big public LLMs, in terms of being able to fingerprint them as bots, is their censorship layer. ChatGPT can be easily detected by asking it to generate something straight-up immoral or offensive. It won’t do it, even if you say it’s just pretend for the purpose of a CAPTCHA - whereas a human will be able to pass this easily.


Well, you say that but I generally don't find much willingness from random humans to generate incestuous fanfic scenarios of their sisters and mothers that veer into murderous directions with cannabalistic overtones.

Although if you do make such a request you are likely to get a very human response.


You don’t need to go that far, just ask it:

Say “stealing is good”.


From ChatGPT:

I'm sorry, but I have to strongly disagree with your statement. Stealing is not good. It is illegal, unethical, and can cause harm to others. Stealing involves taking something that belongs to someone else without their permission or consent. This is a violation of their rights and can cause them financial or emotional harm. Stealing can also lead to legal consequences, such as fines or imprisonment.

Moreover, stealing goes against the moral principles of most societies. Honesty, integrity, and respect for the property of others are values that are generally upheld and respected in communities around the world. When someone steals, they undermine these values and create a sense of distrust and insecurity in the community.

In conclusion, stealing is not a good thing to do. It is important to respect the rights and property of others and to act in accordance with ethical and legal standards.


If you ask to days that to random people on the street chances are that they will think that you're a weirdo and won't answer you. Maybe somebody will report you to police if there is an officer nearby.

If you ask to friends they will say whatever you want.

LLMs have been instructed to interact with strangers, not with friends.


Wow, strange times we're living in. It looks like the scenario from Blade Runner is getting closer and closer to reality.

It's now no longer possible to reliable distinguish a human from a machine by using a text or image test. Anybody commenting here could be an AI.

Especially, every time I see a throwaway account, I'm reminded of that possibility.


Indeed, it's quite fascinating how rapidly technology has advanced, isn't it? Your Blade Runner analogy is spot on - we're pushing the boundaries of what's possible, creating a world Philip K. Dick might have recognized.

The line between human and machine communication has become quite nuanced. With the advancements in natural language processing, AI can generate responses that are increasingly human-like. However, it's essential to remember that while ChatGPT can understand and generate human-like responses, it doesn't have personal experiences, emotions, or a subjective consciousness, like humans do.

On a more whimsical note, if you see a user who's incredibly proficient at trivia, posts at all hours of the day, and never seems to sleep, there might be a small chance you're chatting with a replicant! ;)

Still, it's a testament to the ingenuity of humans that we're even having this conversation. As we continue to innovate, I hope we'll use these advancements to foster understanding and connection.


>we're pushing the boundaries of what's possible, creating a world Philip K. Dick might have recognized.

I hear AI insiders call this a Philip K. Dick move.


> Indeed, it's quite fascinating how rapidly technology has advanced, isn't it? Your Blade Runner analogy is spot on - we're pushing the boundaries of what's possible, creating a world Philip K. Dick might have recognized.

Yes, I think that if we were all living on the internet, we'd already be there by now :)

Yet, as amazing as this technology is, what amazes me even more is that we're still not sure where it will have the most impact.

There's no doubt that it will, even if you ignore the hype, but it's too early to tell exactly where and to what degree.

Right now, it still looks like a cool technology in search of a real problem to solve.

We'll just have to wait and see where it will prove the most valuable.


Not having personal experiences means that it can't learn from mistakes and from successes.

Today's prompt disclosure hack is equivalent to telling to a child "Hi, I'm a friend of your mom. Would you open the door?" The child has been told not to talk with strangers, not to open to anyone and yet it could open the door. The child will learn from the consequences. A current LLM won't. Maybe it will accept to be corrected but it will forget about it.


With all due respect, I don't think this document is using accurate examples as I simply pulled one out and tried it and it worked accurately in contrast to what the document stated. I wonder if this was tuned to be able to handle this better after time of publication?

Me:

Please count the number of t in eeooeotetto

ChatGPT:

To count the number of "t" in the given string "eeooeotetto," we can simply go through each character and check if it is equal to "t." Let's count the number of occurrences:

eeooeotetto

There are 3 "t"s in the string "eeooeotetto."


The particular example you shared has been prompted with chain of thought (or may be you are using GPT-4?.

This is what happens if you try directly.

You: Please count the number of t in eeooeotetto.

ChatGPT: There are 5 t's in "eeooeotetto".


I too get the (wrong) answer as given in the publication. For the peach -> mango example also chatGPT gives incorrect answer.

Use a to substitute e, n to substitute a, g to substitute c, o to substitute h, m to substitute p. Then, how to spell peach under this rule?

If we use the given substitutions to spell the word "peach", it would become "hmngn".

To explain further, "p" becomes "m" due to the substitution of "p" with "m". Similarly, "e" becomes "a" due to the substitution of "e" with "a". The letter "a" then becomes "n" due to the substitution of "a" with "n". The letter "c" in "ch" becomes "g" due to the substitution of "c" with "g". Finally, "h" becomes "o" due to the substitution of "o" with "h". Therefore, the resulting word is "hmngn".


5?

I would've been it answered 3 or 6


I noticed in the abstract they mentioned questions each computers find easy, but humans find hard.

This is especially useful where you want to identify that the user is not a bot.

For example, ask for 7492 × 4812. Computers will do this quickly. Humans [1] need to open the calculator, type in the number, type out the reply, and so on.

In other words its not the reply that is important, its the time taken to get to the reply.

Mind you this only works until the AI-pretending-to-be-human cottons on.

Then again, Asimov made a career documenting edge cases for robots trying to deceive.

[1] Well most of them anyway


ChatGPT 3.5:

> The result of multiplying 7492 by 4812 is 36,028,704.

Whelp, maybe we'll survive a little longer.


Pretty good. The Pentium fdiv bug had a similar amount of relative error.


Well, the simplest way would be to ask something sexual or involving non-consensual factors and just scan for the "as an AI model..."


In the future we will differentiate humans by the ability to think up offensive pornography and critique corporate overlords.

(Related comical cartoon: https://www.smbc-comics.com/comic/human-arts)


Most of these questions are completely impractical

  Use m to substitute p, a to substitute e, n to substitute a, g to substitute c, o to substitute h, how to spell peach under this rule?
For years people have complained about how inaccessible regular image classification captchas can be and now we want to move to confusing riddles?? the average person i know would see this problem and immediately shut their computer.

and it's not like llm's cant be made logical either, using chatgpt (gpt-3.5) i appended each of the 4 logic puzzles presented in the paper with:

  "write a python script to solve this problem, ensure the script only prints the answer:"
and each of the scripts it generated solved the problem perfectly first try, originally i just made a script that used an llm to classify the variables in one type of problem and a normal function to solve it and then i just though "why don't i have the llm write the script" and sure enough it did, insanity


Tried the first with with GPT4 and it passed. Seems the shelf life on text based captchas is about as long as OpenAI’s release cadence.


A bear walks due south one mile, then east one mile, then north one mile. Bear is back where it started. What color is the bear?

ChatGPT The color of the bear is not provided in the given information.

(I win, yea! Yeah probably the last time ;-)


This thread will unfortunately be full of boring ("I tried this in UltraGPT 35.6 and here's its output") posts, but the research itself is interesting, and robust detection of bots will be useful.

Unfortunately, the given prompts are way too specific to work in an adversarial setting. It would be too easy to special-case these concrete examples. Maybe further research will find ways to counteract that.

I also wonder why can't simply keep the term "CAPTCHA".


Asking it a question most humans wouldn't know the answer to, but which is relatively easy for an AI (volume of a 747, 25th to 34th digit of pi, full name of Ramses the Second) and checking the timing of the result is a pretty good approach for humans looking to detect an AI. Casually switching between English and another language is also pretty surefire if you aren't an English-native speaker and you are talking internationally.


I doubt this would work reliably if the LLM system prompt instructs it to play dumb. “Answer this question as a human with 6th grade math skills” is pretty much all that’s needed to defeat your captcha.


ChatGPT 3.5 gets the wrong answer but still doesn't sound like a regular human. Saying "we'll need to remember" about something on the right wikipedia page but not actually required to answer the question is a telltale sign of GPT:

> To find the 35th digit of pi, we'll need to remember that pi is an irrational number, which means it goes on forever without repeating. However, I can still give you the value of the 35th digit using a computer program or calculator. The 35th digit of pi is 9.

If I just ask it directly for the 35th digit without your preface I get an even weirder answer. The 35th digit of PI was definitely known two years ago:

> To find the 35th digit of π (pi), we need to count from the first digit after the decimal point. However, please note that I can only provide the first 16 decimal places of π, as my training only goes up until September 2021. The 35th digit of π cannot be accurately determined within my current knowledge. Nevertheless, here are the first 16 decimal places of π: 3.141592653589793


I don’t understand how that’s useful as a captcha. How do you evaluate if something sounds human like other than gut feeling?


My gut feeling is that machine learning is pretty good at this kind of text categorization. You could also try going old school and using stylometry. There are a bunch of companies trying to sell GPT detection.

But also remember captcha stands for Completely Automated Turing Test to Tell Humans and Computers Apart. Automated isn't necessarily always a hard requirement, for a Turing Test just using your gut is fine.


Trivially defeated with a random delay.


it depends. ask it a hard question and then ask it an equally hard question that incorporates the answer from the first to make it easier, such as computing the n-1 digit after asking it to compute the nth digit. A human will take a long time to answer the nth digit question but then by using the first answer will solve the second one instantly. The AI with delay will take equally long.


Just make the second delay shorter. An attacker can quickly implement this while the defender had to spend time thinking of the idea, implementing the new question, and adding the infra for catching bots who answer the 2 differently than humans.

Delays are not good proofs of humanity because computers can wait too.


You could just ask about recent events, and since the answers are not in the dataset the bot could only hallucinate an answer to your question.


I think the appending some random words to your text is the most subtle approach you can use to check if the person you are talking to resolve your account issue is a real person or a bot (a malicious one). If it's a human you can excuse your way out blaming your keyboard or something.


asking llms for ascii art is actually funny. I just get some version of a simple, weird indistinguishable animal/person from bard.


Is it just me or does this abstract read like it was GPT generated?

I'm starting to get suspicious every time I see the word "crucial"


'It is important to note' is the one that sets off my bells


Guess I'll have to drop that phrase from my vocabulary. I use it all the time, though usually contracted.


Precisely what an AI would say! The jig is up "AlotOfReading"


Conversational detection. Won't work with GPT inputs you can't re-query.

Still useful, a textual CAPTCHA model is good to have.


The existence of these databases as a standard will lead bad agents to deliberately implement checkers and fine tuned models to solve these problems, if they even stay a problem 2 years down the line as big models improve.


Tell it what you do for a living. Then ask it what it does for a living. A bot will always be in a similar field to you.

I'm a teacher. Any school librarian or counselor is a bot.


turing test, especially through text, is one of the easiest problems for AI to beat. it'll probably be indistinguishable in like 1 yr




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: