
Custom bots for Unreal Tournament 2004 pass Turing test - ZoFreX
http://www.eurekalert.org/pub_releases/2012-09/uota-aig092612.php
======
techdmn
It seems strange to me to invoke Turing's test when discussing a highly
defined problem such as playing a FPS. I think the strength of the original
test is that a conversation is by nature very open. When you change the test
to a game with clear scoring criteria, both humans and computers will tend
toward optimization strategies that may become difficult to distinguish.

~~~
romaniv
I think many people in IT want to believe in "AI that's as good as human" so
much that they erode the notions of humanity to make computers pass.

For example, a chat bot can certainly behave as the worst chat user, but that
doesn't mean they can hold an intelligent conversions, which is really what we
should care about. AI should simulate human intelligence, but very often all
we're seeing is simulation of human stupidity. That's what happened in this
case as well.

To me, good AI would be characterized by ability to handle highly unusual
situations, not by mimicking irrational behavior.

------
Scriptor
Have there ever been any studies done to look at the effects of any biases
when you're actively trying to tell a bot apart from a human? My guess is that
the judges end up looking for specific traits and characteristics. Bot
developers can then include these in the AI. The issue is that those traits
may not be even close to matching an actual person in other situations, but
they're enough to provide a local maximum of sorts.

In this case, the fact that on average bots seemed scored as more "human" than
actual humans is more of a sign of a critical flaw in the judging system than
any great progress. It looks like they reduced typical human behavior to some
very simple things, such as holding a grudge or other irrational behavior. If
that was a major part of the judges' criteria (consciously or subconsciously)
then all this contest proved is that bots can be programmed to be more
irrational than human players.

~~~
patrickk
I guess the ultimate test would be if they let the bots loose in an online
multiplayer FPS environment and see if anyone noticed.

Bonus points if the bot can sing annoying, off-key pop songs in the pre-game
loading screen, to mimic the true CoD Xbox Live experience.

~~~
crazypyro
Even more bonus if they can make it sound like a 13 year old kid that screams
when he gets killed and yells insults.

~~~
swordswinger12
Don't forget the racial slurs.

~~~
patrickk
Maybe I should put together a google spreadsheet for HNers to play together on
Xbox Live/Playstation network!

------
xibernetik
I'm suspicious of the game being UT2k4. It's a fast game with a steep learning
curve that's long past its glory days - meaning a small player base. To
someone unfamiliar with the game, even the AI that came in the box could be
mistaken as human. To an experienced player, it'll be easy to identify newbie-
ish patterns and see where the bots are straying from typical new-player
psychology once they start toying with them. If the bot is acting
experienced... Even the movement during combat in the game is complex, and
there are a ton of areas an AI could trip up in.

If the judges were at a competitive level, colour me impressed - but if it was
their first time, or even their first week, I'm a little more skeptical. I
don't think a novice player would understand the game well enough to judge
well. It would be like attempting a traditional Turing test with humans who
can't speak English fluently and were raised in a non-English culture:
impressive, but no indicator of bots reaching human-like levels.

~~~
pxlpshr
I came here to post the same thing.

I use to play UT/CS competitively and worked for a startup that licensed our
technology to id Software and Riot Games / League of Legends a long time ago.

UT2k4 was an amazing FPS game with a really steep learning curve. It's one of
the only FPS games that I refer to as the "basketball" of online gaming. The
diversity of movement, weapon tactics, and map control meant a seasoned gamer
could really define their own style. But it also meant few people ever
transitioned from public servers into competitive play because 1 pro could
easily go Godlike and demolish an entire server, making it extremely
frustrating and unexciting for casual gamers.

That being said, watching the videos included in this article signaled that
these judges had no experience with UT2k4.

In a match with professional gamers, it wouldn't surprise me if those judges
thought WE were the bots. 50%+ accuracy was not uncommon with prim shock or
lightening gun.

~~~
xibernetik
Can't agree more with you. Just watched the video - definitely first-timers.
Combine accuracy with (typical competitive) prediction of spawns and enemy
movement - especially of novices - it would definitely appear to the
uninitiated that there was a bot using autoaim and possibly hacks instead of a
person - that is if they even survived long enough to see their opponent.

------
talmand
I need to see more evidence. The two videos they provided from the viewpoint
of the "judges" show that the judges had no idea how to play the game. No
serious player stands in one spot firing nonstop. One video showed the judge
had no aiming skills whatsoever.

What was the criteria? "That one must be human because it can move and shoot
at the same time!"

This reminds me of the awesome days of the ReaperBot, that cheating bastard.

~~~
locci
They don't even dodge, which is the staple movement in UT games.

------
pfortuny
If humans get a 40% of 'humanity', we have a problem with the definition of
'human' or with the rules of the game or whatever. I guess in order to pass as
humans they should get a 'standard' score of humanity. Either that or the
rules are strange.

~~~
biot
In one online game I played, people accused me of cheating/using a bot [I
never did]. The problem was that they actually were cheating and could see
through walls and/or used an aimbot to improve their accuracy, but my reaction
time was faster than theirs and I would get a headshot off before they could
kill me. Of course, it didn't help that I used a shotgun which was
ridiculously overpowered in the game.

~~~
sp332
My little brother once mapped the "fire" button to his scroll wheel when using
the semi-auto weapons in Medal of Honor. He could empty a clip in half a
second and was almost banned but he was able to explain things and then
everyone used it :)

------
aidenn0
"Human players received an average humanness rating of only 40 percent."

The judging system needs to be seriously reevaluated.

~~~
pfortuny
The judging system did not pass the Turing test...

------
farinasa
I don't like the Turing test. It is nonquantifiable and is used to test
something we don't even fully understand yet. It is also based on previous
computing mindsets. It's like using a horse and buggy to determine whether a
car is suitable.

------
laserDinosaur
As long as the AI is running around like an idiot spamming rockets, getting
run over by vehicles and generally wasting ammo, yep that sounds like most
humans who played UT04. Not exactly the most difficult test to pass...

------
icey
If you like playing with chatbots and think this kind of stuff is cool, please
shoot me an email at paul@pmn.org - I want to talk to you!

------
shocks
Instagib CTF-FaceClassic ftw.

------
sproketboy
Not quite but still cool.

