
IBM‘s Project Debater does debate club-style discussions with humans - rfreytag
https://www.theverge.com/2018/6/18/17477686/ibm-project-debater-ai
======
einr
Briefly attempted to read more about this on IBM's site (linked in the
article), had to close the tab immediately to retain my sanity.

Seriously, _LOOK_ at this:

[https://i.imgur.com/an7857N.png](https://i.imgur.com/an7857N.png)

I have a 1920x1080 24" display and you can't fit more than eight words in
width on the screen and feel the need to hide all the navigation in a
hamburger menu that disappears when scrolling? This is some of the actual
worst design I've seen, ever.

~~~
Barrin92
If you want to be really upset try to scroll through the website dedicated to
their new typeface:

[https://www.ibm.com/plex/](https://www.ibm.com/plex/)

~~~
applecrazy
That website is interestingly horrifying.

There's random call-to-action buttons, spaces where I couldn't tell if
something was loading or if the blank was intentional, things fading and
moving in too slowly, and animations that are just plain _weird_.

To me it looks like _bold design_ that got a bit too...bold.

Edit: Don't get me wrong, the Plex typeface is great, but the website
dedicated to it is wonky at best.

------
wbhart
There is an article claiming that an AI held its own against a human in a
debate. But the supplied video gives just a few seconds of canned material
which could apply to any debate. Am I missing a video somewhere? Even on their
website they don't seem to want to show an actual debate so that we can judge
for ourselves whether it "held its own".

~~~
wbhart
It turns out there are some longer videos on YouTube, though also curiously
cut down sufficiently far that a lot of it is without context, e.g.

[https://www.youtube.com/watch?v=PkSzmnA1CQQ](https://www.youtube.com/watch?v=PkSzmnA1CQQ)
[https://www.youtube.com/watch?v=ZIY1uSxL-
qQ](https://www.youtube.com/watch?v=ZIY1uSxL-qQ)

But now I am suspicious that it always debates the same people (they make
reference to having worked with the machine for over a year). If that is the
case, then they know precisely what to say to trigger certain canned responses
(it makes jokes that were quite obviously well-planned by a human). The rest
of the material seems to just be content culled from online documents and read
out by the machine.

The achievement here, if there is one, is to parse online material and decide
which snippets support a given argument and which don't.

~~~
dforrestwilson
Typical IBM marketing > reality

------
mv4
This quote:

"Watching the debate, I figured the answer was that it didn’t quite get it,
but I wasn’t positive. I couldn’t tell the difference between an AI not being
as smart as it could be and an AI being way smarter than I’ve seen an AI be
before. It was a pretty cognitively dissonant moment. Like I said,
unsettling."

... reminds me of when Kasparov lost to Deep Blue, when it made a strange
move. He thought the move was too sophisticated for a computer. Fifteen years
later, one of Big Blue's designers said the move was the result of a bug in
Deep Blue's software.

IBM is a marketing machine.

------
chmod775
The author seems to celebrate the sneakyness displayed by the AI in that
debate.

I, for one, am not excited about the prospect of AI-generated logical
fallacies.

Dismantling such is where the real work is at. Give me a heads up when they
automated _that_.

~~~
mannykannot
I took it as the author being somewhat unsettled by the appearance of
dissembling, but regardless, the explanation given by Jeff Welser suggests
something more mundane: it generates a bunch of candidate responses, and
sometimes its scoring algorithm picks one that does not address the points
made by the other side.

An accusation of dissembling implies intent, and that probably requires some
sort of theory of mind, to make assumptions about how the response will be
received. The author of the article seems to be anthropomorphizing,
attributing greater cognitive powers to the system than it is displaying.

Welser also says "there’s been no effort to actually have it play tricky or
dissembling games", leaving open the question of whether it has nevertheless
been exposed, in training, to arguments that avoid the issue, and whether that
has influenced the way in which it constructs and scores candidate replies.

~~~
ethbro
Something I read on the anthropomorphisation of AI really clicked with me, and
explains much of the media and marketing wrongness.

If I tell a random person on the street that my AI can play checkers, they
think, "A human can play checkers. A human can also drive a car. Therefore, if
an AI can play checkers it must also be able to drive a car."

People use AI tasks as markers that AI is AGI / human, then extrapolate to
logical abilities it must have.

Which, if you don't know what a matrix is, is fairly understandable.

------
tziki
Considering the Watson Jeopardy was more a PR stunt than an actual demo (no
voice recognition, questions given as plain text, speed advantage of a
computer) I'll hold my opinion until any technical details come out.

------
craftinator
Shame on you, Dieter Bohn, for either accepting bribes from IBM or happily
regurgitating their marketing spume like some sick parody of a bird feeding
it's child.

Pretty much everyone who works in tech knows that IBM Marketing has made much
more progress in technology than IBM AI. Please don't add to the noise and
confusion the average person is already under by writing an article so askew
from reality. Even from the limited and carefully trimmed Debater quotes that
you chose to include, it is apparent that what's going on is merely a gimmick.
It is also apparent that you cherry picked the best sounding responses, then
stripped any context away from them that might have exposed your charade.

So again, shame on you Dieter Bohn, for deceiving your audience. As a
journalist you have a responsibility to educate and inform, and you have
failed at that responsibility I'm this article.

------
smsm42
Does this mean AI is getting good or humans (at least ones AI has been
compared to) are pretty low bar to clear and just chinese-rooming the
arguments would pass as a debate? Reading some internet forums, it's hard not
to think a bot could produce many of the comments. Watching some political
coverage only intensifies the feeling. How many of the debates, on average,
can be classified as more than chinese-rooming?

I wonder if anybody did a sort of reverse Turing test - just like regular one
but no computers involved at all, unknown to the tester which is told one of
them is an AI, and how many people would be declared failing it?

> Another IBM researcher suggested that this technology could help judge fake
> news.

That part is actually scary. Opaque algorithms from IBM would decide for us
what are facts and what are fakes? And they'd know how? Because IBM marketing
dept said they're so good? Thanks but no thanks.

------
evrydayhustling
Would love to hear some inside baseball on Project Debater, or even the
planning of the event itself. It's been a while since one of these Big Blue
demos panned out once removed from the venue, so I think they'd do well to be
as transparent as possible about technology, API-accessible demos, etc.

~~~
luka-birsa
Call me a cynic, but I don't trust anything IBM puts out on AI. There have
been so many bullshit PR articles, with lots of nice working packaging
bullshit. This one feels the same.

It's so discouraging hearing all about Watson and health for past X years and
then reading AI insider articles refuting the work as total bullshit and a PR
job to keep IBM relevant today.

Don't know why this would be any different.

~~~
Eridrus
I think you made the same.mistake as the media - thinking that today's AI
systems are more general than they truly are.

IBM did build a computer that could play jeapordy. But being able to play
jeapordy doesn't tell us how good any of the subcomponents are at, e.g.
understanding doctors notes/journal articles.

The same thing goes here, they built a system that can debate with some
ability. Can it be used to spot fake news as a researcher says? Maybe some
subcomponents could be reused for that, but there's no guarantee that they
will be good enough for that task either.

You really have to take modern AI at face value and not extrapolate from what
it actually does to anything else, otherwise you will whipsaw between optimism
and cynicism.

~~~
randcraw
Actually, real debate is about responding to the _other_ debaters arguments
and countering them. Thus this IBM marketing show was _not_ really a debate.

This was a demo that IBM could synthesize a plausible argument that supports
or refutes a given assertion. That's interesting and a bit impressive
(presuming its argument isn't simply a regurgitation of some position paper it
found online). But without understanding cause and effect, its arguments will
remain very superficial, probably driven by a small catalog of argument
'frames' (templates) that adaptively fill a handful of slots like [needs]
[means] [goal] [conflicts] and [emotional hooks]. Using such simple recipes it
can produce verbiage that plausibly sounds humanlike, but isn't actually
reasoning. It's likely that the system couldn't even diagnose help desk
problems using basic logic, e.g. backtracking to thise dependencies that might
have caused the given outcome, thereby identifying only the possible and
plausible causes.

------
nailer
Original content: [http://www.research.ibm.com/artificial-
intelligence/project-...](http://www.research.ibm.com/artificial-
intelligence/project-debater/)

------
microcolonel
This is a new low for IBM. If they had actually produced the system they are
describing, they would bring a person to debate the machine in real time, and
they would have an uncut clip of that happening, because it would be an
absolute technical marvel.

Obviously I can not _prove_ that they _haven 't_ developed a miracle device
which can construct complete arguments and theories in real time, but if they
_had_ done, they'd've shown something more convincing.

------
pjc50
Does anyone have a clean link to a transcript of this?

------
bunnycorn
I still maintain that IBM is above everyone else in the AI game, everyone.

Thing is that the media doesn't pay proportional attention and respect they do
to other companies that haven't proven anything yet.

Where's Google thing that they showed last I/O where an AI would call a
business to schedule an appointment? Vaporware.

~~~
carlsborg
Thought experiment: Given the mass unemployment that would come from
autonomous self driving vehicles and AI agents that can replace humans: if you
had the tech in place would you release it, or wait for something like
Universal Basic Income to be rolled out first?

~~~
paraditedc
I know this is hugely unpopular, but...that sounds like a problem that
communism was initially designed for, where productivity is so high that
people can just take what they need.

------
DevX101
I've long been interested in seeing debate formalized. That is, a back and
forth structured discussion with the end goal of arriving at a narrowly scoped
truth. One strategy for winning could include attacking core assumptions
underlying an argument, thus weakening any claims that followed from that
assumption (question the methodology/replication of a study that claim
vaccines cause autism).

The problem I have with debates in popular culture is that it's often
performative and participants devolve to whoever is best at dominating a
discussion, or whoever is the more eloquent with not much light at the end of
the tunnel.

------
slededit
Debates have really gone off the deep end with their scoring rules. They
should be about convincing the audience not scoring based on how many points
of the other side were responded to. If the other side makes a ridiculous
claim the audience can see through it.

It does mean that topics need to be apolitical in order to ensure a fair score
- no debate will turn a Republican into a Democrat. But for other topics this
scoring works extremely well.

------
borand
How can we rule out the possibility that Watson, Sophia etc are some WiFi
operated, human controlled PR stunts?

I'm not saying they are, but even when it did beat Kasparov, where was the
proof, code we can run ourselves and see, witnesses, just basic accountability
stuff that we can make us safely believe that these are not staged?

Does anyone know anything?

~~~
casper345
Seeing this with the Google Assitance "scandel" of claims that alot of the
demos didn't add up. How do we know this is really working. Most startup that
say they are doing "ML/AI" tech are just exporting the data to factories of
workers in developing countries to shift the data. Besides in research papers,
I have yet to see these claims be palpable in reality.

~~~
jvanderbot
> most

Is this possible?

If so, then call it Data Science or Push-button analysis.

------
rchaud
Sounds pretty much like how human debates are already structured:

1\. Grab key points from influential books and articles to create templated
responses

2\. Do no original research and thus be completely unable to rebut criticisms
of the source data or analysis

3\. Counter #2 above by falling back on crowd-pleasing templated responses,
forcing the moderator to move the debate along

4\. Never, ever concede anything, because "debates" are no longer about
listening to an argument and carefully considering their merits, it is a time-
boxed verbal combat sport where there must be a winner and a loser.

~~~
tw1010
You think you're getting closer to the truth by being cynical, but what you're
really doing is overshoot the target so far that you're basically just as far
from the reality of debates as the uncynical view is.

~~~
dfxm12
What is the reality of debates?

~~~
kiriakasis
I think that was an intentionally terrible response meant to illustrate the
efficacy of the list.

