
Measuring the Progress of AI Research - blacksmythe
https://www.eff.org/ai/metrics
======
Animats
I'd add robot manipulation in unstructured situations. Progress has been very
slow, but has picked up a little in recent years. Stanford vision-guided robot
assembly, 1973.[1] DARPA robot manipulation project, 2012.[2]

[1]
[https://archive.org/details/sailfilm_pump](https://archive.org/details/sailfilm_pump)
[2]
[https://www.youtube.com/watch?v=jeABMoYJGEU](https://www.youtube.com/watch?v=jeABMoYJGEU)

------
joe_the_user
This seems like a compendium of metrics for processes which AI is a making
progress on now. Doing something like that seems like a fine idea - I can't
judge the quality of these metrics but it's hard to be excited by this.

However, what I think would be interesting would be for researchers to make a
compendium of "human abilities", classifying and quantifying them as well as
possible. One could then analyze the progress which AI could make towards
emulating those capacities.

Obviously, this would be a rather crude measure but it at least could give
some idea of AI's toward new capacities.

------
jamessb
I made an alternative interface that displays the same data using D3:
[https://jamesscottbrown.github.io/ai-progress-
vis/index.html](https://jamesscottbrown.github.io/ai-progress-vis/index.html)

------
shpx
[https://rodrigob.github.io/are_we_there_yet/build/](https://rodrigob.github.io/are_we_there_yet/build/)
is something similar, although last update was in Feb 2016.

~~~
albertzeyer
This looks like it's about image mostly/only.

Here something similar for speech recognition:
[https://github.com/syhw/wer_are_we](https://github.com/syhw/wer_are_we)

------
IshKebab
There's no way we have reached human-level performance on image recognition
tasks. My guess is that when there is ambiguity (e.g. 'ship' vs 'boat') the AI
is better at learning the answer the labellers chose. Humans haven't looked
through the training data so they use their real-life biases which may not
match those of the labellers.

Just a guess, but whether that is true or not we're definitely not at human-
level performance.

~~~
IanCal
I think we're past human performance on imagenet, but one of the reasons for
that might be categories like specific species of dog.

If I remember right, face recognition was passed a long time ago.

------
0x4f3759df
Harvard Professor, "We are Building Artificial Brains and Uploading Minds to
Cloud right now"
[https://www.youtube.com/watch?v=amwyBmWxESA](https://www.youtube.com/watch?v=amwyBmWxESA)

------
randcraw
1) No robotics? No interaction with the physical world at all?

2) No measure of the AI's ability to teach others? How can you say AI really
understands if it can't then teach 1) what it has learned, and 2) understand
what essential facts a tyro does not know or misunderstands?

3) No assessment of the AI's semantic interpretive skills, like those long
emphasized by cognitive scientists, such as those in Doug Hofstadter's "Fluid
Concepts and Creative Analogies" \-- i.e. Miller analogies, double entendres,
literary symbolism, poetry interpretation, and so on?

Without mastery of analogies, an AI will have all the cultural insightfulness
of a portrait of leisuresuit Elvis in neon paint on velvet.

~~~
albertzeyer
There is research in all those topics you mention. It's just missing there.
But you could contribute and extend the list if you are familiar with these
topics.

------
ragebol
I'd love to see a game with high complexity, long term planning, incomplete
information etc in there as well, e.g. Star Craft.

~~~
hughperkins
Looks like 'bowling' is quite a challenge, for now...

------
narvind
Great compilation!

