
Computer Scientists Are Astir After Baidu Team Is Barred from A.I. Competition - T-A
http://www.nytimes.com/2015/06/04/technology/computer-scientists-are-astir-after-baidu-team-is-barred-from-ai-competition.html
======
nl
Some background:

This competition (the "Large Scale Visual Recognition Challenge" aka
"ImageNet") isn't just some random competition. This is _the_ competition that
gave rise to the recent explosion in interest in Neural Networks.

In recent years Google, Microsoft and Baidu have been one-upping each other to
the point now where they are getting very close to better-than-human
performance (ie, humans disagree with other humans more often than their
systems disagree with the average human rating)[0].

Andrew Ng went to Baidu to start their team. I don't believe he is still
involved in this challenge.

Baidu has been getting close and close to Google's performance. Every year so
far Google has topped it at the right time, but Baidu has later passed the
Google benchmark that year.

There were reasons to think that this could be the year they finally beat
them. [1] is a story from January about the system that Baidu had built then.

Now this.

[0] [http://karpathy.github.io/2014/09/02/what-i-learned-from-
com...](http://karpathy.github.io/2014/09/02/what-i-learned-from-competing-
against-a-convnet-on-imagenet/)

[1] [https://gigaom.com/2015/01/14/baidu-has-built-a-
supercompute...](https://gigaom.com/2015/01/14/baidu-has-built-a-
supercomputer-for-deep-learning/)

~~~
Retric
They were just caught blatantly cheating.

This brings quite reasonable questions about their past performance.

~~~
jdmichal
> This brings quite reasonable questions about their past performance.

I don't think this second claim is necessarily derivable from your first. Yes,
they were caught cheating... By making multiple submissions of their own
systems in order to faster derive which is the most promising. IMO, the
performance they have achieved is still theirs to claim, because in the end
they did create a system that reaches that performance.

~~~
Retric
[http://en.m.wikipedia.org/wiki/Overfitting](http://en.m.wikipedia.org/wiki/Overfitting)

Ideally, your final test data set should be completely separate from training
data. Because even limited exposure quickly renders that test data
meaningless.

PS: In the end it's much like taking the same test with word for word
identical questions a second time. Yes, it's a test of something, just not the
intended subject matter.

~~~
ninjin
It is hard to emphasize how bad overfitting on the test set is. Your cheating
analogy is accurate, and cheating in science is a serious matter. If it is
unintentional, you should have your work rejected or retracted. If it is
intentional, you should have your work rejected or retracted and, frankly
speaking, you should most likely leave the field.

Science is based on a huge deal of trust and violating that trust for your own
short term gains is inexcusable. Even worse, any honest scientist will have
find it more difficult to improve upon results that can be explained by
overfitting.

A good short description on several issues that plague the field is "Clever
Methods of Overfitting".

[http://hunch.net/?p=22](http://hunch.net/?p=22)

------
jordigh
Cheating is pretty entrenched in Chinese culture. Sometimes I even wonder if
they consider it's cheating at all or if it's just cleverness and they should
be rewarded for it. I believe they feel that way about infringing copyrights
too: why should they follow someone else's rules that puts them at a
disadvantage when they can instead demonstrate cleverness and break those
rules?

There was an interesting story recently about how a disproportionate amount of
Chinese students are expelled from US universities for cheating.

[http://blogs.wsj.com/chinarealtime/2015/05/29/u-s-schools-
ex...](http://blogs.wsj.com/chinarealtime/2015/05/29/u-s-schools-
expelled-8000-chinese-students-for-poor-grades-cheating/)

The problem is so entrenched that some students _rioted_ when they were
prevented from cheating, because they felt that this put them at an unfair
disadvantage compared to other schools where cheating was tolerated:

[http://www.telegraph.co.uk/news/worldnews/asia/china/1013239...](http://www.telegraph.co.uk/news/worldnews/asia/china/10132391/Riot-
after-Chinese-teachers-try-to-stop-pupils-cheating.html)

~~~
stephengillie
If you look at it another way, it's not so bizarre. The person is retrieving
their information from another's head. They're using a social storage
mechanism. This works fine in the classroom, it works OK in the workplace. It
works well in most situations. The only place it breaks down is during
testing.

Because standard testing is only supposed to verify that you're storing the
data locally. Standard testing is designed in a way that only helps "local
storage" people, and alienates these users who get their information from
their environment.

Open-book, open-note, and open-internet testing removes this barrier and
levels the playing field for these "network storage" learners.

Or am I completely out of touch?

~~~
crimsonalucard
Why is it even bizarre? Chinese people cheat for the same reasons anyone would
cheat. Being chinese and born in America, I actually found it really
surprising the first time I met software engineers who refuse to download even
one copy of pirated software.

Culturally we just feel significantly less guilty about it. Doesn't mean we
aren't aware that it's wrong. Chinese people are fully aware of what's right
and wrong and we choose to deliberately cheat. Whether that's a cultural
tendency or genetic one is another story.

~~~
dimino
> Chinese people are fully aware of what's right and wrong and we choose to
> deliberately cheat. Whether that's cultural tendency or genetic one is
> another story.

What?! Genetic?! There is _very_ little, if _any_ evidence to support the idea
that genetics has anything to do with the Chinese propensity to cheat...

~~~
crimsonalucard
There's no evidence at all. Even when attempting to do research on this, the
political backlash could destroy any scientist's career. No official evidence
will ever be collected because of this.

I only speak from personal experience. I know many chinese people both born
here and born abroad. I am also chinese and born in the united states. I am
telling you, honestly, from a purely anecdotal standpoint: I think there's a
chance it's genetic.

Edit: Just to keep things from getting out of hand, and more balanced I want
to state this fact: Statistically, it is far more likely for a serial killer
to be a white caucasian male then it is for a serial killer to be of any other
race. Do I think this is a cultural thing? No. I'm leaning towards genetics.
But that's a purely anecdotal opinion as there's no evidence pointing in
either direction.

~~~
kmicklas
Why would those things be genetic? If there were a "serial killer gene" or a
"cheater gene", those traits should manifest themselves in all kinds of
obvious differences in behavior - which we do not see. It's hard to imagine a
protein causing such complex differences in behavior while affecting nothing
else.

~~~
crimsonalucard
What's the other explanation then? Culture? Why don't we see higher rates of
serial killers in other races born in the United States?

It's very possible for many differences in behavior between races to be
genetic in origin, in fact it's the more logical hypothesis versus the
alternative which states genetics doesn't influence behavioral differences
between races.

Think about it. If genetics influences physical traits from height, skin
color, facial features, and even athleticism, what black magic in this world
makes it so that genetics doesn't even touch behavior or intelligence?

>It's hard to imagine a protein causing such complex differences in behavior
while affecting nothing else.

It's impossible to logically deduce a conclusion from the bottom up. We simply
currently don't have enough knowledge to know how proteins scaffold the entire
human neural network. With highly limited knowledge, we can only look at the
problem from the top down. That being: genetics is known to influence physical
traits, therefore it is logical to conclude that it also influences mental
traits.

~~~
dimino
> If genetics influences physical traits from height, skin color, facial
> features, and even athleticism, what black magic in this world makes it so
> that genetics doesn't even touch behavior or intelligence?

Because we have no evidence that such is the case on a culturally grouping
level.

There is no genetic concept of "Chinese". It literally doesn't exist.

You're making this about the possibility of genetics impacting behavior, when
the real issue is you thinking cultural boundaries exist in genetics. They
don't.

~~~
crimsonalucard
There's no concept of humanity on the atomic level. It literally doesn't
exist. One configuration or mishmash of atoms we call rocks are no different
then the mishmash we call humans. Try, without using any high level concepts
or groupings, to define what configuration of atoms signifies a rock and what
configuration signifies a human.

If you go low enough on any topic the boundaries between categories become
vague and the definitions become extremely complex. It's very hard to define
what a human is in terms of atoms. The same goes for race, it's very hard to
define, at the genetic level what is chinese, and what is not, but the
category and boundary exists at all levels, and we can't ignore it.

I've heard of your argument before. They say that the delta in genetic
differences between two people of different races is the same as the delta of
two people, of the same race, therefore race doesn't exist. This argument is
flawed. I believe the "genetic" definition of race is immensely more complex
than simply the delta of genetic differences. Here's a more accurate
definition: People of the same race have a higher probability of sharing
certain genetic traits.

So let me redefine my argument in way you can understand. The people who we
label as "chinese" who share similar physical/genetic traits, I believe will
be more likely to also share a behavioral genetic trait that makes them more
likely to cheat.

~~~
dimino
I'm sorry, you misunderstand -- the people we label as "Chinese" do _not_
share similar physical/genetic traits.

Common misconception that they do, but there is very little genetic
consistency across cultural boundaries, and when such a thing does exist, it's
quite noteworthy.

~~~
crimsonalucard
> the people we label as "Chinese" do not share similar physical/genetic
> traits.

This statement is utterly and completely incorrect. It is a common myth in the
social sciences.

Please read:
[http://en.wikipedia.org/wiki/Race_and_genetics#Population_ge...](http://en.wikipedia.org/wiki/Race_and_genetics#Population_genetics)

first paragraph from above page: "The relationship between race and genetics
is relevant to the controversy concerning race. In everyday life many
societies classify populations into groups based on phenotypical traits and
impressions of probable geographic ancestry and socio-economic status - these
are the groups we tend to call "races". Because the patterns of variation of
human genetic traits are clinal, with a gradual change in trait frequency
between population clusters, it is possible to statistically correlate
clusters of physical traits with individual geographic ancestry. The
frequencies of alleles tend to form clusters where populations live closely
together and interact over periods of time. This is due to endogamy within kin
groups and lineages or national, cultural or linguistic boundaries. This
causes genetic clusters to correlate statistically with population groups when
a number of alleles are evaluated. Different clines align around the different
centers, resulting in more complex variations than those observed comparing
continental groups."

In short it's saying genetic traits can be statistically correlated with
population groups (race) but variations of traits that are different within
population groups can actually be more complex than those observed when
compared with people outside of their race.

This is literally exactly my argument. Supported by wikipedia at the very
least.

~~~
dimino
I do not accept the given definition of race from this page, as it presumes
the term "race" is in any way scientific or rigorously defined when in
actuality it is not.

What we "tend to call" race is _not_ defined, despite this wiki page's attempt
to do so.

~~~
crimsonalucard
This wiki page is the reflection of the general opinions of the scientific
community. You can redefine any word to have any definition that fits your
universe, but when communicating with other people, we must go with general
consensus.

~~~
dimino
> This wiki page is the reflection of the general opinions of the scientific
> community.

It isn't. The concept of "race" is not rigorously defined.

~~~
crimsonalucard
A word not having a rigorous definition does not make the concept non-existent
among scientists. "Life" is not rigorously defined.

~~~
dimino
Life is very rigorously defined, however it's not unequivocal:

[http://en.wikipedia.org/wiki/Life#Definitions](http://en.wikipedia.org/wiki/Life#Definitions)

A word not having a rigorous definition means it cannot be discussed
scientifically. Hence the actual problem of studying the existence of life,
e.g. is a virus alive?

~~~
crimsonalucard
please note. Unequivocal and rigorous are synonyms.

[http://www.thesaurus.com/browse/unequivocal/4](http://www.thesaurus.com/browse/unequivocal/4)

>A word not having a rigorous definition means it cannot be discussed
scientifically.

Life is discussed scientifically in many contexts yet it is not unequivocally
or rigorously defined. In fact there's an entire field based on the study of
life. It's called biology, aka the study of life. If a scientific field can
stem from a word that does not have a rigorous or unequivocal definition, then
it can be discussed scientifically.

~~~
crimsonalucard
@dimino

I'm getting pretty tired too. You choose not to accept the facts even when a
scientific description proving my point is thrown in your face. Ideas need
evidence for support, you have presented me with ideas, but no evidence.

The folks in the field are in agreement with me, (see the old wikipedia link I
sent you). You got nothing, only empty claims.

------
trway
In a globalized world, institutions need to tighten up their safeguards
against cheating, fraud and dishonesty. In much of the world, cultural
attitudes toward cheating are a lot more relaxed than some of us presume.

There was a story about Indian students' rampant cheating at US colleges, that
was presumably flag-killed off HN despite the horror stories that were
emerging from academics. I've read about Australian universities basically
selling degrees to foreigners who can't speak English and get admitted
fraudulently. There is a currrent story about fraudulent admissions to US
colleges by Chinese students as a result of massive SAT cheating. I'm not
picking on Asians: Switzerland seems to have its fair share of scoundrels, as
the FIFA scandal reminds us.

We need to acknowledge that gloablization sometimes brings unwanted side-
effects and deal with these head-on.

~~~
rayiner
In my opinion, tightening up the safeguards isn't the right approach. Rather,
institutions, particularly educational institutions, need to fulfill their
obligation to inculcate the right values in people. Cheating and petty
corruption culture is an existential threat and the solution isn't to catch
the cheaters, its acculturate them to follow the rules.

Educational institutions, however, have totally abandoned this obligation.
Shocking cheating behavior will merit just a note in someone's record, if
administrators are even willing to take it that far. Cheating isn't
publicized, shamed, and punished in the way necessary to have any impact on
the cheaters' values. Instead, everyone gets to save face.

~~~
TeMPOraL
I totally agree with your opinion here, and especially that "cheating and
petty corruption" \- and the breakdown of public trust that follows - is an
_actual_ existential threat, something that may ultimately lead to a collapse
of technological civilization. The current trend of trying to figure out
trustless solutions for everything actually worries me.

~~~
bkcooper
_The current trend of trying to figure out trustless solutions for everything
actually worries me._

If you're worried about breakdown in the face of loss of trust, then these
sorts of solutions seem like very important things to be looking at. Do we
have any good ideas about how to create trust at the institutional level?

I agree with your concerns. I think they are representative of a broader theme
- the acceleration of technical change seems like it will eventually (if it
hasn't already) bring us to a point where the rate of cultural change is too
slow to catch up to properly adapt to what is now possible.

------
protomyth
Who wrote that headline. It makes it sound like they were barred for no reason
and the community is mad. A more accurate headline would be "Baidu Team barred
from A.I. Competition for cheating". Its like they are trying not to get
censored or something.

------
phreeza
As a former glider pilot I had serious trouble parsing the headline because of
the all caps and this:
[http://en.wikipedia.org/wiki/Grob_G102_Astir](http://en.wikipedia.org/wiki/Grob_G102_Astir)

~~~
mafribe
Astir (αστήρ) means "star" in greek. I imagine that's where the G102 got its
name from!

------
dekhn
One important thing to recognize here is that this isn't just unfair play (in
the sense of trying to win a competition by submitting multiple entries). It's
a loss for all players. The point of having distinct test and training sets,
and not training on the test set, is to ensure that your system can generalize
(IE, work on things it wasn't trained on). If you don't do that, you're not
making progress, just memorizing examples. Example memorization is fine on its
own, but it's not really a true improvement.

------
vdnkh
Just some anectdotal evidence, but at work I had to implement a clone of our
Google Maps application in Baidu Maps for our customers in China. The only
reason I was able to use their API was because they copied Google's
interfaces, in some cases, word for word.

~~~
IshKebab
Evidence for what? Sounds like a good thing that they used the same API as
Google for their maps to me. Like how Google used Sun's API for Java.

~~~
Zigurd
Google uses the APIs from Java 6 and back for the "boring" bits. Android Java
is the de facto client Java today because Google abandoned the UI APIs in Java
6 and created their own. Sun's, then Oracle's argument is that Google is
"harming" Java by breaking the standard.

~~~
virmundi
I think there is subtile difference. Java was, is and shall be a standard. To
say you had a Java implementation meant that you supported the Java APIs. It
was a legal definition. Google had Java-like runtime that they promoted as a
Java runtime. This led to the question of API protections and reasonableness.

Using Google's API for your own competitive product API is different. Google
has not set a standard for map APIs. At least not officially. It might be THE
standard for maps, but that is neither here nor there. At this point you could
call it cheating. They took intellectual property and appropriated it for a
clone. Plus side, we now have an implied standard. Downside, theft of
intellectual property.

~~~
Zigurd
You are mixing a number of issues here:

1\. Google Play Services APIs: These are optional, proprietary APIs. You don't
have to use them, and you have to treat them as optional if you want to
operate across Google, Amazon, and other AOSP-derived Androids. What's the
issue here? They're like any other proprietary API.

2\. "Breaking" the "Java standard:" Google "appropriated," via a permissive
open source license, an open source implementation of some Java base classes,
and the (not protectable) syntax of the Java language. They added support for
their own remote API feature and a bunch of APIs, mainly to make a usable UI
system. The result runs on a runtime Google devised, using a bytecode Google
devised. Where is the intellectual property theft?

------
ilaksh
It is odd that there is such a significant cultural difference. On the other
hand, the difference in culture may be more subtle than we realize.

For this test, they should have had an automated way to enforce that rule, if
possible.

Look at the actual levels of piracy in the US versus China. In the US, you
will get some people who say they are worried it might be wrong or that they
will get caught. But most people will say something like 'watching this free
stream doesn't hurt anyone'.

In the US, we have no problem with high end shoes that look similar to one
another, or low-end similar to high-end. But there is a subtle distinction --
there must be some way to claim that this is not copying the other. But maybe
it more often comes down to taking a different attitude towards the same
thing. E.g. "cheap Chinese knock-offs" versus "inexpensive sensible
alternative to overpriced designer brand".

Its also perfectly fine in general for companies to copy a successful business
model. But we insist that there must be some distinction. However the
difference between these companies may only be slightly greater or mainly
surface-level, and so when you get down to the fine analysis, I thinj the real
difference between the cultures is smaller than people want to admit.

------
chvid
Why is the competition designed in such a way that it is an advantage for a
competitor to "run test versions of their programs twice weekly ahead of a
final submission"?

How does it work? You submit your classifier to some server and it is run
against what? The data set that determines your final score - hopefully not.

~~~
solve
> Why is the competition designed in such a way that it is an advantage

See my comment & links at the bottom of this page. For some reason HN always
puts my comments near the bottom, even when they're getting many upvotes. (I'm
guessing that HN has a comment-editing penalty from editing my comments too
many times, but I'm not sure, or maybe it's from times I must have insulted
the mods here.)

Edit:

I constantly get "submitting too fast", so I'll reply here.

Thanks for the info kefka. Yeah, I'll happily agree with that linked comment,
that the childish crude moderation techniques used here have been hurting a
once great community. It was the exact same scenario for me as for that guy.
Seems I got rank-banned immediately after confronting dang about the trends of
excessive downvoting and increasing side-project criticizing on HN.

~~~
chvid
Is it correctly understood that in "Large Scale Visual Recognition Challenge"
the competitor's biweekly runs ahead of the final submission is against the
set that actually determines the final result of the contest?!

~~~
mcguire
That's what I was wondering. If so, submitting many test runs makes it trivial
to provide a submission that works perfectly _against the test data_
specifically.

~~~
dumitrue
But the test set has 50,000 images (across 1000 categories). It's not that
easy to just "try many times" to get it right.

~~~
mcguire
...for some definitions of "trivial" and "perfect". At this level, I suspect
even a small advantage would result in winning the contest, which is the point
here.

------
solve
Background:

The technique is actually pretty fascinating. This is something that's been
well understood by the cryptography community for decades, but is somehow just
recently being fully appreciated by the ML community. See here:

[http://blog.mrtz.org/2015/03/09/competition.html](http://blog.mrtz.org/2015/03/09/competition.html)

[https://www.kaggle.com/c/restaurant-revenue-
prediction/forum...](https://www.kaggle.com/c/restaurant-revenue-
prediction/forums/t/13950/our-perfect-submission)

Summary -

Submitting guesses to a system that gives you back scores for your guesses,
will quickly leak out enough information that you can reverse engineer a huge
number of hidden numbers/labels in surprisingly few iterations, e.g. 700
iterations to covertly extract 10,000+ real numbers with high precision. This
surprisingly rapid convergence is a bit reminiscent of the birthday paradox.

Further, this not only lets you win the against the "test" dataset, as apposed
to the final "validation" set, but this allows you to significantly increase
the data available to you to train your model on, since now you can train your
model against both the "test" and "training" datasets.

Layman summary -

ML breaks datasets into 3 partitions "test", "train", and "validation". In
cases where they're evenly split, this technique can double the training data
you have access to, which is a massive advantage in ML competitions where
scores differ by tiny amounts.

Moral judgement -

My opinion, this moral argument is misdirecting the attention from where it
needs to be. Yes, it's bad what occurred here. But at this point, in 2015, and
with tools readily available to crack this problem effortlessly, it's
inexcusable for contests to allow so many scoring reports against their
validation sets anymore. It's no longer a question of whether contestants will
do it, but how many of them will. We'd might as well just let people self-
report their scores on an honor system, if we're going to be this overly
trusting.

Try creating a contest system like this in the cryptography field any time in
the past 3 decades and you'd be insulted and laughed out. Allowing so many
scoring reports against the validation set is fundamentally flawed. The only
solution is to _globally_ limit calls to the scoring api.

Another proposed solution -

Allowing everyone to see everyone else's guesses & resulting scores against
the "test" set, so that everyone is on equal ground for reverse engineering
the "test" set, and then globally limiting the number of scoring attempts so
that the test set isn't reverse engineered too significantly.

Overfitting the "validation" set actually is not a problem either way, because
none of these contests are dumb enough to let anyone score against the
validation set at all until the contest submission deadline is over.

~~~
kastnerkyle
Academia is basically "self-reporting on the honor system". It works generally
but there are lots of holes. Ultimately, "trust but verify" is necessary to
avoid getting caught in a wave of hype, or at least having someone in your own
"circle of trust" say it works. This system leads naturally to elitism and a
bunch of other problems which are seen in academia, but it seems better than
the current alternatives to me given the current rabid focus on exact
percentage score instead of quality/utility of an idea.

The "right way" to do it is test once only _per model /paper_. If you are
interested there are a huge number of sneaky ways overfitting can happen in ML
[1]. Also interesting that you too see crypto and ML as related - I see them
as opposites of the same coin. One tries to pull signal out of noise, the
other tries to bury the signal _in_ noise... but special noise.

[1] [http://hunch.net/?p=22](http://hunch.net/?p=22)

------
McElroy
Is training of the Google AI the reason that reCAPTCHA is now showing pictures
and asking users to select all images of a certain kind?

------
squigs25
This is clearly cheating.

HOWEVER, the goal of these contests should be to promote the most accurate and
powerful image recognition algorithms that will transform the world as we know
it. Limiting access to training data makes it more difficult to test changes
to an algorithm. These rules do not make sense to me, and I would advocate
against them.

~~~
sweezyjeezy
Two things - firstly you need to understand the concept of overfitting :
[http://en.wikipedia.org/wiki/Overfitting](http://en.wikipedia.org/wiki/Overfitting)
. If teams were allowed to train on the full dataset, it would be possible to
get a 100% score, yet still not have a model that was useful on any images
that were not in the training set. Furthermore, if you allowed infinite
submissions, teams could just train a million models with slightly different
hyperparameters, and submit the one that did best (which may be what Baidu was
trying to do here). This is a problem because now there is the possibility
that you are overfitting the test data - there would be no way to tell if the
accuracy generalised to other images without coming up with more labelled
data, i.e. making a new test set.

Second of all, cross validation : [http://en.wikipedia.org/wiki/Cross-
validation_%28statistics%...](http://en.wikipedia.org/wiki/Cross-
validation_%28statistics%29). You don't HAVE to submit to the test set to get
a feel of how well your model is performing. On datasets this large, cross
validation should be effective, if more time consuming method (unless your
model is extremely unstable).

~~~
squigs25
Was there a training set made available and distributed? I got the impression
that there was not.

~~~
sweezyjeezy
Yeah the training set is called Imagenet, it's widely used in research.

------
programmer_dude
Funny for a moment I thought some one submitted a headline with an Indian word
in it. Asthir in Hindi/Bengali means unstable, not stationary or worked up
(remotely connected to Astir?)

~~~
dalke
My thought is that the "a-" prefix in this sense probably comes from the
Germanic use "to show a state, condition, or manner";
[http://en.wiktionary.org/wiki/a-#Etymology_2](http://en.wiktionary.org/wiki/a-#Etymology_2)
. It's no longer used to make new words, but the "a-" form remains in many
words, including abloom, aflame, and abuzz.

The first known use of the word "astir" is from 1765, says Merriam-Webster at
[http://www.merriam-webster.com/dictionary/astir](http://www.merriam-
webster.com/dictionary/astir) . This is what I would expect if "stir" was a
word, "a-" was a possible mechanism to create new words, and eventually people
started to use "astir.

Etymonline gives a first known use as 1823:
[http://www.etymonline.com/index.php?allowed_in_frame=0&searc...](http://www.etymonline.com/index.php?allowed_in_frame=0&search=astir&searchmode=none)
:

> "up and about," 1823, from phrase on the stir, or from Scottish asteer; from
> stir. Old English had astyrian, which yielded Middle English ben astired "be
> stirred up, excited, aroused."

The root "stir" has a longer heritage. Etymonline at
[http://www.etymonline.com/index.php?term=stir&allowed_in_fra...](http://www.etymonline.com/index.php?term=stir&allowed_in_frame=0)
says:

> Old English styrian "to stir, move; rouse, agitate, incite, urge"
> (transitive and intransitive), from Proto-Germanic * sturjan (cognates:
> Middle Dutch stoeren, Dutch storen "to disturb," Old High German storan "to
> scatter, destroy," German stören "to disturb"), from PIE * (s)twer- (1) "to
> turn, whirl" (see storm (n.)).

Hindi is another descendant of Proto Indo-European, so that may be where
there's a connection to "asthir". However, do bear in mind that the surface
similarity is false - astyrian would be linguistically closer to the Hindi
than astir.

~~~
programmer_dude
Thanks for the informative post! However I think asthir may be a false friend
in this case. The a- prefix is used to negate sthir in asthir. Sthir in Indian
languages means stable, stationary, motionless etc. Kind of like the a- prefix
in English sometimes (social, asocial etc.)

But it is not uncommon for word meanings to change in such a way that they
take on a meaning exactly opposite of what they used to mean (see for example
the meaning of the word nice:
[http://www.etymonline.com/index.php?term=nice&allowed_in_fra...](http://www.etymonline.com/index.php?term=nice&allowed_in_frame=0)).
I wonder if there is a name for this phenomenon.

~~~
dalke
Awesome!
[http://etymonline.com/index.php?term=awe&allowed_in_frame=0](http://etymonline.com/index.php?term=awe&allowed_in_frame=0)
:)

------
peter303
In science and technology cheating is often self-correcting. Your result is
not reproducible or your product defective if you cheat too much. Everyone
will know.

------
aminorex
The PLA appears to have taken down the NYT story.

~~~
ohitsdom
Huh? Link works fine for me.

~~~
an_ko
PLA =
[https://en.wikipedia.org/wiki/People%27s_Liberation_Army](https://en.wikipedia.org/wiki/People%27s_Liberation_Army)

I think that was intended to humorously imply that the link appears down when
accessed from China.

~~~
chvid
NY Times has been blocked for years in China.

~~~
seanmcdirmid
For a couple of years. WSJ has been blocked for less than a year. CNN is still
up, though I guess its just a matter of time.

------
thansharp
I'm not an expert, so apologies if my question is stupid.

Why can't the competition have the same test data each week across all
participants? So that no matter how many accounts you create, you will train
with the same images everyone else gets to train with.

------
jokoon
I began watching Andrew Ng's course just this morning. I had also watched some
part of the courses by Pedro Domingo at the Washington University a few months
ago.

Having never been employed but having coded for 8, I wish I could get myself
educated in this field.

------
kailuowang
A major company spending so much, even risk their reputation, on a competition
in a field where computer already achieved better than human performance. I
start to feel that maybe the industry should start looking more on other new
application of deep neural network (e.g. reinforce learning by deepmind).
There is still a long way to go before we can achieve a thinking machine.

~~~
ohitsdom
We only recently (and barely) passed the performance level of a human. There's
still a lot to be gained from competitions like this.

------
amelius
Another solution would have been to allow other researchers unlimited access
to the test server as well.

~~~
mertd
That would turn the competition into an exercise in overfitting.

~~~
solve
It wouldn't, because final scores are only evaluated against the "validation"
set.

As for turning current contests into an exercise in overfitting the "test" set
- we already reached that point long ago. Test vs validation scores often
diverge wildly in these contests.

Edit - Replying to arnsholt:

Completely true. The huge problem I see, is that all the classic NLP tagging
corpuses are created from the very narrow domain of news articles, and a few
good corpuses now appearing for biology texts, and that's about it. Want to
do, e.g. NER for product reviews or chat logs? - Incredibly bad results.
There's a huge corpus problem in NLP today.

~~~
arnsholt
Not to mention out-of-domain performance.

I'm not familiar with computer vision, but in NLP taggers are hovering around
human-level performance, and parsers are quickly approaching that level. But
if you take a state-of-the-art system and test it on a slightly different
corpus (even something as simple as text from the same newspaper, but a year
later!) performance drops by a lot.

------
travelhead
What's more important, winning a competition or improving AI for the entire
world? Do the ends justify the means? I Don't think we should be too hard on
Baidu, considering they are attempting to improve their algorithm for the
interest of humanity (or evil AI that will bring an end to mankind, depending
on how you look at it).

~~~
netheril96
> I Don't think we should be too hard on Baidu, considering they are
> attempting to improve their algorithm for the interest of humanity

They are not improving the algorithm for the interest of humanity. They
submitted and obtained test results much more frequently than allowed. With
that information, they can tune parameters to more closely fit the test data.

Basically it is like scoring higher on an exam where some of the questions
have been leaked. It does not suggest you now have a better understanding of
the subject.

~~~
squigs25
If they are building an algorithm which overfits to training data, their
algorithm will lose performance on in the actual validation test.

It is NOT like looking at leaked answers. It's like re-reading the textbook
more than everyone else. Definitely an unfair advantage however.

~~~
varelse
You misunderstand. They're using repeated attempts on the test set to improve
their network. That is equivalent to training on the test set and that is
unequivocably cheating.

