
The Bandwagon – Claude Shannon (1956) [pdf] - CarolineW
http://dsp.rice.edu/sites/dsp.rice.edu/files/shannon-bandwagon.pdf
======
apoms
"It will be all too easy for our somewhat artificial prosperity to collapse
overnight when it is realized that the use of a few exciting words like deep
networks, back propagation, training, do not solve all our problems. "

------
skybrian
The idea that "only first-class research should be published" sounds good, but
I wonder whether this sort of attitude might tend to discourage researchers
from replicating previous studies or publishing negative results. Those are
important too.

~~~
artifaxx
Testing of past research should be considered first class. It isn't sexy, but
it is critical to increasing our confidence in what we know!

------
Animats
In the 1950s, computers of very modest power were described as "giant brains".
Mathematics was viewed almost as a form of magic. Mathematical logic was
elevated to cult status.[1] Apparently this spilled over into information
theory, which bothered Shannon.

[1] [http://thecomputerboys.com/wp-
content/uploads/2014/02/Ensmen...](http://thecomputerboys.com/wp-
content/uploads/2014/02/Ensmenger2010-CB-desk-set.pdf)

~~~
dredmorbius
That's a trope which extends to the present especially in BBC coverage, where
there's a perverse anti-intellectualism-disguised-as-worship. "Gee, these
boffins are so clever but we really can't explain to you just how clever they
are because you wouldn't possibly understand". With a bit of the old clever-
as-insult connotation mashed in.

You'll find that in the US as well, more in popular media, though there's a
growing amount of technically competent coverage. More usually the US sin is
one of overhyping technology for commercial gain -- hucksterism.

~~~
DonHopkins
You've clarified the vague negative impression I get when I hear British TV
shows use the word "boffin" \-- I've always found it kind of annoying, with a
slightly insulting "not one of us" feeling that "boffins" are not normal
people, nor something to aspire to be. I think "perverse anti-intellectualism-
disguised-as-worship" perfectly captures what I find so annoying about the
term.

Wikipedia gives a reference to its origin [1], and I think it still retains
some of its original contemptuous meaning:

[1]
[https://en.wikipedia.org/wiki/Boffin](https://en.wikipedia.org/wiki/Boffin)

... the article, entitled "Cold Bath for a Boffin", defines the term for its
American audience as "civilian scientist working with the British Navy" and
notes that his potentially life-saving work demonstrates "why the term
'boffin', which first began as a sailor's expression of joking contempt, has
become instead one of affectionate admiration."

... by the 1980s boffins were relegated, in UK popular culture, to semi-comic
supporting characters such as Q, the fussy armourer-inventor in the James Bond
films, and the term itself gradually took on a slightly negative connotation.
[2]

[2]
[https://www.theguardian.com/science/blog/2010/sep/24/scienti...](https://www.theguardian.com/science/blog/2010/sep/24/scientists-
boffin-stereotype)

There is little that irritates scientists more than the idea of the
"boffin"...

~~~
dredmorbius
Thanks for digging that up, I've been curious about the origins myself.

So, "boffin" isn't _gratuitously_ condescending. But it's one of those
slightly sneering down-your-nose terms the Brits can be so good at.

------
Smerity
I'm sure many will read into the similarity between the bandwagon for
information theory described here and the bandwagon for machine learning /
artificial intelligence. While it may be obvious for some, let me decode (one
of the few situations Genius would make sense for me) some of the repeating
issues. As I've noted elsewhere[1], machine learning / artificial intelligence
hold great promise for many fields, but we need to fight hype and unscientific
use. To steal from Jack Clark[2]:

"Machine Learning is modern alchemy. People then: iron into gold? Sure! People
now: shoddy data into new information? Absolutely!"

\---

"Our fellow scientists in many different fields, attracted by the fanfare and
by the new avenues opened to scientific analysis, are using these ideas in
their own problems."

"[E]stablishing of such applications is not a trivial matter of translating
words to a new domain, but rather the slow tedious process of hypothesis and
experimental verification."

This is immensely important. While many of these methods appear general, and
can be used with little effort thanks to the wide variety of machine learning
toolkits that are available, they should be applied with care. Applying these
methods to problems in other domains without careful consideration for the
differences and complexities that might arise.

The availability of advanced toolkits does not make your work impervious to
flaws. With machine learning, it's even worse - the flaws in your data, model,
or process, can be explicitly worked around by the underlying machine learning
algorithm. That makes debugging difficult as your program is, to some loose
degree, self repairing[3].

Using these toolkits without proper analysis and experimental proof that
they're working as intended, especially when their predictions are used for an
important decision, is negligence.

"Research rather than exposition is the keynote, and our critical thresholds
should be raised."

As a field, we don't have a strong grasp on many of the fundamentals. Issues
that are obvious in hindsight are hiding in plain view. Just a few days ago,
layer normalization popped up. It will likely make training faster and results
better for a variety of applications. You can literally explain the idea to a
skilled colleague in all of ten seconds. Somehow we were using a far more
complicated method (batch normalization, weight normalization, etc) before
trying the "obvious" stuff

We need more work like that than papers and media publications grandstanding
about vague potential futures that have little theoretical or experimental
basis.

Also, it's worth reading Shannon's "A Mathematical Theory of Communication"[4]
from 1948. There's a reason it has 85,278 citations - entire fields started
there.

[1]:
[http://smerity.com/articles/2016/ml_not_magic.html](http://smerity.com/articles/2016/ml_not_magic.html)

[2]:
[https://twitter.com/jackclarkSF/status/755257228429406208](https://twitter.com/jackclarkSF/status/755257228429406208)

[3]:
[https://twitter.com/sergulaydore/status/746098734946201600](https://twitter.com/sergulaydore/status/746098734946201600)

[4]:
[http://worrydream.com/refs/Shannon%20-%20A%20Mathematical%20...](http://worrydream.com/refs/Shannon%20-%20A%20Mathematical%20Theory%20of%20Communication.pdf)

~~~
Houshalter
There is some of that, but in general I get the opposite impression. People
are extremely resistant to machine learning. They don't trust it. It's
actually a phenomenon studied in psychology called algorithm aversion:
[https://marketing.wharton.upenn.edu/mktg/assets/File/Dietvor...](https://marketing.wharton.upenn.edu/mktg/assets/File/Dietvorst%20Simmons%20&%20Massey%202014.pdf)

There is a paper I was just reading here
([https://meehl.dl.umn.edu/sites/g/files/pua1696/f/167grovemee...](https://meehl.dl.umn.edu/sites/g/files/pua1696/f/167grovemeehlclinstix.pdf))
where the author surveyed the literature for all the comparisons he could find
of human experts and statistical methods. In all but a few cases the
algorithmic methods did better. Most of these were crude, simple models, using
only a few features. Most of them are before the age of computers and were
calculated on pencil and paper. And yet they still generally outperform human
intuition, which is just terrible and barely better than chance.

Yet in most the industries where algorithms were shown to do better than
humans decades ago, did not switch to the algorithms.

