Hacker News new | past | comments | ask | show | jobs | submit login
The Bandwagon – Claude Shannon (1956) [pdf] (rice.edu)
60 points by CarolineW on July 23, 2016 | hide | past | web | favorite | 12 comments

"It will be all too easy for our somewhat artificial prosperity to collapse overnight when it is realized that the use of a few exciting words like deep networks, back propagation, training, do not solve all our problems. "

The idea that "only first-class research should be published" sounds good, but I wonder whether this sort of attitude might tend to discourage researchers from replicating previous studies or publishing negative results. Those are important too.

Testing of past research should be considered first class. It isn't sexy, but it is critical to increasing our confidence in what we know!

I'm starting to think that "Only research that's been properly replicated at least twice" should be reported in the non-academic press. It would save a lot of "my experts can beat up your experts" kind of arguments about whether (e.g.) coffee is good for you or not.

And, I think, it might make science look a little better in the eye of the general public.

In the 1950s, computers of very modest power were described as "giant brains". Mathematics was viewed almost as a form of magic. Mathematical logic was elevated to cult status.[1] Apparently this spilled over into information theory, which bothered Shannon.

[1] http://thecomputerboys.com/wp-content/uploads/2014/02/Ensmen...

That's a trope which extends to the present especially in BBC coverage, where there's a perverse anti-intellectualism-disguised-as-worship. "Gee, these boffins are so clever but we really can't explain to you just how clever they are because you wouldn't possibly understand". With a bit of the old clever-as-insult connotation mashed in.

You'll find that in the US as well, more in popular media, though there's a growing amount of technically competent coverage. More usually the US sin is one of overhyping technology for commercial gain -- hucksterism.

You've clarified the vague negative impression I get when I hear British TV shows use the word "boffin" -- I've always found it kind of annoying, with a slightly insulting "not one of us" feeling that "boffins" are not normal people, nor something to aspire to be. I think "perverse anti-intellectualism-disguised-as-worship" perfectly captures what I find so annoying about the term.

Wikipedia gives a reference to its origin [1], and I think it still retains some of its original contemptuous meaning:

[1] https://en.wikipedia.org/wiki/Boffin

... the article, entitled "Cold Bath for a Boffin", defines the term for its American audience as "civilian scientist working with the British Navy" and notes that his potentially life-saving work demonstrates "why the term 'boffin', which first began as a sailor's expression of joking contempt, has become instead one of affectionate admiration."

... by the 1980s boffins were relegated, in UK popular culture, to semi-comic supporting characters such as Q, the fussy armourer-inventor in the James Bond films, and the term itself gradually took on a slightly negative connotation. [2]

[2] https://www.theguardian.com/science/blog/2010/sep/24/scienti...

There is little that irritates scientists more than the idea of the "boffin"...

Thanks for digging that up, I've been curious about the origins myself.

So, "boffin" isn't gratuitously condescending. But it's one of those slightly sneering down-your-nose terms the Brits can be so good at.

I'm sure many will read into the similarity between the bandwagon for information theory described here and the bandwagon for machine learning / artificial intelligence. While it may be obvious for some, let me decode (one of the few situations Genius would make sense for me) some of the repeating issues. As I've noted elsewhere[1], machine learning / artificial intelligence hold great promise for many fields, but we need to fight hype and unscientific use. To steal from Jack Clark[2]:

"Machine Learning is modern alchemy. People then: iron into gold? Sure! People now: shoddy data into new information? Absolutely!"


"Our fellow scientists in many different fields, attracted by the fanfare and by the new avenues opened to scientific analysis, are using these ideas in their own problems."

"[E]stablishing of such applications is not a trivial matter of translating words to a new domain, but rather the slow tedious process of hypothesis and experimental verification."

This is immensely important. While many of these methods appear general, and can be used with little effort thanks to the wide variety of machine learning toolkits that are available, they should be applied with care. Applying these methods to problems in other domains without careful consideration for the differences and complexities that might arise.

The availability of advanced toolkits does not make your work impervious to flaws. With machine learning, it's even worse - the flaws in your data, model, or process, can be explicitly worked around by the underlying machine learning algorithm. That makes debugging difficult as your program is, to some loose degree, self repairing[3].

Using these toolkits without proper analysis and experimental proof that they're working as intended, especially when their predictions are used for an important decision, is negligence.

"Research rather than exposition is the keynote, and our critical thresholds should be raised."

As a field, we don't have a strong grasp on many of the fundamentals. Issues that are obvious in hindsight are hiding in plain view. Just a few days ago, layer normalization popped up. It will likely make training faster and results better for a variety of applications. You can literally explain the idea to a skilled colleague in all of ten seconds. Somehow we were using a far more complicated method (batch normalization, weight normalization, etc) before trying the "obvious" stuff

We need more work like that than papers and media publications grandstanding about vague potential futures that have little theoretical or experimental basis.

Also, it's worth reading Shannon's "A Mathematical Theory of Communication"[4] from 1948. There's a reason it has 85,278 citations - entire fields started there.

[1]: http://smerity.com/articles/2016/ml_not_magic.html

[2]: https://twitter.com/jackclarkSF/status/755257228429406208

[3]: https://twitter.com/sergulaydore/status/746098734946201600

[4]: http://worrydream.com/refs/Shannon%20-%20A%20Mathematical%20...

There is some of that, but in general I get the opposite impression. People are extremely resistant to machine learning. They don't trust it. It's actually a phenomenon studied in psychology called algorithm aversion: https://marketing.wharton.upenn.edu/mktg/assets/File/Dietvor...

There is a paper I was just reading here (https://meehl.dl.umn.edu/sites/g/files/pua1696/f/167grovemee...) where the author surveyed the literature for all the comparisons he could find of human experts and statistical methods. In all but a few cases the algorithmic methods did better. Most of these were crude, simple models, using only a few features. Most of them are before the age of computers and were calculated on pencil and paper. And yet they still generally outperform human intuition, which is just terrible and barely better than chance.

Yet in most the industries where algorithms were shown to do better than humans decades ago, did not switch to the algorithms.


I rarely comment here, but the similarity was so striking that I came back with the express intention to comment on this exact vein. I actually suspect this analogy was the OP's intention all along :)

Also, it is striking how the human nature never changes and the societal mores tend to remain the same; 60 years is not such a long time on that scale, but the buzz, the (less/un-informed) fever and fervour of the general public that Shannon remarked, they all sound eerily familiar.

Upvoted. Was going to make the same point.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact