
Memrefuting - wglb
http://www.scottaaronson.com/blog/?p=2212
======
jimrandomh
tl;dr: Scientific American published an article about an analog computing
technique that the author claims can solve NP-complete problems, but which
contains an elementary mistake that invalidates the result. Before SciAm
published it, an arXiv preprint circulated and commenters on HN and Reddit
pointed out the mistake. The editor either did not read these replies, or did
read them but decided to publish anyways.

This is an embarrassing fuckup on their part, but one that is endemic in mass-
market scientific publications, which want to publish sensational results but
lack the expertise to recognize them or to summarize results without major
distortions. The problem is so severe that I recommend avoiding these sources
entirely, because the frequent major errors will effectively leave you less
knowledgeable than you started.

~~~
pekk
The problem is the standard for reviews. It cannot become mandatory to read
_Reddit_ before publishing a paper. _Reddit_ has an endemic issue of
publishing sensational results with major distortions, _Reddit_ contains
frequent major errors which will effectively leave you less knowledgeable than
you started. It makes no sense to avoid scientific journals, but then trust
whatever you read on Reddit.

~~~
duaneb
> It cannot become mandatory to read Reddit before publishing a paper.

No, but if reddit does better than your review team, your journal is less
worth reading than reddit. That's not a good place to be.

------
xxxyy
The original paper on "memcomputing" has also been posted on HN[0], not that
it is worth reading in details.

But I can recommend Scott's paper "NP-complete Problems and Physical
Reality"[1], the whole thing is a brilliant piece of work.

[0]
[https://news.ycombinator.com/item?id=8652475](https://news.ycombinator.com/item?id=8652475)

[1]
[http://www.scottaaronson.com/papers/npcomplete.pdf](http://www.scottaaronson.com/papers/npcomplete.pdf)

~~~
pjungwir
Reading the comments in your [0], I'm confused about the relationship between
thermodynamic entropy and information entropy. In thermodynamics, I've always
thought of entropy as randomness, with the 2nd law saying that eventually the
universe will turn to static. That is, entropy is increasing and information
is decreasing.

But in [2] and [3] I read that information entropy _is_ information, and it
has the same formula as thermodynamic entropy! Am I missing a negation or
reciprocal somewhere? Is entropy information or lack or information?

Reading those Wikipedia pages isn't helping me. Can someone explain why my
intuition is wrong here?

[2]
[http://en.wikipedia.org/wiki/Entropy_%28information_theory%2...](http://en.wikipedia.org/wiki/Entropy_%28information_theory%29)

[3]
[http://en.wikipedia.org/wiki/Entropy_in_thermodynamics_and_i...](http://en.wikipedia.org/wiki/Entropy_in_thermodynamics_and_information_theory)

~~~
pash
No, you're right, it's just a terminological morass. Entropy and information
are different names for the same concept, one that in its full generality is
simply a measure of the uncertainty about the value of a random variable.

 _Entropy_ is a bad name, one that will probably continue to confuse people
for generations. _Information_ is a decent name, but it's a bit backwards:
information is the resolution of uncertainty. Really, the concepts implied by
the names _information_ and _uncertainty_ are opposite sides of the same coin:
for each unit of uncertainty, you gain one unit of information when you
observe a random variable's outcome.

The problem is that in the conventional definition (with the negative sign in
front of the sum), a more positive quantity denotes more uncertainty. Perhaps
the conventional quantity should have been called _uncertainty_ and the name
_information_ should have been given to the inverse quantity, i.e., to the sum
without the negative sign.

As it is, people usually use the word _information_ as a synonym for _entropy_
or _uncertainty_ , but when they focus on resolving uncertainty they sometimes
use it to mean something like the opposite. In the end, so long as you are
confident in you understanding of the concept, it doesn't much matter because
it's easy to figure out what everybody means.

~~~
hackinthebochs
How about:

entropy := amount of disorder

information := knowledge of the system (knowledge of disorder)

uncertainty := unknown disorder (entropy - information)

And so entropy and information are opposites: as you gain more information
about a system the uncertainty is reduced. If the entropy of a system
increase, the amount of information required to describe it increases. If you
have an amount of information equal to the entropy, your uncertainty is zero
and the system is completely described. This seems to square with our usage of
the terms in the context of thermodynamics and information theory.

~~~
tjradcliffe
Another way to look at this is the process of reading a stream of bytes.

A high-entropy stream means that based on what you have read so far you have a
very small chance of predicting the next byte you read.

By the same token (as it were) in those circumstances the next byte you read
will contain a great deal of information about its value that you did not
previously have.

In a low-entropy stream the bytes coming in might be: 0, 1, 2, 3... and by the
time you get to byte N you can be pretty sure of its value, so the next byte
contains very little information you don't already have.

Both information and entropy are measures of the novelty of the next byte in
the stream, but information is measured from the perspective of what you get
when you read it and entropy is measured from the perspective of what you have
before you read it.

------
praptak
Subset sum is actually only _weakly_ NP-complete. What that means: best known
algorithms are only exponential in terms of problem size (number of bits it
takes to encode the numbers) but not in terms of the numbers themselves.
Dynamic programming solves subset sum in Poly(numbers) time, which is
obviously exponential in terms of bits.

So there's a danger here associated with demonstrating your superpowers on
subset-sum: unless the numbers are huge, you are actually solving a trivial
problem.

I didn't dig into the article enough to check whether the authors made this
mistake.

~~~
tlb
Yes, they also make this mistake. The frequencies correspond to the numbers
being summed, so if the numbers are huge (>100 bits) the frequencies will be
so high that the wavelengths would be smaller than atoms, where you can't
construct physical circuits. If you scale down the frequencies, the time it
takes for a complete period goes up exponentially with the number of bits in
the numbers.

------
frankus
I did the k = 0 case in an interview once (with nonempty subsets), and there's
a neat little trick to do it in O(n).

~~~
teraflop
Even with k=0, the problem is still NP-complete.

Are you referring to the "pseudo-polynomial" trick where you maintain a bit
vector of partial sums?

~~~
frankus
Now that I think about it, it was a sequence of integers (looking for sub-
sequences), not a set.

~~~
repiret
To be pendantic with terminology, if you take a sequence and remove some
elements, what you're left with is a subsequence. If you take a sequence and
you select some contiguous elements, what you get is a substring. So "amqz" is
a subsequence of the alphabet but not a substring. "lmnop" is a substring and
a subsequence.

