
Death of the sampling theorem? - gballan
https://markusmeister.com/2018/03/20/death-of-the-sampling-theorem/amp/
======
tacon
I don't recall ever reading a rejecting peer reviewer for one journal calling
out the same work accepted in another ("peer reviewed") journal. Before the
web, I guess there wasn't much for a reviewer to do other than complain to
their peers. Now the low quality journal is called out in public. And a
rejecting peer reviewer calling out the author for ignoring direct
contradictory evidence against their work. The threat of breaking peer review
anonymity is now a force to be reckoned with.

~~~
mannykannot
I do not see how this article poses a threat of breaking peer review
anonymity. The only anonymity broken here is the author's own, only in this
particular case, and after the fact. What is the purpose of reviewer
anonymity? It protects the reviewer from being pressured into accepting a
paper (and to protect science from that happening), which is not a risk here,
and it prevents reviews from turning into long-running disputes about the
validity of a rejection, but in this case, the issue is one of the authors
willfully ignoring objective, valid and significant objections to their work.

------
jwilk
Non-AMP version, which works with JS disabled:

[https://markusmeister.com/2018/03/20/death-of-the-
sampling-t...](https://markusmeister.com/2018/03/20/death-of-the-sampling-
theorem/)

And here's JS-free rendering of the notebook:

[https://nbviewer.jupyter.org/github/markusmeister/Aliasing/b...](https://nbviewer.jupyter.org/github/markusmeister/Aliasing/blob/master/Tsai_2017_Critique.ipynb)

~~~
VladTheImplier
Thx. You are not the hero we need, but the hero we deserve.

------
blauditore
I find it funny how Nyquist theorem is often seen as a highly theoretical rule
with complex technical background. But if you think about it, it's super
straight-forward. You just need to sample each up- and down-deflection of a
wave at least once.

Trying to "beat" this rule is like trying to beat the Pythagorean theorem. If
you want to derive higher frequencies, you'll quite obviously have to make
additional assumptions about the measured wave.

~~~
pishpash
It's not straightfoward. Your intuition only works for periodic signals.
Nyquist applies to all L2 functions.

~~~
OrganicMSG
Non-periodic signals can be thought of as a Fourier series of periodic sine
waves though.

------
gowld
It's theoretically plausible that the noise has some structure, and with some
bayesian priors, one can classify some of the variation as "probably noise" vs
"probably signal". Of course it works best if you know what the signal is
before you start, but if that's the case why are you trying to experimentally
construct it? :-)

This is a common subtlely problem with statistical denoising (aka pattern
recognition, aka lossy compression). Commonly the source data and noise models
are synthetic formulas (simple sine wave or aussian), and knowledge of the
data-generation procedure can infect the supposed algorithm. (a subtle form of
training your model on your test dataset)

~~~
smallnamespace
> but if that's the case why are you trying to experimentally construct it?

People are able to interpret speech under very high levels of noise due to
having strong priors / being able to guess well / having shared state with the
speaker.

Of course you should take advantage of whatever priors you have about the
signal; we know rather trivially from information science that information
gain / transmission rate is _maximized_ when the listener knows as much as
possible about the source. This is also key for compressed sensing.

You see analogues of this pattern everywhere in the world: e.g. people learn
most quickly when they have _some_ context about the problem space already.

The real question is why the sender would produce a signal with any level of
predictability in it, rather than a signal that purely maximizes the effective
bit rate of transmission (e.g. minimize redundancy modulo noise). But it's
easy to see that physical constraints (e.g. how the human voice works) are at
least partly responsible for the enforcement of regularity, at least for many
signals that we're interested in in practice.

~~~
dsacco
_> People are able to interpret speech under very high levels of noise due to
having strong priors / being able to guess well / having shared state with the
speaker._

In the information theoretic sense, this just means there is enough redundancy
in the signal that you can error correct the noise. If you’re in a noisy room
talking to a coworker, your shared context and priors are really redundant
bits of information. If she says something about “nachoes” but you couldn’t
tell if she said “nachoes”, “tacos” or “broncos”, the nachoes next to you add
redundancy to the signal.

So in that context, what you’re saying is that we should take advantage of any
redundancy we can find in an incoming signal.

~~~
AstralStorm
The problem is there is no real way to estimate redundancy of a real signal of
specific form, or more importantly discerning between likely alias vs likely
true signal given a signal with unknown statistics. The statistical approach
might work for simple clean signals but makes critical mistakes in case of
complex ones. Specifically you get to estimate true phase and true magnitude
in a given subband...

~~~
dsacco
Yes, agreed. My point was to basically say that priors (as a form of
redundancy) do not undercut the sampling theorem.

------
EtDybNuvCu
While it might seem arrogant of the author to take this stance, the Sampling
Theorem generalizes extremely well and is considered an extremely stable
result. My favorite generalized Sampling Theorem is
[https://arxiv.org/abs/1405.0324](https://arxiv.org/abs/1405.0324)

~~~
gowld
It's hardly arrogant to vote in favor of a 100-year-old mathematical proof
over a new supposed refutation.

~~~
posterboy
It is an argument from authority. That is asking a lot from the reader, i.e.
to suspend believe or go research, the later of which is not too much to ask
if direct at an acceptable audience.

"Arrogance" literally is akin to "to ask", e.g. interrogate ... unless the
"a-" is akin to "anti" instead of "ad-" or whatever, so that arrogance would
be like walzing in and taking without asking or making an assumption without
due inquiry.

------
kazinator
Usually claims like this fall into two buckets: (1) crazy, (2) existing idea
spun as something new. (Like airless tires, which have been around for a
hundred years.)

This is quite clearly (2).

> _Effectively they can replace the anti-aliasing hardware with software that
> operates on the digital side after the sampling step_

Sampling with filtering in the digital domain after sampling is absolutely old
hat, found in everyday tech.

Let's look at audio. We can sample audio at 48 kHz, and use a very complicated
analog "brick wall" filter at 20 kHz.

There is another way: use a less aggressive, simple filter, which lets through
plenty of ultrasonic material past 20 kHz. Sample at a much higher frequency,
say 384 kHz or whatever. Then reduce the resolution of the resulting data
_digitally_ to 48 kHz. There we go: digital side after sampling step.

This is cheaper and better than building an analog filter which requires
multiple stages/poles that have to be carefully tuned, requiring precision
components.

~~~
weinzierl
First sentence from the original article: „A team from Columbia University led
by Ken Shepard and Rafa Yuste claims to beat the 100 year old Sampling
Theorem“

> (2) existing idea spun as something new

The existing idea is oversampling and is often spun as beating the sampling
theorem. But this is unlike the airless tires example, because, while
oversampling works, it isn‘t beating the sampling theorem but rather based on
it.

You _can_ filter after sampling, but that filter is _not_ a replacement for
the analog ant-aliasing filter. To avoid aliasing you have to somehow limit
the bandwidth of your input.

~~~
kazinator
Right; it's not a replacement. But you can then pretend (due to naivete or
dishonesty) that the analog one doesn't exist and claim you've beaten the
sampling theory. Why: because the input stage before the sampler naturally has
a limited frequency response and you happen sample well above that. Thus a
filter is _de facto_ there but wasn't formally designed in.

------
wolfgke
Just a consideration (I am mathematician, but have nothing to do with signal
processing or its mathematics; so don't overestimate my knowledge in this
area):

Let us consider a periodic signal (though, I believe, this assumption can be
weakened). If we have to assume that all frequencies from the inverse of the
period length up to the Nyquist frequency might occur in the signal, the
sampling theorem gives us the information that we need.

Now some "practical" consideration: Assume that the signal that we want to
sample is very "well-behaved", e.g. we know that lots of its low frequencies
are 0 or near 0 (or in general have some other equation that tells us "how the
signal has to look like"). So if we reconstruct the frequencies of the signal,
but in these Fourier coeficients some other value than 0 appears, we know that
it has to come from "noise" that originates in some higher frequency above the
Nyquist frequency.

My mathematical intuition tells me that it is plausible that such a trick
might be used to reconstruct signal exactly even if we sample with less than
than 2 times the highest occuring frequency. Why? Because we know more about
the signal than what sampling theorem assumes. So a "wrong" reconstruction for
which the sampling theorem can show its existence if there occurs a frequency
higher than the Nyquist frequency is not important for us, since we can
plausibly show that this signal cannot have 0s in the "excluded low
frequencies" Fourier coefficients.

This would explain to me why such an impossible looking algorithm works so
well in practice.

It it very plausible to me that such a trick is already used somewhere in
engineering.

~~~
tanderson92
This is called compressed sensing.

~~~
wolfgke
Thanks.

~~~
whyever
Compressed Sensing originated in math, you might be interested in the original
paper:
[https://statweb.stanford.edu/~donoho/Reports/2004/Compressed...](https://statweb.stanford.edu/~donoho/Reports/2004/CompressedSensing091604.pdf)

------
avip
With the rise of "big data", and "machine learning", we should painfully
expect shit-science to become the norm (if it's not already). If you ever read
a research paper somewhere in the bio-something space, you'll share the
sentiment ("double-blinded this... double-blinded that... our unpublished data
analysis pipeline... _96% accuracy_ ")

~~~
JorgeGT
I particularly enjoy the re-discovery of numerical integration in 1994 by a
medical researcher: [https://fliptomato.wordpress.com/2007/03/19/medical-
research...](https://fliptomato.wordpress.com/2007/03/19/medical-researcher-
discovers-integration-gets-75-citations/)

------
Vaslo
This article reminds me that I am on the low end of intelligence for the
average Hacker News reader.

~~~
SmooL
I'd take with context. Everything in this article should be understood by a
3rd year electrical/computer engineer as standard knowledge. Because I'm a
computer engineering graduate, I can understand it. However, give me an
article with '3rd year standard knowledge' from any other engineer discipline
and I'll have no idea what's happening

~~~
antognini
One of the lessons college taught me is that I can learn anything, but I can't
learn everything.

~~~
tmuir
One of the lessons many engineers who pass the PE exam learn is, "not with
that attitude you can't".

------
pishpash
This isn't about the sampling theorem, it's a misapplication of compressed
sensing.

------
std_throwaway
Similar to the conservation of energy in physics you can ask yourself "is
there enough space for the information?". If somebody tries pulls two signals
out of one signal, they better don't have too much entropy each. For a given
bandwidth (number of independent samples), a given signal energy, and a given
noise energy, there is a hard limit to the amount of information you can pull
out.

If your signals are not "full entropy" or "white noise", you can do all sorts
of funny tricks to increase the SNR to some extent. These tricks sure are nice
but they veil the true issue at hand which is about information and capacity.

------
tw1010
I love this post for two reasons. First, it's a great reminder to question
assumptions about high abstraction mathematical models that tries to say
something about reality. And to not take what a community of even scientists
say as stable truth. Ask a room full of signal processing engineers if it's
possible to do better than the sampling theorem, and they'll confidently –
with bravado – tell you that such an idea is ridiculous. Second, the end is a
great example of a "proof by incentive" argument, which is one of my favourite
ways to produce trust in a theorem.

~~~
tgb
I read this comment before reading the article and got excited to read an
article showing how the "establishment" was wrong. Much to my surprise, the
article was saying the exact opposite! I'm still not really sure how to read
this comment in a way that makes sense.

~~~
dsacco
I think that’s because the comment is ambiguously distinguishing between
academics and engineers _or_ lumping them together.

~~~
carlmr
The comment is digitized brain noise in ASCII form. It just shows that if
there's too much unfiltered noise before keyboard sampling, the signal can't
be reconstructed.

------
keithnz
so, hold on, he's wanting to charge people $10 (to get a file) to attempt to
solve the problem, then give a prize of $1000 to anyone who can do it?

Hmmm.

Interesting.

~~~
fellellor
He does provide a Jupyter notebook to try out your preliminary ideas.

From what I've seen people in the DSP community are very anal about the
inviolability of the sampling theorem. So charging $10 may be his way to
punish novice "fools" who think they know better.

I'm not sure I disagree.

~~~
enriquto
If you understand the sampling theorem, its meaning is self-evident, you just
count the degrees of freedom. Ask a mathematician how many samples do they
need to recover the coefficients of a polynomial of degree 10. They will say:
11 samples are necessary and sufficient, if you have less than 11 samples then
you cannot recover any coefficient. If you have 11 or more samples you can
recover all the coefficients.

The sampling theorem is exactly the same thing but for trigonometric
polynomials.

~~~
tmuir
I don't understand it from this angle, but to me, the general description of
what they employ sounds very analogous to how noise cancelling headphones work
for pilots. The noise generated by the engines, which normally would drown out
any signal of humans conversing through air, is of known frequencies, and a
synthesized version of that noise can be played back 180 degrees out of phase
along with the original audio, cancelling the noise, revealing the signal. No
one is implying that the headphones do this without sampling the engine noise,
they might just not have to rely on the actual measurements in question to
glean the necessary information to synthesize a signal capable of such
cancellation. It reads to me like this broadband thermal noise is predictable,
which is why they are able to synthesize it in the first place.

~~~
enriquto
The statement of sampling theorem has nothing to do with noise. It only states
that there is conservation of information.

~~~
tmuir
You misunderstood. I was relaying that I, too, have a decent grasp of sampling
theory, and that if I understand the article's main premise, then saying they
have disproven Nyquist's Theorem with this example is akin to saying that
being able to draw both 30 degree and 90 degree angles with a compass
disproves the idea that angle trisection is generally impossible.

They have simply shifted in time the sampling of the noise. Even if they argue
they synthesize the noise in realtime, they had to have sampled it at some
point to know what to synthesize in the first place. It was at this point that
they were required to sample at at least 2MHz to accurately quantify their
noise at 1MHz.

The mechanism used in pilot's noise cancelling headphones is much the same.
they don't sample the engine noise as its being made, they synthesize it from
known frequencies.

------
madengr
I suppose you could say the antialias filter doesn’t reduce the noise, but
rather coherently averages the signal to increase it’s amplitude. If the
desired signal is noise-like itself, then I don’t see how the scheme would
work.

~~~
baking
A simple (low pass) anti-aliasing filter has a linear drop-off, meaning the
greater the difference between the frequency of the signal and the frequency
of the noise, the better the signal to noise ratio. On the other hand,
sampling random gaussian distributed noise will be random gaussian distributed
noise no matter what frequency you sample it at.

[https://en.wikipedia.org/wiki/Low-
pass_filter#/media/File:Bu...](https://en.wikipedia.org/wiki/Low-
pass_filter#/media/File:Butterworth_response.svg)

~~~
AstralStorm
Remember that the lowpass will change the phase or time response of the
original signal. If there is anything in the frequencies and phases affected
by the lowpass it will get distorted.

The trick works only because the signal is spike like. I reckon it will miss
some of the spikes in real life on real signals or produce spurious spikes.
The statistics are not accurate at any given point in time just in the limit.

~~~
sohkamyung
I was going to point this out (phase response distortion) but you beat me to
it.

During my university period, I had to digitally sample an ECG
(electrocardiogram) signal. Since the shape of the signal is important in
diagnosis and must be preserved, I used a Bessel filter [1].

[1]
[https://en.wikipedia.org/wiki/Bessel_filter](https://en.wikipedia.org/wiki/Bessel_filter)

------
a-dub
It seems a little tongue in cheek to me. Sampling with no low-pass
antialiasing filter in general seems like a different problem from sampling
with no low-pass antialiasing filter where you _know_ that they only spectral
content above the nyquist cutoff is gaussian noise.

That said, his takedown of their approach makes sense to me. It's hard to
really judge it though as my signal processing is getting _really_ rusty these
days. I guess that's why we have professors. :)

------
RantyDave
But ... they only want signals up to 5Khz, which is bugger all. So you could
oversample (even a stock audio chip will do 96Khz) then apply a low pass
filter with an FIR. Why wouldn't this work?

------
jangeco
there is so much useless discussion here. what the blogger takes issue with
(and refutes convincingly) is that ALIASED THERMAL NOISE can be RECONSTRUCTED.
it clearly cannot (if you think you can, this is your chance to make $1000).
it doesn't matter what rate you sample at, nor is there any need to invoke
compressed sensing - what the authors of the original papers do has nothing to
do with it.

~~~
mameister4
couldn't have said it better myself (I'm the blogger)

------
ineedasername
Tldr: authors of paper claim their digital signal processing method is
superior to those methods imposed by sampling theory; in practice, their
method works only on examples where there is high level of a priori knowledge
of the signal, i.e., not in any practical environment.

~~~
jey
Close, but not quite: compressed sensing really works in a lot of domains. You
"just" have to know of a basis set in which your signal is sparse. This basis
set can be (and often is) overcomplete; i.e. you don't need a minimal
orthogonal set of basis functions. So if you know that "most pixels of an MRI
are black" or "music only has a few non-zero Fourier frequencies", you can
apply compressed sensing techniques to recover/reconstruct the underlying
signal. This is a fairly mild and merely structural form of a priori side
information, as opposed to having to know detailed precise prior distributions
or etc.

See
[https://en.wikipedia.org/wiki/Sparse_dictionary_learning](https://en.wikipedia.org/wiki/Sparse_dictionary_learning)
and [https://arxiv.org/abs/1012.0621](https://arxiv.org/abs/1012.0621) for the
gory details.

~~~
highd
Most of the research in compressed sensing type applications now is focused on
more sophisticated prior distributions than sparse/Laplacian. These more
general bayesian approaches provide significant performance improvements over
LASSO etc.

~~~
jey
Sweet, great to hear. Any particular papers you'd recommend checking out?

------
bcaa7f3a8bbc
Death of the clickbait?

------
Ono-Sendai
There's no problem with doing anti-aliasing digitally later based on samples
you have collected. You don't need some kind of physical analog device to do
it.

~~~
ckocagil
Only if you oversample to cover the entire noise frequency range.

~~~
analog31
Indeed, and digital audio systems typically combine some sort of oversampling
with digital filtering. They still need an anti-aliasing filter, but the
filter works at a higher cutoff frequency and can be much simpler.

