Hacker News new | past | comments | ask | show | jobs | submit login
Death of the sampling theorem? (markusmeister.com)
217 points by gballan on Mar 21, 2018 | hide | past | web | favorite | 101 comments

I don't recall ever reading a rejecting peer reviewer for one journal calling out the same work accepted in another ("peer reviewed") journal. Before the web, I guess there wasn't much for a reviewer to do other than complain to their peers. Now the low quality journal is called out in public. And a rejecting peer reviewer calling out the author for ignoring direct contradictory evidence against their work. The threat of breaking peer review anonymity is now a force to be reckoned with.

I do not see how this article poses a threat of breaking peer review anonymity. The only anonymity broken here is the author's own, only in this particular case, and after the fact. What is the purpose of reviewer anonymity? It protects the reviewer from being pressured into accepting a paper (and to protect science from that happening), which is not a risk here, and it prevents reviews from turning into long-running disputes about the validity of a rejection, but in this case, the issue is one of the authors willfully ignoring objective, valid and significant objections to their work.

People often write "Comments" in the same journal that address a work's shortcomings. This can be the referees who are ultimately over-ruled by an editor, or any other interested reader.

Non-AMP version, which works with JS disabled:


And here's JS-free rendering of the notebook:


Thx. You are not the hero we need, but the hero we deserve.

I find it funny how Nyquist theorem is often seen as a highly theoretical rule with complex technical background. But if you think about it, it's super straight-forward. You just need to sample each up- and down-deflection of a wave at least once.

Trying to "beat" this rule is like trying to beat the Pythagorean theorem. If you want to derive higher frequencies, you'll quite obviously have to make additional assumptions about the measured wave.

Another very simple way to look at this is with the pidgeonhole principle. If I have (for simplicity) 4 signed 8-bit numbers, there are only 2^32 possible digital outputs I can represent. However, in the presence of aliasing, there's a theoretically infinite number of possible analog signals that could have produced those numbers, and no way from the numbers alone to distinguish them. Therefore, you need additional data to distinguish the original signal. In the analog domain, the most practical choice is to pre-filter the signal so that you know the sampling is adequate in the frequency range you are sampling. With that additional constraint you can take those numbers and reconstruct the original signal within the parameters of signal processing theory.

This point of view also has some advantages in that if you think about it, you can see how you might play some games to work your way around things, many of which are used in the real world. For instance, if I'm sampling a 20KHz range, you could sample 0-10KHz with one signal, and then have something that downshifts 40-50KHz into the 10-20KHz range, and get a funky sampling of multiple bands of the spectrum. But no matter what silly buggers you play on the analog or the digital side, you can't escape from the pigeonhole principle.

From here we get the additional mathematical resonance that this sounds an awful lot like the proof that there can be no compression algorithm that compresses all inputs. And there is indeed a similarity; if we had a method for taking a digital signal that could have been 500Hz or 50500Hz despite an identical aliased signal, we could use that as a channel for storing bits in above and beyond what the raw digital signal contains; if we figure out it's the 500Hz signal it's an extra 0, or if it's 50500Hz it's a 1. With higher harmonics we could get even more bits. They don't claim something quite so binary, they've got more of a probabilistic claim, but that just means they're getting fractional extra bits instead of entire extra bits; the fundamental problem is the same. It doesn't matter how many bits you pull from nowhere; anything > 0.0000... is not valid.

Of course, one of the things we know from the internet is that there is still a set of people who don't accept the pigeonhole principle, despite it literally being just about the simplest possible mathematical claim I can possibly imagine (in the most degenerate case, "if you have two things and one box, it is not possible to put both of the things in a box without the box having more than one thing in it").

When dealing with bits, the situation is different since algebraic degrees of freedom (dimensions or coefficients of sinusoids) are different than information degrees of freedom (bits). This difference in the context of the sampling theorem is explored in https://arxiv.org/abs/1601.06421, where it is shown that sampling below the Nyquist rate (without additional loss) is possible when the samples must be quantized to satisfy some bit (information) constraint.

Not even close. Nyquist requires sampling at least twice the /bandwidth/ of the signal, not necessarily twice the highest frequency, because of aliasing. For example, a signal that’s 1 megahertz +/- 1khz requires only 4khz sampling to capture the detail.

Aliasing is always a factor because no real signal has a highest frequency nor a fixed bandwidth (noise is never zero, and all filters roll off gradually forever)

I think it's a bit harsh to say not even close. I think it captures the idea pretty well.

The reason it's not perfect is because of your example, where a signal is not baseband. The extra leap required to understand that is amplitude modulation and demodulation.

Notice that to reconstruct the original signal of your example, you need to know the samples which are collected following the hypothesis of the sampling theorem, and you ALSO need to know the magic frequency 100MHz so that you can shift up your 2kHz bandwidth. That's the same setting as modulation.

The only concept missing, then, is recognizing that sampling can perform demodulation.

Demodulation via sampling isn't a weird side-case, that's at the core of understanding Nyquist and doesn't line up with the intuition that you just need to "sample each up- and down-deflection of a wave at least once"

I disagree with you. To be clear, "sampling each up and down deflection" is exactly the right idea in the case that you have no other information besides the samples (and besides knowing the hypothesis of the sampling theorem is satisfied). To use the more general version of the sampling theorem, you in addition need to know the center frequency (100 MHz in your example), otherwise you cannot reconstruct the signal. So already the setting is slightly different. You need an additional assumption.

Take a look at Wikipedia: https://en.m.wikipedia.org/wiki/Nyquist%E2%80%93Shannon_samp... it suggests that Shannon himself considered your case an additional point on top of the sampling theorem.

All real signals have limited bandwidth, including noise.

Assuming there's no external source of radiation, I can absolutely guarantee there is no energy propagating at cosmic ray frequencies in a circuit built around an audio op-amp with standard off-the-shelf components.

No lowpass filter or circuit attentuates any frequency to zero, including your example. The attentuation will be 10^(-huge number), but it wont be zero.

This isn't just pedanticism, you really can't just sample at 2x the corner frequency of your circuit.

(Unless there's some quantum effect that gives a minimum energy level possible for a signal? But even then it would be probabablistic? Is this what you mean? I didn't pay close enough attention in physics class)

It's not straightfoward. Your intuition only works for periodic signals. Nyquist applies to all L2 functions.

Why L2? So that the Fourier integral is required to converge. But now you know your signal is the sum of periodic signals ;)

Non-periodic signals can be thought of as a Fourier series of periodic sine waves though.

Every signal can be represented as a finite or infinite series of sinusoids.

But intuition is not always correct.

It's theoretically plausible that the noise has some structure, and with some bayesian priors, one can classify some of the variation as "probably noise" vs "probably signal". Of course it works best if you know what the signal is before you start, but if that's the case why are you trying to experimentally construct it? :-)

This is a common subtlely problem with statistical denoising (aka pattern recognition, aka lossy compression). Commonly the source data and noise models are synthetic formulas (simple sine wave or aussian), and knowledge of the data-generation procedure can infect the supposed algorithm. (a subtle form of training your model on your test dataset)

> but if that's the case why are you trying to experimentally construct it?

People are able to interpret speech under very high levels of noise due to having strong priors / being able to guess well / having shared state with the speaker.

Of course you should take advantage of whatever priors you have about the signal; we know rather trivially from information science that information gain / transmission rate is maximized when the listener knows as much as possible about the source. This is also key for compressed sensing.

You see analogues of this pattern everywhere in the world: e.g. people learn most quickly when they have some context about the problem space already.

The real question is why the sender would produce a signal with any level of predictability in it, rather than a signal that purely maximizes the effective bit rate of transmission (e.g. minimize redundancy modulo noise). But it's easy to see that physical constraints (e.g. how the human voice works) are at least partly responsible for the enforcement of regularity, at least for many signals that we're interested in in practice.

"Strong bayesian priors" is just another way of saying "high-redundancy signal".

Yes, and also sometimes the redundancy is structurally unavoidable (e.g. if I make a public statement, inevitably some people will have more context about my mental state than others and understand my message better).

English, like all other natural languages, is very redundant. We can't change it to remove all (unnecessary) redundancy since they need to both be speakable and understandable by most possible pairs of communicators, and we are constrained by pesky little facts like the physiology of our mouths and vocal cords.

So in a very real sense, you can probably can practically beat the sampling theorem for most real-life signals that we care about, as long as you have the priors and the necessary compute to apply them.

But that's not really beating the sampling theorem.

Sampling is a self-contained symbol-transmission technique that is absolutely signal-agnostic. You can throw anything at it, including band-limited noise with no redundancy. As long as there's nothing in the signal above N/2, it works.

As soon as you include redundancy and priors, you're solving a different problem. Your channel is no longer signal-agnostic, and you can use known information external to the signal to reconstruct it. The advantage is you can use less data, but the disadvantage is that your reconstruction system has to make strong assumptions about the nature of the signal.

If those assumptions are incorrect, reconstruction fails.

Technically you could argue that the N/2 limit is a form of prior, and sampling is a special instance of a more general theory of channel transmission systems where assumptions are made.

The practical difference is that N/2 filtering followed by sampling at N is relatively trivial with a usefully general result. More complex systems can be more powerful for specific applications, but are more brittle and can be harder to construct.

It depends on what you mean by 'beating' the theorem, doesn't it?

If you mean that the theorem is actually wrong, then I agree— the proof works and you can't actually avoid its conclusions given its assumptions.

But I think we can agree that for many (most?) practical symbol-systems in their particular contexts, the signals are actually high redundant as viewed against a particular basis, so slavishly applying the conclusions of the sampling theorem will cause you to miss the possibility of side-stepping its assumptions entirely.

Whether you're really in a position to take advantage of the latent priors in the signal very much depend on the tools at your disposal—obviously you will not be extracting speech with smart priors if all you have is analog filters at your disposal.

> People are able to interpret speech under very high levels of noise due to having strong priors / being able to guess well / having shared state with the speaker.

In the information theoretic sense, this just means there is enough redundancy in the signal that you can error correct the noise. If you’re in a noisy room talking to a coworker, your shared context and priors are really redundant bits of information. If she says something about “nachoes” but you couldn’t tell if she said “nachoes”, “tacos” or “broncos”, the nachoes next to you add redundancy to the signal.

So in that context, what you’re saying is that we should take advantage of any redundancy we can find in an incoming signal.

The problem is there is no real way to estimate redundancy of a real signal of specific form, or more importantly discerning between likely alias vs likely true signal given a signal with unknown statistics. The statistical approach might work for simple clean signals but makes critical mistakes in case of complex ones. Specifically you get to estimate true phase and true magnitude in a given subband...

Yes, agreed. My point was to basically say that priors (as a form of redundancy) do not undercut the sampling theorem.

Or, in the time domain, if you talked a lot about nachoes recently.

That's not unlike what you might do to recover a high frequency signal by using a band-pass filter and then sampling that (which is typically done by a mixer in a traditional analog system). If you know what the alias is you can reconstruct your signal. The point though is that you can't sample signals of a higher bandwidth without aliasing. If you know in advance the only alias is at a given frequency then you don't need to filter. The purpose of the filter is to ensure that the bandwidth of the sampled signal is correct given that you know nothing about the frequency content of the original signal. The sampling theorem is pretty specific and its proof is solid. So while it might be true you could use a single pixel to tell the difference between an apple and a banana, without filtering the image before sampling, it doesn't really relate to the sampling theorem at that point.

This is briefly discussed in the article.

“For example we know that the neural recordings of interest are a superposition of pulse-like events – the action potentials – whose pulse shapes are described by just a few parameters. Under that statistical model for the signal one can develop algorithms that optimally estimate the spike times from noisy samples. This is a lively area of investigation.”

You can try to estimate the interference at 50Hz or 60Hz from the electric current from the wall. It's usually better to add some additional cables to grounds and shedding to reduce it as much as possible. But if everything fails, it's a nice stable that it's easy to subtract. [Protip: If your "signal" is too nice, check that the frequency is not 50Hz or 60Hz.]

But the article is about an article that tries to calculate and subtract the thermal noise, that is not nice at all.

As you say, filtering mains pollution is completely orthogonal to the stated problem. You would need to do that even if you used an anti-aliasing filter, since these are typically low-pass filters, not band-pass.

While it might seem arrogant of the author to take this stance, the Sampling Theorem generalizes extremely well and is considered an extremely stable result. My favorite generalized Sampling Theorem is https://arxiv.org/abs/1405.0324

It's hardly arrogant to vote in favor of a 100-year-old mathematical proof over a new supposed refutation.

It is an argument from authority. That is asking a lot from the reader, i.e. to suspend believe or go research, the later of which is not too much to ask if direct at an acceptable audience.

"Arrogance" literally is akin to "to ask", e.g. interrogate ... unless the "a-" is akin to "anti" instead of "ad-" or whatever, so that arrogance would be like walzing in and taking without asking or making an assumption without due inquiry.

That's an extremely interesting paper. Do you have any other references if I were to read more about applications of Sheaf theory to signal processing or relevant fields like information theory, control theory, machine learning?

About signal processing AFAIK it's only this particular author, Michael Robinson https://scholar.google.com/citations?user=WxsA8yEAAAAJ&hl=en He's also got short book Topologica Signal Processing.

About all the other things, you want to follow the field of Applied Algebraic Topology. People such as Robert Ghrist, Shmuel Weinberger, Herbert Edelsbrunner, Gunnar Carlsson, Justin Curry whose thesis was on applied sheaf theory (it's nice and readable). Mostly the collaborations they part in, them just being in good places in the citation graph (OTOH Carlsson's Ayasdi Inc. is sadly very much a vapourware). Also, in over a decade of very active research in this area progress is being made mostly on the applicable algebraic topology part, not necessarily the applied part.

> Also, in over a decade of very active research in this area progress is being made mostly on the applicable algebraic topology part, not necessarily the applied part.

I'm not entirely sure what this means in terms of applied mathematics. Does it mean applied mathematicians produce these papers but applied fields like EECS, CS, biology etc never use them?

I mean the field's notions are being customized, but so far the language hasn't been shown to be all that insightful in numerous engineering fields probed. Algebraic topologists seem happy trying to be vaguely relevant though.

Usually claims like this fall into two buckets: (1) crazy, (2) existing idea spun as something new. (Like airless tires, which have been around for a hundred years.)

This is quite clearly (2).

> Effectively they can replace the anti-aliasing hardware with software that operates on the digital side after the sampling step

Sampling with filtering in the digital domain after sampling is absolutely old hat, found in everyday tech.

Let's look at audio. We can sample audio at 48 kHz, and use a very complicated analog "brick wall" filter at 20 kHz.

There is another way: use a less aggressive, simple filter, which lets through plenty of ultrasonic material past 20 kHz. Sample at a much higher frequency, say 384 kHz or whatever. Then reduce the resolution of the resulting data digitally to 48 kHz. There we go: digital side after sampling step.

This is cheaper and better than building an analog filter which requires multiple stages/poles that have to be carefully tuned, requiring precision components.

First sentence from the original article: „A team from Columbia University led by Ken Shepard and Rafa Yuste claims to beat the 100 year old Sampling Theorem“

> (2) existing idea spun as something new

The existing idea is oversampling and is often spun as beating the sampling theorem. But this is unlike the airless tires example, because, while oversampling works, it isn‘t beating the sampling theorem but rather based on it.

You can filter after sampling, but that filter is not a replacement for the analog ant-aliasing filter. To avoid aliasing you have to somehow limit the bandwidth of your input.

Right; it's not a replacement. But you can then pretend (due to naivete or dishonesty) that the analog one doesn't exist and claim you've beaten the sampling theory. Why: because the input stage before the sampler naturally has a limited frequency response and you happen sample well above that. Thus a filter is de facto there but wasn't formally designed in.

Yes (though it doesn't seem they're increasing the sample frequency much - but still above the limit)

It seems because the signal and noise characteristics are known (in the original article), the noise can just be "averaged away"

Just a consideration (I am mathematician, but have nothing to do with signal processing or its mathematics; so don't overestimate my knowledge in this area):

Let us consider a periodic signal (though, I believe, this assumption can be weakened). If we have to assume that all frequencies from the inverse of the period length up to the Nyquist frequency might occur in the signal, the sampling theorem gives us the information that we need.

Now some "practical" consideration: Assume that the signal that we want to sample is very "well-behaved", e.g. we know that lots of its low frequencies are 0 or near 0 (or in general have some other equation that tells us "how the signal has to look like"). So if we reconstruct the frequencies of the signal, but in these Fourier coeficients some other value than 0 appears, we know that it has to come from "noise" that originates in some higher frequency above the Nyquist frequency.

My mathematical intuition tells me that it is plausible that such a trick might be used to reconstruct signal exactly even if we sample with less than than 2 times the highest occuring frequency. Why? Because we know more about the signal than what sampling theorem assumes. So a "wrong" reconstruction for which the sampling theorem can show its existence if there occurs a frequency higher than the Nyquist frequency is not important for us, since we can plausibly show that this signal cannot have 0s in the "excluded low frequencies" Fourier coefficients.

This would explain to me why such an impossible looking algorithm works so well in practice.

It it very plausible to me that such a trick is already used somewhere in engineering.

This is called compressed sensing.


Compressed Sensing originated in math, you might be interested in the original paper: https://statweb.stanford.edu/~donoho/Reports/2004/Compressed...

With the rise of "big data", and "machine learning", we should painfully expect shit-science to become the norm (if it's not already). If you ever read a research paper somewhere in the bio-something space, you'll share the sentiment ("double-blinded this... double-blinded that... our unpublished data analysis pipeline... 96% accuracy")

I particularly enjoy the re-discovery of numerical integration in 1994 by a medical researcher: https://fliptomato.wordpress.com/2007/03/19/medical-research...

This article reminds me that I am on the low end of intelligence for the average Hacker News reader.

I'd take with context. Everything in this article should be understood by a 3rd year electrical/computer engineer as standard knowledge. Because I'm a computer engineering graduate, I can understand it. However, give me an article with '3rd year standard knowledge' from any other engineer discipline and I'll have no idea what's happening

One of the lessons college taught me is that I can learn anything, but I can't learn everything.

One of the lessons many engineers who pass the PE exam learn is, "not with that attitude you can't".

No you aren’t. (Well, you could be, but not because you can’t follow this). On any given article with a lot of technical content, it’s reasonably safe to assume that it’s on the front page of HN because a lot of people are enthusiastic about the subject or interested in it. Only a small number of people actually follow what’s going on for most of these topics.

I can follow this because I have a signal processing & controls background. I feel the same as you when I see some crazy reverse engineering feats or security breaches explained. They might be simple if you have the background, but it sure looks difficult.

This isn't about the sampling theorem, it's a misapplication of compressed sensing.

Similar to the conservation of energy in physics you can ask yourself "is there enough space for the information?". If somebody tries pulls two signals out of one signal, they better don't have too much entropy each. For a given bandwidth (number of independent samples), a given signal energy, and a given noise energy, there is a hard limit to the amount of information you can pull out.

If your signals are not "full entropy" or "white noise", you can do all sorts of funny tricks to increase the SNR to some extent. These tricks sure are nice but they veil the true issue at hand which is about information and capacity.

I love this post for two reasons. First, it's a great reminder to question assumptions about high abstraction mathematical models that tries to say something about reality. And to not take what a community of even scientists say as stable truth. Ask a room full of signal processing engineers if it's possible to do better than the sampling theorem, and they'll confidently – with bravado – tell you that such an idea is ridiculous. Second, the end is a great example of a "proof by incentive" argument, which is one of my favourite ways to produce trust in a theorem.

Proof by incentive is a terrible argument. There are a ton of cranks out there offering money for proof that the Earth is round or whatever, and holding up the lack of takers as proof that whatever nonsense they espouse is correct. All it proves I’m these cases is that either nobody cares about them, or their standards of proof are crazy.

I think the author here is clearly correct, I just don’t think their $1,000 offer does anything to demonstrate it.

I don't understand. The post is about a paper that was written with weak math and suspect empirical process.

What is the "unstable" truth?

What is the "proof by incentive"?

I read this comment before reading the article and got excited to read an article showing how the "establishment" was wrong. Much to my surprise, the article was saying the exact opposite! I'm still not really sure how to read this comment in a way that makes sense.

I think that’s because the comment is ambiguously distinguishing between academics and engineers or lumping them together.

The comment is digitized brain noise in ASCII form. It just shows that if there's too much unfiltered noise before keyboard sampling, the signal can't be reconstructed.

Have you actually read the article?

so, hold on, he's wanting to charge people $10 (to get a file) to attempt to solve the problem, then give a prize of $1000 to anyone who can do it?



He does provide a Jupyter notebook to try out your preliminary ideas.

From what I've seen people in the DSP community are very anal about the inviolability of the sampling theorem. So charging $10 may be his way to punish novice "fools" who think they know better.

I'm not sure I disagree.

If you understand the sampling theorem, its meaning is self-evident, you just count the degrees of freedom. Ask a mathematician how many samples do they need to recover the coefficients of a polynomial of degree 10. They will say: 11 samples are necessary and sufficient, if you have less than 11 samples then you cannot recover any coefficient. If you have 11 or more samples you can recover all the coefficients.

The sampling theorem is exactly the same thing but for trigonometric polynomials.

I don't understand it from this angle, but to me, the general description of what they employ sounds very analogous to how noise cancelling headphones work for pilots. The noise generated by the engines, which normally would drown out any signal of humans conversing through air, is of known frequencies, and a synthesized version of that noise can be played back 180 degrees out of phase along with the original audio, cancelling the noise, revealing the signal. No one is implying that the headphones do this without sampling the engine noise, they might just not have to rely on the actual measurements in question to glean the necessary information to synthesize a signal capable of such cancellation. It reads to me like this broadband thermal noise is predictable, which is why they are able to synthesize it in the first place.

The statement of sampling theorem has nothing to do with noise. It only states that there is conservation of information.

You misunderstood. I was relaying that I, too, have a decent grasp of sampling theory, and that if I understand the article's main premise, then saying they have disproven Nyquist's Theorem with this example is akin to saying that being able to draw both 30 degree and 90 degree angles with a compass disproves the idea that angle trisection is generally impossible.

They have simply shifted in time the sampling of the noise. Even if they argue they synthesize the noise in realtime, they had to have sampled it at some point to know what to synthesize in the first place. It was at this point that they were required to sample at at least 2MHz to accurately quantify their noise at 1MHz.

The mechanism used in pilot's noise cancelling headphones is much the same. they don't sample the engine noise as its being made, they synthesize it from known frequencies.

The $10 is a filter to avoid people wasting his time. Anyone with knowledge in signal processing would be able to generate their own test data sets from the description.

The problem is entirely solvable without the file (if there's really a solution, that is). If you believe you have solved the problem, you will surely pay $10 to get the file so you can prove it and get $1000.

I suppose you could say the antialias filter doesn’t reduce the noise, but rather coherently averages the signal to increase it’s amplitude. If the desired signal is noise-like itself, then I don’t see how the scheme would work.

A simple (low pass) anti-aliasing filter has a linear drop-off, meaning the greater the difference between the frequency of the signal and the frequency of the noise, the better the signal to noise ratio. On the other hand, sampling random gaussian distributed noise will be random gaussian distributed noise no matter what frequency you sample it at.


Remember that the lowpass will change the phase or time response of the original signal. If there is anything in the frequencies and phases affected by the lowpass it will get distorted.

The trick works only because the signal is spike like. I reckon it will miss some of the spikes in real life on real signals or produce spurious spikes. The statistics are not accurate at any given point in time just in the limit.

I was going to point this out (phase response distortion) but you beat me to it.

During my university period, I had to digitally sample an ECG (electrocardiogram) signal. Since the shape of the signal is important in diagnosis and must be preserved, I used a Bessel filter [1].

[1] https://en.wikipedia.org/wiki/Bessel_filter

There's no point in saying that a signal is "noise like".

The signal maybe "noise like", but it should be a noise that has a distinct bandwitch from the "real noise"

The antialias filter just reduces the amplitude of the data collected (signal plus noise) in the exact frequencies that pertains to the bandwitch of the "real noise".

i.e you have to garantee that your desired signal has a different bandwitch than the noise. In that case, thermal noise must have this proprierty in comparison to neuron signals.

PS: English is not my first language. Sorry about any grammar or orthographical errors.

If you can guarantee that this means you are actually oversampling your input signal. Sampling theorem still holds.

It seems a little tongue in cheek to me. Sampling with no low-pass antialiasing filter in general seems like a different problem from sampling with no low-pass antialiasing filter where you _know_ that they only spectral content above the nyquist cutoff is gaussian noise.

That said, his takedown of their approach makes sense to me. It's hard to really judge it though as my signal processing is getting _really_ rusty these days. I guess that's why we have professors. :)

But ... they only want signals up to 5Khz, which is bugger all. So you could oversample (even a stock audio chip will do 96Khz) then apply a low pass filter with an FIR. Why wouldn't this work?

there is so much useless discussion here. what the blogger takes issue with (and refutes convincingly) is that ALIASED THERMAL NOISE can be RECONSTRUCTED. it clearly cannot (if you think you can, this is your chance to make $1000). it doesn't matter what rate you sample at, nor is there any need to invoke compressed sensing - what the authors of the original papers do has nothing to do with it.

couldn't have said it better myself (I'm the blogger)

Tldr: authors of paper claim their digital signal processing method is superior to those methods imposed by sampling theory; in practice, their method works only on examples where there is high level of a priori knowledge of the signal, i.e., not in any practical environment.

Close, but not quite: compressed sensing really works in a lot of domains. You "just" have to know of a basis set in which your signal is sparse. This basis set can be (and often is) overcomplete; i.e. you don't need a minimal orthogonal set of basis functions. So if you know that "most pixels of an MRI are black" or "music only has a few non-zero Fourier frequencies", you can apply compressed sensing techniques to recover/reconstruct the underlying signal. This is a fairly mild and merely structural form of a priori side information, as opposed to having to know detailed precise prior distributions or etc.

See https://en.wikipedia.org/wiki/Sparse_dictionary_learning and https://arxiv.org/abs/1012.0621 for the gory details.

Most of the research in compressed sensing type applications now is focused on more sophisticated prior distributions than sparse/Laplacian. These more general bayesian approaches provide significant performance improvements over LASSO etc.

Sweet, great to hear. Any particular papers you'd recommend checking out?

Death of the clickbait?

There's no problem with doing anti-aliasing digitally later based on samples you have collected. You don't need some kind of physical analog device to do it.

Only if you oversample to cover the entire noise frequency range.

Indeed, and digital audio systems typically combine some sort of oversampling with digital filtering. They still need an anti-aliasing filter, but the filter works at a higher cutoff frequency and can be much simpler.

That is clearly not the case, as the article repeatedly shows. By the time you have a digital signal, it’s too late. Any noise above the nyquist frequency has already been mixed into the noise below it and they cannot be separated.

Ok let me describe it in a bit more detail.

This is what we do in high-quality computer graphics rendering.

Suppose you are want to reconstruct a signal, showing only the frequencies < N hz.

With an analog filter, you filter out everything above N hz. Then you can sample at the nyquist frequency of 2N hz, and then perfectly reconstruct the filtered signal.

Alternatively, without the analog filter, you can take some number of samples M >= 2N. The samples are randomly placed/timed. You accumulate the samples at the 2N locations, but you weight the sample when you accumulate it with some filter function. This effectively convolves and filters the signal with a low-pass filter. Because of the random sample placement/timing, high frequencies get converted to noise instead of causing aliasing.

Admittedly you are taking more than 2N samples, however you don't have to do any analog filtering, and you can still reconstruct he bandlimited signal. Your reconstruction will have some noise, but the noise will decrease as you take more samples (e.g. as M increases).

I think you’re thinking of stochastic sampling. This still requires additional samples in the time domain and only works for relatively static signals over those increased number of samples.

If you have enough samples, then of course you can do it. The article is all about someone claiming they can do it without taking more samples. Given that was the topic at hand, I kind of assumed that was what you meant.

Ok. I was mostly pointing out that sentences from the original article like

"That filtering process, because it happens in the analog world, requires real analog hardware"

are wrong or misleading.

I’m pretty sure “that filtering process” refers to removing frequency components beyond what your sampling rate can support.

My spidey sense tingled on the sentence just previous:

"Filtering means removing the high frequency components from the signal"

Sure, it can mean that. To claim it always and only means that, even in the context of periodic waveforms, makes me question the authors understanding of the subject.

I think the problem is that it doesn’t make sense to digitally filter frequencies higher than 10kHz from a signal that was sampled at 10kHz.

According to the sampling theorem, shouldn't rather it be: "it doesn’t make sense to digitally filter frequencies higher than 5kHz from a signal that was sampled at 10kHz."

Yes, weakened it a little to make the statement more obvious.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact