
24/192 Music Downloads Are Very Silly Indeed (2012) - Ivoah
https://xiph.org/~xiphmont/demo/neil-young.html
======
Mediterraneo10
Not entirely silly. Yes, the purported benefits of this high-fidelity audio
are imaginary or even include undesirable traits. However, when most
"remasters" of pop music today involve the dynamics being boosted to "loudness
wars" standards for a target audience of people listening to the music through
earbuds in the street, the 24/192 downloads or SACD releases are often the
only way to hear the album with real dynamic range.

I’m not sure how big that market is these days, though. I recently decided to
move from a block in the city to a house in the surrounding countryside purely
to get a quieter listening environment and let me really enjoy the high-
dynamic-range recordings I have – I listen to a lot of classical music,
especially the avant-garde with its even greater range, e.g. the Ligeti cello
concerto starting from pppppp. Yet even among my friends who are really
obsessed with such music, seeking a better listening environment seemed an
extreme measure to take.

So, people who a generation ago would have invested in higher-end equipment
(not audiophile snake oil, just better speakers) and who would have sought
silence are now giving in to listening to music on their phones or cheap
computer speakers. It’s a big shift.

~~~
5zBFyURxgY
Still very silly IMAO. Very, very few recordings are done in 24/192 because of
the many implications this has on your entire studio setup.

You'll need an exceptional good clock to start with, and all other equipment
needs to align to that clock. Then all plugins/processing you use needs to be
in the same 24/192 domain, otherwise your signal is reduced to the limit of
that plugin/processing and all previous efforts are lost.

Most music producers use samples, most are 16/44, so what's the point to try
to get that to 24/192, filling the signal with zero's..

If a piece of music is in very rare occasion truly 24/192 then the listener
who downloaded the track still needs a exceptional good clock (that are both
expensive and hard to find) to playback without signal reduction.

IMAO 24/192 is just a marketing thing for audiophiles that don't really
understand the implications. 24/96 should be a reasonable limit for now,
although personally I think 24/48 is enough for very high quality audio.

~~~
soundwave106
This depends on whether you are in the studio or are just playing back things.

In the studio, I would say that 24 bit at least _should_ be the norm for
_recording_ purposes.

24 bit recording gives you very noticeable increased headroom (about 20dB).
This gives you quite a bit more flexibility recording lower levels without
concerning yourself about the noise floor. The difference isn't _huge_ for
most prosumer setups in practice, but given that the processing power and
storage power of computers makes recording in 24 bit trivial to do, there
really is no reason not to record 24 bit these days IMHO.

Sample rate also comes into play, mainly _if_ you have older plugins that do
not oversample. Some of the mathematical calculations involved, particularly
if they are quickly responding to audio changes (eg limiting / compression,
distortion), or are using "naive" aliasing-prone algorithms (eg naive sawtooth
wave vs. something like BLEP / PolyBLEP etc.), can introduce frequencies
beyond the Nyquist that may translate into aliasing. These days, I would say
most plugins _do_ oversample internally or at the very least give you the
option to do so. There's also a VST wrapper to over-sample older plugins as
well
([http://www.experimentalscene.com/software/antialias/](http://www.experimentalscene.com/software/antialias/)).
So I do not think recording over 44.1kHz is very necessary these days. I don't
discount opinions from people that recording at 192kHz "sounds better",
though, given the possibility that they are using plugins that are prone to
aliasing at 44.1kHz rates.

I personally do not see any benefit of 16/44.1kHz for _playback_ most
recordings. _Maybe_ 24 bit would be useful for heavily dynamic music (one of
the few categories where you generally find this is orchestral music), but I'm
thinking even for here the 96dB range of 16 bit audio should be enough for
most cases.

~~~
nullc
> This gives you quite a bit more flexibility recording lower levels without
> concerning yourself about the noise floor.

To be fair, that only applies to the digital part of your signal chain. The
analog portion is going to have nowhere near 24 bits of room above the noise
floor.

The article is pretty clear that 24/192 can be reasonable for production--
it's just not reasonable for playback.

------
lz400
There's a very straightforward practical reason why you might want to keep
downloading 24/192\. The people who put those together are on average
"audiophile" snobs so the rips tend to be perfect, often from high quality
sources like SACDs, hdtracks, etc. The _source_ of the recording is usually
the best master known for that record (special remastered editions, etc.). If
you download mp3/spotify chances are the copy you download is from a worse
source. Sure, you don't need 24/192, a well ripped 320kbps mp3 would be the
same in practice, but looking for 24/192 in the internet is an easy way of
getting better quality music on average in practice.

~~~
em3rgent0rdr
> "If you download mp3/spotify"

You are misconstruing Monty's argument here. He is very much against mp3...in
fact he says he could tell the difference between high bitrate mp3 and 16 bit
44k wav. The real point of the video is that 16 bit 44k wav is beyond
sufficient...don't need to go beyond that to 24-bit 192kHz.

~~~
lz400
As far as I know, people consistently fail to tell the difference in blind A/B
tests in listening between (any decent) FLAC and a well ripped mp3 from the
same source. I know I can't. I can't link to proper double blind studies but
that's the general consensus in the non-BS-audiophile community as far as I
know.

~~~
ycombinete
My anecdote is that I've done a number of these tests (Foobar has a plugin
that allows you to do them on yourself), and I can reliably tell the
difference between FLAC and 128 MP3, but can't tell the difference on 256 MP3
and up.

~~~
lz400
Yes, I've heard people putting the threshold in ~192kbps. Above that it's
pretty much impossible these days. I can also tell the difference in 128kpbs,
although sometimes it's a bit of nitpicking (can only hear differences in the
sounds of hihats and things like that)

~~~
72deluxe
I think the type of music you are listening to assists with being able to
differentiate. To me (much rock, blues, guitar music) 128kbps MP3 sounds like
somebody chewed it first. Cymbals and bizarre snares that sound like the snare
is now made of paper indicate the low bitrate.

But 256+ and I certainly cannot tell the difference reliably.

~~~
knute
I had satellite radio for a while but had to cancel it because I couldn't
stand the sound of cymbals at whatever bitrate they encode at. Weirdly, when
the phone operator asked why I was cancelling and I told him "audio quality"
he acted like he had never heard that before.

~~~
svachalek
I'm not an audiophile at all but satellite radio sounds worse to me than
anything other than AM/FM radio. Spotify over LTE and Bluetooth sounds night-
and-day better.

------
jahnu
I've said it before, this video demo is one of the very best I've ever seen.

[https://xiph.org/video/vid2.shtml](https://xiph.org/video/vid2.shtml)

So well prepared, so well presented, so little that could be removed without
ruining it.

I aspire to do such good demos but always fall so short.

~~~
amygdyl
I felt that the presenter is a very rare example of a engineer who felt that
his expression and purpose was _attractive_ to his imagined audience.

Without diversion via the questions embedded in the profession and the
perception and self perception of the male geek working and academic world, I
so rarely see a presenter reacting to a sense of apparent _human warmth_ in
the room, and beyond the lens, which even with the most encouraging assistance
behind the lens, is genuinely hard to do. Hard enough that I think it is a
classic contribution to the stereotypes of inflated ego newsreel presenters,
which Hollywood loves to satirise, in my opinion because Hollywood is mocking,
to their narrow and insecure view, a subspecies of acting which when done
well, can so massively capture the greater audience than ever some most
serious actors may manage to capture.

This is a bit more than a little bit of geek knowhow and applied thought, but
I think many geeks by virtue of sheer analysis without a obstruction of a ego,
could be handily outperforming the supposedly inherent talent they are "meant"
to possess. It may be reaching well into "real serious" acting, very easily. I
don't pretend to be a judge of that, but if acting abilities are "I know it
when I see it", this is excellent acting indeed.

Edit, is not was, first line. A comma for clarity but later on.

~~~
sk5t
What kind of software generated this comment?

~~~
soneil
Hah, so cynical. No, this is just how audiophiles describe anything. Even
their floorboards.

~~~
StavrosK
Could it be cargo cult science (language)? By making it sound like an academic
paper, it passes for scientific.

------
flavio81
Good points in the article, but it has some flaws.

The problem whenever somebody writes about digital audio, is that it is very
tempting to hold on to sampling _theory_ (Nyquist limit, etc) and totally
discard the problems of implementing _an actual Analog-Digital and Digital-
Analog chain_ that works perfectly _at 44100Hz sample rate._

I agree with the assesment that 16 bit depth is good enough; even 14 bit is
good enough and was used with good results in the past (!). However, the
problem is with the sampling rate.

> _All signals with content entirely below the Nyquist frequency (half the
> sampling rate) are captured perfectly and completely by sampling;_

Here lies the problem. This is what _theory_ says, however, when using 44KHz
sample rate, this means that to capture the audio you need to low-pass at
22KHz. And this is not your gentle (6, 12 or 24db) low-pass filter; no, this
needs to be HARD filtering; nothing should pass beyond 22KHz. And this must be
on the analog domain, because your signal is analog. To implement such a
filter, you need a brickwall analog filter and this is not only expensive, but
it also makes mess with the audio, either 'ringing' effects and/or ripple on
the frequency response and/or strong phase shifts.

So on Analog-to-digital in 2017, converters should be operating at a higher
rate (say, 192KHz), because this makes analog filtering of the signal much
easier and without side effects.

Now, for Digital-to-Analog, if your sample rate is 44KHz, you have two
alternatives:

a) Analog brickwall filtering, with the problems noted above

or

b) filtering on the digital domain + using oversampling

the article mentions:

 _> So the math is ideal, but what of real world complications? The most
notorious is the band-limiting requirement. Signals with content over the
Nyquist frequency must be lowpassed before sampling to avoid aliasing
distortion; this analog lowpass is the infamous antialiasing filter.
Antialiasing can't be ideal in practice, but modern techniques bring it very
close. ...and with that we come to oversampling."_

So they are mentioning alternative (b). The problem is that oversampling does
not solve all problems. Oversampling implies that the filtering is done on the
_digital_ domain and there are several choices of filtering you could use, for
example FIR (Finite Impulse Response), IIR (infinite impulse response), etc.

And each one of these choices have side effects...

In short, the problem is that with 44KHz sampling rate, your filter cutoff
(22KHz) is too short to your desired bandwidth (20Hz-20KHz). Using a sample
rate of 192KHz gives the DAC designer much more leeway for a better
conversion. And CONVERSION is the key to good digital sound.

 _> What actually works to improve the quality of the digital audio to which
we're listening?_

It is interesting that the author mentions things such as "buying better
headphones" (agree), but he _never_ mentions "Getting a better Digital to
Analog converter", which is highly important !!

On the other hand, he backs up his claim that "44KHz is enough" with an
interesting AES test i was already aware of in the past:

 _> Empirical evidence from listening tests backs up the assertion that
44.1kHz/16 bit provides highest-possible fidelity playback. There are numerous
controlled tests confirming this, but I'll plug a recent paper, Audibility of
a CD-Standard A/D/A Loop Inserted into High-Resolution Audio Playback, done by
local folks here at the Boston Audio Society._

This is a very interesting paper, and I did have the copy, however the test
equipment should be checked. There are systems and better systems. The AES
paper cited above had the particularity that the ADC and DAC used were
provided by exactly the same machine (a Sony PCM converter), with the same
strategy: no oversampling, brickwall analog filters. I can bet (99% sure) that
the brickwall filters were identical on the ADC and the DAC on that machine;
Murata-brand filters in a package.

The devil, as they say, is in the details.

~~~
nayuki
I don't think there are flaws in the article as you claim. It already
explicitly explains that oversampling at the ADC or DAC is an acceptable
engineering solution:

> Oversampling is simple and clever. You may recall from my A Digital Media
> Primer for Geeks that high sampling rates provide a great deal more space
> between the highest frequency audio we care about (20kHz) and the Nyquist
> frequency (half the sampling rate). This allows for simpler, smoother, more
> reliable analog anti-aliasing filters, and thus higher fidelity. This extra
> space between 20kHz and the Nyquist frequency is essentially just spectral
> padding for the analog filter.

> That's only half the story. Because digital filters have few of the
> practical limitations of an analog filter, we can complete the anti-aliasing
> process with greater efficiency and precision digitally. The very high rate
> raw digital signal passes through a digital anti-aliasing filter, which has
> no trouble fitting a transition band into a tight space. After this further
> digital anti-aliasing, the extra padding samples are simply thrown away.
> Oversampled playback approximately works in reverse.

> This means we can use low rate 44.1kHz or 48kHz audio with all the fidelity
> benefits of 192kHz or higher sampling (smooth frequency response, low
> aliasing) and none of the drawbacks (ultrasonics that cause intermodulation
> distortion, wasted space). Nearly all of today's analog-to-digital
> converters (ADCs) and digital-to-analog converters (DACs) oversample at very
> high rates. Few people realize this is happening because it's completely
> automatic and hidden.

The main point of the article is to argue that storing or transmitting music
above 16-bit, 48 kHz is wasteful and potentially harmful. It still fully
condones using higher specs for audio capture, editing, and rendering.

~~~
flavio81
> I don't think there are flaws in the article as you claim. It already
> explicitly explains that oversampling at the ADC or DAC is an acceptable
> engineering solution

Of course it is _acceptable_. Even 14 bit audio at 36KHz with a great DAC
would be fairly nice, acceptable.

What the article claims is that 192KHz is useless, of no benefit. And i
contend that it is of benefit when you want more than just good or acceptable
performance. Not if you have a run of the mill DAC and OK headphones/speakers,
but it is if you are a music lover and critical listener.

~~~
stephen_g
You've missed the point though - no human has demonstrated the ability to be
able to distinguish the fidelity of audio above 44.1kHz sampling in a properly
controlled comparison test (double blind). This empirical result is to be
expected given the biology of the ear and the science of sampling theory, as
the article explains.

It doesn't matter if you're a music lover or critical listener!

------
tzs
What about for non-humans?

Consider a dog that lives with a musician who plays, for example, trumpet. The
musician plays the trumpet at home to practice, and also records his practices
to review.

A trumpet produces significant acoustic energy out to about 100 KHz [1]. When
the musician plays live the dog hears a rich musical instrument. When the
musician plays back his recordings, half the frequency range that the dog
could hear in the live trumpet will be gone. I'd imagine that this makes the
recorded trumpet a lot less aesthetically pleasing to the poor dog.

[1]
[https://www.cco.caltech.edu/~boyk/spectra/spectra.htm](https://www.cco.caltech.edu/~boyk/spectra/spectra.htm)

~~~
peatmoss
What about transhumans? I'm looking forward to my HiFi Cochleatron 9000(tm)
when my hearing starts to go.

Then I'll be mighty glad we made all these high-res recordings.

~~~
exikyut
That makes for some genuinely interesting thought experiments.

What if, this actually becomes possible, but we discover that because we
previously couldn't hear these frequencies, our instruments and equipment are
_HORRIBLY_ mis-tuned and sound terrible? We may end up having to re-record
tons of stuff.

Something something premature optimization. And part of me is glad that the
art of hand-making instruments is not yet lost; we might need the originals in
the future.

Disclaimer: I say this as a completely naive person when it comes to
instruments. The answer to this may be "if it wasn't built to resonate at
frequency X, it won't by itself," which would be a good thing.

~~~
photojosh
Higher frequencies aren't even _in_ most recordings, so we wouldn't have to
re-record them for that reason.

And if they were (such as in 96 kHz hi-res audio), you could just run it
through a low-pass filter to strip off the higher frequencies.

~~~
exikyut
Ah, good point.

And... heh, using a filter to strip out the audio we used all that extra
filesize to deliberately store. Haha. :)

~~~
photojosh
My webapp manages photos for photography competitions. People upload 30MB
JPEGs that would be visually lossless at a _tenth_ of that file size. And I
keep the originals, but actually resize down to 300KB for every function
within the software. Haven't had a single complaint about image quality... :)

------
Const-me
If the only thing you do with your music is listen, then yes, 24/192 delivers
questionable value compared to the more popular 16/44.1 or 16/48 formats.

However, all musicians I know use these high-rez formats internally. The
reason for that, when you apply audio effects, especially complex VST ones,
these discretization artifacts noticeably decrease the result quality.

Maybe, the musicians who distribute their music in 24/192 format expect their
music to be mixed and otherwise processed.

~~~
kazinator
I do not believe it. 24 bits is definitely needed for processing (better yet,
use floating-point).

Not 192 kHz; no friggin' way.

Repeated processing through multiple blocks at a given sample rate does not
produce cumulative discretization problems in the time domain; it's just not
how the math works.

~~~
Const-me
> better yet, use floating-point

Both your inputs (ADC) and outputs (DAC) are fixed-point. Why would you want
to use a floating point in between? Technically, 64-bit floating point format
would be enough for precision. But that would inflate both bandwidth and CPU
requirements for no value. 32-bit floating point ain’t enough. Many people in
the industry already use 32-bit integers for these samples.

> Not 192 kHz; no friggin' way.

I think you’re underestimating the complexity of modern musician-targeted VST
effects. Take a look:
[https://www.youtube.com/watch?v=-AGGl5R1vtY](https://www.youtube.com/watch?v=-AGGl5R1vtY)
I’m not an expert, i.e. I’ve never programmed that kind of software. But I’m
positive such effects are overwhelmingly more complex than just multiply+add
these sample values. Therefore, extra temporal resolution helps.

BTW, professionals use 24bit/192kHz audio interfaces for decades already. E.g.
ESI Juli@ was released in 2004, and that was very affordable device back then.

~~~
nayuki
I habitually edit audio using 32-bit float, not 16-bit integer.

> Why would you want to use a floating point in between?

Because 32-bit float has enough mantissa bits to represent all 24-bit integer
fixed-point values exactly, so it is at least as good.

Because 32-bit float is friendly to vectorization/SIMD, whereas 24-bit integer
is not.

Because with 32-bit integers, you still have to worry about overflow if you
start stacking like 65536 voices on top of each other, whereas 32-bit float
will behave more gracefully.

Because 32-bit floating-point audio editing is only double the storage/memory
requirements compared to 16-bit integer, but it buys you the ultimate peace of
mind against silly numerical precision problems.

~~~
kazinator
float has scale-independent error because it is logarithmic/exponential.

If you quiet the amplitude by some decibels, that is just decrementing the
exponent field in the float; the mantissa stays 24 bits wide.

If you quiet the amplitude of integer samples, they lose resolution (bits per
sample).

If you divide a float by two, and then multiply by two, you recover the
original value without loss, because just the exponent decremented and then
incremented again.

(Of course, I mean: in the absence of underflow. But underflow is far away. If
the sample value of 1 is represented as 1.0, you have tons of room in either
direction.)

------
josteink
I've done signal and sound-processing courses at the university. I know the
Nyquist-Shannon theorem. I know all about samples, and digital sound not been
square staircases.

I know and understand how _incorrect_ down-sampling from high frequencies can
cause distortion in the form of sub-harmonics in the audioable range.

I know about audible dynamic range and how many decibels of extra range 8-bits
are going to give you.

I know all this, but I still have to admit: if there's a hi-res recording
(24-bit, >48kHz) available for download/purchase, I'll always go for that
instead of the "regular" 16-bit 44.1/48kHz download. I guess escaping your
former audiophile self is very, very hard.

Anyone else want to admit their guilty, stupid pleasures? :)

~~~
xiphmont
I collect vintage headphones, and especially relish the truly absurd ones I
can't believe anyone thought were a saleable item.

I'm up to about 300 different models.

~~~
TheRealDunkirk
Would love to see a shot of the collection on /r/headphones.

~~~
specto
Yeah seriously, that's really cool.... and on that note what headphones does
he actually use?!

------
tomc1985
While the extra quality is lost on listeners, from experience I've found that
super-HQ source material can change your results (for the better) when fed
through distortion/compression/warming effects processors

~~~
acchow
> Its playback fidelity is slightly inferior to 16/44.1 or 16/48, and it takes
> up 6 times the space.

The article is highly technical. Does anyone have a way to describe this
phenomenon intuitively?

~~~
fusiongyro
Basically, the sampling theorem says you can reconstruct the exact waveform
with a certain number of samples. Adding a bunch more samples bulks up the
file, but you didn't need them to restore the exact waveform. However, the
unnecessary samples are in the file between you and the next sample you do
need. At high enough levels of waste this creates an I/O bottleneck that
hampers performance.

Another way to look at is that digital audio is not like digital imaging.
There aren't pixels. Increasing the data rate does not continue to make the
waveform more and more detailed to human auditory perception in the way that
raising the pixel density does for human visual perception.

To describe it intuitively, forget your intuition that audio is like visual
and start from "there is no metaphor between these two things."

~~~
sitharus
> Increasing the data rate does not continue to make the waveform more and
> more detailed to human auditory perception in the way that raising the pixel
> density does for human visual perception.

Early low data rate codecs - such as the one used for GSM mobiles - are
obviously inferior, but still functional. I think a better analogy is that an
iPhone 7 has a 1 megapixel screen, so there's no difference between a 1
megapixel image and a 5 megapixel image, except one is much larger. Of course
visually you can zoom in (or move closer in real life), but audibly you can't.

~~~
fusiongyro
I get what you're saying, that if you remove the ability to zoom you can
equalize things, but without that twist I think this preserves the essential
problem with the analogy that Monty works so hard to correct. If your data
rate is too low, you don't have enough samples to recreate the waveform. So
you create an approximate. But there is a _finite number_ of samples that you
need to recreate _all_ of the information in the waveform, so once you have
that, there isn't any additional information there for you to obtain if you
continue to increase the sample rate.

------
tigerBL00D
If you are downloading to simply consume, then sure. But as others have noted
this article assumes that mastering is always the last step in the editing
process. Anyone who ever remixed, sampled or DJ'ed will disagree.

If we keep that in mind, then the setup of the problem changes significantly
and most arguments made here do not apply to the download itself.

~~~
krisdol
Right. I used to DJ throughout college and tried out a variety of formats. At
high amplification, artifacts become apparent. VBR0 and 256kbit MP3s did not
hold up, at all. Even smaller speakers and headphones didn't hide some of the
compression artifacts for me.

320kbit CBR MP3s were... I'd say, generally OK if you did not plan on skewing
their tempo much outside of a very small range (give or take 5% speed). Really
bad artifacting becomes audible quickly beyond that range. Maybe it was
placebo, but I also found differences between FLAC and 320kbit MP3 discernible
when working on big, more powerful speakers.

But, with FLAC, it didn't matter if the sound was played at 10% tempo or
extreme amplitude, audio was always crisp, clear, and free of compression
artifacts (obviously).

On my laptop speakers or earbuds, no way would I be able to tell the
difference between any of these today.

------
mrob
Super-high sample rate could make a difference if you're sampling it and
pitching it down heavily to make a bass-line. But in practice, I think most
such audio has very little information above 20kHz. Ordinary microphones are
designed for audible frequencies, and ultra-sonics are easily lost in
production. What music actually contains interesting ultra-sonic information
for slowed-down listening?

~~~
TazeTSchnitzel
If anything, wouldn't the high sample rate be a liability when trying to shift
pitch? If it captures too much outside the range of human hearing, then that
ultrasound might suddenly become audible when you change pitch, which could
sound weird.

~~~
mrob
Exactly, and the weirdness might be interesting.

~~~
TazeTSchnitzel
Ah, right. And if ultrasound is actually much like regular sound, it may sound
perfectly normal.

------
pmalynin
My go to comparison track for sound quality for a new device, headphones, DAC,
amp, codec etc, is "Speak To Me" from Dark Side Of The Moon.

I don't hear a difference between various rips that are 16/128 or 24/192, but
I have noticed a difference listening to the Blu-ray version of it (which is
many-many gigabytes in size). It is a definitely interesting experience, but
the way I can describe it is as an absolute absence of noise.

Every single version but this one exhibits noise at the start (the heartbeat
sound) as the sound goes from very quiet to loud.

But, to be fair, it could just be different masters.

~~~
terinjokes
Which would be in line with the article: a SACD release pressed onto a CD-R
sounded better than the CD release, because the SACD release had a better
master.

------
gwbas1c
"Under ideal laboratory conditions, humans can hear sound as low as 12 Hz[8]
and as high as 28 kHz..." From
[https://en.m.wikipedia.org/wiki/Hearing_range#Humans](https://en.m.wikipedia.org/wiki/Hearing_range#Humans)

That explains why some people can tell the difference between 44.1 khz and the
higher-resolution sampling rates. It also means that the ideal audiophile
sampling rate is somewhere between 58khz and 60khz, not 192khz.

~~~
TheRealDunkirk
You just need these speakers
([http://magico.net/product/ultimate.php](http://magico.net/product/ultimate.php))
to reproduce those frequencies! The problem, of course, is whether even
"prosumer" recordings keep frequencies higher than 20 or 22kHz, even if they
were recorded on equipment that sampled at 192? Seems like most would LPF the
rest away in the process of mastering for the masses.

------
Veratyr
Perhaps this is a silly question but the explanations mostly make perfect
sense to me but in the case of sample rate, what happens when two waves with
very different frequencies overlap?

Say I've got an 18kHz wave and a 9kHz wave and the 9kHz wave is ever so
slightly out of phase. Then imagine there are 10 different waves under 20kHz
all interfering with each other in different wages.

Is it still possible to reproduce everything accurately?

And on bit-depth and dynamic range: Given that much audio doesn't use the full
range available to it, wouldn't higher bit depth increase the fidelity in the
range the audio does fill? The article talks about the bit depth and range
only in terms of the maximum volume but what about fidelity? What's the
minimum difference in volume the human ear can hear?

~~~
agency
With regards to the first part of your question - as long as the signal is
bandlimited the Nyquist-Shannon theorem applies. If all of the waves are under
20kHz, regardless of how they're interfering with each other, the signal can
be reconstructed perfectly with a 40kHz sample rate.

~~~
umaxumax
Incorrect. Nyquist-Shannon theorem does not guarantee phase! It only
guarantees frequency reproduction. This paper illustrates the issue
succinctly:
[http://www.wescottdesign.com/articles/Sampling/sampling.pdf](http://www.wescottdesign.com/articles/Sampling/sampling.pdf)

------
duderific
Kind of a side topic, but does anyone know what's up with Sirius XM satellite
radio? Whenever I listen to that in my wife's car, I find the sound quality to
be obnoxiously bad, to the point that I choose to listen to plain old FM if
I'm driving her car.

The higher frequencies (hi hats for example) are mushy and sound like they are
warbling, and the sound just generally has a lack of depth. What is the
explanation for this, or am I just imagining it?

~~~
TD-Linux
I notice it too, and it's because the music is extremely low bitrate - usually
48kbps, but sometimes as low as 32. It's HE-AACv2 - not as good as Opus, but
close. Its characteristic artifacts include destroying transients, like your
hi-hats.

------
rconti
Off topic, but I've noticed in the gym, sometimes the music playing from the
instructor's iPhone through one of those cheap Ion Block Rocker bluetooth
speakers (big boxy one, looks like a guitar amp) has very very noticeable
pitch sag in various parts of various songs.

It's via bluetooth, and the source is Pandora. I even googled Pandora and
pitch change, and found some forums with folks discussing _bluetooth_ causing
this to happen, which has never been my experience.

It occurred to me that slowing the playback (eg, causing the pitch to sag)
could be a fantastic way of dealing with a connection that's too slow. It
would be far better than stuttering, even though many of us would find the
pitch change really annoying. The instructor and my wife simply can't hear it,
or say "I thought that was just part of the song".

Anyway, has anyone ever heard of this happening? Do certain products do it?
Were the Pandora forums wrong and this is actually a Pandora problem? Of all
of the bluetooth problems I've had in my life, I've never experienced this
before, so I tend to lean towards it being a data rate issue on the cell
connection and Pandora slowing the music down.

~~~
jaquers
At my greatest aspiration, I'm no more than a casual music listener - however
I am very conscious of pitch. On more than one occasion, I've noticed that a
song I'm familiar with seems off pitch, as in lower or higher key than I think
of it in my head - and I seem to notice that particularly around shitty
speakers or bluetooth or some combination of the two. Glad to know I'm not
crazy :)

To your point, I would lean towards bluetooth as I've never associated the
phenomena with Pandora. My friends and I mostly use Spotify.

~~~
rconti
I use spotify as well. And what I describe is a sag for 5 or 10 or 15 seconds,
not just a song that's playing too slowly. it's really weird.

------
squarefoot
FTA: "192kHz digital music files offer no benefits. They're not quite neutral
either; practical fidelity is slightly worse. The ultrasonics are a liability
during playback."

I would expect the samples to be interpolated before the final DA conversion
takes place, so no extra ultrasonics should be involved. And there would be
filters for them anyway, we still have to use them for 44.1, 48, 96 etc.

~~~
sp332
But if you're filtering out ultrasonics, then you've guaranteed that the extra
samples will be wasted. There won't be any difference in the reconstructed
signal at all.

~~~
zkSNARK
As an engineer, those extra samples are never wasted, and as a listener, I
could care less about the "wasted space." Drive space is meaningless at this
point. When I buy a piece of audio, give me the highest quality available and
I will make my own decisions regarding the frequency I want to listen to it
at. The reasons behind not giving the user the highest quality possible all
sound like a misguided excuse to prevent piracy.

~~~
IgorPartola
So 192/1024? 1024/4M? 1G/1T? Like is there a practical limit for you when you
say "highest quality possible"?

~~~
zkSNARK
I see no issue with storing 24/192 multichannel audio like what comes on blu-
ray audio discs, or DSD streams. For my current hard drive situation, I could
store 5 GB per song and easily have enough space to have a very large catalog.
But drive space is only getting cheaper.

------
tinix
I challenge y'all to take two copies of the same professionally mastered
track, 16 vs 24 bit... and phase invert one, and put them on top of each
other. What you can then hear is basically like an auditory diff.

Do you not hear anything? Yeah... all that sound is what amounts to lost
fidelity when you down-sample. You can't just argue on bit-rates alone... It's
about having head room in the mix and room for more fidelity. Sure, if you mix
shit badly, you can't hear the difference, but that's missing the point
entirely.

It's not nearly as bad as the time someone tried to tell me that Opus was an
adequate audio file format for music, but still... frustrating.

I know I don't have super-human hearing, and I can easily hear the difference.
Just because everyone can't hear the difference, or can't tell that they can
hear the difference, doesn't negate the fidelity loss of lower bitrates.

Further, given a large enough speaker stack, it's not about what you can hear
any more, it's about what you can feel.

And never-mind the benefits of a 24bit DAW pipeline... Hello, low latency?

------
0xCMP
I've always been able to tell when something on Spotify I've heard before is
lower quality now. I can instinctively tell in my car when that's the case.
It's mostly due to how loud certain parts of the song are or how it sounds at
higher volumes. I guess, given what he says, that wouldn't be the case if made
sure they were the same dB.

~~~
anyfoo
It is well known that music media for "popular" consumption are often
intentionally mastered with much less dynamic. This is called "compression",
but has, on the surface, nothing to do with compression of bits* . It's the
dynamic that's compressed, making quieter and louder sounds be closer
together.

For markets with people that are more likely to care about sound quality,
though, a much larger dynamic range is preserved. This is why the same album
often sounds better on vinyl than on digital media[1]. It has nothing to do
with the media, it's the superior mastering that was consciously chosen.

The Wikipedia article on the Loudness War[2] offers a good explanation.

* Well, technically, music with compressed dynamic has less entropy, so can be encoded at a lower bitrate without loss.

[1] See this database for example: [http://dr.loudness-
war.info](http://dr.loudness-war.info) [2]
[https://en.wikipedia.org/wiki/Loudness_war](https://en.wikipedia.org/wiki/Loudness_war)

~~~
baddox
It's true that much popular music is heavily compressed, but you normally
wouldn't encounter two differently-compressed masters of the same song. Two
exceptions would be radio (some stations used to apply their own compression,
and perhaps they still do) and some high-quality releases aimed at audiophiles
that are produced from a less-compressed master. I wouldn't think that
streaming services like Spotify would re-compress songs.

~~~
anyfoo
I was a bit imprecise, but the "high-quality releases aimed at audiophiles" is
essentially what I was referring to. The record stores are full of remastered
vinyl editions.

------
dep_b
Without even reading the article I already know that it's extremely hard that
hearing the difference between 16/44.1 and 24/192 indeed is really hard if not
impossible.

I only use higher bitrates when creating music so when I mix or record I don't
need to make the levels go almost into red to get a hot enough signal, I can
do all kinds of changes like slowing down or up, distortion, EQ, compression
without losing any perceived detail in the end. If I record on 16 / 44.1 then
start manipulating the sound I start losing detail immediately.

But in the end more than 20 / 88.1 won't do much for you. 16 / 44.1 might be a
bit less than ideal for music with a lot of dynamics but it's absolutely fine
for most purposes.

~~~
anyfoo
No, not "really hard if not impossible", it is impossible. The article itself
cites that no one has ever been observed to be able to do so.

~~~
dep_b
More than 44 is useful for tempo matching while mixing

~~~
anyfoo
Yes, the article also mentions how 24/192 can be useful for production and
editing.

------
mmahemoff
Mirror: [http://evolver.fm/2012/10/04/guest-opinion-
why-24192-music-d...](http://evolver.fm/2012/10/04/guest-opinion-
why-24192-music-downloads-make-no-sense/)

------
kevin_thibedeau
Makes me long for the day when consumer audio products were proudly labeled as
having a 1-bit DAC. Nowadays you couldn't sell audio hardware that made such
truthful statements about its internals. Gotta have the bits.

~~~
byproxy
Hey, they still do. And it's 'audiophile.' DSD is 1-bit, but with a very high
sampling rate (2.8224 MHz). Though, as far as I know there isn't actually any
quality boost vs PCM.

~~~
kevin_thibedeau
That's my point. The golden ears wouldn't deign to buy a "cheesy" 1-bit DAC.
It has to be recast as something else to fool them.

------
daveheq
Funny that the author says we haven't found any such people in the past 100
years of testing with truly exceptional hearing, such as a greatly extended
hearing range, so they probably don't exist, yet I was tested at around 13 as
having 33KHz upper range, which allowed me to hear a beetle walking over
leaves and grass in a basement window sill at about 20 feet away...

Unfortunately I lost a lot of hearing through concerts, headphones, and
traffic, but I can still tell the difference between an MP3 and CD-quality
music much of the time.

~~~
Applejinx
That's not even an additional octave over 20K. It amazes me when people behave
as if there's a magical hard limit that's really, really precise and applies
to all humans.

I think you probably understand that people get locked into arguing for
victory, and if you tell them you were tested with thus and so range (as a
young person, which is plausible) they will simply call you a liar.

My own experience is this: when I was a kid, I got an ear wax problem, and had
it removed (hasn't recurred). It was a horrible painful nightmare with nasty
tweezers and water squirters, and I was just a little kid… but afterwards,
sound (especially very high frequency sound) was a revelation.

Later, when I was a little older, the advent of digital audio (at first, in
record albums) was a nightmare to me, because I couldn't understand how or why
that stuff sounded SO BAD. And of course my early experience of CDs was pretty
nightmarish: pod people music, with all emotion and humanity weirdly excised.
That's what got me into audio: wanting to understand this, and then later, fix
it.

I did actually succeed: I can produce and mix and process digital audio that
young me would not be horrified by. But especially if I had to meet that
higher bar, I won't be able to do it at less than say 22 bit/80K, well
engineered. If I get to use all my current tricks I could do it at 20 bit/80K:
I can cheat word length easier than I can compensate for a nasty brickwall
filter.

24/96K is widely prevalent and enough, given good converters. I'm not
convinced 192K is at all necessary, but the more people crusade against it,
the more contrarian I get ;) I've got a Beauty Pill album mastered in 24/192K
and it sounds freaking phenomenal. Mind you, I have professional equipment
designed to handle that.

------
donatj
Contrary to what he says, I can most certainly see some IR remotes. Not super
bright but enough to notice. Same goes for the IR lights in my Kinect.

~~~
xiphmont
"The original version of this article stated that IR LEDs operate from
300-325THz (about 920-980nm), wavelengths that are invisible. Quite a few
readers wrote to say that they could in fact just barely see the LEDs in some
(or all) of their remotes. Several were kind enough to let me know which
remotes these were, and I was able to test several on a spectrometer. Lo and
behold, these remotes were using higher-frequency LEDs operating from
350-380THz (800-850nm), just overlapping the extreme edge of the visible
range. "

~~~
donatj
Interesting. I had not read the footnotes. They should update the actual post
rather than simply amending it in the footnotes.

------
lightedman
I want as much resolution as possible, because when you do like I do and
introduce pitch-shifting after the fact (re-tuning a Floyd Rose for every
other song is not really conducive to a jam session) 24/96 just doesn't cut it
at all, and you get bad artifacting further than a half step either way.

------
sideshowb
Minor niggle: we _can_ see ultraviolet just a little because it makes various
bits of our eyes glow. Perceptually this shows up to us as a little bit of
fuzziness near a UV source. But no, we can't see it with anything like the
resolution we see visible light.

------
insulanus
Excellent writing. Information conveyed concisely, but targeted at non-
audiophiles.

------
chaboud
So, having come from an image-processing background to put in 15 years of
experience at audio processing companies, I've seen the xiph arguments, or
something like them, several times. Heck, I worked for Sony (where that
precious 22KHz cliff was made) for close to ten years... in professional
media... many of those years with Sony professional audio people.

And the "44.1KHz is enough" or "48KHz" is enough" people are, sadly, kind of
dumb.

How do I know? Because I was dumb, too.

Being a coding/math/audio/video badass, after a few years of industry
experience, I rattled off some mouthy kid quip, saying "well, I don't know why
we do 24/96, since nobody can hear above 20K anyway..."

And a very talented, very knowledgable, and generally reserved engineer
suddenly perked up one eyebrow and said, incredulously "because temporal and
frequency response are inherently linked..."

That look was in 2001, and I still remember that feeling of dread sinking in,
realizing that I had no idea what he was talking about, no concept of why that
would matter. I knew about Nyquist and could write a quick FFT, but he'd spent
four years getting a degree in pure audio engineering at the most selective
program in the country. That look, which I _completely_ remember today, was
like a deeply disappointed parent after a kid has just been bailed out of
jail, from one of the nicest engineers I know.

It was late, I was brash, and I pressed on, asking him what he meant. "What's
a transient look like, spectrally?" he asked. He waited for my blinks to make
audible sounds (due to the apparent hollowness of my head), then he asked "and
how many channels of audio do we listen to?"

He watched me stand there, like a doofus, for what seemed to me like several
minutes (probably 5-10 seconds), then he went back to coding.

It didn't hit me until weeks later, and I didn't really internalize until
_years_ later, that he was hinting at inter-ear phasing the _other_ facilities
of our auditory systems besides frequency response. Years later, I read up on
Georg von Békésy's incredible work (including positional acuity), and I worked
as a tech lead at Digidesign on the first-generation Venue live sound system
which operated at 48KHz but with _incredibly low latency_ (processing steps
were 1, 2, or 16 samples) due to the requirements of, for instance, vocalists
using in-ear monitors.

Along the way, I ran across Microsoft engineers who thought that ~10-20ms
inter-channel timing consistency would be okay in Windows Vista (it wasn't),
conducted blind tests between 96KHz and 44.1KHz audio (for people who were
_shocked_ to immediately notice differences), came across plenty of hot-shot
kids who said exactly the same kind of stuff I'd said, and saw postings from
xiph making a mix of valid and grossly sophistic arguments ranging from
"here's how waveform reconstruction from regularized samples work" (good) to
"audio equipment can't even capture signals beyond this range" (dumb). At
times, I thought about setting up refutation articles, then I realized, like
many, that I had actual work to do.

Von Békésy's work points to positional inter-ear phase fidelity of roughly
10µs. What's the sampling interval at 44.1KHz? >22µs? Good luck rebuilding
that stereo field at 44.1...

The trick is that there is a really serious diminishing return on audio
sampling rate. 4KHz to 8KHz is enormous... 8KHz to 16KHz is transformative...
16KHz to 32KHz is eye-opening... 32KHz to 48KHz is appreciable... 48KHz to
96KHz is.... pretty minor, especially in the age of crappy $30 earbuds,
streaming services delivering heavily compressed audio that will be crammed
into a BT headset that may or may not be connected with additional
compression, and all of the convenience that those changes bring. You may
detect it in some audio if you're really listening, if you know what to listen
for, and it may present advantages in system design (converters, processing,
etc). From a data-rate perspective, the low-hanging fruit has already been
picked.

But people who smugly say that there is "no difference", that audiophiles are
buying "snake oil", are letting their ignorance show, and that's including
that kid that I was, 16 years ago.

I've since moved out of pure pro media to consumer devices, where precision
takes a back seat to the big picture a lot of the time. When discussing an
audio fidelity multi-channel problem with a possible vendor last year, I
expressed my concern about the inter-channel timing assurance slipping from
1µs to 50µs in product generations. "Depending on the sampling rate, that's
several samples of audio", I said.

A very senior engineer on our side (Director equivalent) quickly admonished
me, saying "it's _microseconds_ , not _milliseconds_ ", to which I said "I
know... Which is why it's several samples, not several thousand..."

From the look on his face, I'm 100% sure that he didn't understand me at the
time, but I hope he put it together eventually.

In the end, the industry has moved in the opposite direction of 24/192 for a
long time. If we can get back to normalization of CD quality audio, I'll be
happy.

~~~
stephen_g
All that still doesn't explain why in properly controlled double blind trials,
people still haven't been able to demonstrate the ability to distinguish
between different sample rates above 44.1kHz though...

~~~
chaboud
Citations? I'm happy to read any research you have. (You can also come over,
crack a beer, and record balloon pops and acoustic instruments... and an
electric guitar... and electric kazoo.)

The xiph-cited studies I've seen show an identification of difference and a
preference for... MP3. Hey, we want what we want.

Otherwise, go read Von Békésy's work for the foundation, established in the
pre-digital era, but transferable if you understand digital audio.

For recognition of difference in high res audio, see:

Reiss -
[https://qmro.qmul.ac.uk/xmlui/bitstream/handle/123456789/134...](https://qmro.qmul.ac.uk/xmlui/bitstream/handle/123456789/13493/Reiss%20A%20Meta-
Analysis%20of%20High%20Resolution%202016%20Published.pdf?sequence=1)

...And the papers referenced.

It's an interesting meta analysis and a good survey of the last 20 years of
publication on the subject.

If you have some properly controlled double blind trials that show no
discrimination ability, I'd be happy to read them. I'll admit that I haven't
conducted statistically sufficient tests. I have, however, double-blinded (via
software pseudo-random sample randomization).

Like I said, though, I've got work to do. Do some listening. Read some papers.

------
Kiro
Am I the only one who have no idea what 16/44.1 or 16/48 mean? I initially
thought 192 here was referring to 192 vs 320 etc but this is apparently about
something completely different?

~~~
draugadrotten
16/44.1 - 16 bits at 44.1 kHz frequency. 16/48 is 16 bits at 48 kHz frequency.
Both refers to the sampled but uncompressed, "raw" audio.

192 and 320 usually refers to the bitrate for mp3 compressed audio, which
indeed is something different. The mp3 compression removes more and more
details from the original audio to fit "inside" this bitrate speed window,
hence higher is better because less original content is removed. Only great
ears can hear what was removed from a 320 stream.

------
abhirag
This seems the best place to plug one of my favorite musicians,
24192([https://m.soundcloud.com/x24192](https://m.soundcloud.com/x24192))

------
mailslot
I don't know. It could be very useful once cellphone makers find out how to
blast audio directly into nerve impulses. New and innovative audio filters.
Humans with custom DNA. Etc.

------
silimike
There's something satisfying about having really fat audio files on the hard
drive.

------
flatfilefan
when I listen mp3s via 24/192 M-audio spdif to denon 1705 there is a huge
differenc to just optical from the motherboard. why is that then?

~~~
anyfoo
There can be dozens of completely different reasons why that might be the
case, without requiring inaudible frequencies to be involved.

------
Applejinx
I work in this industry, and I have produced what is almost certainly the most
high-performance dither to date, which works through noise-shaped Benford
Realness calculations: [http://www.airwindows.com/not-just-another-
dither/](http://www.airwindows.com/not-just-another-dither/) I mention this to
say that I can absolutely make 16 bit 'acceptable' or 'listenable', even for
audiophiles. I do that for a living. And yet…

Monty is wrong. To cover the range of human listeners, the required specs even
through use of very insensitive double blind testing (which is geared to
substantially indicate the PRESENCE of a difference between examples if that's
present, and does NOT similarly indicate/prove the absence of a difference
with a comparable degree of confidence: that is a horrible logical fallacy
with realworld consequences) are more like 19-21 bit resolution at 55-75K
sampling.

Beyond this, there's pretty much no problem (unless you are doing further
processing: I've established that quantization exists even in floating point,
which a surprising number of audio DSP people seem not to understand. There's
a tradeoff between the resolution used in typical audio sample values, and the
ability of the exponent to cover values way outside what's required)

That said, it is absurd and annoying to strive so tirelessly to limit the
format of audio data to EXACTLY the limits of human hearing and not a inch
beyond. What the hell? I would happily double it just for comfort and
assurance that nobody would ever possibly have an issue, no matter who they
were. Suddenly audio data is so expensive that we can't allow formats to use
bytes freely? That's the absurdity I speak of.

Our computers process things in 32-bit chunks (or indeed 64!). If you take
great pains to snip away spare bits to where your audio data words are exactly
19 bits or something, the data will only be padded so it can be processed
using general purpose computing. It is ludicrous to struggle heroically to
limit audio workers and listeners to some word length below 32 bit for their
own good, or to save space in a world where video is becoming capable of 1080p
uncompressed raw capture. Moore's law left audio behind years ago, never to be
troubled by audio's bandwidth requirements again.

Sample rate's another issue as only very nearby or artificial sounds (or some
percussion instruments, notably cymbals) contain large amounts of supersonic
energy in the first place. However, sharp cutoffs are for synthesizers, not
audio. Brickwall filters are garbage, technically awful, and expanding sample
rate allows for completely different filter designs. Neil Young's ill-fated
Pono took this route. I've got one and it sounds fantastic (and is also a fine
tool for getting digital audio into the analog domain in the studio: drive
anything with a Pono and it's pretty much like using a live feed). I've driven
powerful amplifiers running horn-loaded speakers, capable of astonishing
dynamic range. Total lack of grain or any digital sonic signature, at any
playback level.

My choice for sample rate at the extreme would be 96K, not 192K. Why? Because
it's substantially beyond my own needs and it's established. I'm not dissing
192K, but I wouldn't go to war for it: as an output format, I would rather
leave the super high sample rate stuff to DSD (which is qualitatively
different from PCM audio in that the error in DSD is frequency-sensitive: more
noise in the highs, progressively less as frequency drops).

Even with DSD, which is known to produce excessive supersonic noise even while
sounding great, the scaremongering about IM distortion is foolish and wrong.
If you have a playback system which is suffering from supersonic noise
modulating the audio and harming it, I have three words you should be studying
before trying to legislate against other people's use of high sample rates.

"Capacitor", and "Ferrite Choke".

Or, you could simply use an interconnect cable which has enough internal
capacitance to tame your signal. If you have a playback system that's capable
of being ruined just by 192K digital audio, your playback system is broken and
it's wrong to blame that on the format. That would be very silly indeed.

I hope this has been expressed civilly: I am very angry with this attitude as
expressed by Monty.

~~~
Applejinx
I will add that the concerns of transient timing are actually a fallacy: given
correct reconstruction, sampling is more than capable of producing a high-
frequency transient that crosses a given point at a given moment in time
that's NOT simply defined by discrete samples. Reconstruction is key here, and
no special technique is required: sampling and reconstruction alone will
produce this 'analog' positioning of the transient anywhere along a stretch of
time.

The accuracy is limited by the combination of sample rate AND word length: any
alteration of the sample's value will also shift the position of the transient
in time.

But since the 'timing' issue is a factor of reconstruction, you can improve
the 'timing' of transients at 44.1K by moving from 16 to 24 bit. The
positioning of samples will be a tiny bit more accurate, and that means the
location of the reconstructed wave will be that much more time-accurate, since
it's calculated using the known sample positions as signposts.

Positioning of high frequency transients does not occur only at sample
boundaries, so that alone isn't an argument for high sample rates. You can
already place a transient anywhere between the sample boundaries, in any PCM
digital audio system. The argument for higher sample rates is use of less
annoying filters, and to some extent the better handling of borderline-
supersonic frequencies. For me, the gentler filters is by far more important,
and I can take or leave the 'bug killing' super-highs. I don't find 'em that
musical as a rule.

------
plg
This is fabulous

------
sp332
(2012) but still an excellent article.

Edit: And the linked "Show & Tell" video is a great way to get some
"intuition" about the sampling theorem.
[https://video.xiph.org/vid2.shtml](https://video.xiph.org/vid2.shtml)

~~~
zymhan
Yeah, this post title should be dated, I definitely read this exact post years
ago.

I mean it references Steve Jobs in the present ffs.

------
sqldba
It really needs a TL;DR because it's all buried in so much fluff I didn't
understand why. (I understand the rest of you all really enjoy that and
wouldn't call it fluff but I'm not an audio engineer).

------
kevinsaptell
Exactly in 2017 it looks like silly but i t has been popular in sometime.

------
RachelF
To be replaced soon by "4k video is very silly indeed"

If you look at the angular resolution of the eye, unless you are sitting very
close to the screen, you can't resolve 4k video.

~~~
jaquers
Can't tell if this is sarcasm or not, but I think that's sort of the point.
You want a high enough pixel density so you can't perceive pixels at normal
viewing distance, and I can definitely see pixels on my 42in 1080p screen from
across the room.

~~~
sundvor
I'm going to go for "sarcasm" there.

I use a 65" 4k TV with my HTPC in the living room, but often do desktop-ish
things on it. (Logitech's wireless keyboards are great).

The difference in resolution between 1080 and 2160 is huge. 1080 is just
fuzzy.

~~~
quickthrower2
65 freaking inches! Of course you are gonna need 4k

------
yzhou
the devil is in the DAC and ADC. You just can't turn 16bit/24bit data directly
to/from analog without much loss. The 1/65536 accuracy voltage divider simply
don't exist.

So, you have to up-sample the songs to high rates with less bits, like 1bit to
6bits, then do the conversion, and get the best SNR you can.

In this sense, there's simply a lot of advantage of using 24/196 since the
above conversion can result in less loss and higher SNR

~~~
arrrvalue
why was this down-voted too? it's the truth!

------
arrrvalue
and yet, we are nowhere near being able to electronically reproduce a live
acoustic music performance. have you ever walked by a bunch of musical sound
coming out of a room and thought to yourself, "wow, those live musicians sound
great" only to discover it was just a stereo playing? nope.

as engineers we will never solve this problem as long as the "44.1kHz is good
enough" dogma is perpetuated.

here's a question. why are frequency and bit depth the only two variables
under discussion here? how does the human ear locate a sound in space? suppose
I place a series of 20kHz tone generators along a wall (and that I can still
hear 20kHz :) and trigger them at different times, and record the session in
stereo at 44.1kHz with a standard X-Y mic setup. will I be able to reconstruct
the performance?

~~~
romwell
>as engineers we will never solve this problem [reproduce a live acoustic
music performance] as long as the "44.1kHz is good enough" dogma is
perpetuated

It's the opposite. We are never going to solve this problem if we are going to
focus on things that have nothing to do with the problem. Compare and
contrast:

>as engineers we will never solve this problem as long as the "copper wires
are good enough" dogma is perpetuated

Also, please read the article. The author specifically lists advances in audio
tech they think are worthwhile to consider, such as surround sound. This
actually addresses the problem you mentioned (reproducing the live
performance) and the question you asked, i.e.

>here's a question. why are frequency and bit depth the only two variables
under discussion here?

They are not, at least not in the article. Here it's because that's what's in
the title, and not everyone gets to the end of the article.

Some comments do talk about the importance of having a good DAC for a good
sound.

~~~
arrrvalue
interesting viewpoint, however, did you think about the experiment I
presented? without an answer, sample rate and cabling cannot be considered
equivalent distractions on the road to high fidelity.

~~~
function_seven
You’re talking about something orthogonal to the question at hand. It’s like
complaining that the 4K TV sucks at VR.

Of course it does. It’s not meant to provide VR.

Same thing with sampling and bit-depth. Those address digital encoding of
analog signals. They have nothing to say about speaker design, number of audio
channels, room acoustics, or the myriad other factors that go into replicating
a live stage performance.

~~~
arrrvalue
It's not obvious that 2 channels of recorded audio aren't sufficient to
recreate a convincing stereo image; suggesting that I'm seeking the equivalent
of VR is specious.

And you haven't answered my question about the array of 20kHz tone generators.
In fact, NOBODY has, and yet the question has been down-voted! How is that
even possible? Posing a novel experiment which might invalidate the populist
view considered harmful?

TFA's author is not active in the field of advancing man's ability to recreate
live music more convincingly, AFAIK; he writes codecs. He believes people
shouldn't purchase 192kHz downloads. He's certainly right that most consumers
won't be able to tell the difference with their current equipment. But he
makes no mention of the interaural time difference in human auditory
perception, so he's already not telling the whole story. There is more to
learn here, folks, and down-voting a _question_ is an embarrassing failure of
these forums. Why aren't posts in support of music piracy down-voted (read
above)?

~~~
romwell
>TFA's author is not active in the field of advancing man's ability to
recreate live music more convincingly, AFAIK; he writes codecs

As your other questions have been addressed by others, I simply would like to
point out that this seems to be quite an arrogant stance to have.

The development of codecs has a lot to do with understanding of how the humans
perceive sound, and how to effectively encode and reproduce sounds - which is
useful even if you personally never listen to anything but analog recordings
on analog systems.

However, we do live in a digital world, and one where codecs are a necessity.
Codecs made recording, sharing, and distributing digital media at all possible
- and now, they are making it possible to create _better_ recordings by any
metric you choose.

Consider this: bandwidth and space-saving that codecs give you allows you to
record more data with the same equipment at the highest settings. That's why I
don't have to think if I'll run out of memory I want to record 4-channel
surround sound on my Zoom H2N (something that definitely goes towards a more
faithful reproduction of being there than, say, bumping the frequency to
192kHz, which, incidentally, is the point of the article).

Unless you are there to record every live show, we'll have to rely on other
people doing that - and guess what, they'll use codecs! How do I know that -
that's because I do, they do, and the absolute majority of live show
recordings that I've seen were not available in lossless formats. For that
matter, good codecs contribute directly to the quality of the sound you'll
hear.

Therefore, advancing the codecs does advance man's ability to recreate live
music more convincingly.

So please, pause before dismissing other people's work.

>But he makes no mention of the interaural time difference in human auditory
perception

He also doesn't mention how long it would take from Earth to Mars on a rocket,
or the airspeed velocity of an unladen swallow. If you want to make a claim
that this is somehow relevant to the question, you need to argue why, with
sources - or simply ask the author, who might just answer.

>There is more to learn here, folks, and down-voting a question is an
embarrassing failure of these forums. Why aren't posts in support of music
piracy down-voted (read above)?

Not all questions are created equal. Your last question is an example of one
that rightly deserves to be downvoted, as it contributes nothing to the
discussion (of whether 192Khz really does anything for us), appeals to
emotion, and derails the conversation off the topic. Please don't do that.

~~~
arrrvalue
> Therefore, advancing the codecs does advance man's ability to recreate live
> music more convincingly.

Only where bandwidth and storage are constrained. If we're trying to push the
state of the art, it's not going to be with a Zoom H2N.

The best music reproduction systems use lossless compression. Psychoacoustic
compression does NOT get us closer to the original performance. I'm stating
this as someone who gets 5 out of 5 correct, every time, on the NPR test:

[http://www.npr.org/sections/therecord/2015/06/02/411473508/h...](http://www.npr.org/sections/therecord/2015/06/02/411473508/how-
well-can-you-hear-audio-quality)

(I'm ignoring the Suzanne Vega vocal-only track due to both its absence of
musical complexity and use as test content during the development of the MP3
algorithm.)

While I appreciate xiphmont's codec work, I am dismissive of his open attempt
to steer research and commerce in this area.

Why is his article posted as "neil-young.html"? Is that really fair?

> If you want to make a claim that this is somehow relevant to the question,
> you need to argue why, with sources - or simply ask the author, who might
> just answer.

Please see chaboud's excellent post above, referencing the work of Georg von
Bekesy.

> Your last question is an example of one that rightly deserves to be
> downvoted

You're referring to my array-of-20kHz-tone-generators experiment? Sorry I
don't know the answer, but I haven't done the experiment myself; I was hoping
someone here had! Where's the appeal to emotion, though? If the experiment
shows a higher sample rate is necessary (that's the whole point of the
experiment) it's germane.

~~~
romwell
>Only where bandwidth and storage are constrained

I.e. everywhere in this universe. There is not such thing as unlimited
bandwidth/storage. Gains that codecs give allow us to record information that
otherwise would be lost.

>If we're trying to push the state of the art, it's not going to be with a
Zoom H2N.

I wish I could see the future so clearly!

I only have guesses, and my guess tells me that audio captured from 10 Zoom
H2N's at 48kHz will store more information than audio from a single microphone
at 480kHz. Current "state of the art" seems to use fewer channels. An advance
in the state of the art in the direction of utilizing _more sources_ seems
more than feasible to me.

>Psychoacoustic compression does NOT get us closer to the original performance

I think you have missed my point. An uncompressed source is obviously not
going to be better than the lossy-compressed data.

However, _we do not live in a world of infinite resources_. Given the
constraints, compression offers new possibilities.

At the same space/bandwidth, you can have, e.g.:

\- uncompressed audio from a single source

\- compressed audio from 5x many sources

\- compressed audio from 2x sources, plus some other data which affects the
perception of the sound (???)

This plays right into your question "Why are we only considering
bitrate/frequency?" \- we don't. Compression offers more flexibility in making
other directions viable.

This is why I believe that codec research is important for advances of the
state of the art.

>I am dismissive of his open attempt to steer research and commerce in this
area.

In what area exactly? What research? He is not "steering research", he is
educating the less knowledgeable general public. So far, your dismissive
attitude can also be applied verbatim to anyone who explains why super-thick-
golden-cables from MonstrousCable(tm) are a waste of money.

>> Your last question is an example of one that rightly deserves to be
downvoted >You're referring to my array-of-20kHz-tone-generators experiment?

No, I was referring to this:

>Why aren't posts in support of music piracy down-voted (read above)?

~~~
arrrvalue
xiphmont's primary goal appears to be to stop Neil Young from selling 24/192
audio to the general public; that's why he called the page neil-young.html.
Sure, few buyers have the ears or equipment to pursue anything beyond the
compact disc.

The problem is that many readers of neil-young.html will come away thinking
they understand human hearing and digital sampling, when in fact the article
is far too sparse on details to understand either; there is no discussion of
how sounds are located in 3D space, or of how phase information is recovered.
It is amazing that you can completely cover one ear, rub your fingers together
behind your head and precisely pinpoint where your fingers are. It is also
amazing that "Sampling doesn't affect frequency response or phase" but
xiphmont doesn't explain this at all.

And then there's this lovely quote:

"It's true enough that a properly encoded Ogg file (or MP3, or AAC file) will
be indistinguishable from the original at a moderate bitrate."

which is provably wrong. I can very reliably pick the uncompressed WAV each
try when compared against 320kbps MP3.

My attitude is in support of furthering research in the area of live sound
reproduction. As I've said, we are VERY far away right now. It is foolish to
believe we understand human musical perception completely today. We cannot
even replicate a simple cymbal strike with today's recording and playback
technology.

I would encourage the curious to stand in the center of an outdoor arc of 100
horn players, like this (feel free to skip first 48 seconds):

[https://www.youtube.com/watch?v=2EDIDCdy5Es](https://www.youtube.com/watch?v=2EDIDCdy5Es)

Once you experience that live, try to figure out how to replicate the input to
your two ears. You can't, without 100 brass players.

Interestingly, these two examples of trumpet and cymbal have significant
ultrasonic frequency content:

[https://www.cco.caltech.edu/~boyk/spectra/spectra.htm](https://www.cco.caltech.edu/~boyk/spectra/spectra.htm)

I don't believe it's a coincidence.

------
kuschku
This article makes the same mistake that is done frequently with video.

So, let’s look at a similar issue with video. Your display is likely only
720p, or 1080p, but a 4K video on youtube will still look a lot better,
although technically it should have no visible difference.

But the reality is, we don’t get uncompressed video, or uncompressed audio.

We have a choice between audio or video compressed with lossy codecs, at
16bit/44kHz or 16bit/96kHz, or 4:2:0 video at 1080p or 4:2:0 video at 4K.

And just like you need 4K 4:2:0 mp4 video to get even close to the quality of
uncompressed 1080p 4:4:4 video, you also need far higher sampling rate and
depth of highly compressed audio to get a quality similar to 16bit/44.1kHz
PCM.

That’s the real reason why 24bit/192kHz AAC downloads were even discussed.

~~~
mcbits
Suppose you have a 5 Mbps data budget, 1080p display, and 4k source material.
You will get better quality by first downsampling the 4k to 1080p and then
compressing and distributing the result. If you compress and distribute the
4k, followed by downsampling to display at 1080p, you cannot recover the color
and/or motion information that was lost in order to fit all of those pixels
into 5 Mbps.

However, if you have a 20 Mbps budget for the 4k to account for having 4 times
as much original data, then there shouldn't be much of a difference in the
downsampled 1080p video (ignoring peculiarities of the codec).

All this is not very relevant to the audio issue being discussed. It would be
relevant if it were physically impossible to perceive the difference between
1080p and 4k video, and if watching 4k video potentially caused optical
illusions. In that case, the _only_ reason to prefer the 20 Mbps 4k stream
would be if you planned to edit, mix, or zoom around in the video instead of
simply watching it.

When it comes to audio, since size isn't as much of a concern as video, in
most cases I would say "maybe I'll want to edit it someday" is strong enough
reason to get the 24/192 material at a correspondingly high bitrate if it's
available.

~~~
baddox
Of course your theory is quite sound, but I will point out that in practice
most 4k streaming content uses an HEVC codec, while most 1080p streaming
content uses an AVC codec, so you'll likely have much better results on your
data budget with the 4k signal even if significantly downsampled to your
display.

