There is no point to distributing music in 24-bit/192kHz format.

sjwright · on March 5, 2012

I must say I get rather irritated when people spend time worrying about dubious 'tweaker' methods to improve their audio, when the most under-performing component of most people's sound equipment also has the lowest-hanging fruit: The room itself.

When people ask me where they should spend money to improve the quality of their hi-fi or home theater system, in nearly every case my response will be something like "get a thicker rug" or "put something on this wall to absorb sound reflections, even if it's just a bookshelf."

Beyond that, I'd tend to say something like "stop being so paranoid about what you think you can't hear, and enjoy the damn music."

icarus_drowning · on March 6, 2012

stop being so paranoid about what you think you can't hear, and enjoy the damn music.

I'm a composer who works in film/games. I can assure you this is exactly what I'd like people to do when they listen to my music. I spend 99% of my time trying to create good musical ideas, and I spend 1% of my time getting the mix down. I get criticized (rightly) for this quite a bit, but it is hard to care about someone sitting in a >$10,000 labyrinth of sound equipment when I'd rather write a catchy tune.

Then again, when I write sheet music I have to endure some of the most soul-crushingly awful midi sequencing in order to check my work, so perhaps I'm too tolerant to terrible sound quality. Still, I'd rather people listened to the music, not the sound of it.

davedx · on March 6, 2012

What's wrong with caring about both sides of it?

A good sounding catchy tune is something work spending that little bit more time on.

icarus_drowning · on March 6, 2012

I don't necessarily mind effort spent on making sure that music is presented properly-- what I mind is when it supersedes all other concerns about the music.

kahawe · on March 6, 2012

> A good sounding catchy tune is something work spending that little bit more time on.

But in reality most of your customers want a ton-frakk of compression, loudness and filters on that catchy tune so it ultimately sounds HUGE on the tiniest phone, radio and car speakers... so all that audiophile mixing and dynamics are completely lost anyway.

Anechoic · on March 5, 2012

Agree one-hundred percent about the room (although the prescription isn't always as simple as "get a thicker rug" etc).

The other issue regarding high-frequency sound reproduction is that in most cases, the loudspeaker won't be outputting much beyond 22-25 kHz (assuming very good quality loudspeakers, cheap consumer grade units might struggle to hit a -6 dB point at 18 kHz) and even for the speakers that have usable output at that range, the directivity at those frequencies will be so narrow that your head will have to be locked in the perfect "sweet spot" to hear anything.

sjwright · on March 5, 2012

  > Agree one-hundred percent about the room

With a username like yours, I'm not surprised. :-)

  > although the prescription isn't always as
  > simple as "get a thicker rug" etc

A prescription is only as good as the likelihood that it will be heeded by the patient. A rug is an easy win; acoustic ceiling tiles and bass traps are a bit harder...

marcusf · on March 5, 2012

I might sound/be stupid for asking, but what's the actual physical response from something at 22 kHz+? I have a hard time picking up a pure sine > 17 kHz. I doubt I'd get any aural response from anything at 22 kHz, so what's the deal?

bigiain · on March 5, 2012

The deal is just that you're getting older. Your ears just don't work as well as a 12year old's. Neither do anybody else's your age (within the bounds of typical human variation - probably well over 95% of use _never_ heard 22kHz, not matter _how_ "young" our ears were).

nitrogen · on March 5, 2012

I was once in a small, treated room working with some rather large PA speakers. I was curious how far my hearing range actually extended, and did something very unwise: I played a 20kHz tone and very briefly ramped the volume up and down. I definitely heard it, but I also induced quite a lot of pain. I learned two lessons: 1. my threshold of hearing at 20kHz is near or above the threshold of pain, and 2. don't do that ever again.

joshAg · on March 6, 2012

heh, your story reminded me of the bar scene in Good Will Hunting:

WILL: The sad thing is, in about 50 years you might start doin' some thinkin' on your own and by then you'll realize there are only two certainties in life.

CLARK: Yeah? What're those?

WILL: One, don't do that. Two -- you dropped a hundred and fifty grand on an education you coulda' picked up for a dollar fifty in late charges at the Public Library.

marcusf · on March 6, 2012

Yeah I get that I'm getting older. It's just, what's the point of having a stereo that gives perfect playback at 22 kHz if you can't hear it? I'm guessing there must be something since people buy gear like that, or is just a case of deranged audiophiles?

gbog · on March 6, 2012

You might not hear a pure 22kHz sine but any sound from, say, a harpsichord will have much off these highs, and some think it is a part off the sound, that one feel without actually hearing it. I'm not endorsing this view, sound islike wine tasting, a lot of hand waving and few solid ground.

eru · on March 7, 2012

There's blind wine tasting, if you want the real deal. Without the hand waving.

marcusf · on March 5, 2012

...of course I mean audible. Or spectral? Not aural, anyway. English is not my first language. Sorry.

sp332 · on March 6, 2012

That's really good actually, I'm not even sure I know the difference between aural and audible. (<- also, that sentence is a run-on and not good English haha)

cop359 · on March 6, 2012

I don't really know much about this, but wouldn't the 22kHz sounds potentially create beats in the lower frequencies?

nullc · on March 6, 2012

Acoustic "beat tones" aren't "real" tones— you hear them because of non-linearies in the ear-brain system, but you have to hear the initial tones first. (Well, unless you're talking >>130dB SPL levels where the air starts becoming non-linear, but then lower frequency recording would capture it fine)

If you could hear subharmonic beats from ultrasonics then it would be _very_ easy to demonstrate, alas.

zachrose · on March 6, 2012

Curious, what does non-linear mean in this context?

mseebach · on March 6, 2012

IIRC, linearity is when you put a sound wave frequency into the medium (air) a some point, you can predict the frequency of the sound wave at some other place using a linear function - meaning that there is no distortion. Non-linear is when the physics of the medium starts screwing with that function.

JadeNB · on March 6, 2012

I believe that it means that the superposition principle (http://en.wikipedia.org/wiki/Superposition_principle) doesn't hold: the net response at a point is not just the weighted sum of the individual responses.

cop359 · on March 6, 2012

so without very high power sounds and the nonlinearity business the whole "Sound from Ultrasound" wouldn't work? Huh, I guess all this time I misunderstood it.

for those curious http://en.wikipedia.org/wiki/Sound_from_ultrasound

marcusf · on March 6, 2012

Well, what I can think of is that of course you need to sample at > 2*max frequency if you do uniform sampling to avoid aliasing (by Nyquist), but that's not the same as playback.

crististm · on March 6, 2012

Yes, there will be inter-modulations from higher frequencies. There are also from the audible spectrum but if the amp is linear enough they will be low.

ErikRogneby · on March 5, 2012

I agree. It should be about the performance, not the sonics. There are plenty of old Motown and even Beatles recordings with distorted vocals, bad edits, etc. Your brain passes right over them because of the emotional content of the music.

phillmv · on March 6, 2012

That's because they focussed on the most likely end-user experience:

>While Motown shortened song to fit into radio time, the company also produced records specifically with car radio audio quality in mind. Motown recording engineers set up car speakers in the studio so that they could simulate and perfect how a song would sound emanating from a car radio

- what's the point of engineering things to a set of conditions virtually none of your target audience possesses?

http://web.wm.edu/amst/370/2005/sp3/machinery_marketing.htm

bigiain · on March 5, 2012

This.

I worked out a long time ago, that I enjoy listening to _music_, not HiFi gear.

My advice to people who ask how to make their system sound better? Buy some music you enjoy more…

I can enjoy a wonderful performance of a great tune played through my laptop speakers - much more that I enjoy test tones or gear-demo-tracks through sound gear worth something north of a new car…

(not that I haven't been "that guy" in my past…)

cuu508 · on March 6, 2012

> My advice to people who ask how to make their system sound better? Buy some music you enjoy more…

Good advice, but you do need some baseline quality equipment to start with. Got my car with one speaker blown out, speakers wired semi randomly (left-right and front-rear faders don't work as they should), also powering line-in source from cigarette lighter results in funky background noise. Sounds great--when a good tune is playing and I'm able to recognize it ;-)

brigade · on March 5, 2012

I'd disagree, at least among the people I know: they all have cheap HTIB systems and the single biggest, most cost-effective improvement you can make from there is to buy better speakers.

Then you can worry about your room.

Anechoic · on March 5, 2012

No, even in that case, the room can still overpower the speaker. A $200 HTIB system in a properly treated room will sound a lot better than $28,000 Wilson Watt Puppies in a bathroom for example.

gbog · on March 6, 2012

Hum, my experience (I was sound engineer in a previous life) is that the first thing to fix bad sound is to flatten the equalizer and to remove bass enhancer. Then I'd put the speakers on a solid table in a relative symmetry regarding the listener, while checking they have the correct phase. All the rest is rarely necessary.

polshaw · on March 5, 2012

Sure, but that's a rather contrived example-- most people have a fairly normal room, and the average joe would be best served by getting a good pair of speakers, and a reasonable amplifier and DAC, before worrying seriously about room acoustics.

sjwright · on March 5, 2012

A fairly normal room is pretty awful these days, particularly with the trend towards timber flooring and sparse furnishing.

dedward · on March 6, 2012

I'd even skip the dac (I mean I might not.... but a decent amplifier, and I just mean decent, like something that still works from the 70's or 80's) and a decent pair of tower speakers (needn't be expensive), and, well, just don't use the shittiest cables you can find (I mean as long as they are thicker than a few human hairs you're okay)- it'll sound far, far better.

Or get some half decent headphones.

nitrogen · on March 5, 2012

You might be surprised how much of a difference EQ can make. As an experiment I once used 12 bands of parametric EQ to adjust the speakers in a cheap, old LCD monitor. Sure, you're not going to get any better bass response than before, but stereo imaging (nonexistent before EQ, perfect phantom localization after), spectral balance, clarity, distortion (due to not exciting resonances in the monitor), etc. were significantly improved. Most people could have added an EQ'd subwoofer to those tinny LCD speakers and been completely satisfied.

One of the many things EQ can't fix, of course, is room reflections, which can be helped by room treatments and speakers with a directivity better suited to the room.

gbog · on March 6, 2012

Of course EQ can help with room reflection! In a square room you'll have a resonance at a given frequency, and you can mitigate a bit this problem with an EQ. But usually EQs are used too add bass and do more harm than good.

nitrogen · on March 7, 2012

By room reflections I mean higher frequency reflections that result in comb filtering and spatial and temporal smearing of the sound, rather than lower frequency resonances that result in the standing waves you mention. EQ can reduce the effect of room resonances, but it still can't fix the extended decay time at those frequencies[0].

[0] DRC can improve this, but only within a small sweet spot.

sjwright · on March 5, 2012

I should point out that after room treatment, my next recommendation tends to be an amplifier with Audyssey MultEQ in-room calibration. I've never heard a listening environment that didn't sound unambiguously better with it enabled.

jodrellblank · on March 6, 2012

Did you read the article? Are you sure it wasn't just unambiguously louder? ;)

Terretta · on March 6, 2012

Yes. Because one of the settings shown after the full room color sweeps is dB, and the sweeps will often set various speakers a bit quieter. (The goal of Audessey calibration is to eliminate frequency resonance hot spots in common listening positions, as well as get flat response from your speaker setup.)

nullc · on March 5, 2012

Audio tweaker with hacker leanings but without carpentry skills? Start your engines: http://drc-fir.sourceforge.net/

b0rsuk · on March 6, 2012

We have a lot in common.

It is said that in most cases 192 kbit .mp3 is indistinguishable from >192, and blind tests support that. Granted, there are instruments like castanets which make it easier to hear the difference. In general though, I can't distinguish 128 from 192 and I listen to music a lot. Also it's unlikely that my hearing is already damaged because I try to keep volume low.

But I've noticed that where I put the speakers makes a huge difference. I can easily tell the difference from speakers on the floor versus speakers on my desk. Where I'm at the moment also matters a lot. If I lie on the floor, floor speakers don't sound as bad anymore.

In the end, I use headphones. Midrange Audio Technica ones, and I'm probably already overpaying a bit. But I bought them for build quality and comfort, and I wasn't disappointed. I can have wear them for hours (Not healthy I guess, but I'm used to wearing them even with no music being played). Headphones have the advantage that it suddenly stops to matter where your speakers are and where are you relative to them.

tcarnell · on March 6, 2012

Is the 192 the bitrate or the frequency response? I thought 192 in MP3 was the bit rate, not the maximum frequency response...

kahawe · on March 6, 2012

This effect isn't sooo surprising seeing as it even occurs with dumb mono guitar cab speakers and is very, very, VERY clearly audible there, even just moving your head a few cm in or out of the cones' axis.

xiphmont · on March 5, 2012

Amen. Someone got the point of the article.

goblin89 · on March 6, 2012

> Stop being so paranoid about what you think you can't hear, and enjoy the damn music.

Yes. Do a couple of blind tests with your acoustic system first.

> It's true enough that a properly encoded Ogg file (or MP3, or AAC file) will be indistinguishable from the original at a moderate bitrate.

Disagree. This claim seems to be ungrounded compared with others.

I can believe limitations with bit depth and sampling rate (although I'll take a chance to test myself if I get near good enough acoustic system). However, I definitely could discern in a blind test whether music I listened to was stored using lossy format with reasonable bitrate. It's usually quite audible with rock music that involves cymbals.

pja · on March 6, 2012

There's a specific "bug" in the mp3 encoding scheme which means that you get a pre-echo effect on fast attack waveforms. It's inherent in the encoding, so it can't be eliminated (although the higher the bitrate, the less obvious it is IIRC). If you how know to listen out for it then you'll spot it immediately.

AAC / Ogg don't have that limitation & at high enough bitrate should be indistinguishable from the source in a blind listening test, as demonstrated in a number of Hydrogen Audio listening tests down the years, unless of course you're using crappy encoders at which point all bets are off...

(Really, LAME is very good indeed these days. I eventually decided that I was going to get with the program and just encode all my CDs (backed up to flac files) as mp3 for portable listening. It's good enough, and I've decided not to listen for the pre-echo artifacts so that I won't notice them :) )

goblin89 · on March 6, 2012

Well, actually it was that long cymbal sounds “fade” quicker and just sound different with lossy music.

IIRC I distinguished an mp3 encoded by iTunes with bit rate 192 or 256 kbps from its original in Apple Lossless (both played on same cheap acoustic system). I probably should test with AAC or Ogg, too. Although I have a feeling that it's pretty much impossible to keep intact those rich in high frequencies cymbals while keeping compact file size.

> I've decided not to listen for the pre-echo artifacts so that I won't notice them :)

You're much better at controlling your mind. =) After I once verified that the difference is audible even on cheap speakers, I can't switch back to lossy formats. It means constant wondering if that how it's supposed to sound or not…

That's, by the way, why Apple's idea of having ‘Mastered for iTunes’ label[0] IMO is worthwhile—at least you can be sure that mastering engineer listened to it this way. =)

[0] http://arstechnica.com/apple/news/2012/02/mastered-for-itune...

pja · on March 6, 2012

Might be interesting to try AAC or Ogg Vorbis & see if they're any better. In these days of ever increasing cheap portable storage carrying a bunch of flacs around isn't quite as nuts as it used to be of course.

(Cymbals seem to be a particular bugbear for mp3 encoding; cymbal-heavy tracks tend to suffer the most from obvious encoding artifacts once you know what to listen for.)

TylerE · on March 6, 2012

It's certainly true. Ogg is far less audible than mp3 - it's actually very decent at bit rates as low as 64kpbs.

jefftk · on March 6, 2012

"I distinguished"

Was this in a blind test?

goblin89 · on March 6, 2012

Please refer to my comment above[0]. Yes, it was a blind test. The person helping me might've looked up bit rates, so it was not a double-blind experiment, but I could not (nor did I want to) see what's being played, and relied only on hearing.

[0] http://news.ycombinator.com/item?id=3669893

jdietrich · on March 6, 2012

A thousand times this. I never cease to be amazed at the number of people who will vocally argue the benefits of solid-silver wundercable, but who've never heard of mirror points or bass traps. $20,000 hifi systems in rooms with bare wooden floors and bare concrete walls. Subwoofers in untreated cubic rooms. People praising the transient response of their PMC MB2s in a room with chronic flutter echo. It's utterly dispiriting.

agentgt · on March 5, 2012

I agree the damn room matters so much more. Also the distortion of most speakers is already the bottle neck in most people's systems.

libraryatnight · on March 6, 2012

Alan Parson's makes this point, too: http://boingboing.net/2012/02/10/alan-parsons-on-audiophiles...

exiled · on March 5, 2012

Exactly! I could not have put it better myself!

bigiain · on March 5, 2012

" … when the most under-performing component of most people's sound equipment also has the lowest-hanging fruit:"

And I was _so_ sure the next sentence was going to be something like:

"No, do not have any suggestions that will make your sound equipment make Justin Bieber sound better…"

cmer · on March 5, 2012

There's a lot of scientific-sounded content in this, but unfortunately most of it couldn't be further from the truth. I'm an ex-audio engineer and studied digital and analog audio engineering; this has been debated to death over the last 15 years.

Digitally recording a triangle is the best example of why 48kHz is very limiting. The distinct sound of the triangle constitutes of a high fundamental frequency, ballpark 5kHz and of many very high-pitch harmonics. Most of these harmonics are above 20kHz. The harmonics are what makes it sound like a triangle, not the frequencies below 20kHz. This is why the triangle is one of the hardest instruments to digitally record. It always sounds like crap.

In theory, it's true that the human hear can't hear above ~18kHz, but it can hear the influence of the very high pitch harmonics on a lower frequency.

EDIT: here's more data backing what I said http://www.cco.caltech.edu/~boyk/spectra/spectra.htm

EDIT 2: typos, frequency mistake

Joeboy · on March 5, 2012

> Digitally recording a triangle is the best example of why 48kHz is very limiting

The article's about distribution, not recording. I don't think anybody disputes the usefulness of higher sampling rates when recording.

> In theory, it's true that the human hear can't hear above ~18kHz, but it can hear the influence of the very high pitch harmonics on a lower frequency.

...and 48kHz audio contains those lower frequencies.

cmer · on March 5, 2012

Stripping frequencies above 20kHz negates the effect on the lower frequencies since those lower frequencies are not "modified" by the higher ones. The human hear can actually hear the very high harmonics when they're combined with a lower fundamental frequency.

For example, the human hear will hear a 30kHz frequency if it's fundamental is 10kHz. If it's played at 44.1kHz, the 30kHz frequency is gone and all you'll hear is 10kHz, not a "different sounding" 10kHz.

Anechoic · on March 5, 2012

For example, the human hear will hear a 30kHz frequency if it's fundamental is 10kHz

You are going to have to provide me with a citation to back that up because that goes against everything I've learned and experience in 17 years of working in acoustics.

eatmyshorts · on March 5, 2012

Here is a Wikipedia article on the subject: http://en.wikipedia.org/wiki/Sound_from_ultrasound

Basically, if you produce two ultrasonic frequencies, they will create an interference pattern at a much lower frequency than either of the individual frequencies. Modulate a signal on the difference between two signals, and you can create a directional speaker, since ultrasonic sounds tend to be highly directional (so long as the diameter of the transducer is greater than 1/2 wavelength, which is almost guaranteed with ultrasonic signals). This is how the "sound cannons" that are being deployed for crowd control work.

Anechoic · on March 5, 2012

That article describes hetrodyning which happens because ultrasonic frequencies at high amplitudes interacts nonlinearly with air. You are not going to see that effect with sound waves generated near the audible spectrum, and normal loudspeakers are going to generate ultrasonic sound waves.

eatmyshorts · on March 5, 2012

Yes, but the effects of interference patterns between multiple ultrasonic frequencies is the same, and definitely does affect the audible spectrum. This is why we must filter the square wave that comes out of a DAC. And the limitations of filters (phase shifts and roll-off) are why modern CD players oversample the signal--so that the filtering can be performed well beyond the audible spectrum.

Anechoic · on March 6, 2012

This:

Yes, but the effects of interference patterns between multiple ultrasonic frequencies is the same, and definitely does affect the audible spectrum

has nothing to do with this:

This is why we must filter the square wave that comes out of a DAC

The only reason that square waves "must" be filtered is to reduce the potential of damaging tweeters. If you want to record a square wave with the purpose of later reproducing the square wave, than you don't want to filter it - once you filter it, it's no longer a square wave.

eatmyshorts · on March 6, 2012

OK, if you say so. I think you're misunderstanding a fundamental concept of digital to analog converters. But if you think it's just to prevent blowing your speakers, that's OK.

The reason that square wave sucks is because it introduces tons of high frequency content (your amp probably won't reproduce the high frequency content anyway, so I don't think most Japanese consumer amps will damage your speakers--that is, the amp will act like a filter anyway). That high frequency content then creates alias effects (think of moire patterns when looking at super high-res photos that are scaled down without anti-aliasing). Those alias effects sound like shit to the human ear.

The point of filtering is to anti-alias the resulting analog signal after conversion from digital to analog. The point of upsampling is to move that filter well beyond the audible range, so you can use a 1st-order filter (gentle slope, but it introduces no phase effects). The fact that a square wave hurts your speakers is inconsequential--the amp will effectively filter the signal anyway. Unfortunately, it will filter the signal without anti-aliasing, which introduces those nasty interference patterns within the audible spectrum (that is, if you feed a straight 44.1KHz sampled square wave to your speakers without upsampling/filtering).

juiceandjuice · on March 5, 2012

Recording music is supposed to be a snapshot (with room for interpretation) of the composition at play.

Trying to record an edge case like this is the same as recording in a room with bad acoustics. So you end up with some weird (but not faithful) representation of the sound which is a snapshot of the microphone's characteristics and directionality of the ultrasonic tones. It's not reasonable to assume any microphone will behave exactly like a human ear. Even if you could, you're going to have to mimic the tiny random movements a normal person would make listening to a sound, movements which would definitely impact the perception of the sound, because microphones are much more stationary than any human would be.

The "different sounding" argument two posts above is silly, because sound is almost never that monochromatic, and if it is, it's usually boring. Also I don't understand how missing out on an odd order harmonic would be a bad thing :) The reality is none of these arguments are based in a reality of what people would hear, and because of that, the arguments aren't practical.

In reality, 20 bits at 48kHz (or 64kHz) would be more than acceptable for even the most discerning of ears and probably the most practical in terms of space and fidelity, but it'd be a weird format to distribute in.

pjscott · on March 5, 2012

That's very cool, but it requires pretty high-intensity ultrasound to be noticeable. I doubt that will be the case with ordinary music.

zohebv · on March 6, 2012

> Basically, if you produce two ultrasonic frequencies, they will create an interference pattern at a much lower frequency than either of the individual frequencies.

So the interference pattern will be made up of one low frequency sound and higher frequency harmonics. Once again the higher frequency harmonics are redundant, because you only need to record the lower frequency sound.

The only possible way ultrasound can be picked up by the ear is if the ear has a non-linear response to the input sound. Going by the information in the article linked, it is highly unlikely that any significant non-linearity exists in the ear.

runeks · on March 5, 2012

It's definitely possible for two sounds to be indistinguishable when played separately, but when played together it is revealed that they are in fact different (see link below). Whether this applies for sounds with frequencies above 20kHz I don't know. I'd like to see a citation as well. Doesn't seem like it would be the hardest experiment to set up either.

http://en.wikipedia.org/wiki/Cent_(music)#Sound_files

SpiderX · on March 6, 2012

Me and my brother would sing at each other in certain tones such that we created harmonics in both our ears. It wasn't pleasant, but it was interesting. Regardless, I'd smash my equipment if it made harmonics like that.

cmer · on March 5, 2012

Humans will hear the impact > 20kHz frequency has on the lower frequencies, not the 30kHz frequency itself. That's been proven a million times.

bigiain · on March 5, 2012

If that is true, surely in your up thread example of recording a triangle, the "impact on lower then 20kHz frequencies" would already have happened during the recording process in between the triangle and the microphone, and would have been captured perfectly on recording equipment that's proven capable of capturing everything below 20kHz? So we'd "hear" the effect as part of the recording instead of requiring it to happen in our listening room…

smallblacksun · on March 5, 2012

>That's been proven a million times.

Then you should be able to provide at least one citation.

Anechoic · on March 5, 2012

If you're not going to hear the frequency, then there's no reason to record it, so I don't see what you're objection is.

eatmyshorts · on March 5, 2012

Yes, but if you sample the frequency to create a step wave, then neglect to filter the results, you will end up reproducing tons of high frequencies. That is why we need to filter the output for signals >20KHz...to remove these harmonics that result from reproducing the square wave.

Of course, filters aren't perfect, and result in phase shift and roll-off. So we over-sample the signal to create a signal with a much higher frequency than 20KHz, so that the filtering occurs well outside the audible band, allowing us to filter out all of these harmonics without affecting the desired signal.

Basically, the end result is that by sampling the signal, you are introducing high frequency content that must be removed prior to playback. This high frequency content is one of the reasons old CD players from the 80s and 90s cause "listener fatigue", although I have no sources to back up that last statement.

dedward · on March 6, 2012

Yup... people need to get very clear in their heads the difference between the recording/sampling/mixing/mastering stages, where high bitrate/width/gear/knowledge is helpful, and playback, which is a completely different thing.

(not for eatmyshorts -you get this I gather) - everyone gets that "upsampling" can't add detail to a recording right? You can't get more than you've got.... no matter what you do. There is no magic. You upsample so you drive harmonics generated in the digital-to-analog process during playback further up in the spectrum so when you get to the analog stage you can use a nice gentle analog filter to filter them out. Without the upsampling, you need a nasty steep analog filter to filter them out, and that can have audible side-effects (or at least measurable) in the audible spectrum. eatmyshorts - correct me if I mis-stated any of that please....

eatmyshorts · on March 6, 2012

You got it 100% correct. You upsample simply to move the frequency of the analog filter higher, with a gentle rolloff (and ideally a 1st order filter, so you introduce no phase effects) to get your final signal.

shasta · on March 5, 2012

In other words, your theory is that the superposition principle doesn't hold for sound waves.

kragen · on March 6, 2012

Well, the superposition principle only holds in linear media. Sound waves can propagate in linear media, but they can also propagate in nonlinear media, and any medium that can carry sound will go nonlinear at sufficiently high amplitudes.

surrealize · on March 5, 2012

Note the lack of citation

pragmar · on March 5, 2012

http://en.wikipedia.org/wiki/Overtones

I don't know about the physics of the speaker itself generating the overtone (in cabinet), but it could certainly resonate a wine glass in the room, for example.

surrealize · on March 6, 2012

Yes, overtones exist, and yes, overtones affect the sound, and yes, if you filtered the sound to remove overtones in the audible range then it would sound different. However, if you remove overtones outside the audible range then it will not make an audible difference (this is what xiphmont was saying in TFA).

So no, your wikipedia link is not a citation for the claim that cmer made.

CamperBob · on March 6, 2012

"A/B or GTFO," I believe is the parlance of our times.

DanBC · on March 5, 2012

Are they talking about "beat frequency" type effects?

qq66 · on March 6, 2012

Yes. A 40KHz tone and a 41KHz tone will interfere with each other and can create 1KHz tones that are audible. Edited to correct error, thanks anechoic.

Anechoic · on March 6, 2012

No, Holosonics is not creating sound from beating, they are using heterodyning, which takes advantage of how high-amplitude ultrasonic sound waves interact with the atmosphere, that's different from beating.

nullc · on March 6, 2012

They don't— the air is linear (except at insane sound pressures) so there is no interference. While the ear is not linear, it doesn't respond at those frequencies.

If it really worked that way it would be trivial to demonstrate. Alas, it doesn't.

xiphmont · on March 5, 2012

> For example, the human hear will hear a 30kHz frequency if it's fundamental is 10kHz.

No. It won't.

texel · on March 6, 2012

I think you're referring to this: http://en.wikipedia.org/wiki/Missing_fundamental

Which, since it's a psychoacoustic phenomenon, wouldn't hold true when the partials involved are above the audible spectrum.

dedward · on March 6, 2012

I know where you're going with this I think, and I'm not disagreeing outright, but wouldn't this be captured during the high-bitrate (or good analog?) recording and mixing phase if the recording/mixing/mastering engineer were doing things right? At least, as well as possible?

dj_axl · on March 5, 2012

> The article's about distribution, not recording. I don't think anybody disputes the usefulness of higher sampling rates when recording.

Didn't read the article, so commenting out of context, however it needs to be said that in sample-based music genres the distributed music gets used as if it were a recording. Maybe then it could be argued that higher sampling/bit rates should be available, if only for those who are sampling.

untangle · on March 5, 2012

> In theory, it's true that the human hear can't hear above ~18kHz, but it can hear the influence of the very high pitch harmonics on a lower frequency.

That may well be true. But those mixed-down harmonics that are heard "live" would then be captured by the 16/44 (or whatever) sampling. IOW, the recording captures what you heard. Those upper harmonics have no emergent properties. Their effect is captured.

Bob

amouat · on March 5, 2012

I'm no sound engineer, but as far as I can tell, the main point of that paper is that some instruments produce harmonics at frequencies greater than 20kHz, not that these frequencies matter to humans. However, section X references other papers that apparently make this claim.

Just because it is difficult to record a triangle does not necessarily mean it is impossible to accurately recreate the sound (to human ears) using 48kHz.

glassx · on March 5, 2012

> I'm no sound engineer, but as far as I can tell, the main point of that paper is that some instruments produce harmonics at frequencies greater than 20kHz, not that these frequencies matter to humans. However, section X references other papers that apparently make this claim.

Yes, you're right.

In fact, some of the section X references don't even mention hearing, they talk about "alpha-EEG rhytms" (in this case "listeners explicitly denied that the reproduced sound was affected by the ultra-tweeter") and "bone-conducted ultrasonic hearing" trough the "saccule" ("organ that responds to acceleration and gravity and may be responsible for transduction of sound after destruction of the cochlea").

--

In fact, most of the claims of the article are around the fact that there is energy over 20khz and how it can affect recording process.

This is a well known fact, and this is exactly why engineers filter out sub-sonic and super-sonic frequencies, especially today: stuff that you can't hear (or feel) will just suck your headroom and make you lose the loudness war.

cmer · on March 5, 2012

The only "good sounding" triangles you'll hear are those buried in a mix. Alone, it always sounds weird and "muted".

EDIT: Listen to the triangle at the beginning of Rush's YYZ. It's an old recording, but it sounds significantly worse than the analog version. It's been digitally mastered some time ago so if it was mastered today, it would probably sound better, but still not great. I heard a rumor that Rush is remastering all their albums "for iTunes" at the moment, so hopefully we'll be able to compare soon!

TylerE · on March 5, 2012

Not a very good example, because that's a Crotale (A Flat, ~4" cymbal, basically), not a triangle.

cmer · on March 5, 2012

Wow! I didn't know that! All these years I was convinced it was a triangle just like pretty much everybody I guess. Thanks :)

FrankBooth · on March 5, 2012

It's not like it has been a secret: http://en.wikipedia.org/wiki/YYZ_%28instrumental%29

eatmyshorts · on March 5, 2012

Yes, but our ears only hear 20Hz-20KHz. So, according to Nyquist theory, you can recreate the entire signal that the human ear hears by recording those sonic artifacts that result from interference between supersonic harmonics.

So while it's true that the human ear can't hear well above ~18KHz, and the interference between high order harmonics are audible, it's also true that a properly recorded signal, sampled at 44.1KHz, oversampled, and filtered, can reproduce the exact signal the human ear is capable of hearing. At least according to theory.

The human ear is capable of detecting sound pressure as well as sound intensity, and while playback of the interference between harmonics can be reproduced faithfully in the sound intensity realm, the sound pressure levels will differ, and it is theorized that people may be able to tell the difference between the two. However, as far as I am aware, nobody has been able to demonstrate this reliably in practice.

tzs · on March 6, 2012

What about sound outside 20-20k that affects us via mechanisms other than being directly sensed in the air by our ears? For instance, consider frequencies below 20 Hz that we can feel with our feet as vibrations in the floor, instead of hear with our ears? Or what about the possibility of sound above 20k causing a vibration in something other than our ears, which could have a subharmonic in 20-20k that gets conducted to our ears via bone?

I'd prefer recording technology to err on the side of capturing what we need to reproduce all of that, even if we aren't sure that we need it.

panacea · on March 6, 2012

>I'd prefer recording technology to err on the side of capturing what we need to reproduce all of that, even if we aren't sure that we need it.

Again, this article is about distribution, not recording.

jwatte · on March 6, 2012

Nyquist is true for static signals. Music is not static. Brick wall at 20 kHz and get audible phasing artifacts! (Even if your filter is phase linear)

retrogradeorbit · on March 6, 2012

I'm an audio engineer, too, and I agree that this has been debated to death. And I agree that frequencies above the threshold of hearing are more important than standard dogma (based on Nyquist theory combined with Pure tone audiometry) allows. It helps explain how audio gear with a 100kHz bandwidth sounds clearer than gear with a 20kHz bandwidth even when they measure the same in the audible band.

Have you read the Audio Technology magazine interview with Rupert Neve?

Greg Simmons: Geoff Emerick, the famous British Producer ?

Rupert Neve: Yes, he started me off on this trail. A 48 input console had been delivered to George Martin's Air Studios, and Geoff Emerick was very unhappy about it. It was a new console, made not long after I had sold the Neve company in 1977. George Martin called me and said, "please come and make Geoff happy, while he's unhappy we can't do any work".

They'd had engineers from the company there, and so on. The danger is that if you are not sensitive to people like Geoff Emerick, and you don't respect them for what they have done, then you are not going to listen to them. Unfortunately, there was a breed of young engineers in the company ( I hasten to say this was after I sold it !) who couldn't understand what he was bitching about. So they went back to the company and just made a report saying the customer was mad and there wasn't really a problem. Leave it alone, forget it, the problem will go away. They were acting like used car salesmen. I was very angry with it. So I went and spent time there, at George Martin's request, and Geoff finally managed to show me what it was that he could hear, and then I began to hear it, too.

Now Geoff was The Golden Ears - and he still is - and he was perceiving something that I wasn't looking for. And it wasn't until I had spent some time with him, as it were, being lead by him through the sounds, that I began to pick up what he was listening to. And once I'd heard it, oh yes, then I knew what he was talking about. We measured it and found that in three out of the full 48 channels, the output transformers had not been correctly terminated and were producing a 3dB rise at 54kHz. And so people said, "oh no, he can't possible hear that". But when we corrected that problem, and it was only one capacitor that had to be added to each of those three channels, I mean, Geoff's face just lit up ! Here you have the happiness/ unhappiness mood thing the Japanese were talking about.

copy here: http://poonshead.com/Reading/Articles.aspx

nwhitehead · on March 5, 2012

The article doesn't suggest only using 48kHz for recording and mixing. I don't think the author would disagree that recording triangles is difficult. He would argue that once you've decided what final audible frequencies you want to present to the listener, the best way to distribute them is at 16-bit 44.1/48kHz. It's a compelling case.

newman314 · on March 5, 2012

What if you want to sample the song later?

That's one thing I find concerning with the move to digital. With analog media, you can go back, re-record and get an improved result (provided the source is good) but District 9 (which was shot on Red One) will never have improved quality other than resampling because the source is set to a particular digital format with associated data quality.

shabble · on March 5, 2012

There seems to be some strange idea that analogue means 'infinite detail'. In this particular case, there's no significant difference between being limited by the original digital recording resolution and the grain size of a film recording.

"[...] provided the source is good" is begging the question; it's no different from saying "District 9 could be better if they hadn't recorded in 4k (or whatever the Red One was using) and downsampled it for my DVD" The nature of the source is irrelevant, barring the fact that film might provide a higher resolution, if film scanning technology increases, and you can afford to both capture on film, process and store your film properly (archiving film is rather difficult, I believe), and get the best quality digitisation possible.

newman314 · on March 6, 2012

Obviously, I am not claiming infinite detail. There is going to be a limit based on the grain and the size of the film (35mm, Super, IMAX). 65mm film shot is going to be of higher quality than what digital is capable of today.

While I have no doubt that digital will eventually catch up and surpass film, there inevitably is going to be a transition period where quality films were recorded (let's just say at 2k) where the input is constrained and extrapolation be the only available option.

4k is the current state of the art. It will not be so forever and because it's recorded at 4k, we can't go back and extract more dynamic range due to the limitation of the sensor. Whereas you can go back, redigitize an IMAX film (say Chronos shot in 1985) that is in good condition and get way more info than something shot on 4k yesterday.

TL;DR IMO input still absolutely matters. 35mm is not the upper limit. We went through this with photography and am now doing the same with video/film.

EDIT: After thinking more about it, here's a more extreme example. I purchased a Kodak DC20 back in the 90s (early adopter yay!), even if the camera had decent glass, there's no way I can go back to an image captured by that camera and magically get the equivalent of 22mp 5D camera by resampling. If I had used a film camera, I can get a much improved scan.

EDIT2: Here's a good example. Slumdog Millionaire was mostly shot on a SI-2K which recorded at 2k. You can't go back and get 4k output on the digital portions. So generations later, we will be stuck enjoying an Academy Award winning film at that level of quality.

http://www.siliconimaging.com/DigitalCinema/News/PR_01_31_09...

kalleboo · on March 6, 2012

And we'll never be able to go back and "re-film" "The Texas Chainsaw Massacre" on 32mm, it'll forever be marred by grain and poor low-light performance of 16mm. I guess I'm not sure what your point is. The best digital can present is currently worse than the best film can present, yes. That doesn't mean we shouldn't use it.

newman314 · on March 6, 2012

My original response was to the effect that the output should be high quality so that data is preserved if sampled.

Digital is the future. Hence it behooves us to have the maximal input & output possible at this time. Unfortunately, this is not common now and the price paid is that content created during this period will be stuck at the same quality level.

shabble · on March 6, 2012

I'm entirely in favour of increasing the resolution/bit-depth for video, but I think the general problem is more complicated by external factors.

The cost of renting a red one and recording straight digital vs hiring a film camera, process lab, and all the other parts needed quite possibly means that some films might never have been produced due to filming costs.

What measure of quality can compare X against X, if it was never made?

I imagine (I have very little actual experience here, so it's perfectly possible I'm wrong) that digital recording might make it easier/cheaper to retake shots/scenes repeatedly to get them right as well, offering another 2nd order quality effect.

cmer · on March 5, 2012

I completely disagree with the article having heard the difference many times myself. You can't record at 192kHz and hope to keep the same quality by distributing the final mix in 44.1kHz. It just doesn't work that way.

justincormack · on March 5, 2012

Well there is also the aliasing in that resampling. Recording at 192 for shipping at 48 makes more sense than shipping at 44.1 surely? Some audio seems to do 88.1 but rarely 176.2.

astrange · on March 5, 2012

Would you like to post double-blind test results?

cmer · on March 5, 2012

We actually took those challenges in school :) Lots of fun if you're an audio nerd!

roel_v · on March 6, 2012

OK well then, what were the results?

tompagenet2 · on March 5, 2012

Hey cmer, thanks for posting

I don't think I understand quite what you're saying and wondered if you could explain more. You and the article both say that humans can't hear above about 20kHz. If there are higher frequencies that create a harmonic at a lower frequency (e.g. a 33kHz harmonic that produces a sound at 16.5kHz) then surely that lower harmonic (16.5kHz in this case) will be recorded by the original recording equipment assuming it is recording at a frequency at least twice that of the highest audible frequency (let's say that this would be 48kHz, although there might be other DAC-related reasons to go higher).

I'm possibly being very daft here!

cmer · on March 5, 2012

Let's make things super simple. Let's say you record 4 sine waves at a 192kHz sampling rate: 15kHz, 30kHz, 45kHz and 60kHz. All 4 frequencies will be captured and the 15kHz frequency will sound different to your hear because its harmonics.

If you take this recording and master it for a CD (44.1kHz), you'll effectively get up to ~20kHz (since they're a low pass filter starting at around 16-18kHz). This means that only our first frequency will be captured: 15kHz. It will be exactly the same as if you only recorded 15kHz alone. The harmonics don't modify the fundamental frequency, they just trick the human hear. But when they're gone, they have no effect whatsoever.

Hope this helps!

EDIT: the frequency numbers I used are actually somewhat of a bad example. Harmonics are never exactly double, triple the fundamental. Those would be mostly inaudible. But you get the idea.

tompagenet2 · on March 5, 2012

I don't think I understand how it could sound different to my ear. My understanding is that my ear doesn't have the sensory equipment to detect signals above ~20kHz - this is what I was told at university, and a decent trawl of the web suggests this is still true. If there is any sound that is in the range 20Hz-20kHz then why doesn't the microphone pick it up?

Or am I wrong, and the ear is able to detect frequencies above 20kHz?

glassx · on March 5, 2012

> Harmonics are never exactly double, triple the fundamental. Those would be mostly inaudible. But you get the idea.

Actually, they are: https://en.wikipedia.org/wiki/Harmonics

cynicalkane · on March 5, 2012

The second half of the statement is wrong, but the first half is right. Harmonics in real-world instruments are not usually exact multiples of the fundamental. A simple diffeq model of a rigid oscillator will show you this mathematically.

An extreme example is present on modern pianos, where the high rigidity of the loud, heavy piano strings can cause tuners to stretch the lowest and highest notes as much as a half-semitone so that their harmonics are in tune with the note the next octave down or up. In other words, the first harmonic on the lowest note of a piano can be as much as 1/2 of a note sharp.

And when your oscillator is no longer one-dimensional, most harmonics aren't even close to integer multiples. The harmonics of bells, cymbals and drums are all over the place. That's what gives them their percussive sound. (Edit: some of these modes of vibration aren't harmonics in the linear sense.)

Anechoic · on March 5, 2012

Harmonics in real-world instruments are not usually exact multiples of the fundamental. A simple diffeq model of a rigid oscillator will show you this mathematically.

That is absolutely incorrect, mathematically speaking, harmonics are by definition "integral multiples of the fundamental." (Fundamentals of Acoustics, Kinsler & Frey).

shabble · on March 5, 2012

People from a musical, non-signals background tend to use 'harmonics' as a synonym for 'overtones' or 'partial tones', which is where the confusion arises, I suspect.

There's a measure -- inharmonicity[1] -- of how far the actual overtones of a particular instrument differ from their theoretical fundamental multiples.

[I suspect you already know this. This reply is probably for others' benefit]

[1] https://en.wikipedia.org/wiki/Inharmonicity

cynicalkane · on March 5, 2012

Then one would be forced to conclude that many instruments have no harmonics at all, which is obviously not what 'harmonic' is referring to in this thread of discussion. Why be pedantic when it's obvious what everyone is talking about?

Anyway, it's not as though mathematical literature requires you to use a term exactly one way. I had a diff eqs textbook that used the word 'harmonic' in exactly the way I used it above when I made reference to diff eqs...

leephillips · on March 6, 2012

> I had a diff eqs textbook that used the word 'harmonic' in exactly the [incorrect] way I used it above

So you had a textbook with a mistake in it. What book was it?

glassx · on March 5, 2012

> And when your oscillator is no longer one-dimensional, most harmonics aren't even close to integer multiples. The harmonics of bells, cymbals and drums are all over the place. That's what gives them their percussive sound.

But those aren't harmonics, they're inharmonic partials.

kahawe · on March 6, 2012

> The harmonics don't modify the fundamental frequency, they just trick the human hear. But when they're gone, they have no effect whatsoever.

This is the part I really do not understand... either my ear CAN pick up those frequencies, maybe the harmonics are "tickling" the little hairs inside my cochlea and ultimately the frequencies I can actually hear were altered in my perception that way - or I can not hear or sense the harmonics and they physically alter the "original" wave that I end up actually hearing.

Either way, pretty much the exact same thing should happen in a studio microphone. Those all do have frequency limitations and AKG, Royer, Rode, Shure, Sennheiser, Audio Tech, what-have-you pretty much all go up to 15kHz or 20kHz according to specs, if I understand them correctly, but not further than that. If it isn't even recorded, those frequencies I also cannot hear can NOT alter my perception so they HAVE to somehow change the frequencies I can hear and are being recorded... on top of that you are making "room" for frequencies up to, say, 60kHz but I very strongly doubt your mics can go even remotely that high.

leephillips · on March 5, 2012

The linked article was accurate. You are confused.

"I'm an ex-audio engineer"

Hard to believe.

"The distinct sound of the triangle constitutes of a high fundamental frequency, ballpark 10kHz"

That's a pretty high note - higher than the top key on the piano. But an "audio engineer" would know that.

"many very high-pitch harmonics"

Since the next harmonic after the fundamental would be at 20khz, which only young people can hear, and none of the others are audible to any human, I don't understand what you are talking about.

"Most of these harmonics are 20kHz."

OK, you don't either.

"it can hear the influence of the very high pitch harmonics on a lower frequency."

Sure....

leif · on March 5, 2012

You clearly have little to no musical background, and think that your basic math skills are a substitute. The overtones present in a cymbal or triangle are not straight multiples of the fundamental, they are chaotic, and are very important in determining the timbre. Anyone (and I mean that) can easily tell the difference between a cymbal with and without a low-pass filter with the threshold around 22kHz, because these "inaudible" frequencies are lost.

mortil · on March 5, 2012

If anyone can hear it, then surely it must have been verified through a double-blind test. Can you provide a citation?

leif · on March 6, 2012

I don't know of any to point you to. They probably exist, but I haven't read them. Let me know if you stir some up.

cmer · on March 5, 2012

This is a much more polite response than what I had in mind. Better this way I guess :)

jwatte · on March 6, 2012

The undertones created by the high overtones are realized in the anti-aliasing filter during recording. That's not actually the reason 44 kHz sampling isn't enough.

leephillips · on March 6, 2012

1: He said "harmonics", not overtones. 2: You can not hear inaudible frequencies. Because they are inaudible.

leif · on March 6, 2012

1: You're still wrong. One person's typo is not a slight against physics. 2: That's what "sarcastic quotes" are for.

femto · on March 5, 2012

Playing Devil's Advocate...

The statement that frequencies above 20kHz don't matter rests upon the assumption that the ear is linear. If the ear is not linear (I don't know whether it is not not) then frequencies above 20kHz will matter, as the ear will be able to mix higher frequencies down to less than 20kHz. For example, if we have frequencies of 56kHz and 59kHz, the ear MIGHT be able to discern a difference frequency of 3kHz. No doubt this effect could be reproduced by signal with a sampling rate of 44.1KHz, but only if the analogue systems, before the sampling stage, reproduce any non-linearity in the human ear.

Incidentally, you can get speakers that create a localised beam of sound, that the person sitting next to you cannot hear. They work by transmitting frequencies above the audible range. These high frequencies can be beamformed by a relaitively small speaker array, so the sound is localised. They then rely on the non-linearity of the ear (or maybe the air around the ear?) to mix the ultrasonic frequencies down to audible frequencies. I guess there must be non-linearity in the human auditory system!

On the subject of 24-bits my understanding is that 16-bits is adequate, provided the levels (scaling) are set correctly in the recording. What 24-bits delivers is the ability to do a crappy job of the mixing, and still end up with the full dynamic range of the human ear. 24-bits is probably a temporary solution though, as manufacturers will engage in the usual Loudness War [1], and push the signal to the top of the dynamic range. Before long 24-bit audio will be equivalent to 16-bits (since the 8 least significant bits will be unused) and the next big thing will be 32-bit audio.

Having said all that, I'd guess that the speakers will be the limiting factor in most sound systems, not the recording format.

[1] http://en.wikipedia.org/wiki/Loudness_war

waqf · on March 5, 2012

Nonlinearity of the ear is thought to be the explanation for sum and difference tones, which most certainly exist: https://en.wikipedia.org/wiki/Combination_tone.

glassx · on March 5, 2012

> Having said all that, I'd guess that the speakers will be the limiting factor in most sound systems, not the recording format.

Yes. And DACs, which normally have filters too.

femto · on March 5, 2012

Yes, though I tend to think of the reconstruction filters as being part of the recording format.

Here's an interesting article:

  http://news.google.com/newspapers?id=E5guAAAAIBAJ&sjid=d6EFAAAAIBAJ&pg=3183%2C2664048

In 1975, the Canadian Broadcasting Corporation was using a head shaped microphone, which was presumably an attempt to reproduce the non-linearity of the ear. It would be interesting to do such experiments with digital sampling.

Thinking about it, if every person has a different non-linear response, in theory the only way to reproduce sound beyond a certain threshold of fidelity would be to reproduce the ultrasonic components, so each person would hear their own non-linearity. (That would be beyond what I can hear or care about, but it would be fun to play with. Beyond a certain level we also get to the point where we need to ask what it means to hear a sound.)

sjwright · on March 5, 2012

  > the speakers will be the limiting factor
  > in most sound systems

I disagree -- in most sound systems, the room is generally the most limiting factor.

Pardon the reductio ad absurdum, but would you prefer to listen to $1,000 speakers in a dry, padded listening room, or to $100,000 speakers in a tile bathroom? Obviously the room matters; I think most people underestimate by how much.

femto · on March 5, 2012

Probably. I should have left it at "it's not the recording format" and not nominated a limiting factor.

I'd take the bathroom, given that my singing voice sounds less worse there! :-)

davesims · on March 5, 2012

Another "ex audio engineer" here, you can believe or not at your leisure. Many hours spent in high-end recording and mastering environments.

I'm not sure what your background in audio is, but everything he says is correct. High end frequencies well past 15k and up (22.1k actually) are widely acknowledged to influence the lower frequencies and play a huge role in the perception of the quality of a recording. This is an old debate with pros and cons on both sides, but in general you'll find the "Golden Ears" mastering engineers (Stephen Marcussen, Bob Ludwig, etc.) come down on the side of higher sampling rates.

Now, if your original recording was mastered to 16/44.1, then a transfer by way of 24/192 will probably actually hurt the recording. But if you're mastering from an original analog or high-quality digital, in my experience there's no question, higher sampling rates deliver better experiences.

sjwright · on March 5, 2012

I have also spent many hours spent in high-end recording and mastering environments, and it's my observation that most engineers suffer from confirmation bias just like everyone else on the planet.

I've caught engineers using L1-Ultramaximizer (or similar) to bounce a recording down to 16-bit/44.1khz as part of the mastering process, and they're always surprised when they're completely unable to hear the difference even in the most simple cover-the-screen-and-toggle-bypass test.

davesims · on March 5, 2012

Audio, perhaps like the wine industry, is a vast bastion of confirmation bias and subjectivity, no argument there.

But I know what my ears hear, and IMO there is absolutely a vast different between 44.1 and 192. I'm not sure how you can even question it. Someone else on the thread was saying it's impossible to hear the difference between 16bit and 24bit. I don't even know what to say to that. It's like telling me the glass of Gallo "Table Red" you're drinking is as good as my '75 Lafite. All I can say is "cheers" and just enjoy.

jff · on March 6, 2012

Regarding your wine anecdote: http://www.theatlantic.com/health/archive/2011/10/you-are-no...

If I gave you a bottle of "Table Red" with a '75 Lafite label, I'm sure you'd tell me how rich and wonderful it was. The problem here is that, as you said, "I know what my ears hear". You know you're listening to 192, wow it sure sounds great!

nitrogen · on March 6, 2012

You just need to consider more plausible explanations for the difference you are hearing, such as low-quality sample rate conversion on the playback devices you are using, clock jitter that is less audible at 192k than 44.1k, no dithering on the 16-bit output resulting in quantization noise on quiet sounds, etc.

If you're listening to quiet music in a quiet room at high volumes on very low noise equipment, you can hear a difference in the noise floor level between dithered 16-bit and 24-bit, but at that volume level if that music (or movie) also has full-amplitude signals you'll be reaching peaks over 110dB SPL.

sjwright · on March 5, 2012

  > I'm not sure how you can even question it.

With evidence.

xiphmont · on March 5, 2012

As much respect as I have for Bob Ludwig's hard won mastering skills, he also strongly believes in $n,000/foot speaker cable, which is what he has installed at Gateway. So by all means give him well deserved props, but don't assume he's an expert on all aspects of audio theory or practice.

sjwright · on March 5, 2012

I've always thought the most expensive speaker cable sounds a lot better... to the wallet of the salesperson.

colanderman · on March 5, 2012

That data doesn't back your point at all. That data concerns what frequencies are present, not what frequencies can be heard.

Derbasti · on March 5, 2012

He raises a lot of valid points. However...

192 kHz is clearly overkill for listening. Not so for further editing of the data.

Same goes for 16/24 bit, however, the difference between 16 and 24 bit is actually audible.

44100 is not a bad sampling rate, but it necessitates very sharp aliasing filters, which are audibly bad. A bit more headroom is well needed there.

That bit about intermodulation distortion is complete bogus. He talks about problems when resampling high-fs audio data. However, you would never do that. You would digitally process 192kHz all the way. Only your loudspeakers or ears would introduce a high-pass filter, and a rather bening (flat) one at that. There is certainly no aliasing going on there unless you resample (wrongly). Intermodulation distortion is not the fault of the sample rate.

I mayored in hearing technology. Calling 192/24 worse than 44.1/16 is total BS. How useful it is is a different debate.

ferongr · on March 5, 2012

>Same goes for 16/24 bit, however, the difference between 16 and 24 bit is actually audible.

This [1] (widely accepted in the scientific audio community) study's conclusions disagree with your assertion.

>44100 is not a bad sampling rate, but it necessitates very sharp aliasing filters, which are audibly bad.

This is not the 1980s, hardware has progressed beyond that point. Modern (i.e. anything from 1995 onwards) DACs do not suffer from aliasing problems. Also see [1]

>That bit about intermodulation distortion is complete bogus. He talks about problems when resampling high-fs audio data.

I did not notice that in the article. It talks about IMD in the context of the analog chain and the transducers following the DAC, and it's possible that high frequencies can increase it.

[1] http://www.aes.org/e-lib/browse.cfm?elib=14195

Derbasti · on March 6, 2012

> Modern (i.e. anything from 1995 onwards) DACs do not suffer from aliasing problems.

True, but they do so using (long, high-quality) high-cut filters. And these filters are pretty sharp, as they have to close within, say, 18-22.1 kHz. You can design them as linear-phase FIR filters with oversampling and all the good stuff, but physics dictates that sharp filters introduce distortion. A sharp filter like that is audible.

_delirium · on March 6, 2012

I'm not aware of any (blind) listening tests actually showing that a modern, high-quality DAC for 44 kHz audio introduces audible distortion compared to a similarly high-quality DAC for, say, 96 kHz audio, though. It's not theoretically impossible that the lowpass would introduce some sort of noticeable distortion, but I haven't run into substantiated evidence that it actually does.

Anechoic · on March 5, 2012

44100 is not a bad sampling rate, but it necessitates very sharp aliasing filters,

When you're talking about recording, sure, but in terms of storage and playback, we solved that problem 20 years ago with oversampling.

Derbasti · on March 6, 2012

You will still need a aliasing filter that cuts off between, say, 18 and 22.5 kHz to avoid aliasing noise. That is one sharp filter no matter how you look at it. You can use a high quality, long, linear-phase FIR filter, but you can't cheat physics: sharp filters necesserily introduce distortion, and such a sharp filter so close to the hearing threshold does not go unnoticed.

sjwright · on March 6, 2012

I don't see how a sharp filter could be needed if the DAC is oversampling.

Derbasti · on March 6, 2012

Obviously. Analogue audio does not have a sampling rate. The ADC however can oversample all it wants, but if the output is 44.1 kHz, it needs an aliasing filter that cuts off at 22.05 kHz.

sjwright · on March 7, 2012

We appear to be in complete agreement with each other.

hackermom · on March 5, 2012

Same goes for 16/24 bit, however, the difference between 16 and 24 bit is actually audible

No, the difference is not audible at all. At 16 bits of depth on a normal low-level audio signal (~0.3 volts), we're talking about less than 0.000005 volts per amplitude step. This difference gets lost in the THD already at the DAC in your audio output stage. Then it gets lost again in the amplifier. And again in the cable to your speakers or headphones. And then it gets lost again in the speaker elements. What survives in a normal low-level audio signal is about 14 bits of resolution.

44100 is not a bad sampling rate, but it necessitates very sharp aliasing filters, which are audibly bad. A bit more headroom is well needed there.

44.1khz IS a bad sampling rate for accurately reproducing anything except a triangle wave or square wave above 5khz.

yzhou · on March 5, 2012

why do you think "This difference gets lost in the THD already at the DAC "? Do you have numbers to back it up? What's the noise floor of DAC? What's the noise floor of an output stage? Do you have the number?

sjwright · on March 5, 2012

The number is somewhere between the amplitude of an ant pissing on cotton, and an ant not even thinking about pissing on cotton.

yzhou · on March 5, 2012

high dynamic range is not about the lowest volume you can hear, it's about the voltage resolution between this sampling point and the next. Base on your assumption, we can all see black whether we use 16bit RGB color or 24bit RGB color, what's the point of using 24bit RGB?

hackermom · on March 5, 2012

Many years of building audio equipment (in particular analog synthesizers), and equally many years of being meticulously anal with getting the best components for my circuits, reading specifications of down to every single op-amp I've ever employed, is why I think so.

I am not saying that there aren't any DACs on the planet that can't handle five millionths of a volt, but I am saying that five millionths of a volt isn't surviving through the particular DACs and the rest of the electronics used in your PC/living room hi-fi audio equipment.

davesims · on March 5, 2012

Heh, it's funny to see this late-nineties debate get re-hashed here. Also kind of fun.

If it were true that there's no audible difference between 16 and 24 bit, companies like Alesis, Otari, ProTools, etc. wouldn't have spent the last 15 years ditching 16 bit like an old pair of smelly sneakers. (better metaphors welcome).

Seriously, anyone who has sat down in a real listening environment for 5 minutes A/Bing 16 vs 20 bit, 16 vs 24, etc. hears the difference immediately. There's no question. This is why you can buy ADAT 16 bit 'blackfaces' for $100, down from their original $4,000.

shard · on March 5, 2012

Sure, moving up from 16bit recording was an improvement, but having done engineering for a company listed above for over a decade, I can tell you that we went 24bit/192kHz because of market demand, not for any real technical reasons. We thought it was fairly unnecessary ourselves. It was also kind of an arms race with other companies, much like the megapixel arms race for digital cameras.

ferongr · on March 5, 2012

Bigger numbers are better. Right?

It's all marketing, baby!

cdvonstinkpot · on March 5, 2012

...And the new Pro Tools 10 just added the ability to record in 32-bit floating point. http://www.avid.com/US/products/Pro-Tools-Software

hackermom · on March 5, 2012

Yes, and anyone who has ever sat down infront of an LCD flatscreen watching their favorite movie on DVD/BD using gold-plated $200 HDMI cable instead of $4.99 Walmart HDMI cable see the extra sharpness immediately. This is why non-gold plated non-OFC HDMI cables are down to $4.99 a piece from their original $49.99 during introduction.

scott_s · on March 6, 2012

I can't tell if you're being sarcastic or not.

alanh · on March 6, 2012

I’m going to go ahead and say yes, that seems to be blatant sarcasm, or at least, reference to placebo effect / being a sucker.

scott_s · on March 6, 2012

The difficulty I had is that the same person claimed they could hear the difference between 44 kHz and 96 kHz, when the article (and all other comments which cited outside sources) claims that is well outside of human capability.

davesims · on March 6, 2012

That's cute. Obviously you've never recorded a rock band while riding the pre to compensate for 16bit's terrible noise floor and horribly limited headroom. You've never had the joy of ruining a perfectly good take because of that wonderful sound it makes when the volume spikes into digital distortion despite compressing the wazoo out of the input source. Glorious sound, digital distortion. Run a dentist drill through an old Speak & Spell and you'd just about have it.

You've never rented an expensive tube EQ during a mix to cover up 16bit's grating harshness from 10k to 15k. Or tried like mad to make the bass drum sound like a freaking bass drum and not a pie pan slamming against the back of a plastic trash can. And yes, we had good mics and pres, all standard studio stuff. Decent, not brilliant, converters, but it was the 16bit that was the problem. Getting those 20bit XTs for the first time was like walking into the Promised Land.

Sure, there's lots of marketing ploys out there, lots of snake oil. Moving up from 16 bit was not one of them.

rdtsc · on March 6, 2012

It looks like you are jumping in without actually having read the article in question. That's ok, but you are wasting space building a straw man proceeding to vigorously demolish him.

The original article explicitly mentions how 24bit is useful for recording.

davesims · on March 6, 2012

Speaking of jumping in without reading...I wasn't responding to the article. I was responding to the commenter that said you couldn't tell the difference between 16 and 24 bit.

rdtsc · on March 6, 2012

And you cannot tell the difference. The reason to record using 24 bits is so you don't have to be as precise centering the recording level. If that level is centered then you can capture fine with 16 bits (by the way that is also explained in the article).

lambda · on March 6, 2012

Did you read the original article at all?

> Professionals use 24 bit samples in recording and production [11] for headroom, noise floor, and convenience reasons.

...snip...

> Modern work flows may involve literally thousands of effects and operations. The quantization noise and noise floor of a 16 bit sample may be undetectable during playback, but multiplying that noise by a few thousand times eventually becomes noticeable. 24 bits keeps the accumulated noise at a very low level. Once the music is ready to distribute, there's no reason to keep more than 16 bits.

The original article does say that yes, during recording and production, 24 bit audio gives you a lot more room to play with. That doesn't mean that you can hear the difference between 16 and 24 bits for the final recording; just that 24 bits give you more room to keep out of trouble during production.

davesims · on March 6, 2012

Did you read the comment thread at all? I wasn't responding to the article, I was responding to a comment:

...snip...

>Same goes for 16/24 bit, however, the difference between 16 and 24 bit is actually audible

No, the difference is not audible at all.

...snip...

Anechoic · on March 5, 2012

For those of you who are interested in just how much of a golden ear you truly are: download Harmon's "How to Listen" software for Windows or Mac OS X http://harmanhowtolisten.blogspot.com/ (scroll down).

Harmon requires its trained listeners to pass tests based on this software before participating in juries to evaluate Harmon products. It doesn't directly address the sample rate/bit depth issues discussed in the linked article, but it does address a lot of the issues brought up in the HN discussion, so you can have a chance to see how much those characteristics really matter.

You may be surprised.

JangoSteve · on March 6, 2012

Even without debating the science and signal processing arguments raised...

In any test where a listener can tell two choices apart via any means apart from listening, the results will usually be what the listener expected in advance; this is called confirmation bias and it's similar to the placebo effect. It means people 'hear' differences because of subconscious cues and preferences that have nothing to do with the audio, like preferring a more expensive (or more attractive) amplifier over a cheaper option.

The human brain is designed to notice patterns and differences, even where none exist. This tendency can't just be turned off when a person is asked to make objective decisions; it's completely subconscious. Nor can a bias be defeated by mere skepticism. Controlled experimentation shows that awareness of confirmation bias actually increases rather than decreases the effect!

Doesn't that completely negate his conclusion, that there is no point to distributing 24/192 music? If people want to pay for 24/192, and even he just admitted that they will legitimately enjoy it more, how can you conclude there is no point?

Life is short. I want to enjoy things. Whether or not my enjoyment can be quantified or scientifically defended, I really don't give a shit. But that's okay, if you don't want to sell me 24/192 music, Amazon will. Between this and DRM-free content, it's no wonder I buy all my music from Amazon these days.

rdtsc · on March 6, 2012

There is a perversion going on both ends here. And by perversion I mean a distortion of truth in a bid to make a profit. This is not the worst that can happen, but is just worth mentioning. You probably put more mildly, but I am bit more harsh. Some people are irrational and spend money of stuff that they don't need and another group of people are perpetuating the lies and the marketing in an effort to extract the maximum amount of money from the other group (In other words your basic market setup).

Audiophiles are quite a fascinating group. These are people that can be rather rational in some respects (they could be doing research in some lab somewhere) but when it comes to audio equipment they will shell $2000 for HDMI cables. The salesmen and manufacturers that make these things ("high end" HDMI cables, 192kHz recordings) know this very well and they aggregate around this target set of clients.

I think that is exactly what is happening here. At some point storage capacity is just good enough and one can distribute 48kHz, 16bit audio to everyone. But what do you do next? Everyone is getting that and it is not new and cool anymore. What to do? Well increase the frequency and sell everyone a newer, better, higher fidelity thing, even though objectively human years cannot really hear the difference. Subjectively though, there is a huge difference. If you ask someone who just spent $50 for a 192kHz record if they like it better than say a $20 48kHz one, I bet you 100% of people will confirm that 192kHz sounds better and will be ready to go and buy more.

backprojection · on March 6, 2012

> Doesn't that completely negate his conclusion, that there is no point to distributing 24/192 music? If people want to pay for 24/192, and even he just admitted that they will legitimately enjoy it more, how can you conclude there is no point?

Ultimately, sure. The world is full of products and services which only add value in this weak sense.

If the same wine tastes better if it's priced higher, then it still tastes better. But I think it's only honest that the consumer be aware that the increased utility from being priced higher is due solely to the fact of it being priced higher. Beyond that, I don't care.

One thing we can all agree on is that music is much more enjoyable if you think you're listening to it through good equipment or from a good source. Ultimately it's only the `thinking' part that matters. So I would make two points:

1. One point he's making is that playing audio sampled at 192khz through regular equipement actively distorts the music in negative ways. So now if you know this now you should enjoy that music _less_.

2. If you're adept metacognition (maybe that's not the right word), you'll realize a) you can get most of the enjoyment by buying equipment that's `pretty decent', and then not worry about it too much. b) you're probably fooling yourself by spending so much time/money worrying about having the best equipment, so you're probably not getting the maximum utility from the experience anyway. Or maybe it's the experience of trying to get the best equipment it self that's enjoyable, not necessarily the increased audio fidelity.

jpdoctor · on March 6, 2012

> If people want to pay for 24/192, and even he just admitted that they will legitimately enjoy it more, how can you conclude there is no point?

Sorry, no time to reply. I gotta run and write up my biz plan to distribute 32/384 audio.

joshAg · on March 6, 2012

SUCKER! I'm already working on 48/768 audio. It's amazing how clear the recordings are.

cwp · on March 6, 2012

Well, if we accept that argument, then just about any means of signalling "this sounds better!" will work. How about we choose something that doesn't waste bandwidth?

JangoSteve · on March 6, 2012

That's true. It's kind of the Monster Cable model. BTW, I'm not saying that marketing and whatnot should deceive less technical consumers and trick them into spending more money than they should (which is basically what Moster Cable does). But when you explain to technical people why something like 24/192 isn't better (other people in this thread have pointed out, this isn't totally accurate in the first place), and they understand what you're saying but still prefer it, by all means, let them buy it.

roel_v · on March 6, 2012

This is the same reasoning that somebody used when I was debating with her if insurances should reimburse homeopathic and other alternative treatments. Her reasoning was 'well if it works, it should be reimbursed, doesn't matter if it's from a placebo effect or not'; my position is that they shouldn't be reimbursed, but quite honestly, I don't really have a rational reason for it (at first I thought I had but it turned out I couldn't formulate it, which is the same as not having it).

So, while I have no option (for now) but to acknowledge your position, I still feel dirty for doing so.

tripzilch · on March 6, 2012

If there's no point arguing against something that people will eat up regardless of evidence or fact,

why are you arguing against the conclusion of an article that has this many upvotes on HN?

untangle · on March 5, 2012

This article is one of the most lucid and accurate that I have read on this topic.

However, one thing that's missing here (and in nearly all other similar pieces) is a full discussion of the prerequisites of the sampling theorem. For example, the signal must be bandwidth-limited (and no finite-time signal can be).

But this is a minor concern, as there are many elements in the analog domain of the recording and playback chains that serve as low-pass filters - starting with the mics. So bandwidth-limiting is effectively achieved.

For a similar reason, the discussion of the "harmful" effect of high-frequencies to playback electronics and loudspeakers to be a bit overdone IMO. Peruse the excellent lab results of modern audio gear on Stereophile's web site. You'll find that bandwidths exceeding 30kHz are rare.

One last thing. When doing subjective "testing," keep in mind that what some folks are hearing may be limitations of their gear. For example, most DACs derive their clocks for higher sampling rates (88/96/176/192) by clock-multiplier circuits. IOW, 44kHz and 48kHz are the only ones clocked directly by a crystal. These multiplier circuits are often noisy, contributing to jitter. The audible effect of this jitter is hard to predict.

Bob

PS As an avid audiophile, I find the clash of subjectivists and objectivists on this normally-buttoned-down forum to be a bit of a trip.

blackhole · on March 5, 2012

You always record stuff at 24-bit/192 kHz for many reasons usually involving minimizing analog artifacts and to give you a lot of information to work with. You use 32-bit float wavs to transport stuff around so you don't have to worry about normalizing levels and clipping. Lossless formats drastically improve the quality of transients by an enormous degree. But every single objection to this is either ignoring the points of the article, or talking about the benefits of recording at high fidelity, when this entire article is pointing out that once you have _finished a mix_, there is no reason to distribute things in 24-bit/192kHz. Most speakers can't even play about 20kHz anyway, which makes the entire point moot. I don't care if you have a bajillion kHz, the speakers can't play about 20 kHz, so your screwed.

mfarris · on March 5, 2012

You're getting two entirely different things mixed up.

192 kHz is the sample rate. 192,000 slices per second. It does not refer to the audible sound spectrum.

20 kHz in speakers refers to the cycles per second of the audible waveform. Normal human hearing rage is 20 hz - 20 kHz. For most people, it's less than that.

A speaker can certainly play back music sampled 192,000 times per second. Most of them can't play tones that are higher pitched than 20 kHz, which is fine because mostly only dogs can hear up there anyway.

blackhole · on March 6, 2012

I am not getting these things mixed up, because the sample rate is related to the maximum frequency that can be stored, and lo and behold, look at all these people claiming that those higher frequencies matter. 44.1 kHz sample rate can only encode tones up to about 22 kHz, whereas 192 can encode frequencies of up to 81 kHz, and those people up there are arguing that these higher frequencies are exactly why 192 kHz is superior. Now, if you want to say that sampling a tone at 44100 times per second somehow won't sound as good than 192000 times per second, I'm not saying that isn't possible, but I don't really take that claim seriously at all.

The fact is, simply distributing music in lossless format carries the vast majority of audible improvements. Arguing over whether or not its 24-bit or 16-bit or making a chunk of sound last 5.2 microseconds instead of 22.67 seems incredibly stupid to me, because you're better off simply improving the mix itself then fiddling over such microscopic differences. These things only become relevant if your mix and performance and recording equipment (or synths) are absurdly close to perfection. This becomes even LESS relevant in an age of indie-musicians.

jwatte · on March 6, 2012

The sampling theorem is for static signals and perfect filters. Turns out, music isn't static. Once you have transients in the program, you need higher bandwidth or you will end up with phasing effects (time domain aliasing.) This is plain from the math!

Filters are also not perfect (but good oversampling filters are not the weakest link)

Further, even perfectly dithered 16 bit data can't go 20 dB below the quantization floor, unless you give up on frequency response on the high end. Again, this is plain math.

With a calibrated 105 dB low-distortion sound system, in a quiet room, I can hear imperfections from 16 bit, 44 kHz material, especially in soft flutes and triangle type percussion. Of course, D class amplifiers, and MP3 encoding, do worse things to the signal, so let's start there. But 20 bit, 96 kHz (or at least 64 kHz) are scientifically defensible, when analyzing the math and the physics involved. No snake oil needed!

wickedchicken · on March 5, 2012

For an article containing a lot of "well, if you knew signal processing..." there are two fairly major oversights:

1) Any well-designed system is going to have headroom. Period. Just because 48kHz can capture the frequencies the human hear theoretically, it's always good to have a little wiggle room. This comes into play even more with interactive situations: humans are particularly sensitive to jitter. Having an "overkill" sample rate lets you seamlessly sync things easier without anyone noticing.

2) 192kHz comes with an additional benefit besides higher frequencies: it also means more granular timing for the start and stop of transients. More accurate reverb would be the obvious example. I don't know if the human ear can discern the difference between 0.03ms and 0.005ms but it's something I don't see mentioned often.

xiphmont · on March 5, 2012

1) 48kHz sampling does include headroom.

2) increased sampling rate does not improve timing. This also has been researched in detail (because it sounds like it could possibly be true given that the ears can phase match to much greater granularity than the sample clock). It was found false in practice, and in retrospect, the sampling theorem explains why. The Griesinger link discusses this with illustrations, and provides a bibliography.

nullc · on March 6, 2012

To avoid the trouble of digging up the link: http://www.davidgriesinger.com/intermod.ppt

Slides 29-35 address this point.

rdtsc · on March 6, 2012

> it's always good to have a little wiggle room

48kHz already has enough 'wiggle room'. How many people do you personally know that can hear a 24kHz sine tone?

> more granular timing for the start and stop of transients. ... it's something I don't see mentioned often.

Probably because it doesn't make sense. Human ears cannot hear frequencies about 24kHz and Nyquist tells us that 48kHz is enough to completely capture all the detail of a signal at that frequency and below.

sjwright · on March 5, 2012

  > Having an "overkill" sample rate lets you seamlessly
  > sync things easier without anyone noticing.

You can get the same theoretical benefit by oversampling on playback. And a lot of audio equipment does just that.

  > 192kHz ... also means more granular timing for the
  > start and stop of transients.

Not really, for two reasons -- unless you're talking about glitch music, transients are unlikely to ever be so sudden that the difference between 0.03ms and 0.005ms could possibly matter.

jaylevitt · on March 5, 2012

I'm pretty sure that #2 isn't true; signal processing folks will be able to phrase this better than I can, but I think that if you have enough information to capture the waveform at a given frequency, you also have enough information to precisely place it in time - phasing errors are more likely due to quantization error, which is about bit depth, not sample rate. No?

aiscott · on March 5, 2012

[edited: I was wrong]