When people ask me where they should spend money to improve the quality of their hi-fi or home theater system, in nearly every case my response will be something like "get a thicker rug" or "put something on this wall to absorb sound reflections, even if it's just a bookshelf."
Beyond that, I'd tend to say something like "stop being so paranoid about what you think you can't hear, and enjoy the damn music."
I'm a composer who works in film/games. I can assure you this is exactly what I'd like people to do when they listen to my music. I spend 99% of my time trying to create good musical ideas, and I spend 1% of my time getting the mix down. I get criticized (rightly) for this quite a bit, but it is hard to care about someone sitting in a >$10,000 labyrinth of sound equipment when I'd rather write a catchy tune.
Then again, when I write sheet music I have to endure some of the most soul-crushingly awful midi sequencing in order to check my work, so perhaps I'm too tolerant to terrible sound quality. Still, I'd rather people listened to the music, not the sound of it.
A good sounding catchy tune is something work spending that little bit more time on.
But in reality most of your customers want a ton-frakk of compression, loudness and filters on that catchy tune so it ultimately sounds HUGE on the tiniest phone, radio and car speakers... so all that audiophile mixing and dynamics are completely lost anyway.
The other issue regarding high-frequency sound reproduction is that in most cases, the loudspeaker won't be outputting much beyond 22-25 kHz (assuming very good quality loudspeakers, cheap consumer grade units might struggle to hit a -6 dB point at 18 kHz) and even for the speakers that have usable output at that range, the directivity at those frequencies will be so narrow that your head will have to be locked in the perfect "sweet spot" to hear anything.
> Agree one-hundred percent about the room
> although the prescription isn't always as
> simple as "get a thicker rug" etc
WILL: The sad thing is, in about 50 years you might start doin' some thinkin' on your own and by then you'll realize there are only two certainties in life.
CLARK: Yeah? What're those?
WILL: One, don't do that. Two -- you dropped a hundred and fifty grand on an education you coulda' picked up for a dollar fifty in late charges at the Public Library.
If you could hear subharmonic beats from ultrasonics then it would be _very_ easy to demonstrate, alas.
for those curious
>While Motown shortened song to fit into radio time, the company also produced records specifically with car radio audio quality in mind. Motown recording engineers set up car speakers in the studio so that they could simulate and perfect how a song would sound emanating from a car radio
- what's the point of engineering things to a set of conditions virtually none of your target audience possesses?
I worked out a long time ago, that I enjoy listening to _music_, not HiFi gear.
My advice to people who ask how to make their system sound better? Buy some music you enjoy more…
I can enjoy a wonderful performance of a great tune played through my laptop speakers - much more that I enjoy test tones or gear-demo-tracks through sound gear worth something north of a new car…
(not that I haven't been "that guy" in my past…)
Good advice, but you do need some baseline quality equipment to start with.
Got my car with one speaker blown out, speakers wired semi randomly (left-right and front-rear faders don't work as they should), also powering line-in source from cigarette lighter results in funky background noise. Sounds great--when a good tune is playing and I'm able to recognize it ;-)
Then you can worry about your room.
Or get some half decent headphones.
One of the many things EQ can't fix, of course, is room reflections, which can be helped by room treatments and speakers with a directivity better suited to the room.
 DRC can improve this, but only within a small sweet spot.
It is said that in most cases 192 kbit .mp3 is indistinguishable from >192, and blind tests support that. Granted, there are instruments like castanets which make it easier to hear the difference. In general though, I can't distinguish 128 from 192 and I listen to music a lot. Also it's unlikely that my hearing is already damaged because I try to keep volume low.
But I've noticed that where I put the speakers makes a huge difference. I can easily tell the difference from speakers on the floor versus speakers on my desk. Where I'm at the moment also matters a lot. If I lie on the floor, floor speakers don't sound as bad anymore.
In the end, I use headphones. Midrange Audio Technica ones, and I'm probably already overpaying a bit. But I bought them for build quality and comfort, and I wasn't disappointed. I can have wear them for hours (Not healthy I guess, but I'm used to wearing them even with no music being played). Headphones have the advantage that it suddenly stops to matter where your speakers are and where are you relative to them.
Yes. Do a couple of blind tests with your acoustic system first.
> It's true enough that a properly encoded Ogg file (or MP3, or AAC file) will be indistinguishable from the original at a moderate bitrate.
Disagree. This claim seems to be ungrounded compared with others.
I can believe limitations with bit depth and sampling rate (although I'll take a chance to test myself if I get near good enough acoustic system). However, I definitely could discern in a blind test whether music I listened to was stored using lossy format with reasonable bitrate. It's usually quite audible with rock music that involves cymbals.
AAC / Ogg don't have that limitation & at high enough bitrate should be indistinguishable from the source in a blind listening test, as demonstrated in a number of Hydrogen Audio listening tests down the years, unless of course you're using crappy encoders at which point all bets are off...
(Really, LAME is very good indeed these days. I eventually decided that I was going to get with the program and just encode all my CDs (backed up to flac files) as mp3 for portable listening. It's good enough, and I've decided not to listen for the pre-echo artifacts so that I won't notice them :) )
IIRC I distinguished an mp3 encoded by iTunes with bit rate 192 or 256 kbps from its original in Apple Lossless (both played on same cheap acoustic system). I probably should test with AAC or Ogg, too. Although I have a feeling that it's pretty much impossible to keep intact those rich in high frequencies cymbals while keeping compact file size.
> I've decided not to listen for the pre-echo artifacts so that I won't notice them :)
You're much better at controlling your mind. =) After I once verified that the difference is audible even on cheap speakers, I can't switch back to lossy formats. It means constant wondering if that how it's supposed to sound or not…
That's, by the way, why Apple's idea of having ‘Mastered for iTunes’ label IMO is worthwhile—at least you can be sure that mastering engineer listened to it this way. =)
(Cymbals seem to be a particular bugbear for mp3 encoding; cymbal-heavy tracks tend to suffer the most from obvious encoding artifacts once you know what to listen for.)
Was this in a blind test?
And I was _so_ sure the next sentence was going to be something like:
"No, do not have any suggestions that will make your sound equipment make Justin Bieber sound better…"
Digitally recording a triangle is the best example of why 48kHz is very limiting. The distinct sound of the triangle constitutes of a high fundamental frequency, ballpark 5kHz and of many very high-pitch harmonics. Most of these harmonics are above 20kHz. The harmonics are what makes it sound like a triangle, not the frequencies below 20kHz. This is why the triangle is one of the hardest instruments to digitally record. It always sounds like crap.
In theory, it's true that the human hear can't hear above ~18kHz, but it can hear the influence of the very high pitch harmonics on a lower frequency.
EDIT: here's more data backing what I said http://www.cco.caltech.edu/~boyk/spectra/spectra.htm
EDIT 2: typos, frequency mistake
The article's about distribution, not recording. I don't think anybody disputes the usefulness of higher sampling rates when recording.
> In theory, it's true that the human hear can't hear above ~18kHz, but it can hear the influence of the very high pitch harmonics on a lower frequency.
...and 48kHz audio contains those lower frequencies.
For example, the human hear will hear a 30kHz frequency if it's fundamental is 10kHz. If it's played at 44.1kHz, the 30kHz frequency is gone and all you'll hear is 10kHz, not a "different sounding" 10kHz.
You are going to have to provide me with a citation to back that up because that goes against everything I've learned and experience in 17 years of working in acoustics.
Basically, if you produce two ultrasonic frequencies, they will create an interference pattern at a much lower frequency than either of the individual frequencies. Modulate a signal on the difference between two signals, and you can create a directional speaker, since ultrasonic sounds tend to be highly directional (so long as the diameter of the transducer is greater than 1/2 wavelength, which is almost guaranteed with ultrasonic signals). This is how the "sound cannons" that are being deployed for crowd control work.
Yes, but the effects of interference patterns between multiple ultrasonic frequencies is the same, and definitely does affect the audible spectrum
has nothing to do with this:
This is why we must filter the square wave that comes out of a DAC
The only reason that square waves "must" be filtered is to reduce the potential of damaging tweeters. If you want to record a square wave with the purpose of later reproducing the square wave, than you don't want to filter it - once you filter it, it's no longer a square wave.
The reason that square wave sucks is because it introduces tons of high frequency content (your amp probably won't reproduce the high frequency content anyway, so I don't think most Japanese consumer amps will damage your speakers--that is, the amp will act like a filter anyway). That high frequency content then creates alias effects (think of moire patterns when looking at super high-res photos that are scaled down without anti-aliasing). Those alias effects sound like shit to the human ear.
The point of filtering is to anti-alias the resulting analog signal after conversion from digital to analog. The point of upsampling is to move that filter well beyond the audible range, so you can use a 1st-order filter (gentle slope, but it introduces no phase effects). The fact that a square wave hurts your speakers is inconsequential--the amp will effectively filter the signal anyway. Unfortunately, it will filter the signal without anti-aliasing, which introduces those nasty interference patterns within the audible spectrum (that is, if you feed a straight 44.1KHz sampled square wave to your speakers without upsampling/filtering).
Trying to record an edge case like this is the same as recording in a room with bad acoustics. So you end up with some weird (but not faithful) representation of the sound which is a snapshot of the microphone's characteristics and directionality of the ultrasonic tones. It's not reasonable to assume any microphone will behave exactly like a human ear. Even if you could, you're going to have to mimic the tiny random movements a normal person would make listening to a sound, movements which would definitely impact the perception of the sound, because microphones are much more stationary than any human would be.
The "different sounding" argument two posts above is silly, because sound is almost never that monochromatic, and if it is, it's usually boring. Also I don't understand how missing out on an odd order harmonic would be a bad thing :) The reality is none of these arguments are based in a reality of what people would hear, and because of that, the arguments aren't practical.
In reality, 20 bits at 48kHz (or 64kHz) would be more than acceptable for even the most discerning of ears and probably the most practical in terms of space and fidelity, but it'd be a weird format to distribute in.
So the interference pattern will be made up of one low frequency sound and higher frequency harmonics. Once again the higher frequency harmonics are redundant, because you only need to record the lower frequency sound.
The only possible way ultrasound can be picked up by the ear is if the ear has a non-linear response to the input sound. Going by the information in the article linked, it is highly unlikely that any significant non-linearity exists in the ear.
Then you should be able to provide at least one citation.
Of course, filters aren't perfect, and result in phase shift and roll-off. So we over-sample the signal to create a signal with a much higher frequency than 20KHz, so that the filtering occurs well outside the audible band, allowing us to filter out all of these harmonics without affecting the desired signal.
Basically, the end result is that by sampling the signal, you are introducing high frequency content that must be removed prior to playback. This high frequency content is one of the reasons old CD players from the 80s and 90s cause "listener fatigue", although I have no sources to back up that last statement.
(not for eatmyshorts -you get this I gather) - everyone gets that "upsampling" can't add detail to a recording right? You can't get more than you've got.... no matter what you do. There is no magic. You upsample so you drive harmonics generated in the digital-to-analog process during playback further up in the spectrum so when you get to the analog stage you can use a nice gentle analog filter to filter them out. Without the upsampling, you need a nasty steep analog filter to filter them out, and that can have audible side-effects (or at least measurable) in the audible spectrum.
eatmyshorts - correct me if I mis-stated any of that please....
I don't know about the physics of the speaker itself generating the overtone (in cabinet), but it could certainly resonate a wine glass in the room, for example.
So no, your wikipedia link is not a citation for the claim that cmer made.
If it really worked that way it would be trivial to demonstrate. Alas, it doesn't.
No. It won't.
Which, since it's a psychoacoustic phenomenon, wouldn't hold true when the partials involved are above the audible spectrum.
Didn't read the article, so commenting out of context, however it needs to be said that in sample-based music genres the distributed music gets used as if it were a recording. Maybe then it could be argued that higher sampling/bit rates should be available, if only for those who are sampling.
That may well be true. But those mixed-down harmonics that are heard "live" would then be captured by the 16/44 (or whatever) sampling. IOW, the recording captures what you heard. Those upper harmonics have no emergent properties. Their effect is captured.
Just because it is difficult to record a triangle does not necessarily mean it is impossible to accurately recreate the sound (to human ears) using 48kHz.
Yes, you're right.
In fact, some of the section X references don't even mention hearing, they talk about "alpha-EEG rhytms" (in this case "listeners explicitly denied that the reproduced sound was affected by the ultra-tweeter") and "bone-conducted ultrasonic hearing" trough the "saccule" ("organ that responds to acceleration and gravity and may be responsible for transduction of sound after destruction of the cochlea").
In fact, most of the claims of the article are around the fact that there is energy over 20khz and how it can affect recording process.
This is a well known fact, and this is exactly why engineers filter out sub-sonic and super-sonic frequencies, especially today: stuff that you can't hear (or feel) will just suck your headroom and make you lose the loudness war.
EDIT: Listen to the triangle at the beginning of Rush's YYZ. It's an old recording, but it sounds significantly worse than the analog version. It's been digitally mastered some time ago so if it was mastered today, it would probably sound better, but still not great. I heard a rumor that Rush is remastering all their albums "for iTunes" at the moment, so hopefully we'll be able to compare soon!
So while it's true that the human ear can't hear well above ~18KHz, and the interference between high order harmonics are audible, it's also true that a properly recorded signal, sampled at 44.1KHz, oversampled, and filtered, can reproduce the exact signal the human ear is capable of hearing. At least according to theory.
The human ear is capable of detecting sound pressure as well as sound intensity, and while playback of the interference between harmonics can be reproduced faithfully in the sound intensity realm, the sound pressure levels will differ, and it is theorized that people may be able to tell the difference between the two. However, as far as I am aware, nobody has been able to demonstrate this reliably in practice.
I'd prefer recording technology to err on the side of capturing what we need to reproduce all of that, even if we aren't sure that we need it.
Again, this article is about distribution, not recording.
Have you read the Audio Technology magazine interview with Rupert Neve?
Greg Simmons: Geoff Emerick, the famous British Producer ?
Rupert Neve: Yes, he started me off on this trail. A 48 input console had been delivered to George Martin's Air Studios, and Geoff Emerick was very unhappy about it. It was a new console, made not long after I had sold the Neve company in 1977. George Martin called me and said, "please come and make Geoff happy, while he's unhappy we can't do any work".
They'd had engineers from the company there, and so on. The danger is that if you are not sensitive to people like Geoff Emerick, and you don't respect them for what they have done, then you are not going to listen to them. Unfortunately, there was a breed of young engineers in the company ( I hasten to say this was after I sold it !) who couldn't understand what he was bitching about. So they went back to the company and just made a report saying the customer was mad and there wasn't really a problem. Leave it alone, forget it, the problem will go away. They were acting like used car salesmen. I was very angry with it. So I went and spent time there, at George Martin's request, and Geoff finally managed to show me what it was that he could hear, and then I began to hear it, too.
Now Geoff was The Golden Ears - and he still is - and he was perceiving something that I wasn't looking for. And it wasn't until I had spent some time with him, as it were, being lead by him through the sounds, that I began to pick up what he was listening to. And once I'd heard it, oh yes, then I knew what he was talking about. We measured it and found that in three out of the full 48 channels, the output transformers had not been correctly terminated and were producing a 3dB rise at 54kHz. And so people said, "oh no, he can't possible hear that". But when we corrected that problem, and it was only one capacitor that had to be added to each of those three channels, I mean, Geoff's face just lit up ! Here you have the happiness/ unhappiness mood thing the Japanese were talking about.
copy here: http://poonshead.com/Reading/Articles.aspx
That's one thing I find concerning with the move to digital. With analog media, you can go back, re-record and get an improved result (provided the source is good) but District 9 (which was shot on Red One) will never have improved quality other than resampling because the source is set to a particular digital format with associated data quality.
"[...] provided the source is good" is begging the question; it's no different from saying "District 9 could be better if they hadn't recorded in 4k (or whatever the Red One was using) and downsampled it for my DVD" The nature of the source is irrelevant, barring the fact that film might provide a higher resolution, if film scanning technology increases, and you can afford to both capture on film, process and store your film properly (archiving film is rather difficult, I believe), and get the best quality digitisation possible.
While I have no doubt that digital will eventually catch up and surpass film, there inevitably is going to be a transition period where quality films were recorded (let's just say at 2k) where the input is constrained and extrapolation be the only available option.
4k is the current state of the art. It will not be so forever and because it's recorded at 4k, we can't go back and extract more dynamic range due to the limitation of the sensor. Whereas you can go back, redigitize an IMAX film (say Chronos shot in 1985) that is in good condition and get way more info than something shot on 4k yesterday.
TL;DR IMO input still absolutely matters. 35mm is not the upper limit. We went through this with photography and am now doing the same with video/film.
EDIT: After thinking more about it, here's a more extreme example. I purchased a Kodak DC20 back in the 90s (early adopter yay!), even if the camera had decent glass, there's no way I can go back to an image captured by that camera and magically get the equivalent of 22mp 5D camera by resampling. If I had used a film camera, I can get a much improved scan.
EDIT2: Here's a good example. Slumdog Millionaire was mostly shot on a SI-2K which recorded at 2k. You can't go back and get 4k output on the digital portions. So generations later, we will be stuck enjoying an Academy Award winning film at that level of quality.
Digital is the future. Hence it behooves us to have the maximal input & output possible at this time. Unfortunately, this is not common now and the price paid is that content created during this period will be stuck at the same quality level.
The cost of renting a red one and recording straight digital vs hiring a film camera, process lab, and all the other parts needed quite possibly means that some films might never have been produced due to filming costs.
What measure of quality can compare X against X, if it was never made?
I imagine (I have very little actual experience here, so it's perfectly possible I'm wrong) that digital recording might make it easier/cheaper to retake shots/scenes repeatedly to get them right as well, offering another 2nd order quality effect.
I don't think I understand quite what you're saying and wondered if you could explain more. You and the article both say that humans can't hear above about 20kHz. If there are higher frequencies that create a harmonic at a lower frequency (e.g. a 33kHz harmonic that produces a sound at 16.5kHz) then surely that lower harmonic (16.5kHz in this case) will be recorded by the original recording equipment assuming it is recording at a frequency at least twice that of the highest audible frequency (let's say that this would be 48kHz, although there might be other DAC-related reasons to go higher).
I'm possibly being very daft here!
If you take this recording and master it for a CD (44.1kHz), you'll effectively get up to ~20kHz (since they're a low pass filter starting at around 16-18kHz). This means that only our first frequency will be captured: 15kHz. It will be exactly the same as if you only recorded 15kHz alone. The harmonics don't modify the fundamental frequency, they just trick the human hear. But when they're gone, they have no effect whatsoever.
Hope this helps!
EDIT: the frequency numbers I used are actually somewhat of a bad example. Harmonics are never exactly double, triple the fundamental. Those would be mostly inaudible. But you get the idea.
Or am I wrong, and the ear is able to detect frequencies above 20kHz?
Actually, they are: https://en.wikipedia.org/wiki/Harmonics
An extreme example is present on modern pianos, where the high rigidity of the loud, heavy piano strings can cause tuners to stretch the lowest and highest notes as much as a half-semitone so that their harmonics are in tune with the note the next octave down or up. In other words, the first harmonic on the lowest note of a piano can be as much as 1/2 of a note sharp.
And when your oscillator is no longer one-dimensional, most harmonics aren't even close to integer multiples. The harmonics of bells, cymbals and drums are all over the place. That's what gives them their percussive sound. (Edit: some of these modes of vibration aren't harmonics in the linear sense.)
That is absolutely incorrect, mathematically speaking, harmonics are by definition "integral multiples of the fundamental." (Fundamentals of Acoustics, Kinsler & Frey).
There's a measure -- inharmonicity -- of how far the actual overtones of a particular instrument differ from their theoretical fundamental multiples.
[I suspect you already know this. This reply is probably for others' benefit]
Anyway, it's not as though mathematical literature requires you to use a term exactly one way. I had a diff eqs textbook that used the word 'harmonic' in exactly the way I used it above when I made reference to diff eqs...
So you had a textbook with a mistake in it. What book was it?
But those aren't harmonics, they're inharmonic partials.
This is the part I really do not understand... either my ear CAN pick up those frequencies, maybe the harmonics are "tickling" the little hairs inside my cochlea and ultimately the frequencies I can actually hear were altered in my perception that way - or I can not hear or sense the harmonics and they physically alter the "original" wave that I end up actually hearing.
Either way, pretty much the exact same thing should happen in a studio microphone. Those all do have frequency limitations and AKG, Royer, Rode, Shure, Sennheiser, Audio Tech, what-have-you pretty much all go up to 15kHz or 20kHz according to specs, if I understand them correctly, but not further than that. If it isn't even recorded, those frequencies I also cannot hear can NOT alter my perception so they HAVE to somehow change the frequencies I can hear and are being recorded... on top of that you are making "room" for frequencies up to, say, 60kHz but I very strongly doubt your mics can go even remotely that high.
"I'm an ex-audio engineer"
Hard to believe.
"The distinct sound of the triangle constitutes of a high fundamental frequency, ballpark 10kHz"
That's a pretty high note - higher than the top key on the piano. But an "audio engineer" would know that.
"many very high-pitch harmonics"
Since the next harmonic after the fundamental would be at 20khz, which only young people can hear, and none of the others are audible to any human, I don't understand what you are talking about.
"Most of these harmonics are 20kHz."
OK, you don't either.
"it can hear the influence of the very high pitch harmonics on a lower frequency."
The statement that frequencies above 20kHz don't matter rests upon the assumption that the ear is linear. If the ear is not linear (I don't know whether it is not not) then frequencies above 20kHz will matter, as the ear will be able to mix higher frequencies down to less than 20kHz. For example, if we have frequencies of 56kHz and 59kHz, the ear MIGHT be able to discern a difference frequency of 3kHz. No doubt this effect could be reproduced by signal with a sampling rate of 44.1KHz, but only if the analogue systems, before the sampling stage, reproduce any non-linearity in the human ear.
Incidentally, you can get speakers that create a localised beam of sound, that the person sitting next to you cannot hear. They work by transmitting frequencies above the audible range. These high frequencies can be beamformed by a relaitively small speaker array, so the sound is localised. They then rely on the non-linearity of the ear (or maybe the air around the ear?) to mix the ultrasonic frequencies down to audible frequencies. I guess there must be non-linearity in the human auditory system!
On the subject of 24-bits my understanding is that 16-bits is adequate, provided the levels (scaling) are set correctly in the recording. What 24-bits delivers is the ability to do a crappy job of the mixing, and still end up with the full dynamic range of the human ear. 24-bits is probably a temporary solution though, as manufacturers will engage in the usual Loudness War , and push the signal to the top of the dynamic range. Before long 24-bit audio will be equivalent to 16-bits (since the 8 least significant bits will be unused) and the next big thing will be 32-bit audio.
Having said all that, I'd guess that the speakers will be the limiting factor in most sound systems, not the recording format.
Yes. And DACs, which normally have filters too.
Here's an interesting article:
Thinking about it, if every person has a different non-linear response, in theory the only way to reproduce sound beyond a certain threshold of fidelity would be to reproduce the ultrasonic components, so each person would hear their own non-linearity. (That would be beyond what I can hear or care about, but it would be fun to play with. Beyond a certain level we also get to the point where we need to ask what it means to hear a sound.)
> the speakers will be the limiting factor
> in most sound systems
Pardon the reductio ad absurdum, but would you prefer to listen to $1,000 speakers in a dry, padded listening room, or to $100,000 speakers in a tile bathroom? Obviously the room matters; I think most people underestimate by how much.
I'd take the bathroom, given that my singing voice sounds less worse there! :-)
I'm not sure what your background in audio is, but everything he says is correct. High end frequencies well past 15k and up (22.1k actually) are widely acknowledged to influence the lower frequencies and play a huge role in the perception of the quality of a recording. This is an old debate with pros and cons on both sides, but in general you'll find the "Golden Ears" mastering engineers (Stephen Marcussen, Bob Ludwig, etc.) come down on the side of higher sampling rates.
Now, if your original recording was mastered to 16/44.1, then a transfer by way of 24/192 will probably actually hurt the recording. But if you're mastering from an original analog or high-quality digital, in my experience there's no question, higher sampling rates deliver better experiences.
I've caught engineers using L1-Ultramaximizer (or similar) to bounce a recording down to 16-bit/44.1khz as part of the mastering process, and they're always surprised when they're completely unable to hear the difference even in the most simple cover-the-screen-and-toggle-bypass test.
But I know what my ears hear, and IMO there is absolutely a vast different between 44.1 and 192. I'm not sure how you can even question it. Someone else on the thread was saying it's impossible to hear the difference between 16bit and 24bit. I don't even know what to say to that. It's like telling me the glass of Gallo "Table Red" you're drinking is as good as my '75 Lafite. All I can say is "cheers" and just enjoy.
If I gave you a bottle of "Table Red" with a '75 Lafite label, I'm sure you'd tell me how rich and wonderful it was. The problem here is that, as you said, "I know what my ears hear". You know you're listening to 192, wow it sure sounds great!
If you're listening to quiet music in a quiet room at high volumes on very low noise equipment, you can hear a difference in the noise floor level between dithered 16-bit and 24-bit, but at that volume level if that music (or movie) also has full-amplitude signals you'll be reaching peaks over 110dB SPL.
> I'm not sure how you can even question it.
192 kHz is clearly overkill for listening. Not so for further editing of the data.
Same goes for 16/24 bit, however, the difference between 16 and 24 bit is actually audible.
44100 is not a bad sampling rate, but it necessitates very sharp aliasing filters, which are audibly bad. A bit more headroom is well needed there.
That bit about intermodulation distortion is complete bogus. He talks about problems when resampling high-fs audio data. However, you would never do that. You would digitally process 192kHz all the way. Only your loudspeakers or ears would introduce a high-pass filter, and a rather bening (flat) one at that. There is certainly no aliasing going on there unless you resample (wrongly). Intermodulation distortion is not the fault of the sample rate.
I mayored in hearing technology. Calling 192/24 worse than 44.1/16 is total BS. How useful it is is a different debate.
This  (widely accepted in the scientific audio community) study's conclusions disagree with your assertion.
>44100 is not a bad sampling rate, but it necessitates very sharp aliasing filters, which are audibly bad.
This is not the 1980s, hardware has progressed beyond that point. Modern (i.e. anything from 1995 onwards) DACs do not suffer from aliasing problems. Also see 
>That bit about intermodulation distortion is complete bogus. He talks about problems when resampling high-fs audio data.
I did not notice that in the article. It talks about IMD in the context of the analog chain and the transducers following the DAC, and it's possible that high frequencies can increase it.
True, but they do so using (long, high-quality) high-cut filters. And these filters are pretty sharp, as they have to close within, say, 18-22.1 kHz. You can design them as linear-phase FIR filters with oversampling and all the good stuff, but physics dictates that sharp filters introduce distortion. A sharp filter like that is audible.
When you're talking about recording, sure, but in terms of storage and playback, we solved that problem 20 years ago with oversampling.
No, the difference is not audible at all. At 16 bits of depth on a normal low-level audio signal (~0.3 volts), we're talking about less than 0.000005 volts per amplitude step. This difference gets lost in the THD already at the DAC in your audio output stage. Then it gets lost again in the amplifier. And again in the cable to your speakers or headphones. And then it gets lost again in the speaker elements. What survives in a normal low-level audio signal is about 14 bits of resolution.
44.1khz IS a bad sampling rate for accurately reproducing anything except a triangle wave or square wave above 5khz.
I am not saying that there aren't any DACs on the planet that can't handle five millionths of a volt, but I am saying that five millionths of a volt isn't surviving through the particular DACs and the rest of the electronics used in your PC/living room hi-fi audio equipment.
If it were true that there's no audible difference between 16 and 24 bit, companies like Alesis, Otari, ProTools, etc. wouldn't have spent the last 15 years ditching 16 bit like an old pair of smelly sneakers. (better metaphors welcome).
Seriously, anyone who has sat down in a real listening environment for 5 minutes A/Bing 16 vs 20 bit, 16 vs 24, etc. hears the difference immediately. There's no question. This is why you can buy ADAT 16 bit 'blackfaces' for $100, down from their original $4,000.
It's all marketing, baby!
You've never rented an expensive tube EQ during a mix to cover up 16bit's grating harshness from 10k to 15k. Or tried like mad to make the bass drum sound like a freaking bass drum and not a pie pan slamming against the back of a plastic trash can. And yes, we had good mics and pres, all standard studio stuff. Decent, not brilliant, converters, but it was the 16bit that was the problem. Getting those 20bit XTs for the first time was like walking into the Promised Land.
Sure, there's lots of marketing ploys out there, lots of snake oil. Moving up from 16 bit was not one of them.
The original article explicitly mentions how 24bit is useful for recording.
> Professionals use 24 bit samples in recording and production  for headroom, noise floor, and convenience reasons.
> Modern work flows may involve literally thousands of effects and operations. The quantization noise and noise floor of a 16 bit sample may be undetectable during playback, but multiplying that noise by a few thousand times eventually becomes noticeable. 24 bits keeps the accumulated noise at a very low level. Once the music is ready to distribute, there's no reason to keep more than 16 bits.
The original article does say that yes, during recording and production, 24 bit audio gives you a lot more room to play with. That doesn't mean that you can hear the difference between 16 and 24 bits for the final recording; just that 24 bits give you more room to keep out of trouble during production.
>Same goes for 16/24 bit, however, the difference between 16 and 24 bit is actually audible
No, the difference is not audible at all.
Harmon requires its trained listeners to pass tests based on this software before participating in juries to evaluate Harmon products. It doesn't directly address the sample rate/bit depth issues discussed in the linked article, but it does address a lot of the issues brought up in the HN discussion, so you can have a chance to see how much those characteristics really matter.
You may be surprised.
In any test where a listener can tell two choices apart via any means apart from listening, the results will usually be what the listener expected in advance; this is called confirmation bias and it's similar to the placebo effect. It means people 'hear' differences because of subconscious cues and preferences that have nothing to do with the audio, like preferring a more expensive (or more attractive) amplifier over a cheaper option.
The human brain is designed to notice patterns and differences, even where none exist. This tendency can't just be turned off when a person is asked to make objective decisions; it's completely subconscious. Nor can a bias be defeated by mere skepticism. Controlled experimentation shows that awareness of confirmation bias actually increases rather than decreases the effect!
Doesn't that completely negate his conclusion, that there is no point to distributing 24/192 music? If people want to pay for 24/192, and even he just admitted that they will legitimately enjoy it more, how can you conclude there is no point?
Life is short. I want to enjoy things. Whether or not my enjoyment can be quantified or scientifically defended, I really don't give a shit. But that's okay, if you don't want to sell me 24/192 music, Amazon will. Between this and DRM-free content, it's no wonder I buy all my music from Amazon these days.
Audiophiles are quite a fascinating group. These are people that can be rather rational in some respects (they could be doing research in some lab somewhere) but when it comes to audio equipment they will shell $2000 for HDMI cables. The salesmen and manufacturers that make these things ("high end" HDMI cables, 192kHz recordings) know this very well and they aggregate around this target set of clients.
I think that is exactly what is happening here. At some point storage capacity is just good enough and one can distribute 48kHz, 16bit audio to everyone. But what do you do next? Everyone is getting that and it is not new and cool anymore. What to do? Well increase the frequency and sell everyone a newer, better, higher fidelity thing, even though objectively human years cannot really hear the difference. Subjectively though, there is a huge difference. If you ask someone who just spent $50 for a 192kHz record if they like it better than say a $20 48kHz one, I bet you 100% of people will confirm that 192kHz sounds better and will be ready to go and buy more.
Ultimately, sure. The world is full of products and services which only add value in this weak sense.
If the same wine tastes better if it's priced higher, then it still tastes better. But I think it's only honest that the consumer be aware that the increased utility from being priced higher is due solely to the fact of it being priced higher. Beyond that, I don't care.
One thing we can all agree on is that music is much more enjoyable if you think you're listening to it through good equipment or from a good source. Ultimately it's only the `thinking' part that matters. So I would make two points:
1. One point he's making is that playing audio sampled at 192khz through regular equipement actively distorts the music in negative ways. So now if you know this now you should enjoy that music _less_.
2. If you're adept metacognition (maybe that's not the right word), you'll realize a) you can get most of the enjoyment by buying equipment that's `pretty decent', and then not worry about it too much. b) you're probably fooling yourself by spending so much time/money worrying about having the best equipment, so you're probably not getting the maximum utility from the experience anyway. Or maybe it's the experience of trying to get the best equipment it self that's enjoyable, not necessarily the increased audio fidelity.
Sorry, no time to reply. I gotta run and write up my biz plan to distribute 32/384 audio.
So, while I have no option (for now) but to acknowledge your position, I still feel dirty for doing so.
why are you arguing against the conclusion of an article that has this many upvotes on HN?
However, one thing that's missing here (and in nearly all other similar pieces) is a full discussion of the prerequisites of the sampling theorem. For example, the signal must be bandwidth-limited (and no finite-time signal can be).
But this is a minor concern, as there are many elements in the analog domain of the recording and playback chains that serve as low-pass filters - starting with the mics. So bandwidth-limiting is effectively achieved.
For a similar reason, the discussion of the "harmful" effect of high-frequencies to playback electronics and loudspeakers to be a bit overdone IMO. Peruse the excellent lab results of modern audio gear on Stereophile's web site. You'll find that bandwidths exceeding 30kHz are rare.
One last thing. When doing subjective "testing," keep in mind that what some folks are hearing may be limitations of their gear. For example, most DACs derive their clocks for higher sampling rates (88/96/176/192) by clock-multiplier circuits. IOW, 44kHz and 48kHz are the only ones clocked directly by a crystal. These multiplier circuits are often noisy, contributing to jitter. The audible effect of this jitter is hard to predict.
PS As an avid audiophile, I find the clash of subjectivists and objectivists on this normally-buttoned-down forum to be a bit of a trip.
192 kHz is the sample rate. 192,000 slices per second. It does not refer to the audible sound spectrum.
20 kHz in speakers refers to the cycles per second of the audible waveform. Normal human hearing rage is 20 hz - 20 kHz. For most people, it's less than that.
A speaker can certainly play back music sampled 192,000 times per second. Most of them can't play tones that are higher pitched than 20 kHz, which is fine because mostly only dogs can hear up there anyway.
The fact is, simply distributing music in lossless format carries the vast majority of audible improvements. Arguing over whether or not its 24-bit or 16-bit or making a chunk of sound last 5.2 microseconds instead of 22.67 seems incredibly stupid to me, because you're better off simply improving the mix itself then fiddling over such microscopic differences. These things only become relevant if your mix and performance and recording equipment (or synths) are absurdly close to perfection. This becomes even LESS relevant in an age of indie-musicians.
Filters are also not perfect (but good oversampling filters are not the weakest link)
Further, even perfectly dithered 16 bit data can't go 20 dB below the quantization floor, unless you give up on frequency response on the high end. Again, this is plain math.
With a calibrated 105 dB low-distortion sound system, in a quiet room, I can hear imperfections from 16 bit, 44 kHz material, especially in soft flutes and triangle type percussion. Of course, D class amplifiers, and MP3 encoding, do worse things to the signal, so let's start there. But 20 bit, 96 kHz (or at least 64 kHz) are scientifically defensible, when analyzing the math and the physics involved. No snake oil needed!
1) Any well-designed system is going to have headroom. Period. Just because 48kHz can capture the frequencies the human hear theoretically, it's always good to have a little wiggle room. This comes into play even more with interactive situations: humans are particularly sensitive to jitter. Having an "overkill" sample rate lets you seamlessly sync things easier without anyone noticing.
2) 192kHz comes with an additional benefit besides higher frequencies: it also means more granular timing for the start and stop of transients. More accurate reverb would be the obvious example. I don't know if the human ear can discern the difference between 0.03ms and 0.005ms but it's something I don't see mentioned often.
2) increased sampling rate does not improve timing. This also has been researched in detail (because it sounds like it could possibly be true given that the ears can phase match to much greater granularity than the sample clock). It was found false in practice, and in retrospect, the sampling theorem explains why. The Griesinger link discusses this with illustrations, and provides a bibliography.
Slides 29-35 address this point.
48kHz already has enough 'wiggle room'. How many people do you personally know that can hear a 24kHz sine tone?
> more granular timing for the start and stop of transients.
... it's something I don't see mentioned often.
Probably because it doesn't make sense. Human ears cannot hear frequencies about 24kHz and Nyquist tells us that 48kHz is enough to completely capture all the detail of a signal at that frequency and below.
> Having an "overkill" sample rate lets you seamlessly
> sync things easier without anyone noticing.
> 192kHz ... also means more granular timing for the
> start and stop of transients.
Said another way, two band limited pulse signals with different onset times, no matter how arbitrarily close, will result in different sampled signals.
This is true, but different than what I am arguing. You're saying that a listener over time will be able to tell that the two signals differ. I am saying that a listener will be able to determine this at fractional wavelengths.
It's similar to dithering a high dynamic range signal onto a lower bit depth: more than two samples are required for "evidence" of two different signals, while sampling at a high enough rate will tell you this almost instantly.
Again, I don't know if human ears are able to detect this, just that I haven't seen it addressed in these discussions.
As a thought experiment, let's consider a pulse that has been band-limited to 20kHz. Are you arguing that the analog output of a (filtered, idealized) DAC would look different depending on whether the dac was running at 44.1kHz vs 192kHz? If so, I don't think many people would agree with you.
Any difference in the "timing" of the output wave would have to come from energy that falls above nyquist of the slower sample rate. So, while I agree with you that the timing would be sharper, this is exactly caused by "higher frequencies", not by some other sort of timing improvement.
No. I'm arguing this: take a 44.1kHz signal and upsample it to 192. It's the same signal, same bandwidth and everything. Duplicate the stream and add a 1 sample delay to one of the channels. When you hit play, that delay would be there. If you downsampled the 44.1kHz signals after applying the delay to one of the channels, you would almost hear the same thing. The difference is that you could not detect the difference between the signals until after a few samples. With the 192kHz stream it would be unambiguous after 2.
Remember, Nyquist-Shannon holds if you have an infinite number of samples. If your ears could look into the future then what you say is perfectly correct, but they need time to collect enough samples to identify any timing discrepancies.
but the question for me is how exact that guessing is.
correct me if i'm wrong but, that interpolation happens twice: when recording by the adc and on playback by the dac.
so a lot of that whole discussion (yeah, finally something about acousticts :) depends on how accurate interpolation works in adcs and dacs.
It turns out that if you reproduce a digital signal using stair steps you get an infinite number of harmonics— but _all_ of them are above the nyquist frequency. The frequencies below the nyquist are undisturbed. Then you apply a lowpass filter to the signal to remove these harmonics— after all, we said at the start that the signal was bandlimited— you get the original back unmolested.
Because analog filters are kinda sucky (and because converters with high bit depth aren't very linear), modern ADCs and DACs are oversampling— they internally resample the signal to a few MHz and apply those reconstruction filters digitally with stupidly high precision. Then they only need a very simple analog filter to cope with their much higher frequency sampling.
That's the time it takes sound to travel 8mm. Do you think you could tell if an instrument was positioned differently by 8mm?
http://en.wikipedia.org/wiki/Sound_localization cites http://web.archive.org/web/20100410235208/http://www.cs.ucc.... that suggests the brain is sensitive to timing differences between ears as low as 10 microseconds, or 0.01ms.
Humans are sensitive to jitter, but jitter isn't a major problem with modern digital electronics and reclocking strategies. This ArsT thread hashed out these issues a couple of months ago: http://arstechnica.com/civis/viewtopic.php?f=6&t=1164451...
Is this too unrealistic to expect? Has something like this been tried before?
Plenty of other artists have as well, but this is the most high profile example I can think of. I agree it would be great if it happened more often.
The beatles multi-tracks are also available (although they were only recorded 4-track so not every instrument always has it's own track), and there has been a handful of artists who have released their samples of one song for remix competitions (Daft Punk, Royksopp, Booka Shade).
1. People would use the tracks to create custom remixes which they would then distribute. What happens when a remix becomes more popular than the original track? Artists generally have to pay other artists to remix their songs (usually via royalties).
2. Creativity. When an artist creates something they want you to hear it the way it was intended. Allowing you to remix it however you like takes away a lot of the creative control from the artist.
And another important remark: some artists are flattered when someone asks them to make a remix for their song. (Imagine you're an artist and your idol asks you to make a remix of his song.)
I write music and have considered releasing separate tracks so people can freely remix it but I prefer just having mixes that are controlled by me. Allowing another artist you know to remix your track still allows you some sort of control (you know their style so have some idea of how the remix will go). Giving up that control is a big step and, I think, an unnecessary one.
Why do you need that control? Someone creating something new with your work, doesn't seem to damage your work in any way.
Closest I've found is to take the .mogg files out of the Guitar Hero games and use those to make new mixes. :-)
Of course, you can very easily get just the vocal track by subtracting the two. Sometimes the "non-vocal" track will still include backing vocals or the like in appropriate places, and just pull out the main vocal track.
Some musicians have even released every single track of their work separately; see "Desperate Religion" on http://en.wikipedia.org/wiki/Trilogy_(ATB_album) , intentionally inviting remixes.
For music where the vocal tracks aren't released separately, you can often pull them out nevertheless. The best is if you can get the audio in 5.1 -- vocals are almost always center-panned, which makes extracting them quite easy.
And when you have them, just use something in the like of Ableton Live and that will be it. I think that that's what you mean right?
It will be a great idea to have tracks released as several `layers` so that the user can choose which of them to play and which not, for example the bass/beats layer, layer with melodies, layer with the percussions, layer with the vocals of course, but that sounds like semi-studio production.
As pointed out, mastering has vastly greater effect on the audio quality (and is often pretty poor), and is the reason vinyl records often can sound better than their digital counterpart, despite being an inferior technology. The DAC used also has a massive effect on the sound once you get into decent quality equipment.
Like the author, i'd also love to see some expansion of mixed-for-surround music.
 a lot because of loudness wars, as pointed out in the post, but also just due to a lack of time/care/love(/demand?).
 http://www.hydrogenaudio.org/forums/index.php?showtopic=6175... This thread explores the bit-depth of vinyl records, beginning with a claim of a maximum 11-bit resolution-- limited by the width of a PVC molecule the record is made from.
My advice to you younger guys is to keep the windows rolled up while driving. I have no other explanation why my left ear is much worse than my right.
In my own tests I believed that I couldn't tell the difference between 16/44 and 24/96 on high quality loudspeakers, but I could with high quality headphones. The studies cited all seem to use loud speakers in testing.
Also worth noting, the article states that obtaining 24/96 source material sometimes means you get better mastered material, which still sounds better after down-sampling back to 16/44.
That said, if Apple also allows high quality recordings to be sold, it will be useful. For example of their acapellas, instrumental tracks or samples, it would be convenient for others who want to want to remix it, and iTunes would be a platform for this trade.
Also for tracks DJs play. Most compression throw away a lot of the bass which people can't hear, but this is bass you can feel rumbling through your guts on a big sound system and is part of the experience.
For the rest, they were happy with low rate AAC files on the early iPods, they are happy with the sound coming from their crappy little iPod dock, for them it won't make a difference as long as it's a chart music track from a memorable and impressionable time of their life.
Given a 5 minute song, if I have the choice to download a 11MB file (320kpbs MP3) or a 330MB file (24/192) I would of course choose the 11MB file. The sound quality is perfectly acceptable and the file size much more convenient to manage (storage, backups, etc.).
In terms of the convenience of managing the file size and sound quality I think 320kbps MP3 is the best compromise.
Here's a file size comparision of a 5 minute stereo song:
MP3 128kbps > 5 MB
MP3 320kbps > 11 MB
Uncompressed 16/44 > 50 MB
Uncompressed 24/192 > 330 MB
When talking about sound quality there is a much more relevant issue: the amplitude compression (distortion) abuse used by mastering engineers and producers that totally destroys the dynamic and life of the sound. That is a real issue. When buying a song there should be two versions to choose from:
A) "Loud", dynamically destroyed / distorted version.
B) Normal, dynamic, non-distorted version.
Today only version A is available to buy.
For a producer and manufacturer the rational approach would be to cater to that craziness and extract as much money from it as possible. In other words if you are selling HDMI cables, spend $2/cable to make it, then sell most for $5 and then re-brand some and sell for $500. If only takes 1 out of 100 people to buying that to make the same profit. You know these people are obsessed and irrational so you cater to that. And that's basically how we end up with ridiculously overpriced Monster cables and recordings distributed to customers @ 192kHz.
Footnote, you don't have to have a >$10,000 setup to benefit from higher quality tracks (compared to the downloads that sometimes have 'questionable' quality). I have two systems, a full range stereo (front left and right) setup for nearfield listening at my desk thats +/- 1DB from 50hz-20khz. The other is a stereo setup in my media room; 2 way quarter wave transmission line, +/-3DB 40hz-20khz. The point is, there are a lot of people with less than $1200 in audio gear that still want lossless tracks made available. Who cares if the human ear can't discern much of the extra information, we still want it.
Then we starting recording other people. I became obsessed with gear, software and all the associated toys that go with any technical pursuit. I'm a programmer, so it's easy to understand how that happens but I totally lost sight of the music, spent way too much money and equipment that was nowhere near being required and generally lost the plot. I was tracking everything 24-bit/96kHz and bemoaning the loss of quality when I mixed down for CD.
Anyway, the TL;DR version of what followed was that we recorded quite a bit, lost interest in making our own music and then the whole adventure came to an end. Now my gear is leaving via eBay and I'm finding my way back to just playing guitar and trying to write good music.
24-bit/192kHz - pointless. Give me a small venue and a guy with an acoustic guitar any day.
> The FLAC file is also smaller than the WAV, and so a random corruption would be less likely because there's less data that could be affected.
At the same time, if you flip a bit on a WAV file, you may hear a "pop" sound. On a FLAC file, the whole encoding block may be inaudible (or worse).
The basilar membrane is a loosely tuned resonator. The hair cells placed on it fire beginning on the positive zero crossing. So, to a first approximation, the ear is in fact a filterbank.
There is a time domain component in that the cochlear nucleus contains nerve cells that watch multiple hair cells at a time and correlate the firing in several different ways. Some attempt to discriminate pitch, some convolve and correlate in-phase firing energy, some look for tones to end, etc. This information is then forwarded on to the brain.
However, getting back to your point, no hair cells will fire if the basilar membrane doesn't move, and it's tuned to a frequency range.
Further, I can hear a difference between 44.1kHz and 96kHz. Whether you can hear that difference is up to you. (The word-length is a red herring - there's no new information contained in a 24-bit recording vs 16.)
IMO anything less than flac and you're missing something. Higher sampling frequencies do add to the sound, but in a way that is almost invisible to the untrained ear. Perhaps these should be distributed at a premium the way SACDs and similar "audiophile" formats were in the past?
The key to reproducing the original signal from the digital signal is a low-pass filter that rejects everything above the sampling rate, correct?
That is to say, what I am getting at is while the original signal can be reproduced, it requires properly tuned, and probably reasonably high performance, hardware to remove the higher frequency components of that square wave. Can you count on consumer grade hardware to do this well?
Typically the technique used inside DAC is to digitally upsample the signal (by duplicating samples, often to a few MHz— also allowing them to use a low bit-depth DAC) then it applies a very sharp "perfect" digital filter to cut it right to the proper passband (half the sampling rate). The analog output then contains only a tiny amount of ultrasonic aliasing which is so far out that it's easily rolled off by simple induction in the output.
This isn't just theory. Here is a wav file I made at a 1kHz sampling rate, where every other sample is -.25/.25: http://people.xiph.org/~greg/1khz-sampled.wav (so a 500Hz tone, the highest you can represent with 1kHz sampling).
Feeding that file to a boring resampler (I used SSRC, but anything should give roughly the same result— a least when not quite so ridiculously close to nyquist, most will attenuate near-nyquist data extensively) and get this: http://people.xiph.org/~greg/1khz-sampled-to-48khz.wav
Here are the two signals plotted against each other:
As you can see— the 500Hz sinewave is reconstructed perfectly. (Of course, a 500Hz square wave would not be (you'd get a sinewave out) but this is because a 500Hz square wave contains energy far beyond the nyquist of 1kHz sampling).
Here is a spectrograph of the same signal http://people.xiph.org/~greg/1khz-to-48khz-spec.png showing that the tone is indeed pure (the faint background noise is the dither the resampler applies when requantizing its high precision intermediate format back to 16 bits).
If this is the case, then all of the arguments in the world about the maximum audible single frequency are irrelevant. Imagine music composed entirely of these beat frequencies and performed with a pair of oscillators between 25kHz and 35kHz. Without higher resolution encoding, it would be audible IRL but the recording would be silence.
So you'd be right if your mics were head spaced and in the venue. But you'd still have secondary data, with the original lost.
By that standard, the original is always lost unless you have a completely holographic recording. 192kHz doesn't help with that problem at all.
The tone you get from an acoustic beat is not a real tone— it's a perceptual quark that requires you to be able to hear the tones in the first place.
EDIT: it turns out Audacity won't generate a tone above 20kHz (the UI accepts the value, but when you reopen it the value has been rounded down), so both of my generated tones were actually 20kHz.
EDIT: You can generate higher than 20kHz by increasing the pitch of a tone lower than 20kHz. Upon doing this, I could hear 24kHz and 26kHz together.
I chuckled because this is so true, and yet tell that to the people who buy oxygen free copper 'monster' cables for their speakers, being careful to align the arrows with the direction of the music from the amplifier to the speaker. People, even otherwise reasonable people, will swear up and down they can hear the difference.
People have suggested this. It's been tested in rigorous double blind tests— involving both real music signals as well as special test tones (Linked from the article). The tests were unable to show that people could hear the ultrasonics. Moreover, there isn't any physiological basis to expect people to be able to. You can't expect a stronger result than that.
Common 48KHz audio already goes a bit beyond what adults are known to be able to hear, so you've already got some headroom for "but what if a few people hear better than anyone the researchers have been able to find!".
Another tangent: To me it seems audio engineering should fix the "woofer". That is it seems subwoofers have terrible distortion.
The move from a low end ($150-300) DAC to one much more expensive will be considerably less drastic, and likely won't matter until you've dropped at least $5k in to the rest of your system.
That said, you may already own a DAC without realising it...as long as you're taking the singal out _digitally_ (e.g. SP-DIF or digital coax) to an external receiver, you're already in a pretty decent place.
But the high end ones that are 24 bit 192khz that cost $1k (Cambridge Soundworks DAC magic comes to mind) I have to seriously doubt I'm going to hear it. I really only hear the DAC difference (compared to my laptop and FIIO) when I use headphones.
(yes, the ear-brain system is non-linear too— but apparently it filters out the ultrasonics before they do anything measurable in this regard)
Hotblack Desiato, is that you?
This is a great article but I'm still not convinced people cannot have a sensation of sound out of there hearing range.
I used to be able to see it when I was a kid (it looked very faintly red), but I just tried it and couldn't see it at all. That's actually a little bit disturbing.
These days, the various IR communication protocols have been standardized and virtually all use 920nm, 940nm or 980nm emitters, all of which will be invisible. I mentioned the Apple IR remote specifically because it's a remote most people reading TFA will have, and it's known to be a 980nm emitter.
44.1khz gives too much aliasing distortion, but 192khz is quite the overkill. Ideally, digital audio could sit on 16 bits of depth sampled at 96khz.
The signal reproduced from your 44.1kHz sampled digital input is not a stair-step like some broken waveform editor might display: On output it goes through a matched reconstruction filter (which may, in fact, be digital and involve an oversampled DAC or it could be analog though those are harder to build without compromise). After the reconstruction filter the output is _EXACT_, assuming the input only contained energy below the nyquist (well, and was sufficiently far away from the reconstruction lowpass).
So even a 5khz sine wave is reproduced perfectly with 44.1kHz sampling.
These discussions of audio standards always get sidetracked by people who don't understand or believe this result. (Have to admit, the result is surprising).
I think there may be problems with the argument in TFA, which is based exclusively on standard linear systems theory.
Of course, the ear and some of its perceptual components may be significantly nonlinear, and thus not covered by the frequency response graphs of TFA.
These graphs assume linear systems, in which you put two frequencies in, and the same frequencies pop out in scaled form. Nonlinear systems can produce new frequencies in response, and this possibility is not discussed in TFA. Probably these effects are quite minor, but may be audible to some listeners on some equipment for some choices of source material.
The TFA does at least make this the-proof-is-in-the-pudding point somewhere in its depths. :)
16 bits is very limiting for music with lots of dynamics (ie: classical). Very quiet sounds sound quite bad at 16 bits, but since most pop music has about 6-12db of dynamic range, it doesn't make much of a difference.
I always thought the sweet spot would be 96-24. But the truth is, the market wants smaller and portable digital files, not higher quality music. Anything MP3 encoded will sound significantly worse than a CD anyways.
Many things are mastered poorly— recording engineers crushing the dynamics in order to get the loudest possible signals— mostly a problem for pop music, but nothing is immune.
It's been observed that the various 'higher-definition' recordings have less brutal mastering— no doubt owing to the different audience they are marketed to. But this isn't a property of 24-bit vs 16-bit distribution.
People actually _believe_ the 20KHz argument that anything above is inaudible. That's hogwash. I know because I can hear (or sense) higher frequencies, and I do not have the absolute best ears I've ever "met."
For example, last week I attended a A/V equipment event with very high-end equipment. It was packed --- over 600 people for one evening. 6 rooms of equipment. I'm sure all six served the same fare according to the 20-20KHz argument of this piece, yet they all sounded quite (or even extremely) different.
The 20 KHz argument is a myth. For people who can't hear the difference, no problem. But please do refrain from ruining or hobbling music for the rest of us... who can hear a wider frequency range.
Yes, some people are color blind. Does that mean the rest of us shouldn't use color? I hope not.
Music is an important wholesome and potentially emotional part of human life. Please do not cap it with "false optimizations".
24-bit/192 KHz is not inferior to CD quality sound. If you don't believe me, try a Linn system sourced on a Klimax DS with some high bitrate Linn classical music (or the Beatles Masters USB release!). If you can't hear the difference compared to low bit-rate (including CD quality) material, I assure you someone can. The low bit-rate will sound flat, hollow, less lively, or/and more coarse. Any number of problems exhibit at inadequate bit levels.
Vinyl is analogue quality (no discrete digital distortion). CD quality is a large step down from vinyl. A/V is just trying to get vinyl like quality from digital. We don't need nay-sayers impeding progress. If you can't hear the difference, please let someone who can hear make the informed decisions.
That's the problem with the theoretical science. When it's false, it's false. Come up with a new hypothesis; this one's false as it pertains to human hearing. There's information theory, and then there's auditory reality. Reality confounds the theory as applied to hearing. I don't know where the fault lies, and I don't really care.
But it's really annoying and frustrating having people nix progress out of idealistic theory, "laboratory" studies, and ignorance. The experiments (my experiences and numerous others) don't lie.
Double-blind is great, but I can already tell the differences between all six rooms of equipment from last week. One of the rooms was so extreme, I wanted to run out of the room due to discomfort (but I was polite and stayed all 30 minutes). In other words, double-blind was unnecessary. Someone whose ears I respect a great deal, loved that room. Even golden ears don't all hear the same. But I don't need double-blind to confirm trivial experience. The proof is already in the listening.
Because it's not a blind study. In audio, claiming something sounds better than something else is low-strength evidence, because it doesn't: 1) distinguish psychological bias (which is very strong in this area) from actual audible results; or 2) distinguish which characteristics of speakers, if any, you may be hearing.
If you can consistently ABX two speakers that have similar characteristics except that one reproduces frequencies over 20 kHz while the other doesn't (with identical performance below 20 kHz), I'd be convinced. One possibility is to use the same speaker but insert a high-quality 20 kHz lowpass in the chain during part of the test; or use the same speaker but with 44 kHz versus 96 kHz source material. I've never seen a controlled, blind case where a human can tell the difference there.
As for the double-blind and high frequencies, I believe I've already done the test. I have had my hearing tested several times. One of them, I recall the tester actually asked me to repeat some tests... it was funny. The testing was at very high frequencies. I believe she thought I was guessing the higher/lower frequencies... and getting lucky. So (I strongly suspect) she wanted to "prove" to herself what you want to prove --- that noone can hear above 20KHz. I disappointed her. I think she even threw in some placebo tests (no frequencies at all). It was funny. She never explained herself. I suspect she just thought I got lucky again.
How to really test this stuff? Get one of the audio designers to test... but they will laugh in the testers' faces. They do this stuff for a living... to build real products... for real live customers who can hear the differences. Dave Wilson was at the A/V show. Try listening to a pair of Wilson Audio speakers. I bet he can hear better than just about anyone... His speakers (when sourced and driven properly) are that good. But he wouldn't waste his time on such tests. He has customers to serve and a business to run.
I doubt lab experiments look to disprove their theories once and for all. That's a social prejudice built into the lab experiments. Fix that, and you'll end up with a better hypothesis.
• 24-bit audio is magical. When I recorded myself playing guitar in 24-bit and played it back through my amp, it sounded like I was still playing. 16-bit sounded like a CD.
• With MP3s, 192 kbps is a huge step up from 128 kbps. 192 doesn't exhibit any of the "swooshiness" heard in the upper range of 128 kbps MP3s for regular rock/pop/hiphop music.
*not the study I was referring to but its along the same lines: http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=5291...
I've never been able to enjoy listening to my favorite classical music on headphones or even smaller speakers, and it's largely because of the effect you describe.
At this point I'm resigned to preserving my treasured (and cumbersome) vinyl collections. Maybe if Apple comes up with some snazzy marketing term (e.g. "Retina") for 24/192 or even 24/92, and starts distributing it on iTunes, things might start to change.
Specifically because CD-quality 16/44 audio has midrange distortion present during complex passages that is completely eliminated and non-present in 24/96 sources.
Listen to "Us and Them" off a 16/44 CD version of the Pink Floyd album Dark Side of the Moon. When it kicks into the chorus, it becomes totally distorted and everything in the midrange bleeds into each other. It's a mess.
Then, try listening to the 24/96 Immersion box set copy or a vinyl-sourced 24/96 rip and you'll find it's gone. When the song gets complex and loud, everything remains totally clear, each instrument stands on it's own, it doesn't become an awful distorted jumble.
You could argue that it's just the quality of the master that makes the difference; but if you take a copy of the original transcoded to 16/44 and compare it again with the 24/96 copy you can hear the same effect.
Why would anyone argue against high-resolution audio anyway? Sure, most everyone will probably just continue downloading 16/44 MP3s, but at least give us the option to have 24bit FLACs of the stuff we really like. Please and thank you.
"but if you take a copy of the original transcoded to 16/44 and compare it again with the 24/96 copy you can hear the same effect."
I could believe that, but do you mean to do the transcoding yourself? IN this case you become the engineer, and the tools you use and all that become vital as well.
Having heard stunningly awesome CD's of DSOTM on a homebuild heathkit amp and some old speakers and not believing my ears when I saw what the setup was, I'm skeptical... can't help it.
Human hearing is limited to 20k because frequencies higher than that are perceived as painful? Dont agree with that one.
24 bit doesn't offer any advantages to sound quality? Sheesh.
And the crux of the argument is intermodulation distortion increases when you try to represent more frequencies? Isn't that an argument for a faster power amp?
Yeah, that's a silly one. I disagree with it, too. It's a good thing it appears nowhere in the fine article. Are you actually confused about the difference between frequency and amplitude? Or did you misread the article?
"24 bit doesn't offer any advantages to sound quality? Sheesh."
As brazzy rightly points out, "Sheesh" isn't a reasoned statement. It's an ejaculation. And, it turns out, the author talked about why sound engineers record with 24 bits; It has to do with pragmatic reasons about leaving room for the highest and lowest frequencies in the audio being recorded without clipping, as well as with the author's discussion of Nyquist considerations in the distributed product.
Your post is wrong in so many ways that would have been easily fixed by reading the linked article with even 8th-grade reading skills that the reasonable reader has to wonder if you're being deliberately obtuse. Are you?
You misread the article. It's because there is so little response that being able to hear it would blow your eardrums (and even then, it might still be beyond your ability to hear it). There's no value in that.
> 24 bit doesn't offer any advantages to sound quality? Sheesh.
Not quite what TFA says. According to the article, 16 bits effectively covers the dynamic range of human hearing, so more than that is pointless for music consumed by human beings (hence all the stuff about 24bit being a good idea for mastering & production). If you're storing integers in the 0~16384 range, going from 16 bit integers to 32 bit ones is not going to give you "better ints", it's just going to waste 2 bytes per int. Same thing here.
24 bit is also extremely easy to hear. Arguably more important during the recording phase when headroom is valuable.
Its just as easy to qualify everything with "placebo effect", as it is to be dismissive
If you test this and hear something, it's almost certainly because the ultrasonic signal is being distorted by your amplifier and speakers and you're hearing distortion products that are ending up at frequencies you can hear.
Ideally audio engineers would take the effort to do good 16-bit conversion for distribution, but I realize that's too much to expect of them.