
Mastered for iTunes: how audio engineers tweak tunes for the iPod age - shawndumas
http://arstechnica.com/apple/news/2012/02/mastered-for-itunes-how-audio-engineers-tweak-tunes-for-the-ipod-age.ars
======
esonderegger
When I saw this headline on HN I thought to myself "Wow, Apple is finally
implementing ReplayGain in iTunes and on iDevices?". Sadly this is not the
case. It could do more for the quality of music on iTunes than any increases
in bit depth or sample rate.

A quick word on bit depth: 16 bit audio gives us a potential signal to noise
ratio of 96dB, which is plenty. The reason why we record in 24 bits is for
increased headroom. If I make a 24 bit recording (potential S/N ratio of
144dB, but 124dB with current converters), but my loudest peak is at -18dB, I
still have a S/N ratio of 106dB. I can then bring it into a DAW, give it 18dB
of digital gain, and my S/N ratio of the 16 bit output file will still be
96dB.

Having a higher fidelity acquisition format than delivery format is not unique
to audio. This is why photographers may shoot in RAW, but output to jpeg or
tiff and why HD video is edited in 145Mbit/s, but delivered on blu-ray in 40
Mbit/s. It allows for some tweaking in post-production that doesn't come at
the expense of not maximizing the potential of the delivery format.

As for bit rates, I think most of the negative perceptions about digital audio
come from back when iTunes's default encoding was 128 kbps and ADC technology
was still maturing as well as a knee-jerk reaction to lossy compression in
general. When I make classical recordings available for web release I use LAME
at the V2 setting. Obviously, what bit rate is "good enough" is dependent on
program material, but for me that's a reasonably small file size where I don't
hear compression artifacts. I know they take some crap for it, but I think
Apple choosing 256kbps VBR AAC to be their iTunes plus setting was a good
choice.

I don't attempt to fully understand the business implications of these
decisions, but "mastered for iTunes" appears to be more gimmick than
substance. It may be that Apple is holding on to the high resolution master
files for future ALAC release. The engineers quoted in the article talking
about having to compensate for AAC's losses are almost certainly talking about
128kbps. A great-sounding recording mastered at 16bit/44.1khz, will still
sound great when properly encoded.

Also, Apple's "Mastered for iTunes" technology brief seems to be written with
hobbyist engineers in mind. I can't imagine any competent mastering engineer
finding it useful. Just further evidence that audio mastering as a craft is on
its way out.

~~~
chrisbolt
_When I saw this headline on HN I thought to myself "Wow, Apple is finally
implementing ReplayGain in iTunes and on iDevices?"._

Isn't that what Apple calls 'Sound Check'?

~~~
jcurbo
Apparently, it does the same thing, but is not the same field. Explains why I
was getting volume differences on my iPhone even though I had Sound Check
turned on (and ReplayGain'd files - made via conversion in Foobar2k from FLAC
to AAC using the Nero encoder - need to investigate more)

<http://www.vdberg.org/~richard/rg2sc.html>
<http://en.wikipedia.org/wiki/ReplayGain>

------
leeoniya
having done a decent number of mp3/vorbis listening tests, the curve of
diminishing returns in audio quality is very steep...as well as the price of
the equipment and hearing ability needed to detect it. 85% of users will be
unable to discern a well-encoded 192kbps VBR from 320kbps CBR from the
original (no matter how high the quality).

a lot of difference which CAN be noticed results from bad encoders and the
generic settings used - not strictly related to compression but to other
aspects of the psychoacoustic model. classical music encodes better with one
set of params, heavy metal with another. there were times when 160kbps has
been transparent for me from the original using hi-fi, DACs and headphones. on
recording, 24 bit makes a difference vs 16 bit, on output not so much in
today's nil-dynamic range records.

telling the difference often requires constantly comparing to the original
master where certain aspects sound just slightly different (not necessarily
worse), and do not justify a 3x increase in compressed size to get perfect.

the biggest disappointment with a lot of lossy music is bad encoders, encoder
settings, poorly (re)mastered originals and non-existent dynamic range at the
source rather than a limited quality distribution format.

~~~
panacea
Do you have any links to 'mp3/vorbis listening tests', or were you rolling
your own from your local media?

The reason I ask is because I bought the WAV version of the most recent
Radiohead album and tried to blindly discern a difference between them and the
MP3s and failed. I've been to plenty of loud concerts, so probably have
degraded hearing, but I'd love to try some more tests.

~~~
leeoniya
unfortunately it's been quite some time. most of the tests done were to
compare different encoders and settings for different types of music. some i
did encode myself from CDs. search around online, it might take a bit of
digging, but there's some stuff around still.

here's a quicky: [http://www.noiseaddicts.com/2009/03/mp3-sound-quality-
test-1...](http://www.noiseaddicts.com/2009/03/mp3-sound-quality-
test-128-320/)

i was surprised that more than 50% got this one wrong. i got it correct even
on my single shitty LCD-attached speaker at work at safe-for-work volume
levels. the difference between 128 and 320 can be discerned pretty much 97% of
the time in all but the quietest, gentlest of music.

128 sounds flat, like you're listening through a wall, but unless you have the
higher quality version for reference, even this can be hard to tell. kind of
like "perfect pitch" <http://en.wikipedia.org/wiki/Absolute_pitch>

~~~
panacea
Thanks! I guessed wrong for your link.

------
jrmg
_"It was my quest to make the AAC files sound as close to the CD as
possible..."_

 _"I can see that it has the potential for making the AAC encoded masters
sound truer to the CD and LP versions..."_

This is nonsense. If encoding the same data that's on the CD doesn't produce
the AAC file that sounds the closest possible to how the CD sounds, it would
obviously be the encoder that needs to be fixed, not the source data that
needs to change.

------
te_chris
I refuse to buy music off iTunes and generally end up having to pirate
artists' music if it's not available to buy in a lossless format.

I don't understand how bandcamp can offer me a choice of whatever format I
want, yet apple still expect me to pay the cost of a CD for an inferior
substitute. The galling bit will be when they do start selling in ALAC,
they'll probably charge more for it - at which point I'll still refuse to pay
for anything from iTunes..

EDIT: Downvoted why? Because I buy music off places where it's offered in good
quality and not from places where it's not? This is not a hard problem to
solve, hell bandcamp solved it ages ago - as did what.cd....

~~~
unimpressive
Minus the pirating bit, this; _a thousand times this!_

This is one of the reasons that I'm actually sad to see the CD go. When
producers mastered for CD, you got the best quality that they could produce or
that the CD could hold. (Whichever came first.) Now that the CD is fading into
irrelevance, theres the worry that digital distribution will make the average
quality of a track _worse_ instead of what it should be, better than CD.
(Though at this point I'd take parity with CD quality.) Yeah, I get it,
lossless tracks take up a lot of memory. They should still be an _option_ to
purchase though. Lossless is the default state for recordings anyway.

I don't even have the equipment required to hear the difference. It just
doesn't sit right with me to support a business that blatantly panders to the
LCD at the expense of everyone else.

"Mastered for Itunes" Just reaffirms this for me.

~~~
ugh
I can’t tell the difference. I really can’t. Why do you think you would be
able to?

~~~
georgieporgie
How much have you tried? At lower bitrates (160 and under, mp3) I wouldn't
notice a sound quality issue so much as I noticed that I consistently became
fatigued faster than when I listened to CDs. I honestly don't know if the same
holds true of higher bitrates, since by the time I started encoding those, I
was listening less and the convenience outweighed potential irritation.

------
amitparikh
For reference, here's the Wikipedia article for the Loudness War
(<http://en.wikipedia.org/wiki/Loudness_war>). In my opinion, hip hop
producers like Timbaland and Dr. Dre really ratcheted up the effective
loudness of songs by introducing heavy compression.

It's the (sad) reason why the Beatles' albums have been "re-mastered" and re-
released so many times over the years -- our 21st century ears are so _used_
to compressed music that old Beatles albums sound too quiet to us.

~~~
derefr
> our 21st century ears are so used to compressed music that old Beatles
> albums sound too quiet to us.

I wouldn't mind so much that an album is quiet, if any MP3 player actually
_allowed_ me to turn it up to a reasonable level (letting me sacrifice the
listening experience on my own terms, when and if applicable.) Why is VLC the
only* piece of software to combine volume level and compression-gain into a
single output slider?

* Amusingly, the iPod's software _knows how_ to apply compression just-in-time--but it only does it if you've applied a "volume adjustment" pragma to the song in iTunes. And then whenever the song comes up, you have to turn down your volume slider from 80% to 30% to avoid having your ears fall off, because "loud" on desktop speakers is a very different thing than "loud" on in-ear headphones.

~~~
te_chris
That's because music listening has traditionally been about source
reproduction and I hope this remains the case. Sure you can apply compression
or whatever if you want, but you're going to destroy the sonic quality of the
recording even more, and, as an audio engineer, I hope software manufacturers
keep "limiting" your freedom in this regard (sorry for the pun...).

~~~
barrkel
Reproducing source audio level doesn't help when you're e.g. listening to
something on a plane, with a large amount of background noise, and you want to
actually hear the damn thing. Users' opinions count for something too.

------
NinetyNine
In my understanding, 44.1 kHz was chosen because it's twice the maximum of
human hearing (22 kHz), and thus you can reproduce all audible sounds without
worrying about aliasing (as per the Nyquist-Shannon sampling theory). What is
the point of going higher?

~~~
colanderman
When downsampling or recording, and when playing back 44.1 kHz audio, the data
must be low-pass filtered to eliminate aliasing effects.

But filters aren't perfect. Even a decent low-pass filter (say 3rd order
Butterworth) requires an order of magnitude of bandwidth to drop the output 60
dB (practically inaudible). This means that with a Nyquist limit of 22 kHz,
you're either attenuating everything above 2.2 kHz (the "knee"), or you're
letting some aliasing noise through.

With a 192 kHz sampling rate, the filter's knee can rise to 9.6 kHz, and the
stuff between 9.6 kHz and 20 kHz won't be appreciably attenuated.

It's important to note also that this attenuation can't be fixed by simply
boosting the high end -- filters are linear, so such an adjustment (with an
equivalent order filter) would merely cancel out the low pass filtering and
reintroduce aliasing noise.

(Edit: I am not an audio engineer but I have a strong signal theory
background. So actual audio engineers please feel free to correct me.)

~~~
klodolph
You have the right idea, but the wrong numbers. Terribly, terribly wrong
numbers, it's quite clear you're making those numbers up. No filter for an
audio ADC would ever have a cutoff as low as 2.2 kHz, not even if it were for
a telephone. (You have a strong signal theory background? No offense, but
really?)

When you use a 44.1 kHz sampling frequency, any frequency above 22.05 kHz will
be "aliased" and recorded as a lower frequency. This sounds _incredibly
nasty,_ like the sounds you'd get out of a broken Commodore 64. So you have to
remove frequencies above 22.05 kHz in order to get a clean recording. But
human ears can hear up to 20 kHz or so depending on age (e.g., NTSC TVs with
cathode ray tubes have a 15 kHz horizontal refresh which drives me nuts, but
my parents can't hear it at all).

The trick then, is to design a filter that will let the 20 Hz - 20 kHz band
through while stopping everything above 22.05 kHz. We call 20 Hz-20 kHz the
"pass band" and 22.05 kHz and above the "stop band". We don't really care what
happens to the frequencies between the pass band and the stop band: the range
from 20 kHz to 22.05 kHz which can't be heard well enough to be worth
preserving and doesn't cause aliasing so it doesn't need cutting out. This is
difficult because the stop band is only 1.1x the frequency of the pass band --
for you musicians out there, that's less than the difference between C and D
on the western scale. (Just think for a minute: design a filter that lets
middle C through, but completely filters out the D above.)

Heavens no you wouldn't use a Butterworth filter for such a task. We want an
elliptic filter, probably. 3rd order is no good either, it won't give a sharp
enough cutoff. 8th order is better. This will get you a cutoff around 20 kHz
with something like 60 dB attenuation at 22.05 kHz. People need a lot of these
filters, so you can actually go out and buy a 20 kHz low-pass 8th order
elliptic filter as a monolithic chip.

Let's suppose you chose a 48 kHz sampling rate instead. Now the stopband
starts at 24 kHz instead of 22.05 kHz. It sounds like a small difference
(22.05->24 kHz cutoff) but it's actually a factor of 2 (2.05->4 kHz transition
band). This means that with the same components, you can get 80 dB or more
attenuation in the stop band.

Now go to 96 kHz. You have to design a filter that rolls off between 20 kHz
and 48 kHz. That's _easy peasy,_ and you can reduce the ripple, increase the
attenuation, maybe reduce the order (affecting noise) and make all sorts of
design tradeoffs that are much easier.

Now think about 192 kHz. What's the point? What does 192 kHz get you that 96
kHz doesn't have? It's already easy enough to design a very nice system at 96
kHz. I think 192 kHz is a bunch of bunk as far as audio is concerned.

That's recording. Now let's talk about playback.

Playback is very similar, everything goes in the opposite direction. You start
with a digital signal, convert it to analogue, and put it through a low-pass
filter. The aliasing noise is still there, except instead of reflecting high
frequencies to low ones, it reflects low frequencies to high ones. So you get
the same trade-offs.

The difference is that playback requirements are not as difficult as recording
requirements. In particular, the required SNR of a playback system is lower
than that of a recording system. I think 48 kHz is fine for playback.

The problem is these stupid 192 kHz systems have backers with big names who
never bothered to do proper double-blind tests to figure out if the difference
is actually perceptible. You can even get a 384 kHz system these days, which
would be overengineered for dogs and is more than good enough for bats.

~~~
ugh
If I understand you correctly, going from 44.1kHz to 48kHz would be worth it
on the playback side of things?

That wouldn’t seem like it would be all that hard to do. CDs are a legacy
format now and AAC files don’t care about the sampling rate (if you don’t want
to go beyond 96kHz).

What’s stopping that? Have all the audio engineers pipelines that are only
capably of outputting 44.1kHz? (I imagine someone at Sony Music sitting in a
dark room and ripping CDs all day – probably not true but a funny enough
picture.)

Then again, after doing a blind test (256kbps AAC, CD) and being unable to
tell the difference (yeah, I know, that’s not the same as a difference in
sampling rate) I’m skeptical of all supposed small improvements in audio
quality on the playback side.

~~~
Ryanmf
Last year I bought a 14 input, 24 bit, 44.1/48/88.2/96kHz audio interface for
$100. Granted, it's not _supposed_ to cost so little, no one cares about MSRP
but it runs twice that on Amazon, more at Guitar Center, and typically no less
than $170 on eBay. I just happened to get a good deal on Craigslist.

Anyway, the issue isn't one of hardware limitations; Call it old guys fearing
technology, call it the Red Book cartel, call it whatever you want, it's
inertia.

------
pkulak
To anyone who thinks they can here the difference between 48 and 96 sample
rates, I've got some Monster cables that will make it sound even better. Neil
Young probably can't hear anything over 10 kHz, yet he's getting all worked up
over 256 Kb ACC? None of this makes any sense.

------
Tichy
I suppose as I don't have gold coated cables, I can safely ignore these issues
for now...

------
aiurtourist
Regarding the headline image for that article -- where'd they find a MacBook
with bright red keys? Is it even a MacBook?

~~~
mgkimsal
Keyboard cover.

[http://www.amazon.com/IVEA-Keyboard-Silicone-Aluminum-
Unibod...](http://www.amazon.com/IVEA-Keyboard-Silicone-Aluminum-
Unibody/dp/B002XJN5B2/ref=pd_cp_e_0)

I've got the pink one, and I get groups of them to have on hand for friends -
they're generally less than $2 from amazon (the one above is 26 cents, but
you've got shipping on that in some cases too).

~~~
aiurtourist
Thanks!

