Hacker News new | comments | ask | show | jobs | submit login
Effects of MP3 Compression on Perceived Emotional Characteristics in Music (aes.org)
53 points by pbowyer 26 days ago | hide | past | web | favorite | 56 comments

Before anyone jumps to conclusions, the paper is comparing the original audio with MP3s compressed at 112kbps and under, where the compression artifacts are very easy to notice.

was about to say this.

192Kbps VBR is the minimumn for transparency for me. i've been moving my whole collection FLAC -> Opus 160Kbps VBR (which is widely regarded as audiophile-transparent).

Same. It's crazy to me that people are saying that they can hear the difference between 24-bit 192khz recordings and something like 320Kbps MP3 or 192 VBR (which is what most of my personal collection is at). Back in the day on [huge;y famous private torrent tracker], FLAC and 192k VBR were the gold standards, with most opting for the VBR version. I personally have some VERY nice headphones, speakers, and sources (Senn HD800s, Westone ES60, KEF LS50, RME ADI-2 Pro, etc), and I can't hear the difference between this high-bitrate stuff and the lower encodes. I CAN hear a HUGE difference between bad recordings and good, but that doesn't stop me from enjoying my favorite albums. I have other headphones, speakers, IEMs, and sources for different needs, for example working out...I like my Jaybird Tarah Pros, and even though they're not "audiophile", I've still had tons of enjoyment with them doing things like skateboarding, mountain biking, and riding my motorcycle. Horses for courses. The HD800s is NOT great for bass-heavy music, so in that case, I'll switch to a Shure 1540 or Beyer DT1770, which even though they're more in the higher-end mid-fi range, have each brought me many hours of pleasure over the years.

This is a fun hobby, but it's possible to make it less fun by arguing on the internet with people about it. If it sounds good TO YOU, and it's worth the price TO YOU, then never let someone make you feel bad about it. Some people buy $10k cables (or "worse" imo, cable elevators [e.g. https://forum.audiogon.com/discussions/the-finest-cable-elev... ]. I think that those are ridiculous, but some people swear by them, and that's their choice. I'm the guy with the inner tube under the turntable playing with air pressure, because I love a good hack and have a penchant for DIY. I've built speakers that sound bad, but I've always learned something from every build. If you want TRUE REFERENCE sound, look up exactly what gear your favorite recording studio uses, and buy the same set-up. Otherwise, just have fun :)

I can hear the difference between them but honestly I listen to Spotify on highest quality setting (320Kbps I think?) for my everyday because it's convenient and already sounds great. I'm listening in an imperfect environment anyways with environmental noise from outside and a non-treated room. My main benchmark is does the codec give me listening fatigue which, for example, Soundcloud and 128Kbps does but 320Kbps doesn't.

As an aside people who swear by 24-bit audio don't even understand what bit rate does in context of audio. It's only useful in the studio context.

Edit: Not sure why I'm getting down voted. I can explain the difference as added sparkle in the transient high end frequencies. My hypothesis is that the presence of frequencies outside of the range of our ears affect the frequencies within our hearing range. It's pretty common to have to re-EQ the whole track after EQing one instrument in a mix so this concept is just being applied outside our hearing frequency spectrum Also I would like to point out that higher bandwidth audio doesn't necessarily sound better to me, just different.

I work in Film Postproduction and work as a soundesigner or re-recording mixer and also mix music from time to time.

In audio production you can really feel the difference between 16 bit Audio and 24 bit. Just like in photography were RAW only really makes a difference when you start to manipulate the data you really start to hear the difference between 16 and 24 bit once you work with recordings.

24 Bit allows more “shades” of loudness between the loidest signal and absolute silence. This means the above statements depend on the dynamic range of the recording, which is arguably too low in these days.

As an output format 16bit should be enough for most scenarios and except for a few edge cases hearing the difference of a well mastered 16bit recording to the same thing in 24 bit should be nearly impossible.

This however refers all to uncompressed audio. Depending on your audio gear you can definitly can hear a difference between uncompressed/lossless and lossy mp3 compression even at 320kbps. The same is true for 330kbps vs 128kbps mp3.

I must admit that I cannot differenciate between uncompressed and opus 256kbps.

If you do anything that needs audio compression I strongly recommend looking at the Opus Codec

> transient high end frequencies. My hypothesis is that the presence of frequencies outside of the range of our ears affect the frequencies within our hearing range

This is something I wonder about, too, but by my limited knowledge of signal processing the low pass effects of the signal transmission path do simply not admit any such frequencies. I wonder whether that counts for all orders of derivation, though.

I wonder if higher frequencies can cause resonance in lower frequencies, undertones instead of overtones so to speak, maybe in exotic geometries like the inner ear. Echo location famously works better in the ultrasonic range and involves higher order derivatives. We don't talk of constant sinusoids, but ultrasonic sweeps and chirps (not sparkle :D). We can't hear ultrasound, but perhaps we can sense it similarly to how we can sense lowest frequencies as rumble.

I really don't know the maths, what is a really low frequency note played for nanoseconds? High frequency Pulse Width Modulation is used to modulate sound in the audible range, which is quite tricky to get right mathematically and in hardware. The common problem is the introduction of ringing, indeed.

By the way, I really like to listen to music under the shower in the next room. I hear, or rather hallucinate, the most wonderful sounds under the noise.

I upvoted you, but you're wrong. 24bit audio depth matters, especially in jazz and classical recordings. You notice both subharmonics and superharmonics missing. Particularly when their phase is slow in time.

> 24 bit audio depth matters,

Please forgive me but I'm not going to just accept someone's word on this when the science seems to point in the other direction.

Is there any properly conducted test that backs this up?

Case in point. The bit rate has to do with dynamic range not frequencies.

The reverberation of a single tone can be in phase or out of phase. The results is observed as a beat frequency in amplitude.. not in tonal frequency, that's what why bitrate matters.

We use higher bitrates during recording to lower the noise floor, in turn giving more headroom in the mix. Once you mix down this becomes irrelevant because the difference between silent and full volume at 16bit is enough to cover the full capability in terms of sound pressure levels of the human ear. 1bit = 6dB which means 16bit is 96dB. As we don't start at absolute zero but with a background noise of the room at 30dB using the full dynamic range of 16bit at playback you risk deafening yourself.

In your example if your single tone is out of phase then 1) how do you control this at any bit rate and 2) this is controlled by an acoustically treated room and a skilled engineer during recording. When you listen on headphones you already eliminate this concern on playback and unless you are listening in a professional designed and treated listening studio you will always have some element of phase cancellation. This is the listening character of your room and will occur at any bitrate.

Read this https://www.mojo-audio.com/blog/the-24bit-delusion/

You know - I agree with this article. Thank you for your thoughtful response.

I understand and agree that 24bits is enough for human perception. But something else is happening in MP3 encoding. Some sort of assumptions about human perception of dynamic range that are inaccurate. I am not expert enough to know precisely where. But I generally understand compression and I am an "audiophile".

Nothing in the article stands out as false to me. But I cannot help but believe it might be missing something.

> Particularly when their phase is slow in time

What does this mean?

Compressors variably quantize a frequency band based on what is considered to be imperceptible to human ears. Imagine a standup bass is sustaining a note. That has a bunch of different resonances based on; strings, the wood of the bass, and the geometry of the room. In a live setting, those material properties create interference between the source of the sound and the reflection. But they're very quiet and can reverberate on the order of seconds. So most compressors will quantize the constructive/destructive dynamic range to the same value.

In rock music this doesn't matter so much. But orchestras may even use the room geometry as an effect.

There are always a few tell tale signs that audiophiles can pick out. In an analog mix, you should never expect two instruments to share a quantization... But that's what you hear with compression. The cymbal quantization is dependent on the guitar line and the tom/kick bleeds into the bass drum. The attack of the bass bleeds into the guitar, etc. If compressors had access to the raw channels, this is not so much a problem - you can keep each band stable. But they don't, they operate on the frequency domain of the entire mix.

yeah, for biking/skiing i use a-JAYS Five. for traveling 1MORE Quad Driver, for at-home audio Sennheiser HD 6XX and for gaming Klipsch ProMedia 2.1. the PC has an Asus Essence STX II 7.1 and onboard Realtek ALC1220 (surprisingly great, btw).

anyone who claims to hear a difference between 24-bit 192khz and a 320Kbps CBR MP3 is drinking some good kool-aid. there's a great quote by Alan Parsons:

"Audiophiles don't use their equipment to listen to your music. Audiophiles use your music to listen to their equipment."

I once went to an audiophile's place. He had one wall covered with equipment, paid tens of thousand of euros (actually liras, back in the day).

He had twenty records. Ten of them were test signals.

>It's crazy to me that people are saying that they can hear the difference between 24-bit 192khz recordings and something like 320Kbps MP3 or 192 VBR

With some decent headphones and a basic knowledge of compression artifacts, it's not particularly difficult to tell the difference in a blind test. You can diff a FLAC and MP3 recording using phase inversion, which allows you to train your ears to recognise compression artifacts. Modern perceptual encoders are very good, but they're not perfect. It's not a difference I particularly care about, but it is there.

> 24-bit 192khz

Chances are my soundblaster soundcard does not support either of those.

Similarly, you can't just diff two different recording of two different sample rates without conversion, so the conversion would either introduce artifacts if upsampling (and if it wouldn't, you wouldn't hear a difference) or render any advantage obsolete if downsampling the high sample rate. Obviously depends on the algorithm used, but you should target whatever your soundcard supports.

The 24/192 part is irrelevant - there's no audible difference between properly-mastered 16/44.1 and 24/192 audio. No adult human can hear anything above 22kHz; 16 bits provides 96dB of undithered dynamic range and ~120dB of perceived dynamic range with shaped dither.

You can hear the degradation from 16/44.1 bitstream to 320kbps or 192kbps VBR MP3, but you need good gear and good ears. I'm perfectly happy to listen to properly encoded MP3s because the difference is very slight, but I'd still use uncompressed or lossless formats for archival purposes.

> If you want TRUE REFERENCE sound, look up exactly what gear your favorite recording studio uses, and buy the same set-up. Otherwise, just have fun :)

It's interesting how much audiophiles focus and are ready to spend on gear when acoustic treatment (including planning and building) of the room brings undeniable and immediate impact, and in it's basic form can be orders of magnitude cheaper than all those insane RCA and power cables, supporters, pyramids and whatnot.

Just buying the same gear without proper acoustics won't bring you the "TRUE REFERENCE".

This. Just looking at a graph for resonances in a room, or moving around while playing a low-freq tone, is enough to give you an idea of the enormous difference a properly treated room would make.

Do you recommend 160 Kbps for Opus encoding? I've been encoding FLAC to Opus with 140 Kbps setting normally for playback.

I don't move my collection from FLAC to Opus though, just encode for playback purposes (Opus takes less space, so good for portable players and etc.). It's always good to preserve the lossless original. What if some new Opus-next will appear which will be even better? Without lossless source, you won't be able to re-encode.

It depends on your planned listening conditions and your gear, but I strongly suggest going much lower (as low as 96kbps, or even lower) and perform some ABX testing under slightly better conditions than what you're going to listen under. If the results are unsatisfactory, bump the bitrate up 16kbps and repeat. I bet you'll be surprised.

If you're going to test, you might start from 64, Opus is that good. I didn't perform extensive tests and use 128.

If it's transparent, why would you care?

Not sure what you are asking. Why use 140 instead of 160 Kbps? 140 produces smaller file size, and is practically transparent as far as I know.

Or you mean why reencode later from lossless again, if a better codec appears? If it will be better, it will offer some benefits (smaller size, faster decoding and so on). So why not?

I'm similar. I store FLAC on my desktop and then transcode it to opus before putting it on my music streaming server. MP3 doesn't really seem to have much relevance in the modern world other than legacy.

For someone out of the music codec game, is Opus generally the best codec to go with these days?

Any alternatives that work with iOS?

What software do you use to listen to music on your computers?

Is there a good media streaming server with iOS / desktop apps which I could use and then store FLAC on my server and have it transcoded to serve to clients?

Opus can be amazing at lower bit rates. e.g. try encoding some music you know well at 64bps and compare to a 128k mp3.

The higher bitrates (above 128) aren't so advantageous.

https://wiki.xiph.org/Opus_Recommended_Settings (Dec. 2018)

> The higher bitrates (above 128) aren't so advantageous.

That’s just because by about 128Kbps Opus is already effectively perfect, where it takes closer to 200Kbps for MP3 to achieve that.

Opus is just all-round better than MP3.

> For someone out of the music codec game, is Opus generally the best codec to go with these days?

not for compatibility. mp3 is ubiquitous. my 2017 VW can play mp3 and even flac from an sd card. i doubt it even knows what opus is.

i think you'd be hard pressed to tell a difference between a well-encoded (flac source) 192kbps vbr mp3 and 160 kbps opus. the file size of the latter will be 30% smaller.

> What software do you use to listen to music on your computers?

on windows, foobar2000. it's superb, no bs and has tons of optional plugins.

> on windows, foobar2000. it's superb, no bs and has tons of optional plugins.

Old school. I was using that back in the early 2000s. Had a lot of fun with it. But, I've been over in Apple-land for years now... and mostly Spotify for music.

foobar2000 is excellent on Android too.

Jriver is good media player which works on IOS, Linux and on raspPI's. However, it's not free. But you can do a lot with it and it'll play movies too. I use on Jriver on my work PC to stream lossless audio to a couple of Raspberri PIs i have around my house/work. I even have it set up for streaming over the internet for when I go away on Holiday. Best thing is about this setup is that all my playlists and ratings stay synced as I'm using just the one library on my main PC.

Opus is the best codec. There is VLC for iOS, which should be able to play Opus files.

VLC leaves quite a bit to be desired in terms of library navigation.

It does have the advantage of being patent free, since they have all expired.

So is opus and flac. Pretty much all modern audio codecs are patent free.

Can this be proven? No chance of submarine patents?

Someone would have sued by now.

> I'm similar. I store FLAC on my desktop and then transcode it to opus before putting it on my music streaming server.

Yep, this seems to be a common practice.

Why wouldn’t you just keep it as FLAC?

filesize. flac is about 10MB/min, opus 160 is 1MB/min

But you already have it stored - your provisioning is a sunk cost. And disk space is only getting cheaper...

You have it stored on a high-capacity server, not your phone.

Or a portable player. No need to waste space there either.

Maybe slightly off topic but I would be curious to know what the artifact of lower bit rate mp3 encoding is actually doing. It’s not aliasing, but it’s in the 6-15khz range. I could only describe the sound of the artifacts as a flock of seagulls in the distance flying towards you.

Edit: found this which answers a lot of this.


Relatedly, this article nearly a decade(!) ago mentioned that younger people of the time were preferring the sound of MP3 compression artifacts over higher-quality compression or even the original uncompressed audio:


I suppose it's similar to how some people prefer the distortion of "tube sound".

Im sceptic of this slashdot classic. Was this ever verified independently?

As of today, if I need to compress, I use M4A/ALAC. This article may have been appropriate >10 years ago.

Most of the time I have good speakers and decently encoded sound, I do not have any real quibs.

Check out opus encoding - should be way better

The available research is incomplete.

The Opus "Comparison" cherry-picks the sub-128 kbps bitrates, where the format is evidently strong, but it doesn't include references for high-bitrate music compression.

Actually, even their (partial) diagram hints that there is virtually no difference at over-160 kpbs bitrates.

I'm struggling to find high-bitrate music compression (blind) listening tests lately in general; it seems that the internet crowd enthusiasm in the subject vanished in the last decade.

What do you mean by incomplete? What information do you lack? If you meant some tests of OPUS over 128kbps, then it's just there's no real need for it. Why would anyone want to add some 50-100% weight to your files for almost no audible increase in quality? In any case, lossy codecs are not meant for storage/critical listening.

Lack of enthusiasm in publishing tests can probably be explained by the ease of performing ABX tests by yourself with your own gear and exact listening conditions you're aiming for.

>Compression effected some instruments more and others less

Affected [1] is correct. You could perhaps say 'Applying compression effected an undesirable change in some instruments...' - the sound of the instruments is affected by the compression, but the change in the sound has been effected by the addition of compression.

1. https://en.oxforddictionaries.com/definition/affected

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact