Hacker News new | comments | show | ask | jobs | submit login
How Surround Sound for Headphones Works (hajo.me)
97 points by vinhnx 997 days ago | hide | past | web | 31 comments | favorite



i didn't submit it yet because the article wasn't 100% ready .. but in any case:

Feel free to ask me about it :)


Hajo, the article very reads well, so don't worry about it not being "100% ready".

I may have missed something that was written, but I wanted to know if you isolate the sound that is _only_ on the given channel before applying your filters? (i.e. isolation through sound inversion to silence everything but the sound that is unique to the channel)

Though it's not exactly related, Alexandre Ratchov and Jacob Meuser (and others) have built the amazingly impressive audio features of OpenBSD including sndio(7), sndiod(1), audio(4), audioctl(7), mixerctl(7) and so forth. I've got a hunch that you might really enjoy studying their work. Warning: If you dig into the OpenBSD sound system, you'll probably like it enough to never run any another OS.

http://www.openbsd.org/cgi-bin/man.cgi?query=sndio&section=7

http://www.openbsd.org/cgi-bin/man.cgi/?query=sndiod&section...

http://www.openbsd.org/cgi-bin/man.cgi?query=aucat&section=1

http://www.openbsd.org/cgi-bin/man.cgi?query=audio&section=4

http://www.openbsd.org/cgi-bin/man.cgi?query=audioctl&sectio...

http://www.openbsd.org/cgi-bin/man.cgi?query=mixerctl&sectio...

http://undeadly.org/cgi?action=article&sid=20120401171457

http://2010.asiabsdcon.org/papers/abc2010-P1B-paper.pdf

http://en.wikipedia.org/wiki/Sndio


Thanks for the OpenBSD links :)

No, currently I do not do any sound isolation. The reason is that I want the virtualization to be as unobtrusive as possible.


Not a question, but I just installed it on my Mac Mini and WOW! I am blown away by the difference. My music sounds much better. I can't wait to try it out with a movie :)


:)


Headphone surround sound (including recordings made with a binaural "head" microphone) can be very convincing until the listener moves his head.

The sounds don't change to match the head motion, and this kills the illusion.


Absolutely correct. I'm also working on real-time compensation with the Oculus Headtracker ;) but that won't be ready until a few months from now ...


Nice write up. I installed the app to give it a whirl. Admittedly I haven't tried it with movies yet, but it gives a "warmer" sound to my Spotify tracks while using my Polk ANC headphones.

I went to purchase the product, but not sure why it's asking me for my mailing address?

I'd recommend streamlining the workflow (if possible).

Ask for Email address

Ask to select Payment Type

Then, if credit card is selected:

Name as it Appears on the Card

Credit Card Number

Credit Card Expiry

Credit Card CVS

Billing Address

Billing Phone Number

If I chose to use Paypal, why do you need the mailing and phone number?


That's a good question.. and I don't know the answer myself:

The shop site is run by FastSpring because I didn't want to have to deal with all the VAT weirdness in the EU. Maybe that's why they asked you, so they can decide wether to charge you VAT or not.


I'm sorry, but surround sound for headphones (defined as: HRTF implementations) don't work. What does work? Actual binaural recordings made with dummy head recordings.

Trying to reprocess 5.1 surround into either a) making it sound like I'm there, or b) making it sound like I'm listening to a 5.1 audio system, has never worked for me. Dolby Headphone is the least objectionable one of them, but it just mangles audio.

I wish someone would try to do Ambisonics-like calculations to try to bake 5.1 into working surround sound. I'd even pay for it (for, say, a faux passthrough driver on Windows that outputs to your default sound card).


I thought your comment was unnecessarily downvoted, but I also don't think you provide much information.

A dummy head recording has the downside that it's no longer fully suitable for playback over anything besides headphones. Also, it is dependent on the HRTF of the dummy matching that of the listener.

I'm not sure Ambisonics [1] would work for 5.1 sound. In most cases, 5.1 isn't produced by mic'ing in the directions of the ideal 5.1 speaker placement (at least for movies and music, I don't know anything about 5.1 production for video games). Typically, the L and R channels are used for directional sound, the C is used for dialog and other things that might be corrupted by panning, and the surround and bass channels are used for (weakly directional) ambient sound. So trying to 3-D pan the surround channels will probably just muddle the effect.

Really, the best thing would be to actually record or synthesize in the Ambisonic format, and resynthesizing based on the playback setup. But a 5.1 recording, even one created with a 5.1 rig, is poorly suited for this.

[1] For those who aren't aware, Ambisonics is a a format for storage of 3-D recordings, independent of the mic and speaker placements at the time of recording and playback. If the mic positions are known, you can project the raw recorded signals from multiple mics onto a neutral format. Likewise, if the speaker positions are known at playback time, the neutral format can be projected onto the proper signals for the playback setup to best simulate the recorded soundfield. Ideally, the recording and playback setups are optimized to capture the full 3-D soundfield.


What someone awhile back had suggested (and I can't find the article now) is using virtual playback of 5.1 audio and doing Ambisonic-like math to produce the proper directional cues.

As in, it wouldn't be a cohesive audiofield in of itself, it merely would be like standing in front of a 5.1 system.

I think the trick was using headphones as a very wide stereo dipole system, and just doing all the timing correctly (ie, lesser than full scale HRTF). Due to how 5.1 systems are positioned (30 degrees between L and C, C and R, and 60 between L and R), there isn't much you can do to get this correct.


Yeah, people already do that with 5.1 headphone surround sound. That's not really Ambisonics-like math, because Ambisonics doesn't attempt to do anything with the timing differences. In fact, I haven't really ever heard much about playback of Ambisonics signals over headphones. The whole idea is reproducing the soundfield in open space, which would be experienced by your head within that space.


>surround sound for headphones (defined as: HRTF implementations) don't work

I'm not trying to be an ass, but do you have any hard evidence or citations for that claim?

I have been a skeptic of the technology as well, but just 2 weeks ago I tried a demo [1] that was _very_ convincing. In short, I wore ~$30USD headphones while a device with some positional sensors was moved around my head and music played. The only sound emitted was from the drivers in the headphones, but I do not exaggerate one bit when I say that my brain was tricked into believing the sound originated from the mystery box moving around my head.

Perhaps you have some scientific reason for why in theory (or fact) it doesn't 'work' in the physical sense, but my anecdotal evidence supports that there is a way to do a 'good enough' job where the brain is successfully tricked.

I admit to feeling there's still a bit of a snake-oil salesmen aspect to this technology, but the guys at Visisonics spun off their company from research at University of Maryland. They're not just another ragtag shop implementing your last decades' positional audio algorithms. They really seem to be pushing the space forward.

[1] http://visisonics.com/


Hard evidence for what? I've tried them all, I tried to like them, and it just butchered the directional cues for me, and also made the sound come out really oddly (some added excessive reverb, others just fucked the EQ all to hell).

Scientific theory doesn't mean shit if the resulting product doesn't work.

And I've heard working audio cues on natural audio, but as I said, they've all been binaural recordings done with dummy heads, so you can't claim its just something defective with my hearing.

I want headphone downmixing of 5.1 surround to work, but I've yet to find one that works without butchering the audio.


>Scientific theory doesn't mean shit if the resulting product doesn't work.

That's the thing, I'm claiming that the implementation I tried DID work. Now, questionable YouTube videos, however, haven't worked for me before. Nor do the settings in the software for my discrete sound card. But I have experienced it with one particular company's implementation.

What was explained to me is that they have a generic HRTF that works for ~90% of the population, but there's a minority of the population for which a customized HRTF must be determined. They claimed that certain differences in the spatial characteristics result in their generic HRTF not being a good fit; perhaps you are part of that unlucky 10% who would need to be measured in a lab.

This was actually the topic of one of the questions I asked of the presenters, and their answer was to concede that unfortunately those 10% would be out of luck unless they got their personal HRTF derived. Who knows if the numbers are real or not.

Sorry you're getting downvoted for having a dissenting opinion/experience.


Dummy head binaural recording is the physical equivalent of that generic HRTF math. It, too, would work for 90% of the population, but not the other 10%.

Dummy head binaural recordings work perfectly for me. I can hear forwards, backwards, up, and down perfectly.

The only 5.1 -> headphone HRTF implementations I've heard that properly give directional cues mangle the sound so bad that it isn't worth it.


>5.1 -> headphone HRTF implementations

The technology I tried is intended to be used for real-time games. Individual game objects' audio samples and their positions are run through their filter, accounting for position and the material properties of nearby reflective surfaces (hard vs soft, solid vs semi-solid).

Is this fundamentally different to the implementations you've tried? It seems to me you're referring to '3D' implementations that take an already-mixed 4 channel source and try to jam that into a spatial 2 channel mix. That's the kind of technology that I think is pretty lame/useless as well...

If you have a chance to try out Visisonics tech, I think it'd be worth giving it a go if games are your thing. Seems like you definitely know a lot about the subject.


Real-time games are "okay" if the game has the tech built in to the sound engine and does audio processing in a way that takes into account the position of your speakers (on head, surround, etc).

Problem is, I rarely meet a game that does (I think maybe I own one or two, but I can't remember which ones, which tells you what sort of impact those games made on me).

Being able to already have positional queues on a per sound basis is much different than downmixing 5.1. The technology isn't useless when all I want to do is watch a movie and be able to enjoy it with headphones without garbled audio.

Like I said above, you can downmix left, center, and right, and throw in a little sub, and get a stereo/headphone mix, but positional cues provided by surrounds are gone, and you can't just mix those in normally, the mix gets too confused.


If one was to do actual binaural recordings with a dummy head, one would get Impulse Responses, which is just another name for the FIR filters that I use. Or to put it another way:

You could use Soundman OKM 2 microphones to measure out your ears and then I can put your measurements into my algorithm and it'll work just fine :)


Down votes for opinion is not right in my own personal opinion but for incorrect or aggressive post. (I was a sound engineer and owned my own studio 15 years ago)

For me surround sound is only amazing when used in video games or virtual reality in Head Phones only and not speakers.

Same with 5.1 sound for music also falls flat on my ears. The 5.1 mixes are impossible to do well since the engineer is mixing to his 5.1 and your 5.1 will certainly be different especially with sound reflections. On a head phone with surround sound it works out much better. I find the 2.0 head phone mixes are usually much better.


I tried the tool and it really produces some kind of surround sound feeling. Weak, I admit, but it's there.


Well I deliberately called my tool "Hajo's Headphone Enhancer" because it is optimized for my head / ear sizes ... that is (in my opinion) also the biggest problem with measuring actual HRTFs with dummy heads: It'll only work if the user has a head that is similar to the dummy head used for recording.


Just wanted to say I was looking for something like this last week and couldn't find one for my Mac. Thanks this is awesome! I'll def buy a license here soon


Just wanted to pass on a note, I watched the new Peanuts trailer, and wow, what a massive improvement in sound and bass while using your audio plugin. Going to purchase a license.


does it sound different? yes

does it sound 3d? no

does it sound better? no, unless you pay, then auto suggestion produces perception of better quality

meh


What's obviously missing is a sample, unprocessed and processed, so that people can listen to the effect without running any os x crapps..


You're absolutely right, though "os x crapps" is a bit unfair.



Think you could make one without the heavy equalizer changes? It's hard to compare apples to apples :\


If you search for dolby headphone on youtube, there seems to be alot of mixes of a presumably similar (commercial) technique being used to encode the audio from some music that was recorded in 5.1 (https://www.youtube.com/watch?v=547i8RSveZs).




Applications are open for YC Winter 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: