Interestingly, after playing with the demo long enough, I could hear both words spoken at the same time.
E.g. if I start on the 'yanny' side and very slowly move it down, I hear 'yanny' all the way down to the 'laurel' side.
If I pause for a few seconds, I finally start hearing 'laurel', and can hear all the way to the other end too. However, moving the slider a bit more breaks the illusion.
And even weirder I can now choose which to hear! I went back to the original recording off twitter or wherever and I can now force myself to hear either at that pitch.
Before I couldn't hear Yanny at ALL.
Very very cool.
=> I think I can confirm there is no bug in the demo.
This is maybe an even more extreme phenomenon (from the same subreddit that was apparently involved in making the Laurel/Yanny thing go viral this past week):
With this video, I find that I consistently hear whichever phrase I'm thinking of at the moment! (In this case either "brainstorm" or "green needle".)
Edit: Also, if anyone in this thread likes this stuff then you might really enjoy
if you've never heard it before (or even if you have).
The Yanny clip is in the high register, and the Laurel clip (from dictionary.com) is in the low register, and they've just been overlaid into a single mono clip.
But after using the demo, I can now ONLY hear “laurel”.
It makes sense that we'd occasionally find things like this, if human perception does work somewhat similarly to convolutional neural nets. Everyone trains their own feature detectors, and almost everyone ends up with something that works pretty well. (People who don't end up getting called things like "faceblind" or "tonedeaf" when their features don't work for a particular thing.) So it'd make sense that there are edge cases where two common approaches get different results. You could even argue these count as "adversarial examples".
However, if I listen to the samples at https://twitter.com/xxv/status/996462632998711297 I can hear yanny.
No, I very much doubt this is a doctored recording. Wired did an interview with the person(s) who first discovered it . They played the pronunciation of "laurel" from vocabulary.com  on their computer and re-recorded the audio to their phone, then posted it to instagram. The audio sounds distorted because it was played through laptop speakers and recorded again with a microphone.
The "laurel/yanny" effect is also reproducible with the audio from the original source at . You don't need to doctor the audio to hear it.
It’s interesting how both this and the yanny/laurel example are recordings of low quality speakers. I guess the added harmonics from the distortion in the playback causes the sounds to be more ambiguous than they would otherwise be.
The is a clear S in there, also I can clearly hear it's two syllables - where do you guys hear the third?
Using Audacity and my Sennheisers, a notch filter from about 500 to 1000 Hz reliably changes laurel to yanny in the clip that's being shared on Twitter. There is an intermediate setting where I can hear both at once.
The original source from https://www.vocabulary.com/dictionary/laurel stays laurel no matter what I do.
Does anyone know how to recreate this illusion with other sentences?
Top Tier Social Media 'celebrities' can directly influence the sales of almost anything if they promote or demote it enough, so there's a lot of money and effort moving around in the background to make all that happen.
The sad thing is, that in desperation to keep the hip kids clicking or viewing ads on their sites, all the news outlets just lap up this crap on a daily basis. It's a vicious circle.
If you hear "Yanny", put your hands over your ears. The tighter your hold them down, the more high frequencies are filtered out and you will hear "Laurel".
You can modulate that to hear both. Might work the other way too. Cup your hands and aim them at the speakers so you get more high frequency. Someone give it a try, eh?
Surprised there wasn't more discussion here about it that I could find. Seems like this was possibly found in a study of adversarial examples for audio codecs/speech processing systems?
After hearing "yelly" at the rightmost position of the slider I still hear "yelly" as it slides to the left, becoming "laurel" at about one third of the way. After that, as I slide back to the right, I do not hear "yelly" until the rightmost position.
(A story on the origin of this, if you're interested: https://www.wired.com/story/yanny-and-laurel-true-history/ )
Update: The weirdest thing is that now on my speakers also, at the midpoint I have become a Laurel person..
With the NYT slider I was able to adjust my yanny boundary leftward through some training.