Hacker News new | comments | show | ask | jobs | submit login
Optimizing Siri on HomePod in Far‑Field Settings (apple.com)
132 points by cift 8 days ago | hide | past | web | favorite | 71 comments

> Directional noises generated by household appliances such as a vacuum cleaner, hairdryer, and microwave

From what my friends with HomePods tell me, the HomePod is amazingly good at hearing people — even with lots of background noise. Interestingly, Apple's problem may cease being "can the HomePod hear the user?" and become "How loudly do we have the HomePod respond so that the human can hear the response while a vaccum cleaner is running?".

This is a novel problem because people generally can't have conversations with vaccums running, for example.

It is exceptionally good at hearing people. I tested with our HomePod and our Amazon Echo next to each other and the HomePod outperformed the Echo any time there was outside noise or any time the device itself was playing music. The fact that the Echo can't seem to hear people talking over itself seems like a big issue. The HomePod has to use some kind of audio cancellation that subtracts anything it's playing because it was able to hear me from a room away at normal speaking volume at 80%+ playback volume.

I briefly had an Echo, and one thing my friends and I found funny was to have it play some annoying song, and then call out "Alexa, volume 10" because once it got that loud, the only way to make it stop was to go over and manually adjust the volume, as it couldn't hear anyone trying to turn it down.

> The fact that the Echo can't seem to hear people talking over itself seems like a big issue

Google Home has the same issue in my experience. It's quite annoying.

The team at Amazon has already figured out how to have it whisper back to you if you whisper at it. Seems like it would be easy to run that backwards.

The point I was trying to make is that HomePod can hear you even when a vaccum is running — but your hearing is not accurate enough to hear HomePod if it replies to you at the same volume level because you don't have the impressive array of mics. HomePod would have to figure out how far away you are from it, and how far you are from the vaccum, to calculate the appropriate volume to reply with so that you could easily hear it.

I was testing a HomePod as a conference speakerphone in a large open office environment - where we often have all hands - and it did extremely well - clearly picking up individuals that were quite far from the actual device and that were speaking at a normal volume. I would love Apple to enable some sort of "Office Mode" for the HomePod similar to what they've done with the Apple TV (given the name doubt it will happen anytime soon) - maybe something they could work on with Cisco on? It performed better than the Jabra Speak 810 which is almost twice the cost. Has anyone else experienced this or used it in an office environment?

Cisco has the Webex Board (formerly SparkBoard). It has an array of 12 microphones on top and does audio processing to pick out who's talking, cancel the noise, and since it has an idea of where the participant is, the video part zooms into the speaker during a call.

Disclaimer: Yes I'm in that BU


Purchasing the low-end version of the WebEx Board and paying for just one year of the subscription would cost as much as 21 HomePods, if my back-of-the envelope math works out. Not exactly a HomePod competitor...


Amazon has Alexa for Business[1] which is a similar offering. It integrates with your conferencing platform (WebEx, Skype for Business, Exchange dial-in info) and you can use it as a conference phone and to control A/V in the room. It's pretty nifty and has some cool APIs for internal skills. The Echos have similar farfield microphones, so it should be around the same quality as the HomePod.

[1] https://aws.amazon.com/alexaforbusiness/

What is the "office mode" on Apple TV?

I think they refer to conference room display:


It's a mode where the Apple TV display instructions on how to connect with airplay and a pin code for authentication. It's really a handy feature in combination with Direct Wi-Fi, since it allows anyone in a conference room with a Mac or iDevice to present things.

We use it in meetings (though I prefer hooking up to by cable, since Airplay doesn't do 4k yet for 'screen sharing').

Is it better than a Chromecast? If I remember correctly, anyone with a Chrome browser (and apps with the feature added) can use the screen. Which gets around Apple only.

Last time I tried, Chrome has to be on a wireless network and the computer on the same network. The access point also shouldn't isolate clients.

Apple TV works with Wi-Fi Direct (P2P airplay)

Edit: ah Chromecast guest mode works differently. But we are an Apple shop anyway.

A few of my friends have Google Home and I found it very interesting they have to shout louder at the device when the music is louder, I've never had a problem with the HomePod hearing me, even if I whisper and music is playing. I have a HomePod in the livingroom about 15 feet from the bathroom and it can always hear me in the morning. I also find it's good at knowing what HomePod in my house to react on, and handing the request off to the HomePod if my iPhone picks up. My only complaint is that if my iPhone is upside down Hey Siri stops working on the HomePod.

I have a Google Home Mini and a Insignia (Best Buy house brand) smart speaker, and both seem to have much better microphones than my standard Google Home.

Both the Echo and HomePod have multi-microphone arrays vs the two on the Home, and this really shows in far-field settings (especially with loud music)

Do you mean if your phone is in your pocket upside down, Siri doesn't work on the Homepod? That's very weird.

No, it has to be laying flat face down on the table for this to happen - I presume it's a result of the iPhones facedown detection but not sure why it's preventing my HomePod from registering.

Does Siri on the phone respond when it is face down? You should contact Apple Care.

It doesn't, but it's not supposed to.

"Unlike Siri on iPhone, which operates close to the user’s mouth,"

I wish they wouldn't make this assumption. Siri would be much more useful sitting on my kitchen counter or coffee table. I now use the Echo for timers, music, weather, movie times, random questions, ...

I find that my iPhone is responsive to Siri requests if I'm within 8-10 feet. It's possible that training "Hey Siri" at this distance could help.

But at the end of the day, Apple won't optimize too much for this use case, since they want you to buy a HomePod, Apple Watch, or AirPods.

Also, there's power consumption to think about. The HomePod is plugged 24/7, whereas you want your iPhone battery to last a long time.

As an aside, one neat side effect of hey Siri is that often multiple devices can hear you at the same time. In my office I have my XR, iPad, and for dev an iPhone 7 and 8. When I say hey Siri I often hear all of them start to respond, but then all but one cancels listening. I've never really paid attention to which one wins the battle, but it does usually seem to be the device closest to me.

Just thought it was interesting as something they had to accommodate -- multiple devices all phoning home to determine if there's a race, and if so who should field the request.

I think this is all done locally over Bluetooth: https://support.apple.com/en-us/HT208472

I believe it uses the same logic that Handoff does to determine what your "active" device is. If the "active" device hears the Siri request, it responds. Otherwise I think it prioritizes HomePod.

If multiple HomePods hear it, I'm not sure, possibly using volume to determine distance?

Since they have an array of six mics, they could triangulate the sound to calculate the distances.

Wouldn't a strict volume check be better in general? If I'm physically closer to one HomePod but something is in the way, obscuring audio, then I probably want to talk to the HomePod that's further away but unobscured. Or to put it another way, if I'm standing in my living room next to the door to the Office, the Office HomePod might be physically closer, but I'm probably trying to talk to the Living Room one.

I use this when I need to locate my phone in a room.

"Hey Siri! Where are you?"

Your iPhone is always listening for “hey Siri” in a power-optimized way. It’s just not doing all the fancy things the HomePod is doing because it’s use case (and hardware) is different.

Yep, and it seems that it's also primed to respond if there's been recent motion, versus sitting on the table.

iPhones are not optimized for far-field use, though (which typically requires a mic array different than what's on iPhone). Echo has seven highly directional mics pointed in all different directions, and software which detects and isolates the source of speech. iPhone has two omnidirectional mics[1], on the bottom of the phone optimized for how you hold it in your hand.

[1]I am ignoring the other two mics on iPhone, which are not relevant: one on the back for video recording, and one in the earpiece for noise cancellation

Yes, I am aware of all this, and that's why I bought a couple of Echo Dots.

My point is that I wish Apple didn't assume I would be holding my iPhone in my hand. I have Xs, for example. It doesn't need to operate from a different room but unless I'm holding my phone, I just don't bother.

Adding an additional microphone, etc would make Siri more useful while it's nearby.

I'm really looking forward to yelling into my kitchen to start my Apple Music on my Echo in a few weeks.

Let me get this straight.

You want Apple to put more mics into the iPhone, so that it can better address a use-case that Apple is already addressing with a well-engineered product built expressly for that purpose?


Let me clarify for you. I would like my $1300 phone to answer me when it’s sitting 3 feet away from me. I wanna start music, I want to pause music, I want to ask it questions, and I want it to work reliably.

When a timer stops, for example, I want to be able to say "hey Siri stop timer."

It’s perfectly acceptable if Apple doesn’t see this as a use case. I’ll simply take my business somewhere else to solve that problem. I'm simply throwing it out as a nice to have.

> I want my $1300 phone to answer me when it’s sitting 3 feet away from me.

Is this a problem you have? I'm able to reliably use Siri on my phone from up to maybe 2-3 feet away in a moderately noisy environment (casual conversation around a table) with almost no issue, and I can easily use it up to 10+ feet away in a quiet environment. I've had this success since the 7+ at least (when I tested it), but I've continued to see such performance on the X and XS Max.

Try saying:

“Hey Siri, create a 10 second timer”

“Hey Siri, play some Rolling Stones”

Find yourself picking up the phone to stop these two actions?

Hm are you sure you have `Listen for "Hey Siri"` on?

Yes. Siri will respond to those two questions.

Did you try each then try to stop them with Siri?

Oh oh I see you're saying the requests you listed work but "hey Siri stop the timer" and "hey Siri stop the music" don't

Which absolutely does work. Not really sure what the OP means here.

Well, the timer does sort of work... but NOT REALLY.

I tried it several times. It worked 4 out of 5 times on the phone. You know what it doesn't do?

It doesn't stop the timer on my Apple Watch, which keeps buzzing even after my phone stops. The Watch stops if I press the button,

I couldn't get the Music to work.

So, can we be done? I have a couple of Amazon Echos. They just work as a hands-free voice activated device.

I just have a HomePod but I have told Siri to start/stop the music literally every morning since I bought it.

I think he's running into the issue that when the phone is playing loud audio through the speakers, it has trouble picking up voice over that. iPhones don't have the same audio separation hardware as HomePods.

Yes, this is why my initial comment was a bit snarky. GP wants his phone to have properties it fundamentally cannot have, and is not expecting that of any other phone.

So your next phone is going to be a non-iPhone if they don't improve "Hey Siri"?

I understand that you would like "Hey Siri" to be better. What is not clear to me is how you think Echo Dots are a "competitor" to your phone. Echo Dots are purpose-built Home Assistant hardware. Just like the HomePod. No phone is going to perform nearly as well as a purpose-built home assistant -- it's a hardware limitation. No matter how many mics you put in any phone, the far-field performance is not going to be great. It's like you want your sedan to be just as good at carry heavy loads as your pickup truck. That's ridiculous. And as far as Home Assistants go, the HomePod's mic/noise setup is hands down the best.

So if you find yourself constantly asking questions of your phone while its sat down in your living room or kitchen or bedroom or wherever it is you're setting timers and playing music (I'm going to bet it's the kitchen), buy a HomePod for your kitchen. That is the solution to your problem.

Or you could get Google Home products. They are also better than the Echo line. Sounds to me like you're just disappointed you bought the worst Home Assistant products on the market.

Maybe your phone is malfunctioning? My iPhone 7 easily responds to Hey Siri from 10 or so feet away. It's not as good as HomePod, but seems to work well enough.

Siri would be much more useful sitting on my kitchen counter or coffee table.

That's called "a HomePod".

I now use the Echo for timers, music, weather, movie times, random questions, ...

"Because the iPhone does not fill the role of 'smart speaker', instead of buying Apple's smart speaker I will purchase someone else's smart speaker."

To each their own, but to me it's like complaining how the Apple Watch makes for a shitty wireless speaker.

I have a HomePod, a google home, several Echos, and a Sonos setup.

The HomePod is incredible for voice recognition of basic commands and that makes a lot of sense based on the work they describe in this blog post. HomeKit has also really shaped up and at this point I use the HomePod as a home hub. In the small apartment I live in I’m able to shout commands from another room for things like switching the lights.

That said, HomePod lacks integration with any services or devices outside of apples echo system (ignoring HomeKit). You pretty much have to use airplay for almost everything unless you want to use that god forsaken Apple Music; which is fine, airplay is the fantastic, and there are even some good open source implementations too but it just means the voice assistant aspect of HomePod is severely diminished. So I end up using the echo for almost everything because I can hook one up to a pair of speakers and boom I’ve turned them into Bluetooth speakers, or I can use the echo to control Spotify on my Sonos multi room audio setup (and no, airplay on the HomePod is still not as good for this use case imho).

The google home is a clever search engine in a speaker but for some real world queries it’ll do stupid things like read Wikipedia instead of telling you the thing you want. So at this point I only use the google home mini as a cheap bathroom speaker because it’s a better than the old gen echo dot (until my new echo dot 3 arrives). The google home overall is the least compelling of all of these.

Oh yeah, did I mention HomePod makes a fantastic speaker for an Apple TV (or iPhone/MacBook) connected to a tv for watching video? The low audio network latency of airplay makes it a great stand in soundbar replacement.

I have a $400 speaker in my kitchen. I use it with my $30 Echo Dot and not my $1300 iPhone because the Dot is a much better experience.

On an iOS device, Siri will still do its best to work from a distance but there are limits to what can be done with a tiny phone-sized microphone.

You need far field microphones for that. Which is what Echo and HomePod have.

The article is literally about Apple's Echo competitor.

That's a little reductionist - it's a paper in Apple's Machine Learning journal that has a lot of actual technical details, rather than an ad / marketing material.

Uhh I think "journal" is a little generous (if we compare to actual papers/journals), even if that is the term Apple chooses to use. This is a well-written scientific article (maybe short paper, but not a regular-length paper) published on Apple's Machine Learning-focused blog.

My point isn't to be nitpicky. I just think that the truth is that this article falls somewhere an accessible digest article (e.g., Ars Technica) and an actual paper (e.g., a publication in NeurIPS or EMNLP).

For a related (but less complicated) project, see Mozilla's RNNoise project, which uses neural networks to better tune classical signal processing algorithms to reduce noise in recorded speech: https://people.xiph.org/~jm/demo/rnnoise/

Does anybody else wonder how advanced audio surveillance tech is, if commercial off the shelf consumer products are _this good_ at eavesdropping?

Has anyone ever tried putting a HomePod at the focus of a large parabolic (audio) reflector, and pointing it at their neighbours windows or people across the park?

The parabolic reflector might be unnecessary...

When I was a kid, I had this toy that was a cardboard tube with a 12-14" plastic parabolic dish on it and a microphone at the focus with earphones. It was surprisingly effective at letting you hear conversations from long-ish distances.

Reading this page made me wonder just how good the NSA's version of that toy is these days assuming they've got multi-microphone and signal processing gear at least as good as (and probably an order or two of magnitude better than) what Apple will sell you for a few hundred bucks...

Hi gain 2.4/5GHz antennas aren't "necessary" for wardriving, but they make new things possible if you do use them...

In my experience they over-optimized, with homepod now responding to "hey Sara" instead of Siri. Drives my partner nuts.

"Hey Sara" makes sense to me. Sort of. Because there's at least a "Hey" trigger.

Why my wife's iPhone responds every time she says, "Are you serious?" makes less sense. Especially since the response from Siri doesn't show her inquiry string on the screen, just Siri's response, which is usually, "I'm always serious."

That's because people have accents.

My partner is Korean and I am Australian for example. Two very different approaches to speaking English.

You would hope that the HomePod would be usable by both of us.

If you have a HomePod, try saying "Hey, Hugh!". I'm getting false positives with 'strayan accent.

We've been getting lots of false positives, as well, but it seems to have started happening suddenly. I wonder if it's related to a recent software update.

We often get false positives as well. I've wished there were some way to signal a false positive to the device after Siri is activated, like "Siri, I didn't ask for you," to help it learn/train it. Not sure how feasible such a function would be, but I imagine it would be helpful feedback down the line on the engineering end. Anything to reduce the false positives would be great as they are a) a nuisance and b) further confirmation to friends/family who witness it, that we've made a mistake allowing a "surveillance" device in our home.

I would guess that they are inferring this based on whether the device gets another request in rapid succession.

Have you noticed that certain queries that used to work have stopped giving you the expected response?

I've started noticing more false positives since iOS 12 launched, and there's a couple requests that now don't give me the expected response. The most annoying issue happens to be requesting Siri to set an alarm. I'll ask: "Hey Siri, turn on my 6:15am alarm." to which she'll respond, "Which alarm do you want to turn on..." and then list all the different alarms I've set on the HomePod.

This all changed with iOS 12, and I wonder if I should reach out to Apple or just live with it and hope it improves with a software update??

If you think about what you asked Siri it's a sensible response.

You didn't ask her to set an alarm for 6:15am. You asked her to turn on an existing 6:15am alarm.

Which since I am assuming it didn't exist Siri asked what others you did want to turn on.

Late response, but here's the thing...that alarm does exist and is listed off when Siri goes through all the existing alarm options that can be turned on...

Haven't noticed certain commands no longer working, personally.

At some point in the future, I'd rather wear a BT headset most of the time in some inconspicuous manner and use that for notifications and requests. The device and headset could know if I'm focused and wait until I come up for air for lower priority notifications. I would always have a mic handy for vocal requests.

The last stop before true integration where I just have to think my request...

Sounds sort of cool, but am I the only one not interested in vocalizing my computer UI? I’d just rather not interact with technology that way. I perceive it as a direct competitor for what I’d rather have “in the air” - music, etc.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact