Hacker News new | past | comments | ask | show | jobs | submit login

We did explore using standard protocols in order to support third party headsets. The most natural solution would be to listen for AT+BVRA (voice recognition command), which most headsets generate after some button is held down for a couple seconds. It didn't fit with our desired UX, though. We wanted a hold-while-talking UX, rather than hold for a couple seconds, wait for a beep, then release and start talking.

We thought about listening for AVRCP key events to detect when a certain button was pressed and released -- probably play/pause, which seems to be the most prominent button on most headsets. It would have been hacky, though, and we ran into several problems. For one thing, a lot of headsets power off if the play/pause button is held down for several seconds.

We also had concerns about audio quality with third party headsets, especially those which didn't support modern versions of SCO (which introduced new codecs with 16khz support and other improvements), or with poor antennas leading to high packet loss (SCO is a lossy protocol, so we still get some speech and attempt to translate it, but accuracy suffers). We were concerned that all accuracy problems would make Google Translate look bad, even if the headset was to blame.

Interesting, thanks for the reply.

What about an app that you interact with rather than something physical on the headphones? Considering the phone is required, anyway...

If I'm understanding you correctly, it's already supported. The Google Translate app has a "conversation mode" where two people can take turns speaking into the phone - https://support.google.com/translate/answer/6142474?co=GENIE...

The problem that the Pixel Buds integration aims to solve is that it gets cumbersome for two people to hand a phone back and forth. With the new UX, one person can use the phone while the other uses Pixel Buds for the duration of the conversation.

I think what he was saying is just move the "button" on the headphones to a button on the phone, because they are already holding it. Then the headphones can just be in a normal "audio call" mode, and the phone button triggers the translation engine to stream data...

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact