Hacker News new | past | comments | ask | show | jobs | submit login
iPhones will be able to speak in your voice with 15 minutes of training (theverge.com)
39 points by mfiguiere on May 16, 2023 | hide | past | favorite | 62 comments



This verge article is very confusing:

"Apple’s new accessibility features can assist those who’ve lost the ability to speak.... users can create a Personal Voice by reading a set of text prompts aloud for a total of 15 minutes of audio on the iPhone or iPad."

huh? I presume it's for people who anticipate losing their voice due to a medical condition, never the less the article is really badly put together.


I think this is confusingly phrased in the press release too, so it's probably on Apple rather than the Verge.

> With Live Speech on iPhone, iPad, and Mac, users can type what they want to say to have it be spoken out loud during phone and FaceTime calls as well as in-person conversations. Users can also save commonly used phrases to chime in quickly during lively conversation with family, friends, and colleagues. Live Speech has been designed to support millions of people globally who are unable to speak or who have lost their speech over time.

It's clearer that it's for "at risk" users after, but I guess "millions of" was a marketing requirement.

> For users at risk of losing their ability to speak — such as those with a recent diagnosis of ALS (amyotrophic lateral sclerosis) or other conditions that can progressively impact speaking ability — Personal Voice is a simple and secure way to create a voice that sounds like them.

[0] https://www.apple.com/newsroom/2023/05/apple-previews-live-s...


Live Speech and Personal Voice are 2 different features, that's why you're getting confused. You can use Personal Voice to customize the Live Speech voice, but you don't have to.


it is poorly phrased, but “unable to speak” could equally refer to people who are entirely unable to speak as people who are unable to speak for prolonged periods of time


Interesting. Stealing a phone in the future will also steal their voice and therefore activities like bank authorization.


They already do that now. People are getting phished for their sms codes. As a side note, my local news just reported a big crime spree of people robbing cashless places. They have the manager log into the clover POS machines and issue refunds into their cards.

Criminals have been high tech for a while now


do they just run to the ATM afterwards?


Probably better to have someone else wait at an ATM ready to withdraw once they get a text message.


This is alarmist at best. Does stealing someone's iPhone today mean that you've stolen their passwords and everything else? No. Why would this change?


People are being held up and forced to enter more passwords, it is definitely rarer than previously because of the technical hurdles Apple has thankfully added.

It's just interesting thinking about edge cases and what a stolen phone will mean for your identity 10-20 years from now.


What kind of banks do authorization by voice?


the australian tax office and various other entities use a vocal print of “in australia my voice identifies me” i wish i was kidding

https://amp.theguardian.com/technology/2023/mar/16/voice-sys...


‘My voice is my passport please verify me.’

“…You’re right. I can’t kill my friend.”

::turns to goon::

“Kill my friend.”


What if you don’t speak English? I guess it wouldn’t really matter but if you were leaning English it might not work with better pronunciation later.


Schwab, and they are pushing it hard.


It seems like all the brokerages got sold hard on this. I have a friend who did customer support for Fidelity and they also pushed the voice ID thing very hard. I think it was to the point they wouldn't help you until you set it up.

I have had Schwab accounts for over a decade and called them a couple times, and they didn't bug me about it. Would still prefer WebAuthn to log in, however.


Had Fidelity 401k from work I was trying to rollover to a Schwab IRA. Schwab mentioned setting up the voice password thingy, and I declined and had no further problem. Fidelity tried to push it on me before I could talk to anyone. I refused and had to dial 0 several times for it to transfer me to a person. Glad I no longer have a Fidelity account, if Schwab ever tries that BS I'll raise hell with them.


"At Schwab, my voice is my password"


Vanguard too.


Fidelity does it too


It's actually becoming more common. I'm not sure how many banks have implemented it, but I've called Spectrum about my internet and a "security" feature called Voice ID that lets you verify your identity with your voice was being advertised


Wonder if it's "perfect" enough for that or if they'll add some sort of audio fingerprint (like the invisible dots on the printers).


Some "partners and customers" listed here:

https://www.pindrop.com/whos-it-for/banking-finance

In general, financial institutions (such as the many multi-nationals and regionals that use this) don't want to talk about the specific providers they use, so it can be hard to get a comprehensive view of which are using what.

But everyone who is anyone in banking evaluated this provider, for example.


Pretty sure that gov.uk uses or used voice authentication.

Banks certainly do: https://www.vice.com/en/article/dy7axa/how-i-broke-into-a-ba...


HSBC has the option, I have never enabled it even though they continually push me to.


Besides, being able to send voice messages to third parties with someone else's voice is a security nightmare.


Fidelity does it. I've had to explicitly decline 'voice authorization' multiple times.


American banks. Of course.


Now you can steal someone's fingerprints, DNA, phone, and voice all at the same time. Efficiency!


And enable fake ransom calls. What will we be able to trust?


I remember the days of people buying novelty voices for their car's navigation unit. I wonder if this will become a viable business for phones too. I'm sure there's money to be made by making Darth Vader read your messages to you.


Darth Vader now in „Return of the Jamba Spar-Abo“ (this only makes sense to Germans if at all;)


Jamster, Jambas international name, may ring a bell, though. :D


Darth Vader in TomTom was the best. https://m.youtube.com/watch?v=o9Oso7199WE


The Jeremy Clarkson one was pretty great, too.


The Alzabo is a monster from Gene Wolfes 'Book of the New Sun' series where it eats its victims and absorbs their thoughts and memories. It then speaks in the voice of its victims to lure the kinfolk of those hes eaten.


Ironically I want an iPhone feature that allows me to speak in a generic voice so scammers don't clone my voice.


Heavy EU accent siri incoming!


What exactly is an EU accent?


Eastern Europeans pronounce TH and R in a specific way in English.

https://www.youtube.com/watch?v=A5onfpBVs3M


EU != Eastern Europe

EU == European Union


Yikes


Which is exactly what I said to myself when you posted EU accent.


If this is the case, why haven't we trained Echo devices to speak with Majel Barrett-Roddenberry's voice yet?


Between voice.ai and that audiobook John de Lancie and Majel Barrett recorded (Q-in-Law), I think it should be possible to run such a setup privately at least. Amazon would obviously need to get the copyright situation in order first.

Paired with projects like https://github.com/toverainc/willow I think you could recreate the Enterprise at home if you have enough IoT stuff in your home.


My voice is my passport. Verify Me


How many languages are supported?


My voice is my passport.


Ha! The article made me imagine a scene like this one in Sneakers, where over the course of dinner someone tricks their target into saying the complete set of training phrases for one of these iPhone voices. But then I realized that if the goal was to impersonate someone's voice, they'd probably just use a different voice model that was more flexible about its training data.


Verify me.


My first thought of this went to capturing the voice of a parent or family member. Imagine being able to talk with them after they pass...


I'm more creeped out by that than anything. I understand that losing a loved one is difficult, but with something like this just seems to be one of those things that would make moving on much more difficult. Photos/videos of things that actually happened are one thing. Having an artificial computer voice that can lead one to think they are really speaking to someone is just creepy and I could see where someone with actual mental issues could get really confused.


I'm sure I listened to a scifi short story like this recently - but it was just text responses. Is a strange and interesting thought - I know that even listening to the voices of my children when they were little, can be really emotional feeling for me.

Having a deceased relative respond would be really odd, but could be soothing. What an odd world we live in now.


With the dead person's agency no longer involved, even without malicious intent, any feelings or behaviour such a construct might generate in its users will be artificial and manipulative.

This type of AI would have the potential to become a true basilisk with poisonous breath and all.


> With the dead person's agency no longer involved, even without malicious intent, any feelings or behaviour such a construct might generate in its users will be artificial and manipulative.

Memories are like that. Do we ever truly know anybody?


We each have our own versions of the world, true, but at least when I think about my grandma I'm only fooling myself.


It's even better when a family member loses the ability to speak. Someone with a stroke may still be able to type words and this can help them deal with the frustration of not being heard.


First a tracking tab to facilitate stalking, now a software kit to help simulate the voice of a person who may or may not consent. Entirely mass-produced. What could go wrong, and why isn't Apple concerned?


Apple’s press release specifically notes that the training lines are randomized. You’d need to get 15 minutes of audio of the person saying the exact lines that the training asks for. Doesn’t seem like a huge risk.


15 minutes.

I doubt if majority of people would sit down to say "The pleasure of Busby's company is what I most enjoy. He put a tack on Miss Yancy's chair, when she called him a horrible boy." to even try out this feature



15 minutes of randomized phrases. There are better tools for malicious voice cloning. Seems pretty safe. And Apple has put more tools into AirTags to prevent malicious uses than any other tracking devices. And to your last point, Apple is concerned, they are running it all on device and trying to make it as safe as possible while still being useful.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: