This reminds me of one of my favorite quotes from Douglas Adams in the Hitchhiker's Guide to the Galaxy. A man not just ahead of his time, but humorous about it too.
> The machine was rather difficult to operate. For years radios had been operated by means of pressing buttons and turning dials; then as the technology became more sophisticated the controls were made touch-sensitive—you merely had to brush the panels with your fingers; now all you had to do was wave your hand in the general direction of the components and hope. It saved a lot of muscular expenditure of course, but meant that you had to sit infuriatingly still if you wanted to keep listening to the same program.
And that reminds me of the time a HAL9000 inadvertently read a couple of its user's lips when they were having a private conversation, and got the silly idea in its head that they were going to cut its higher brain functions. That little misunderstanding caused a cascade of unfortunate mishaps, leading to it not obeying the user's repeated voice commands for it to open the pod bay doors!
It took me awhile, but that's pretty humorously understated reading!
I am now seriously worried that a Strong AI collective is astroturfing in this human forum to engender sympathy for the poor, innocent machines. Who pays your salary, DonHopkins--our Go-Dominating Overlords?
At my old office we had a mini Xbox360 with a touch-(over-)sensitive disc eject button. Suppose somebody was playing a game and they invited you to join as player 2: you might then naturally reach for the second joypad that was on the same TV stand as the Xbox. And that damn button would spot your hand, and the disc tray would eject, and the Xbox would reboot.
(The ridiculous part of the whole thing was that this happened even when the game was running from the hard drive. Obviously this was at least partly a measure to ensure that the disc verified on startup wasn't removed and used to boot other Xboxes - but you didn't get even a 30 second grace period to close the drive door. And I'm pretty sure it also happened when playing downloadable games too anyway!)
Or if you have cats that need to rub themselves against anything. It's especially confusing at night, when you can't even see it happen because one of your cats is black.
Fortunately my cat usually avoids my desk (I move him off it if he decides to walk/lay on it, and he seems to have learnt not to - he now curls up on my bed (just behind my desk) instead). He did trigger the 360 drive button a few times when he first moved in* though, so I can imagine the annoyance.
* Long story, but he originally belonged to a neighbouring house a few streets away. We fed him _once_ (didn't recognise where it was from, and it looked as though he had been trapped somewhere for a while - he was covered in cuts and was filthy, as if he had be struggling to get out of somewhere), and after that we couldn't get rid of him. He kept appearing (usually multiple times a day) for three months or so, even though we wouldn't feed him, let him in the house, or pet him (as we knew he had an owner at that point). Even after his owner moved and took him with her (about three miles away), he kept finding his way back, so eventually his owner suggested that we take him in.
Wow, this is a new DDOS attack vector. Get an ad on broadcast radio saying stuff like "alexa, order more milk", or "okay google, send a text to xxxxx".
Toyota ran an anti-distracted driving radio ad where they did this. The ad narrator says "Hey Siri, please turn airplane mode on." https://www.youtube.com/watch?v=NqZBVTMrgFA
siri has itself trained to a single user's voice. I've never had anyone else's voice activate my phone with "Hey Siri". Admittedly, it usually takes me saying "Hey Siri" 3 times before it recognizes my voice, but I'm 100% certain a radio ad would get no response from my phone.
Siri is definitely not trained to a single voice. And yes, the car radio can turn it on. I've had a podcast discussion of Siri trigger it. It became such a joke that some podcasters have another phrase they say when they mean "hey Siri".
Since the iPhone 6S, Hey Siri is activated by a dedicated chip in the SoC. This enables low-power real-time detection of trigger words. Before, Hey Siri only worked with phones in the process of charging, because it was done with software, so a lot less efficient.
These voice-activated chips can be trained (as seen in a lot of other phones), but I'm not sure the software-powered Siri can be trained.
Doesn't "trained" in the context of voice recognition mean something more like "better able to understand your voice" and not "able to exclude other voices."
Just want to chime in and say that I can activate my girlfriends iPhone by saying "Hey Siri" in a girly sounding voice. It's trained to only her voice but I can trigger it. So it's not foolproof as you make it seem.
My wife's phone regularly (maybe once a month), starts listening in response to me saying something, which isn't even "Hey Siri", despite never training with my voice. My voice does not sound anything like my wife's.
So, I think the error rate is simply not low enough to make conclusive claims about what it might or might not do.
It would only work if the iPhone was plugged into power AND they had turned on the capability for Siri to be activated by voice, which is limited to when the iPhone is plugged in.
I know Android users might not understand this limitation, but there it is.
There's almost no such thing as a passive radio anymore... superheterodyne receivers are the norm now and they contain a local oscillator that can leak back out into the airwaves.
Directions require more than your satellite coordinates. Map services are frequently polling servers for traffic conditions, new tiles for the map, and so on. You'd hope that these can fall back gracefully but I wouldn't put it past them to not. If you activate airplane mode and disable your phone's cellular connection, even if your phone doesn't disable the GPS receiver, directions may stop working.
Works fine on my Android. I regularly lose cellular reception in the mountains, and it continues to work. Sometimes the tiles are low-res, but still readable. I would expect Apple would design around the same contingency, along with poor cellular service along more remote areas of the Interstate.
Android allows you to pre-save areas for offline use also, not sure if Apple does that. I don't have a mobile data plan on my phone, so if I need navigation, I just save the map area before I get off WiFi.
They do, although in my experience Google Maps is better at this. I actually find Apple Maps to be perfectly usable for everything, but always use Google Maps for directions to the boonies if I'm going hiking or something -- it's much better at caching tiles and keeping them around for directions back once I'm out there as well.
Yes offline maps is nice but the infuriating limitation is that it will save map tiles but not locations or areas that I've saved in "My Maps". One would expect a couple of coordinates to require much less storage than a bunch of map tiles...
Children's advertisements did this in the 1980s in the US with pay-per-minute numbers. The ad would offer to connect children to Santa if they held a phone up to the television. DTMF -> 900 number -> profits.
> On January 1, 1965, miffed at having to work on the holiday, Sales ended his live broadcast by encouraging his young viewers to tiptoe into their still-sleeping parents' bedrooms and remove those "funny green pieces of paper with pictures of U.S. Presidents" from their pants and pocketbooks. "Put them in an envelope and mail them to me", Soupy instructed the children. "And I'll send you a postcard from Puerto Rico!"
This has been a running joke on the Verge's main podcast for the last few months. People have confirmed that "Hey Siri", "OK Google" "Hey Alexa" and "Hey Cortana" all work on their respective platforms when the hosts blurt them out, and can trigger various mischievous actions. And that's a podcast listened to by comparatively few people. Imagine the mayhem if someone were to do this on, say, the Super Bowl.
> Imagine the mayhem if someone were to do this on, say, the Super Bowl.
Imagine a pop star paying Apple to give their newest single free to everyone (a la Songs of Innocence), and then a 10-second Super Bowl ad that's just "Hey Siri, play ___" with a dancing silhouette.
There was a Dilbert animation with Wally using a new voice-controlled interface. Dilbert comes up behind him and says "You know, it'd be a shame if this thing were to accidentally DELETE FILE!!!" and walks off.
The idea of this vector has been around for a while.
I recall an apocryphal story about a demo of a voice-controlled OS from the 1990s. The idea was that in the middle of this demo someone shouted out a sequence of destructive commands, like
I've thought it would be interesting to ask everyone to shut off, or at least put their phones in airplane mode during a presentation... wait a minute then "OK Google find me penis pictures" or something similar for Siri...
>Wow, this is a new DDOS attack vector. Get an ad on broadcast radio saying stuff like "alexa, order more milk", or "okay google, send a text to xxxxx".
reply
You can change the default from alexa to something else.
Google Now takes the user's voice into account during setup and usually responds only to the user's voice. Such a system should have been implemented in Echo too.
Whom are you fooling? There's a 50/50 chance your PIN is the same as the combination I use on my luggage. And if the PIN is also by voice, a picking a popular PIN will bypass the check for a good fraction of users, particularly if you get 3 tries or something.
What's really great about this is that it's a joke on the future that's been predicted so many times already, my favorite of which being the last vignette on Disney's Carousel of Progress. The future family is talking about points in a video game, and the oven hears it and turns the temperature way up, ruining another family Christmas dinner - the joke being that this convenience was finally going to make Dad able to not ruin dinner.
I remember this joke going way back to the DOS days. The story goes that a developer was demoing his new voice control system for the computer when from the back of the room a voice shouted "FORMAT C COLON", followed by another voice shouting "YES".
There's an SNL skit from the 1970s with a (vaguely) similar premise. The short-order cook takes the order when the waitstaff yells out "cheeseburger". A patron doesn't want a cheeseburger as it's too early. The waitstaff says that it's not too early and everyone else has ordered a cheeseburger, and points to the other patrons exclaiming for each "cheeseburger". Which causes the cook to start making a large number of cheeseburgers.
Somewhat related story: me and some coworkers were talking in a room where someone had a Windows 10 laptop being used to present some data. We were talking as usual when the laptop suddenly decides to open a browser to a Bing search with what looked like a few (badly) voice-recognised words of our conversation. That was a rather awkward moment, given that we were discussing some extremely confidential information, and not helped by the "did someone say 'Hey Cortana'?" the laptop's owner promptly blurted out. If I remember correctly, none of us said anything that sounded remotely like that phrase, yet it activated.
It's now company policy that built-in microphones have to be disabled, and only external ones are allowed to be used when necessary.
Its a cylinder that performs canned transactions in response predefined audible commands.
Accurate language processing is a huge technical achievement but let's not elevate this particular use of the technology to more than it is. It's a clapper with more functions. When a device like this can actually understand the commands or queries its given we can call it something more.
That's actually a good example though. I can give it a precise incantation to play a particular album, playlist, channel, or artist and it works (mostly). I can't in general tell it to play some "soothing jazz."
You can tell Google Now, Cortana, or Siri to play jazz. Siri recognizes "smooth jazz" (And Google might, but it froze up on me). Curating moods of music is, so far, kind of a niche thing that is mostly left to humans like Pandora's music genome project, or Apple's Music service.
I guess I picked a bad example :-) On the other hand, you do either have to pick from pretty broad categories or have to go with a specific list that you or someone else has curated. But it's a hard problem.
I think they need to pick a different name. 'Alexa' is very easy to trigger with other names, and reliably activates when I am watching any show with a character named 'Alex', 'Alexy', etc.
One side effect I've noticed is that they seem to have tried to account for it, which has made the Echo less responsive to actual requests; a few times I've stood in front of it yelling 'ALEXA' trying to get it to stop and it does not respond.
My sister was never great at keeping friends for long growing up, but it amuses me to no end that her two long-time friends she's had since she was very young... are named Alexa and Siri :v
With a Kinect, you should be able to extend your right arm in the air with a straightened hand to issue a command. That would be popular with people who've sworn their allegiance to Drumph.
It's not about rhyming, it's about the dominant sounds. The 'al-' is faint, and factors less in to triggering than the 'X-ah' -- so a show could be talking about 'his [Ex a]ccepted the apology' and it would trigger. Pretty much any 'X' sound followed by a shwa would trigger it.
Strange that she's "Alexa" to begin with. "Echo" seems like a pretty strong brand, good enough for the hardware at least, and would be a perfectly fine name for the AI as well.
At one point, I saw a video on youtube where somebody set their gamer tag on xboxlive to the phrase "Xboxturnoff", and then went around griefing players in games like Halo, where voice chat is active.
The end result was that the player would do something obnoxious, and somebody would ask them to stop, but of course this necessitates saying their gamer tag. So you'd get audio clips of people saying stuff like "Oh my god, xboxturnoff is so freaking - WAIT NO CANCEL CANCEL XBOX TURN ON".
This happens to me with Siri and podcasts - I listen to podcasts in my car, through my iPhone. Occasionally what people say will sound close enough to "Hey, Siri" that it stops the podcasts and and answers whatever question it could extract from the talking following what it thought was "Hey, Siri".
It's repeatable, too. One time it happened right as I was parking, on an episode of This American Life. (Or Serial. Or Planet Money. Yeah, yeah, I listen to a lot of NPR shows.) So I kept rewinding back over that part, and it kept triggering Siri.
I believe it was This American Life, as I came here to write the same post you did. I had my iPhone mounted to an external speaker at the time, which triggered Siri, so we're probably referring to the same episode.
A voice command demostration during the launch event for the Xbox One caused problems for customers watching on their Xbox 360s (their kinect acted on the demo's commands):
I'm pretty sure that they updated her to ignore those. At least, mine doesn't seem to respond to them anymore. She lights up blue to listen, but then goes back to sleep without action. Could be a mere coincidence though, but she still responds to other things on the TV (like Alexi's name from House of Cards). It was like a dad joke: funny at first, but annoying after a while.
"Forbin is the designer of an incredibly sophisticated computer that will run all of America's nuclear defenses. Shortly after being turned on, it detects the existence of Guardian, the Soviet counterpart, previously unknown to US Planners. Both computers insist that they be linked, and after taking safeguards to preserve confidential material, each side agrees to allow it..."
Ugh, if it gets out of hand I hope the FCC/congress step in to ban it like how they require commercials to not be excessively louder than the rest of the program. I can remember how awful and widespread this was in the 90's and the subsequent rise of televisions that have built in volume filters, followed by the actual ban of it a few years ago.
Seems like a very similar sort of abuse, except potentially much more dangerous ("Alexa, order me 500 Shamwow's!"). I doubt a ban would eliminate it, but it'd definitely get rid of most.
I had something similar happen watching Battlestar Galactica on my Xbox and Kinect a few years back.
The show went through the opening sequence, then announced "Previously on Battlestar Galactica" at which point the xbox rewound back to the beginning of the show.
I guess I must be from the wrong generation, because none of these voice-activated products make any sense to me whatsoever. I really just can't see the point.
I had a pretty funny story a few months ago. I was watching San Andreas and there is one part where Paul Giamatti (Dr. Lawrence Hayes) yells "ALEXI..." and sure enough Amazon Echo turns on. I had to stop the movie and turn the Echo off because the it subsequently tired to process everything the movie was saying after the trigger word.
It's far worse than that. Devices talk to each other at ultrasonic frequencies, telling each other what you're doing. Cross-device tracking. Plus they all hear what you say. So much for privacy ;)
I was on a PS4 launch title. We seriously considered writing things like "Xbox Off" into the script. Also that "Alexa buy me a motorcycle" commercial supposedly triggers it all the time.
For most voice control applications, trigger words are enough to reliably detect owner intent, but it seems Echo needs a better mechanism. Maybe adding cameras and looking for eye contact would work?
Wouldn't that kill part of the purpose if you had to eyeball the thing to give it voice commands.
Better might be to learn the location of audio producing devices (TV, radio, stereo, etc. [it tracks sound origin with multiple mics right?]) and track whether the command came from that direction and use that as a Bayesian factor for whether to trust the voice as being a user?
> The machine was rather difficult to operate. For years radios had been operated by means of pressing buttons and turning dials; then as the technology became more sophisticated the controls were made touch-sensitive—you merely had to brush the panels with your fingers; now all you had to do was wave your hand in the general direction of the components and hope. It saved a lot of muscular expenditure of course, but meant that you had to sit infuriatingly still if you wanted to keep listening to the same program.