Hacker News new | past | comments | ask | show | jobs | submit login
[dupe] Laser-Based Audio Injection Attacks on Voice-Controllable Systems [pdf] (lightcommands.com)
218 points by SirLJ 15 days ago | hide | past | web | favorite | 107 comments

This is a cool attack, but I'm not sure where you got "$14" from. The laser pointers were $18 for a pack of 3, but you need a separate laser driver like the $300 one the authors used, and probably an audio amplifier as well.

There are some videos of the attack at https://lightcommands.com/

Exactly. Title says "$14" and then show this as the setup: https://i.imgur.com/SYUSIDs.png

"Google Assistant can be hijacked by a $2 pack of screws!"

So your typical Reddit/Hacker News new computer build.

"$300 for the whole thing. (Note: This doesn't include the GPU, motherboard, case, or monitor that I already had prior to building.)"

If you scroll further down, they show a significantly cheaper setup.

Agree, title should be more like "Light Commands: Laser-Based Audio Injection Attacks on Voice-Controllable Systems"

Particularly as this isn't exclusive to smart assistants. They found the same could be used against phones and tablets (as slightly higher power due to the lower sensitivity of the microphones).

Yes, but it's targeting the aforementioned software regardless of device. (Not that the HN title isn't terrible, I agree with thsowers on using the title from the actual paper.)

I think it's extendable, like adding extra voices to a phone call.

Stipulating that $14 might be off by as much as two orders of magnitude, cost remains an insignificant barrier for any adversary with access to a credit card.

We changed the title from "Siri, Google Assistant, and Amazon Alexa can be hijacked with $14 laser pointer".

Submitters: please don't editorialize like that. It breaks the site guidelines: https://news.ycombinator.com/newsguidelines.html

I foresee a market for decorative chotskies with open centers that sit over the mic to prevent these peripheral attacks.

Clickbait PDF

To be fair, the PDF's title is different, "Light Commands: Laser-Based Audio Injection Attacks on Voice-Controllable Systems".

Maybe we should change this submission's title to the PDF's title.

Modulating a laser at audio frequencies is pretty trivial.

You can probably inject the signal straight from the headphone output of a smartphone into the right pad in the laser pointer with a decoupling resistor and inline capacitor.

...if you don't care about audio quality and longevity of the laser.


For the ones too lazy to RTFA:


In this paper we presented LightCommands , which is an attack that uses light to inject commands into voice-controllable systems from large distances. To mount the attack, an attacker transmits light modulated with an audio signal, which is converted back to the original audio signal within a microphone. We demonstrated LightCommands on many commercially available voice-controllable systems that use Siri, Portal, Google Assistant, and Alexa, obtaining successful command injections at a maximum distance of more than 100 meters while penetrating clear glass windows. Next, we highlight deficiencies in the security of voice-controllable systems, which leads to additional compromises of third-party hardware such as locks and cars. Better understanding of the physics behind the attack will benefit both new attacks and countermeasures. In particular, we can possibly use the same principle to mount other acoustic injection attacks (e.g., on motion sensors) using light. In addition, heating by laser can also be an effective way of injecting false signals to sensors.

I really liked their take on the countermeasures: "It is possible to reduce the amount of light reaching the microphone’s diaphragm using a barrier that physically blocks straight light beams [...] However, we note that such physical barriers are only effective to a certain point, as an attacker can always increase the laser power in an attempt to compensate for the cover-induced attenuation. Finally, in case such compensation is not possible, the attacker can always use the laser to burn through barriers, creating his own light path."

Sure, you can try to defend by putting a lid on the microphone. But we already have a laser here, we'll just burn through that.

If someone is willing to blast your possessions to bits using a laser, I think the security of your smart devices is probably the least of your immediate concerns.

I don't see how that follows? A thief might be willing to break a conventional door lock if it gains them access to high-value, fenceable possessions.

In this hypothetical, the hostile actor isn't interested in the smart speaker itself, they're interested in what the smart speaker can do.

My point is that if someone is directing a powerful laser into your home, you are in imminent mortal danger, which is worse than whatever the smart speaker can do.

For such a thorough and insightful paper, I wonder if they mean this as a joke. Not even a 1W laser is going to burn through any common, real-world materials at a distance.

I wonder if simply having dirty windows would be a more effective countermeasure. See also "Real Genius".

Why is everyone more concerned over the ability of someone with too much time and laser modulation equipment telling their smart speaker to order 10 crates of hand sanitizer than the physics & possible new applications of influencing microphones via laser light?

The technique of using light to transmit sound has been around for over half a century.. although it's more commonly used as a microphone rather than a speaker.


A microphone is just a reverse speaker. For example, if you plug your earbuds into an audio input port, you can use them as a microphone.

This is all relatively basic, just not obvious to most people.

I am not sure it is really as basic and obvious as you imply. The comparison between the "reversibility" of laser speakers/microphones and the reversibility of normal speakers/microphones is a bit off, in my opinion. For starters, laser microphones do not use the photoacoustic effect[1] that the device described relies upon. Whereas the reversibility of normal speakers/microphones is because both devices use the same physical process. That it's possible to build a microphone using lasers and to build a long-range sound generating device using a laser doesn't imply it's obvious (or even possible) to do so for any given audio production or recording technique. E.g., if I can generate audio using a flickering lightbulb, it is not clear that a flickering lightbulb can be used as a microphone.

[1]: https://en.wikipedia.org/wiki/Photoacoustic_effect

They also point out they can use this to bypass security and open "smart-locks".. this is slightly more concerning than just buying hand sanitizer.. I wouldn't be caught dead with one of these smart-lock devices, but I have friends who aren't so cautious.

I have owned several smart locks and every single one has required a secret passcode to actually unlock via a smart assistant. Working backwards, the real threat if they didn't do that is just someone shouting through the window.

I forget if it was this article or not but someone mentioned that a few of the smart locks that do have secret codes don't rate limit them, so, in theory, they could brute force a pin code or password.

I guess the idea is that they think nobody is going to sit there and shout pin codes until it unlocks?

I own a smart lock and my Google home device will only lock the door. You can't ask Google to unlock it.

I don't use a smart assistant, but I think some people use them with smart locks ("unlock the back door").

Alexa doesn't let you unlock, you can only check status and lock it.

Until a vulnerability is discovered.

How about telling their smart speaker to complete a task that might be connected to more sensitive things, like unlocking a door, opening a garage door?

Money is important to people.

Because it's about more than just buying stuff. Potentially you could say "Hey Siri, transfer all of my Monero coin to XYZ address".

One thought is that the radiation is causing the MEMS sensor to physically vibrate. Part of me wonders how much of this is similar to the way the Raspberry Pi would glitch out when hit with a camera flash.

[1] - https://www.raspberrypi.org/blog/xenon-death-flash-a-free-ph...

[2] - https://www.youtube.com/watch?v=SrDfRCi1UV0

Not very similar.

The flash going off gives off strong EMF that induces currents in conductors at relatively low frequency.

The photo-acoustic effect is usually a thermal effect. Material gets rapidly hot from the laser and cools quickly when the laser is turned off.

According to their website the Google Home mini, the one with the cloth mesh top, is surprisingly one of the most resilient to this type of attack, being limited to about 20m in ideal conditions(1).

It's also one of the cheapest voice controlled home assistant devices, being given away by Spotify, Google and others on special promotions.

(1) https://lightcommands.com/

I'm a bit surprised that this works. It appears that this attack targets a single microphone. However, internal to most of these home assistant devices (e.g. Echo or HomePod) there is an array of microphones. A real sound from a spoken word would probably show up on more than one microphone (with an appropriate time and phase offset), although it seems to not be currently required. It seems like it would be complex or impossible for an attacker to target more than one microphone with this attack.

This is covered in the paper. They acknowledge this defense technique while also pointing out that a laser flashlight could be used to illuminate all microphones at once.

The power output would need to be much higher and I'm not sure it'd work as well through glass, I'm thinking something more like multiple lasers?

Perhaps the "all at once" attack could work against today's hardware. This is because the mics (in devices I know of) are co-planar and the user may be speaking to Alexa (or whoever) from directly above or below the device. In this configuration, it is valid for all mics to be receiving the same audio signal simultaneously.

But in some future rev, one could imagine that if the mics in the array are non-coplanar (e.g. at least 4 mics) and sufficiently far from each other, then there is no possible way for the audio signal to reach them at once (unless it is actually light being measured).

You could add timing difference to individual lasers as necessary. It's not really more complicated than duplicating the laser setups and feeding them the same signal with time delays. Not a huge step.

However, non-coplanar mics would work for the opposite reason: If they are on different sides of the device, you couldn't reach all of them from the same distant location. So unless all mics receive (more or less) similar sound signals, you could discard it as manipulation.

What about internal microphones?

You could require a signal from >1 microphone, but then that would make the system less reliable if any of the mics were occluded. And this would be to prevent an attack which is kind of ridiculous in the first place.

Can someone explain the mechanism that causes light to translate to electrical signals in the microphone? Is the heat generated moving the diaphragm? Is the microphone photosensitive?

It's radiation pressure from the pulsed beam of light moving the tiny microphone.

MEMS microphones are tiny capacitors that are vibrated by sounds. In section IV.C of the paper, they test whether the effect is mechanical or photoelectric, and determine that it's acting via mechanical vibrations, since the effect is stopped by gluing the microphone down with a transparent bit of glue.

Think of it as a tiny solar sail-- they're hitting a very small piece of metal with a lot of photons, so even minor deflections are translated effectively.

"The diaphragm is a thin membrane that flexes in response to an acoustic wave. The diaphragm and a fixed back plate work as a parallel-plate capacitor, whose capacitance changes as a consequence of the diaphragm’s mechanical deformations as it responds to alternating sound pressures. Finally, the ASIC die converts the capacitive change to a voltage signal on the output of the microphone."

"As can be seen, the modification decreases the amplitude of the signal detected by the microphone, and the signal after the glue application is less than 10% of the original signal. We thus attribute our light-based signal injection results to mechanical movements of the microphone’s diaphragm, which are in turn translated to output voltage by the microphone’s internal circuitry."

Kind of related but light can change photo sensitive resistors, which can change current i.e. change electric signal.

Man this is such a great hack. This reminds me of the old TEMPEST attacks against CRT monitors from the 90s.

Cool stuff!

They made a simplified video explanation here: http://youtu.be/ORji7Tz5GiI

While the attack is technically feasible, complexity and sophistication do not lend this to wide deployment. Someone with a lot of time, money and drive (think heist movie, or spy agency) to hack a person of interest might find this attack viable, assuming they find a line of sight to the mic. But if one's dealing with determined hackers, there is likely a multitude of other lower hanging fruit to first pick off.

Hacks always start as "infeasible". But eventually the technique is refined and there's a kit for $20 that can let anyone do it by following simple instructions.

I don't know if the authors mention it, but it seems like you could combine this with the vishing attack (https://news.ycombinator.com/item?id=21306612) by using silent light commands to get a malicious app installed.

Fun stuff...

Title is editorialized, and in a way that adds incorrect information (the supposed $14 price tag). Mods, the title for this should be "Light Commands: Laser-Based Audio Injection Attacks on Voice-Controllable Systems".

> ... We thus attribute our light-based signal injection results to mechanical movements of the microphone’s diaphragm, which are in turn translated to output voltage by the microphone’s internal circuitry.

Aside from the security aspect, this is a pretty cool example of applied physics.

The paper doesn't appear to report anything else on this effect, but AFAICT, it's new. The Wired story described two hypotheses, one of which involved the laster heating the air immediately above the microphone, simulating a sound wave as the light amplitude is modulated.

Naw, the photo acoustic effect is a 150 years old

Why did they need to focus the laser through a telephoto lens? I thought the whole point of lasers was that they were already focused into a beam that is moving in nearly exactly one direction?

Lasers are collimated [0], not focused. They might be focusing it to maximize light intensity at the required spot.

[0] https://en.wikipedia.org/wiki/Collimated_beam

Lasers spread just like a flash light with distance, the ones shined at the mirrors left on the moon by the Apollo missions for example were 6.5 kilometers wide by the time they reached the moon.

Personally I've seen cheap lasers without a good focussing mechanism get house sized at even a few hundred yars, I bought a 1w laser diodes years ago that had no optics to focus it and at a foot it was roughly the size of a dime and absurdly bright and at a few hundred yards the diameter was taller than a door and incredibly dim -we tested it by shining it across a pond at the apartments we were living at one night aiming where there were no windows and immediately turned it off due to the massive size of the dot.

I imagine they were just using what they had on hand to attempt to make a collimating lens/focusing Lens.

If you have a pair of binoculars, go shine a flashlight into them (from both ends) tonight and you'll see that even a cheap optic can drastically change a light source. It's actually a fun little hack for making a searchlight when you're camping, you'll get a considerably brighter spot of light.

I forget where I read about a very early and simple attack on Amazon Alexas. Some people knew that their neighbors were going on a long vacation. So they just shouted into their house orders for the Alexa through the front door. They ordered tons of expensive shit on Amazon which was delivered to the front door and then ran off with the packages. The family didn’t get the order confirmation emails from Amazon since they were on a remote trip with no Internet access.

I'm curious if you can use the same effect with human/animal eardrums.

I suppose the next generation of MEMS microphones can contain photodiodes to mitigate this attack.

This is cool research, but I hope people aren't viewing this as a real-world worry. By the time someone is breaking into my house to set up hundreds of dollars worth of laser equipment, I probably have bigger problems.

If you'd actually read the paper, you can see that they were able to inject commands through double pane glass from 75 meters away.

I know, but I have a fence around my house. Once someone is setting up equipment around my house, why not just smash a window and steal my stuff directly?

Huh? I mean we have seen the WiFi person detection hacks through walls that need a lot of calibration, careful setup and so on, but this was pretty straightforward. Modulate your sound signal onto the laser current and it reproduced more or less directly.

They triggered the device from across the street.

For this attack nobody needs to break into your home. It can be executed from a distance (100m) and through "clear glass windows."

The hardware can be much cheaper. Furthermore, stealing very expensive cars would pay off the cost very quickly.

and, the attack may be "open the front door / garage door" ...

Blew my mind!

I always felt all these assistants we're super creepy any way and I get a lot of things hooked up in my smart home but want to control everything via my phone, which I trust.

The authors seem to a little confused by the purpose of voice identification in these systems. They repeatedly call it "Voice Authentication", which is it not.

I don't understand why anyone wants these listening devices in their home

Do you really not understand or are you just not accepting the same answers everyone gives every time this question comes up?

They're terribly useful for listening to music, setting timers, alarms, reminders, checking weather, traffic, unit conversions, and just general web lookups.

I suspect what you meant was, "why do people trade their privacy for convenience?" Which of course is ALWAYS the trade-off one makes when it comes to security.

I personally have decided (for now) that the convenience of the devices outweighs the likely harm they will have on me. Would I have one if I were running for political office? Absolutely not.

I use mine to check the weather every morning and frequently listen to the news feature. As well as music during the day sometimes. I love them :) I think other people are just obtuse or paranoid about surveillance.

Most of the security and privacy concerns come from not bothering to press few buttons on a smartphone.

A smartphone that would be in your hands or pockets anyways.

Yeah, this explains why they've been integrated into every high end headset. ;-)

It's weird that I manage to do that entire list of things without saying a word aloud.

It's luxury tech. That's where the market is.

> It's weird that I manage to do that entire list of things without saying a word aloud.

That's a incredibly broad statement... anything you do can be done in another way which someone could define as luxury. I can churn butter, but I like the luxury to buy it instead, I expect you are the same ;).

If your phone has a mic, it's most likely also listening. Most new phones have the same technology as the smart speakers. And even if it doesn't have the feature, it's can still be compromised to be listening.

By that logic, a TV remote control is "luxury tech".

VUI is just another new interface for devices that provides convenience.

Totally luxury tech. Just like nice phones, wide-screen TVs, and a million other things that people find useful, convenient, and fun.

> They're terribly useful for listening to music, setting timers, alarms, reminders, checking weather, traffic, unit conversions, and just general web lookups.

In short: it enables people to get even more lazy. No wonder US has an obesity problem.

On the contrary, this is more useful for very active people, who use them for multitasking. People just sitting around browsing HN can already use the device they have at hand.

For me it is because when I am home I don't want to have to use my phone. I want to spend time with my family.

Every time I pull out my phone there is the temptation to check my email, message friends, etc.

But with my echo dot (or google home mini, I currently have both while I decide which to keep) I can control my lights by voice without missing a beat, I can play music for my kids to dance to without fumbling through an app, I can check the weather forecast while I'm digging through the coat closet, all without being distracted from what I'm doing and without any extra fluff.

I do not have any locks or security systems connected to them as I do not trust them that far, but for informational and recreational use they have made things considerably more convenient.

how does not wanting to use your phone have anything to do with using a listening device to turn off your lights?

I don't see how it's not clear. Kiseleon wants the convenience features of various bits of smart home tech without having to use their phone to do it so that they can avoid the attention maw that is the cell phone.

That's not the parent comment's point at all. Parent comment is clearly arguing that the convenience provided by an alexa over having no smart home stuff at all is not worth the privacy invasion. Pointing out that you can accomplish the same thing with your phone only improves that position.

There's already one in your pocket

Also I believe it will move towards a voice free AI determined system. Because at some point in future decision making will be the toughest activity. (Currently moving and switching on buttons is)

Yep, predictive decision making is the next steps. An example of this is on the newer versions of Android, where when you pull up the app tray, it predicts and shows which apps it thinks you're most likely to want to use based on datetime, location, connected peripherals, accelerometer (driving), etc. For the connected home it could be the same.

I use mine as a glorified internet radio. My grandparents use it as a glorified Clapper(tm) to switch room lights on and off.

If you can find a standalone internet radio that's as convenient as an Amazon Echo, please let me know. It's an underserved market.

Cause my kids need help spelling and I really suck at it.

Don't worry, neurodiversity is trending

I find voice controls so frustrating that I don't see the appeal.

Maybe once "okay google" works correctly on my phone, I'll start using one of these.

The ads make them look cool. Amazon puts them at the top of the site so it looks like they're popular and everyone wants one. The industrial design makes them super unobtrusive, and the cheerful voice comes off as ultra benign.

Okay, but they’re still the largest global privacy violation technology in history...you couldn’t pay me to have one of these in my home, and I mean; a lot - you could offer me cash every month to keep this in my house and I would still say no.

These devices are as literally Big Brother as it gets.

I personally boycott Google products and do not purchase from Amazon. Google has gone from ‘don’t be evil’ to ‘be evil and pretend we’re still cool’.

Please forgive my tone, but it doesn't take a genius to recognize that you shouldn't have sensitive conversations around a microphone. Most of the things this appliance does are similar to the ones accomplished by a cell phone or computer. And like those things, the microphone works best when the device is connected to a power source.

When it comes down to it, we are all compromised, and it's just a question of what you are comfortable with. Some people choose not to use electricity at home, and yet they go on living. Your choices are your own, and of the billions of people you should take pride in your individuality and be comfortable in your choices.

> it doesn't take a genius to recognize that you shouldn't have sensitive conversations around a microphone

Stop this. You are putting blame on the end user for failing to understand complex security and privacy issues.

Also, this is an incorrect understanding of network economies and society.

"it doesn't take a genius" to carefully check a car for safety features, like having brakes. Yet there are regulations to protect car buyers.

Without the intervention of regulators we would not have safety belts and airbags.

Same for building safety. Food safety and so on.

The same goes for privacy. In a world where any device in your home has cameras and microphone it becomes impossible for a lone individual to push back.

You need a whole society to do that.

>> Please forgive my tone, but it doesn't take a genius to recognize that you shouldn't have sensitive conversations around a microphone.

Then, forgive my tone, but it doesn’t take a genius to understand that you don’t want one constantly monitoring you in your home. Where you have private conversations. This is my point.

These devices are physical spyware - they’re as useful and terrible as Bonzi Buddy but people pay for it.

To add, I’ve actually opened up my 2011 MacBook Pro and physically removed the wire connecting the microphone and video camera from the device.

I use TOR, for everything. Even basic browsing. I’ll take the performance hit any day.

I also disable JS on 90% of sites I visit.

I’ve stayed at an iPhone 6S that I could jailbreak to observe outgoing and incoming network information and block IPs where necessary, and redirect to TOR in the background.

I don’t keep Android devices and let all of my friends know about all the serious security risks involved, and beg them to at least get a refurbished iPhone instead.

I have friends in the psychedelic scene who literally text each other openly about LSD like total idiots when Signal is free.

I have friends who have ended up in prison, not even jail, from this kind of shit.

I talk about psychedelics, I talk about occultism, I talk about revolutionary political ideas, and I talk about private details like my sex life, comfortably, at home, as do many, many of my friends.

Having one of these devices in their home alone could be a ticket to jail or prison for them.

You’re telling me there are advantages for this shit? Because you can turn your lights on without having to grab your phone, or, God forbid, stand up and go flick a light switch, it’s worth it to have corporations listening in to all your shit?

You’ve obviously never had to watch a best friend dragged off by police due to getting nabbed by information grabbed through technology. Good for you! Have a cookie.

Some people are involved in progressing society beyond its pathetic, self destructive status-quo, to move the world forward, and we need privacy to do that.

If you wanna live your little work, wife, and kids life, where you’d never be concerned about a single thing you say while your Echo is around, go for it.

Myself? I actually intend to change the world while I’m on it. So no, I’m not alright with anyone listening in to what I’m doing.

> I’ve stayed at an iPhone 6S that I could jailbreak to observe outgoing and incoming network information and block IPs where necessary, and redirect to TOR in the background.

Do not redirect traffic from applications that have not been designed for Tor into the Tor network! They can leak all sort of data making it even worse than non-Tor traffic.

(BTW it's written Tor)

That's...very thorough of you.

I can’t afford to take any chances.

Do you leave your phone on when you're in your home too?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact