I've never seen a demographic breakdown of HN, but I'm 37 years old. I'm definitely not the oldest guy here, but I see history repeat itself regularly. I remember when people used location tracking as a reason to not own a cell phone. I'm talking about flip phones, mind you, not even iPhones with GPS receivers. I recall the bulk of complaints in that vein died down around the time Google Maps for phones came out.
If this happened today but with Verizon and someone's location via cell tower records, no one would use that as evidence that no one should own a cell phone. That probably tells you what you need to know about the trajectory of voice assistants.
There's a big difference: having a GPS in my phone means I never have to worry about buying or carrying a map with me ever again, at the cost of potentially revealing my location to someone who I'd rather not know it.
Having a voice assistant means I don't have to spend a few seconds typing, at the cost of having everything I say potentially overheard by someone who I'd rather didn't hear it.
By my quality metric, the first is a big win at a small cost while the second is a tiny win at a huge cost. YMMV.
> Having a voice assistant means I don't have to spend a few seconds typing, at the cost of having everything I say potentially overheard by someone who I'd rather didn't hear it.
You're underplaying it in order to artificially inflate your mileage.
Having a voice assistant means not having to learn to change application contexts inside an operating system follwed by a few seconds typing. You are ignoring the hundreds/thousands of hours you've spent learning which software is appropriate for which contexts. Not to mention the latency of instantiating them, or possibly even installing them, migrating data between them, etc.
Voice assistants have contexts but they map fairly closely to natural language. And non-technical users can teach other non-technical users what those contexts are. "Here's how you set an alarm." I've never used Alexa but I'd bet it's something like, "Set an alarm for blah," or even, "Wake me up at blah." And I bet if I asked an Alexa user how to do that I'll remember because it's all natural language. That's a huge win for usability.
Now try teaching a non-technical user about cron, its flags, and its relationship to other shell commands, using natural language. In a decade that bullshit will sound more antiquated than the amplitude modulation on the voice of a Dalek.
>Having a voice assistant means not having to learn to change application contexts inside an operating system follwed by a few seconds typing. You are ignoring the hundreds/thousands of hours you've spent learning which software is appropriate for which contexts. Not to mention the latency of instantiating them, or possibly even installing them, migrating data between them, etc.
So, it's interesting. I resisted the iphone-style smartphone for a long time (I loved my nokia communicators) - I mean, I was super into wearable computers; I owned an Xybernaut at one time; I had the palm v with the cdpd modem... but I resisted the winning form factor for so long.
And... yeah, it was a mistake. Not really a huge one, 'cause the things are designed to be easy to learn over being easy to use (there's usually an either/or there. the easiest to learn process is rarely the most efficient process) but still, a cost; to this day, I'm not all that great at typing long messages on a phone keyboard, like young people seem to be. For that matter, I didn't buy a car GPS until far after they were common, and I am unusually bad at reading maps for a man who started driving before consumer GPS was common, so I would have saved many hours and more than one job interview if I had that technology sooner.
I kind of feel the same way about voice control. I mean, essentially it's a command line interface with a wonky input method, no? and it's a command line interface designed for regular people, not people who are familiar with the command line tradition. I don't think it's super useful now, and I kinda think it will be a fad, like the '80s talking cars.
But, I could be totally wrong. What am I missing out on by resisting the voice controlled garbage?
I mean, if someone does come up with a google maps level 'killer app' that makes you way more effective if you use the voice control, I'll be behind the curve.
"I mean, essentially it's a command line interface with a wonky input method, no?"
This is precisely the feedback we got from salespeople, when we were working on a Natural Language Programming interface for Salesforce. Initially, I got angry and denied the comparison. But after several people made the same comparison, I came to appreciate how true it was. Unless NLP is perfect, it is really just a Command Line Interface with an awkward input device. I talk about this a little towards the end of this story:
NLP will only continue to improve in the years ahead. I've got a few echo's around the house, and yes the current implementation of NLP is not perfect. It requires some modification of how the user speaks, but the first time you say "Alexa what's the weather today?" and Alexa comes back with the day's forecast is a jaw dropping moment (at least for me it was). I believe voice activated devices will ultimately be placed all over the house and we'll communicate with our computers for a lot of various tasks through voice - it's just natural for humans. Think of all the mundane tasks you do typing into a computer - setting calendar appointments in outlook for example - and then think how many mouse clicks and typing are required. It's so much easier to just tell a computer using a voice command.
The privacy issue is easily solvable IMO. Google already requires a physical mute button on their devices to disable the mics in hardware not just software. It will require either user trust (admittedly in short supply with tech companies lately) or some government regulations but there is no reason even today for devices to have to "record" your conversations in order to process. As long as you require a "wakeword" to tell the device to start listening to your next commands the device doesn't need to record all audio all the time.
Have you ever written an Alexa skill? Because as soon as you try that, you realize the limits of the slot/intent system that Amazon is pursuing. You will notice this especially when it came to names and letters. We built an Alexa skill that allowed sales executives to use an Amazon Echo to ask Salesforce questions such as "Who was my best sales person last month?" That worked great, but the executives we talked to mostly wanted specifics about particular cases. And Alexa could not figure out the company names. These tripped it up badly:
Avon
IBM
Sinopec
Volkswagon
Alexa could not get these, which made the skill nearly useless.
It is much easier to simply use a keyboard.
And please consider the failed promises we've heard over the last 3 years. Remember "Invoxia will enable Alexa to tell which person is talking":
Hundreds to thousands of hours seems completely reasonable to me.
Have you ever done IT for your family or friends, or seen a coworker completely helpless in the face of an unrecognized file extension? Say they're, I don't know, an engineer or a call center operative or something. 6 hours of computer in a working day, 250 days of work a year, maybe they got out of college in 2005... more than fifteen thousand hours in front of a computer, and, well, https://www.youtube.com/watch?v=fa9DLxDtPtc. Computers are mindbogglingly complicated.
And then there's us, the people that hang out on HN. I can't even think of how many times I've spent fifteen minutes digging around in config files or menus or googling for a better tool. I've probably spent over a hundred hours just on my emacs setup. Easily hundreds, perhaps thousands, of hours of focused effort dedicated solely to high-end computer-use skills, and that's not even counting the things I use the computer for. And I'm still learning things every day.
I agree with you. I know more than one who bought a new notebook because the operating system didn't work and they don't even realize that reinstalling is an option.
Everything a voice assistant does could be done with an device that has nothing but a keyboard and a text input field. People are not complaining about the great natural language comprehension these things have, they're complaining about always-on microphones in their homes.
But even with typed instead of spoken commands, wouldn't the natural language processing still done in the cloud? Admittedly, typed instructions instead of recorded sounds is less intrusive but still intrusive because it requires the internet, not processed locally.
If it was implemented using something like hitting super/the Chromebook search button and then typing your intentions, it would be a more deliberate choice to send the data than it is using a voice assistant which is constantly recording to look out for the keyword.
Except that voice assistant are not self-discoverable (in addition to being very unreliable and sluggish, which granted could be improved over the long run). And it forces me to memorize commands. Not just "switch on x". But like I want to play that song from that band I don't remember the name of but I remember it is a blue album a few songs after that radiohead song in my playlist. I don't want to have to memorise all these names.
> Now try teaching a non-technical user about cron, its flags, and its relationship to other shell commands, using natural language.
I'm just not sure what you mean with this statement. Are you talking about setting up cron jobs via voice? Or are you just making an analogy about the (perceived?) complexity of cron and teaching it to a non-technical user?
I would assume they are saying that it's a whole lot easier for the average person to say:
"Please remind me every day at 5pm to water the plants", or "Please play this song at medium volume every morning at 7am"
than to learn how to do the equivalent in whatever operating system their device uses, and I'd tend to agree. Cron is certainly a much higher bar than learning to use the alarm app in your phone, but the point remains, particularly in a home context.
But who's using cron to set a reminder or an alarm on their phone?! That can't be it, but if it is, then they are ironically overstating.
I mean you can use the GUI (launching apps etc.), or literally just type "remind me every day at 5pm to water the plants" into the home screen search bar (on Android at least).
The voice recognition and the assistant parts are decoupled.
You two are talking about different groups of people.
crontab/Linux/Win: definitely needs learning
smartphone: a little bit easier to use, open an app and everything is set
visual assistant: out of box for being ready to use.
I taught one of my family members about using PC and Android, and found that almost everything needs to be explained. For example:
- What's Google/Microsoft for? (Yes, even the giants in tech may never be heard for some people and it's common)
- Browser? And there are many of them? (Not everyone has a computer and Internet access)
- URL? What is that? I cannot even spell it! (This is more complicated and requires people to remember the addresses of tens of commonly used websites, btw, what's a bookmark!?)
- register? with what? bank account or passport? (if someone doesn't have a email account, it's likely that they have no idea about the process of register, activation.)
- App market? wait a second, what's an app? (not to mention built-in ones and those from 3rd parties)
The thing is Alexa, google home and many others provide a unified, easy to use substitute(well, I have to admit these devices are pretty dumb these days. So they are not 100 percent alternatives). The only requirement is being able to speak in a clear way, which is true for almost everyone.
PS: I don't have any of these devices. As a hard-core programmer, my choice is, without any doubt, *nix, shell, etc. GUI sucks ;-)
> Having a voice assistant means not having to learn to change application contexts inside an operating system follwed by a few seconds typing. You are ignoring the hundreds/thousands of hours you've spent learning which software is appropriate for which contexts. Not to mention the latency of instantiating them, or possibly even installing them, migrating data between them, etc.
I can type everything I'd issue as a voice command into my home screen. I have to install and learn just as few/many things and commands as I'd have to using voice recognition. Op might have underplayed something, you just made things up.
How do you manage to do the typing when you're in the bath?or stepping out of the shower? or when you've left your laptop in another room? or if someone else is using your laptop for something important? The point is not that the voice commands are easy to learn or remember, it's that you have with you at all times (even if entirely naked, hey it's your house) the equipment needed to issue these commands...
>I can type everything I'd issue as a voice command into my home screen. I have to install and learn just as few/many things and commands as I'd have to using voice recognition. Op might have underplayed something, you just made things up.
You are not the market that these are targeted at then. The normal person cannot do those things.
I’m sympathetic to this view. I’m not sure it will stay that way. There was a time where internet access on your phone was an extravagance. It would be difficult to appreciate the utility before living with it. If what we are talking about is a difference in perceived utility, I’m not sure how big of a difference that really is.
Utility of VAs are going up. Comfort with them is a normative thing, and history tends to point to the norms changing.
As others have said here, you're overplaying one and underplaying the other:
1. having a map on your phone doesn't necessitate gps; only automatically locating yourself on that map does, which is an added convenience as opposed to the core functionality.
2. even the core functionality of a map—navigation—is a rarer one than the plethora of things one can connect voice control to
3. apart from taxi integration, probably the most common multiplier of the applicability of gps is voice: "find restaurants near me", etc.
In fairness, even with the above, your comparison is probably still correct: gps represents a bigger value add relative to the trade-off, but I just think they're much much closer than you make out.
How does that help me at home? I don't dispute the value of voice recognition on a phone. But we're talking about smart speakers here, in your home, always on.
(BTW, on a phone I can set up common queries like "find restaurants near me" as shortcuts and access them with a single touch.)
> gps represents a bigger value add relative to the trade-off
Yep. Also, I can easily turn my GPS off, restrict apps from accessing it, etc. Home smart speakers are necessarily always on.
Aside from the usefulness: True, my GPS data can be used to trace information about me (where I eat, who I meet, which doctor or prostitute I visit etc.) however I believe that conversation I have attached home are more private and reveal notably more about me than those things derived from GPS as I might discuss the exact condition and state or explicit preferences in other regards.
Much more likely to want to discuss things in your home that are outside the Overton window and don't want recorded than accidentally go somewhere that you wouldn't want people to know about.
I hate talking to computers, so I don't see a voice assistant in my future ever. I'm in my early 50s by the way, and have been coding since I was in high school.
Maps/navigation is the only reason I even have a smartphone.
Compare the workflows for leaving a reminder or note. Before:
* Find phone/computer
* Unlock
* Locate self in interface and navigate to application selector
* Invoke note-taking or reminder app
* Type in note or reminder
* Navigate through tagging/scheduling interface
* Commit note/reminder
Or:
* Locate notebook
* Locate writing utensil
* Open notebook to current page
* Write note
* Remember to check notebook every thirty minutes for the rest of your life
After:
* Be in shouting distance of phone or assistant base station
* "Okay Google/Alexa, note to self/set a reminder for eight pm: <blah blah blah>"
For most people the "before" is a minor inconvenience. These people don't really need organizational tools anyway. For people with memory problems the "before" is a rock wall that renders the vast majority of organizational technology functionally useless. You know how sometimes you walk into a room and realize you have no idea why you needed to be in the room? Or how thirty seconds spent answering email can end with all the the eggs you were juggling crashing to the floor and it takes you half an hour to get back "in the zone"? Imagine that, but it happens at the drop of a hat, five times an hour, in the time it takes you to say good morning to someone... or because you had to context-switch to execute the motion to unlock your phone.
There are lives that can be changed by organizational technology. But exactly this population was the one least well-served by classical note-taking and reminder apps. That "before" workflow was the six-inch drop at the end of the wheelchair ramp [1]. Voice assistants are a game-changer.
* Find phone/computer
* Unlock
* Locate self in interface and navigate to application selector
* Invoke note-taking or reminder app
* Type in note or reminder
* Navigate through tagging/scheduling interface
* Commit note/reminder
I just now took a stopwatch to time how long it took me to do all of this: 30 seconds, and 20 of those were walking to my office. That is not a negligible amount of time, but it's not a huge cost either. But...
* Be in shouting distance of phone or assistant base station
* "Okay Google/Alexa, note to self/set a reminder for eight pm: <blah blah blah>"
That's what happens when everything goes right. But based on observing friends who have Alexa, things only go right about two times in three. The rest of the time Alexa gets something wrong, often with comical results. The time it takes to recover from those situations can be a lot longer than 30 seconds. Often my friends just give up and drag out a laptop, or, more often, just decide that whatever it was they were trying to do really wasn't that important after all and give up.
I think the key difference is the focused attention and brain bandwidth to spend. Traditional way is way more efficient but requires more of that - you need to do a relatively long sequence of actions, coordinated to the goal. Using voice, there is no sequence, and it doesn't matter how many times you need to try and to repeat the same simple phrase - you spend no additional bandwidth with every repeat.
The voice way is like writing the simple add-mupltiply loop in C, and UI way is like writing the same in assembly (using vector extensions). Assuming you know how to do it very well, writing assembly code for that loop wouldn't be even longer to do. And you don't have a 100% chance the compiler will do the right thing with unrolling and vectorizing in this particular case, but most of the time you check the output and try another compiler options until it do the right thing, and only after a bunch of tries you finally resort to assembly. That's basic energy-conserving optimization present in people's firmware in their brain.
Yep. A garbled duplicate of a reminder or note doesn't cost anything and even alarms aren't a major issue unless they're set for the middle of the night, so if anything hiccups I just retry.
Your analogy touches on another important factor, which is that the human brain has a titanic amount of hardware dedicated to accelerating and reducing the overhead of voice. It's so good that it even participates in basic cognition, so most of the time the note you want to take is already right there and properly formatted for speech. There's a reason that "thinking out loud" is a thing. :P
Although to be fair, this is an emerging technology which should improve over time.
Anyone using a voice assistant now is almost certainly at the forefront of the technology. In five years or maybe ten they'll probably be a lot more impressive.
> * Find phone/computer * Unlock * Locate self in interface and navigate to application selector * Invoke note-taking or reminder app * Type in note or reminder * Navigate through tagging/scheduling interface * Commit note/reminder
This is just bad UI. Voice assistants and search have been used to paper over a serious decline in UX over the past decade or two. PDAs from the late 90s had much better note taking workflows.
1. Find phone (Is this really an issue? Haven't they become an extension of our body?)
2. Unlock phone with thumb (unless of course you're already using it).
3. Type "make a note to do the thing" into the big text box at the top of your screen.
4. Preview / confirm it's what you wanted.
5. Hit "Checkbox" to confirm.
By Voice:
---------
1. No need to find the phone (assuming it close, which I'd say is a fair assumption).
2. Say "OK Google".
3. Say "Make a note to do the thing".
4. Depending on current coverage, wait while it goes to the cloud to figure out what you mean (annoying).
5. Listen while it tells you what it did.
6. Say "Yes" to confirm.
My point is that the equivalent workflow exists via typed text. The workflow itself is not coupled to voice.
That being said, I do agree that it's very convenient to use voice for these sorts of tasks. Especially when driving.
I just walk over to my fridge and write it down with the dry-erase marker stuck to it. No issues with voice recognition misunderstanding, and it's trivial to do something more complicated like add to a specific list.
That's a bit odd. What version of Android are you on, does it have any of those junk manufacturer skins, and how old is the phone? If my phone is already unlocked I can spout off the hotword, invocation, and content in a single stream, the assistant pops up in an overlay and prints the note for me to check visually, and then it finishes and hides with no further intervention. Pixel 2 XL, fully patched, it's worked like that since I got the phone.
If you're having trouble with getting the assistant to wake up, and you don't see an improvement after running through the wizard you get when you say "retrain voice model" or "recognize my voice", you should also be able to get the assistant up by long-pressing the middle button in the navbar at the bottom. You additionally shouldn't need to wait for a response after saying "take a note", just dump it all in a single breath.
> when it occurred to me I should remember to put my shoes on
...Are you mocking me? I thought I'd made it pretty clear that I used that tools to help compensate for a memory problem that does, in fact, qualify as a disability. Try "okay Google, remind me at eight pm to reply to $friend about $thing". Or "okay Google, remind me every day at two in the afternoon to make a dentist appointment...".
Sorry, I definitely didn't mean to mock you! It's just that lately, I feel like I might just forget to get dressed and that's why I'd like to use voice assist to build a to-do list. Too many things going on at once.
Yeah, this is the "killer app" for voice for me. It's the only thing I use Siri for on my iPhone, although I do have that to require a long press instead of wake-on-voice.
For people that are already into the IoT space, I think it's a no-brainer. I have smart bulbs (Lifx). They look great but have little to no support for physical switches, and the app works OK until you want to have two phones controlling them. Suffice to say, voice control has made the experience 10x better especially for guests. Now, we also have a Roomba and some other smart things. Rather than get a new app for everything, it's all controlled by once interface.
> ather than get a new app for everything, it's all controlled by once interface.
Makes sense.
But why does that interface have to be the clumsy, always listening crap that we see today?
And why can no one except Apple[0][1] give any guarantees with regards to what they use my data for?
[0]: No, I'm not a Apple fanboy. I just can't stand their UX, seriously, which is sad since I value their current stanace on privacy.
[1]: And for what it's worth, Apple seems happy with selling out if the alternative is leaving the Chinese market behind as documented elsewhere in this thread. Although I'll admit that from what I read they where up front with their Chinese users about the change.
Basic voice dialing has been a thing for at least 15 years. Its progressively gotten better, and alexa et al. are no leap forward. The old sytems, while perhaps flawed, at least did not send recordings of you anywhere.
I can imagine things a digital assistant could do that I'd value quite a lot, but none of them are done by what's currently on the market. (And most of them would involve heavy smart-home integration, which makes their privacy failings even worse.) Meanwhile, the offerings I've actually seen so far look like their maximum utility would be converting tablespoons to cups while my hands are messy from cooking.
Yes, hence: YMMV. (Which, in case you don't know, is an acronym that stands for "Your Mileage May Vary", which in turn is an idiomatic expression meaning: a reasonable person could disagree.)
I am in my 30s too, and I remember reading articles saying that DHS had hundreds of thousands of hours of wiretapped conversations they would never get to because they all needed to be manually reviewed.
When the company selling a suspiciously cheap home speaker ('stocking stuffer') is the same company that boasts of having infinitely scalable computing ability and is the only cloud vendor that meets the Pentagon's procurement requirements, people are justified in thinking that Amazon's ambitions go beyond 'making it easier to order online'.
Just so I'm clear here, you're proposing a conspiracy theory where Amazon, the DHS, and The Pentagon are in collusion to collect and transcribe audio from Amazon Echos -- in such a way that none of the many people who have hacked and extensively reverse-engineered them would be able to tell?
The people who have "hacked and reverse engineered" the Echos don't have access to what happens on the server side. We are facing a situation of either trusting or not trusting, because verification isn't possible.
There should be a word for believing that someone is working for your interests even though you have no reason to, as a counterpart to "conspiracy theory" which implies that you believe that a group is working against your interests without having a good reason to. It's like people have a conspiracy theory where Amazon is conspiring in secret to help them.
I'm not sure I follow - just capturing packets from them tells you how much they're sending to their respective motherships, e.g.[1].
Amazon of course is not exactly conspiring to try and get me to buy more stuff from them (they're very upfront about it), which makes them both more crass and also easier to trust in some ways than Google.
It wouldn't have to be perfectly covert. Also it wouldn't have to be purposeful, spooky collusion between private and public sector from the start. That's an unfair assumption and a bit of a straw-man. What's going to happen is Amazon and others will develop methods to collect ambient data at all times and sift through it efficiently for marketing info--come on, you know for a fact that they are doing that--and eventually there will be another fucking patriot act, some awful legislation that is blatantly unconstitutional but nobody cares and brows are furrowed and think-pieces are typed up but the law goes through regardless--the feds will eventually subpoena the massive cache of data Amazon owns, they'll make a bit of a show of resisting for PR's sake, but eventually law enforcement will get it. And maybe they'll use it to solve a legit case, they probably really will. But forever after that tool will be in the hands of whatever regime or agenda reaches office. It's not ridiculous and people shouldn't dismiss it so easily.
Infosec conspiracy theories are different from normal conspiracy theories because while normal ones are pretty much always bullshit, infosec paranoia is always proven sensible on a long enough timeline. Yes, there will be an incident sometime soon where law enforcement builds a case through evidence gathered with a virtual assistant using AI or something to sift through all that audio. The constitutionality of this evidence-gathering will be superficially questioned but it'll end up getting admitted anyways, or used sneakily via parallel construction techniques. Assuming it hasn't already happened.
They've already done it with phones. Providers and manufacturers used to resist, but after lots of murky "national security" legislation they can't really refuse anymore. Ever read about how the stingray was first discovered, and that law enforcement was sneakily using it extra-legally to build cases? Now it's pretty common knowledge that your phone can be tricked by a fake cell tower and there's nothing you can really do. But does anyone give a single shit? Nah. It's normalized. Every successive technology that is developed and added to the panopticon toolkit will make us more and more accustomed to it. There's been enough overreaches and abuses of technology already that you shouldn't have this blase attitude about it. But that's not how people really work.
If you've been following the news for the past few months or years, you should know that this conspiracy theory has a higher chance of being true than not.
Right?? I feel like I'm losing my mind sometimes with how lax people have gotten with cybersecurity or even the most basic concepts of digital hygiene / common sense internet practices. In the nineties giving any information out online was idiocy. And remember refuseniks? People who just refused to own a phone based on the privacy and social ramifications of always carrying around a sophisticated bug that would make CIA engineers during the cold war spontaneously orgasm?
It's been a slow boil since then; people accept things today on a regular basis that would have caused riots just a couple decades ago. And I get it, there's lots of important and valid improvements to life thanks to smartphones and the modern internet. But I'm not talking about the way things are now. I'm talking about the way things will be soon. Very soon. Like the next 10-20 years up until we all die in the climate wars or whatever, hopefully not but that's a different story I guess.
People were worried about cell phones back when you had to pull up the antenna and checking your email was an incredibly exotic feat of mobile technology. They had no idea about the kind of big data extrapolations that would soon become possible. If they did, they would have been far, far more worried.
Digital assistants, ubiquitous computing, "smart" cities which keep tabs on your life better than you do and other advances in mobile tech are going to be the same way. Those ramifications we can think about now are only the very tip of the iceberg. Completely unexpected uses and exploitations are going to appear out of nowhere like black swans as the years unfold. We can't make informed decisions about what we're doing to our humanity because we really have no idea. This has always been the case with industrialized society, true. But it's happening faster and more dramatically each time.
Totally agree with you. And it's funny to obeserve that history repeats itself in the comments of this post, like GPS vs voice assistance.
I'm in my 30s too and I can remember how I tried to walk to the library in the city for the first time: a map, tricks to figure out the orientation, and asking passerby. Back then, it's not as convenient as navigation with GPS/Google Map, but people were used to it.
The point is: concerns about privacy had and has never been a issue for any technology. The fact that everyone can own a cellphone with GPS, mic, gigabytes of storage makes them forget about that their location is being tracked by both cell towers, OS itself, and possibly some apps. It's totally possible that in the near future a device in the size of earpods can: help you learn a foreign language, teach you certain professional skills such as singing, and control all appliances. Then people will say something with absolute certainty: visual assistant really helps and my life will be a totally mess without it, but technology XYZ is too much, useless for me,and it tacks my movement.
PS: I use GPS related apps a lot on a daily basis: find nearest restaurant, navigate, and many other location-based apps. But here is a interesting report about taxi drivers rely on their memory instead of GPS:
https://www.thenational.ae/opinion/comment/exercising-your-b...
A mobile phone that doesn't leak your position is impossible, but a voice activated assistant that doesn't spy on you on by the orders of some company is pretty much possible.
> but a voice activated assistant that doesn't spy on you on by the orders of some company is pretty much possible
You're missing the part where said company has no motivation to not spy on you. There's absolutely nothing preventing companies from doing so today (in the US), while on the upside the opportunity and motivation for them to collect all your data for possible profit gains later is extremely high.
There's nothing stopping a competitor from launching a non-spying version either. People are more eager to live with problems when there is no alternative.
New technology is scary, old is not. With the third party doctrine the only way to really protect yourself is to not participate in modern society in any way.
What makes a “modern society”? Is it modern because of a certain code it uses, or because it has certain qualities which make it more vital than “non modern society”? I understand that one should follow and understand technological change, but being wary toward obvious intrusion of personal space is far from being socially unacceptable.
Or at least don't do anything that would be considered 'suspicious' within that modern society. Of course, it is hard to know what will be considered suspicious....
Whatever's convenient to dispose of people in the way of the status quo. Remember that the feds really started rounding up pot smokers on a large scale for the first time as a way to legally get rid of Vietnam war protestors. It'll happen again, and next time they'll have an infinite pit of dirt to dig through.
Nobody is 100% law-abiding or morally clean. Nobody. We're definitely all vulnerable to prosecution and blackmail. People who think they have nothing to hide aren't using their imagination properly.
anything existed before my birth is old and outdated. Ideas and inventions I grew up with really changed and is changing the world. Others are useless and invade my privacy.
I cannot remember either the exact sentences or where I get this from.
It sounds like a quotation from Douglas Adams: "Anything that is in the world when you’re born is normal and ordinary and is just a natural part of the way the world works. Anything that’s invented between when you’re fifteen and thirty-five is new and exciting and revolutionary and you can probably get a career in it. Anything invented after you’re thirty-five is against the natural order of things."
Well, I'm in my late 40s and I avoid Google Maps like the plague. Osmand or other off-line navigation apps are really easy to use nowadays. Also, I'm still not happy that my phone is leaking my location through GSM.
Easy to use, but heavy if you travel a lot (the maps are big). Also it doesn't offer anything in the way of alternative routing for traffic jams/road repair proactively (so you don't spend an hour driving back or looping instead of taking a route that is maybe 10m slower from the get go). There are many smaller examples that make Osmand not really something I would recommend to anyone else, even though it is what I use (I use it because google maps simply won't work on my phone because of how it is set up).
True, I guess. An hour is a lot though; aren't road repairs signposted in a reasonable way where you live? (I.e., using actual road signs, not navigation.) Actually, in the Netherlands I've found some longer-term road works listed in OSM. I don't use navigation, or even drive a car, often enough to compare with Google Maps.
If you're in Android use OsmAnd! It's great. It uses data from OpenStreetMaps. It does decent road and walking directions, in fact it tends to do far better at foot paths. You can download whichever regions you want for offline use. The only gap I have found vs Google Maps is transit directions.
It requires the route you will take and the time at which you are estimated to be at certain points along it. I would say that this is even worse than your current exact position.
No, it does not require that at all. You can implement it that way, but you can also just request traffic information for a whole 'cell' without revealing your exact location, just a cell ID, then do the processing locally.
I'm fairly sure if you have the app on your phone you can download the map data and use it offline unless it changed (It's been a while since I've done it).
Mass privacy invasion is probably one of today's trends that our descendants will look back on and say "what were they thinking?" Another example for comparison would be 1950s-60s mass atmospheric nuclear testing.
Societies regularly undergo bouts of either really extreme political corruption or mass insanity (or sometimes both at once). All this mass surveillance is going to be a blast next time we have a Hitler, Pol Pot, or Stalin. We are already seeing this abroad but Americans still think it can't happen here.
>Mass privacy invasion is probably one of today's trends that our descendants will look back on and say "what were they thinking?"
I'm not sure about that. The slope is not going to change direction, so it's only going to get worse. The newer generations will look back and go "what was the problem the old timers were complaining about, besides getting off their lawn?" I have a kid that just missed out being part of the millennial group. She already thinks I'm a curmudgeon about things. By the time she has a kid, privacy will just be a word to learn the definition to understand what people were talking about as it's simply not going to be a concept they care about. Maybe one more gen later.
These speakers could well be built without the ability to spy on you. However they aren't, and this should tell you all about Google and Amazon's motivations. Also, they could easily charge 2x or 3x the price they currently do and still find plenty of takers. The fact that they chose to sell it as a loss-leader is also telling.
This is just facebook all over again. Why can't people see trouble-tech? I never used facebook and won't ever use an alexa, nor whatever google's thing is.
I don’t know why this is getting downvoted. I feel the same way. Amazon and Google have yet to demonstrate to me that they value my privacy enough to take necessary steps to protect it. They, like Facebook, know most people foolishly don’t care enough to hold these companies feet to the fire on it.
At the risk of stating the obvious; Facebook has more than demonstrated the opposite.
Is there anything Apple could have done to protect the privacy of its Chinese users better (whilst still having some)? You could say they shouldn't have Chinese users, but that's a different argument, I believe.
This is false. iCloud data doesn't contain phone call contents or iMessage data contents. Yes they gave access to metadata, but this is China where privacy basically doesn't exist, not a western country.
Wrong - Apple moved all iCloud data, which includes text messages, email and other data stored in iCloud (according to Reuters) as well as the access keys. Apple even forced users to sign the updated TOS or drop service.
The Chinese government access is _not_ limited to metadata.
But with Apple's entire ecosystem being closed and intentionally locking you in, is Apple really a better choice. All it takes is one decision from the board that it would be more profitable to start violating privacy.
The board's legal mandate is to make as much money as possible. As soon as privacy is no longer a fad that makes more money and growth than the opportunity cost to not violating it, Apple will flip a switch. That's their duty as a for profit company.
> While it is certainly true that a central objective of for-profit corporations is to make money, modern corporate law does not require for-profit corporations to pursue profit at the expense of everything else, and many do not do so. For-profit corporations, with ownership approval, support a wide variety of charitable causes, and it is not at all uncommon for such corporations to further humanitarian and other altruistic objectives. Many examples come readily to mind. So long as its owners agree, a for-profit corporation may take costly pollution-control and energy-conservation measures that go beyond what the law requires. A for-profit corporation that operates facilities in other countries may exceed the requirements of local law regarding working conditions and benefits. If for-profit corporations may pursue such worthy objectives, there is no apparent reason why they may not further religious objectives as well.
Love your list & know I'm late, but I'd add one item to your list: Make your update mechanism resistant against sending "personalized" update versions to specific clients. A client querying a central server can be overcome with a single NSL, while a peer to peer solution would need to hijack your connection as well.
1 means it's impossible to offer better voice recognition accuracy or change the wake word, meaning they get pilloried by reviewers for offering a device that's strictly worse than the current gen, and they get pilloried by HN for contributing to tech waste.
1&2 get the company pilloried by HN the moment a vulnerability comes out.
3 is a requirement that doesn't actually solve any problems and ruins a legitimate need of ML - comparing performance against a known baseline.
4 is overbroad - most software companies make their money through data analysis of one form or another.
Be honest, even if the manufacturer tells you #1 is true you're not going to believe them.
You're setting up an impossible set of demands that no manufacturer is going to meet and no consumer really cares about (other than perhaps yourself and a few tinfoil hat wearing crypto-anarchists).
The iPhone hardware meets my standards. The secure enclave for touchID is detailed in it's description and security. The Apple business model is not data collection, and the legal fights they have put up give me a reasonable belief that they take my security and privacy seriously enough for me. They are also printing money so it seems to be workable as part of a business model. I think you are wrong on all accounts.
I think you misread my comment. I'm actually a huge fan of the secure enclave and the way that Apple protects customer privacy. What I was commenting on was the distrust a lot of people have, combined with a lack of knowledge about how the hardware works. For example, Amazon's echo devices have a physical circuit that disconnects the microphone when you mute it. They could have done it in software, but wanted to break the physical circuit so that there was no question in customers minds whether the device was still listening or not when it was muted.
I trust Amazon as much as I do Apple to protect customer privacy.
Can't you set up Alexa on a raspberry pi? I guess you could then have your own trigger word "hello pi" or whatever that you've programmed it to listen for, and only then activate Alexa maybe?
Pain in the ass but you could open source it for the rest of the world.
Yeah, I think you are correct here. I think I would also be happy if someone else was selling me the hardware and I controlled it more, and it just was a hardware pipe to the voice service provider.
Hotword recognition does already exist and works on fairly modest hardware (I've never tested it on a pi, so I don't know specifically how well that would work - or more accurately how fast). The trouble is that it's an enormous hassle to set up and while it is open source, it involves some company or other having it's grubby paws all over the recognition networks. Which may or may not rub you the wrong way.
I disagree. Amazon seems to actually care about privacy and takes things like this pretty seriously. One could certainly criticize AMZN data collection, but I think Amazon has shown that it does care about privacy of this data. Mistakes do happen which is unfortunate, and I think there is a good argument for no company, no matter their intentions to record this much data.
On the other hand, this is Facebook's business model.
y'know, I never thought about it, but I think I agree with you that being willing to pay another company for access to private information about you probably counts as not caring about your privacy.
It's not as bad. It's the difference between paying somebody to tell them secrets about you and selling secrets about you. But it's definitely not a thing a person respectful of your privacy would do.
I find it's funny how people these days think the problem is with Facebook.
Literally every company that has access to users' data has in one way or another some kind of business model based on selling that data. Every day I come across 2 or 3 articles from big newspapers giving Facebook some shit about selling user data. But what about Experian? What about phone carriers? They all sell users' data, maybe anonymous maybe not but they are still making profit on people's privacy.
And when some shit goes down, we make a big fuss about it for a few days and then we go back to normal. Equifax anyone?
Honestly, companies in the position of making profit on users' data have been doing it for the better part of 15 years now, I don't understand why we are acting so surprised now.
The thing is that for most people avoiding this kind of tech and platforms is unlikely to have any kind of payoff for them. Life is too short.
Most peoples lives are as simple as going to school and work, raising a family, taking vacations, paying bills, and then passing away.
There will never be a moment where they can say “Yes! My avoidance of Facebook and Alexa for all my life has finally paid off! Hahaha!”
The only people who can say that are people who have good reasons to hide: drug dealers with multiple lines of business, professional black hat hackers, home grown terrorists, life insurance fraudsters, etc.
The average person has nothing of interest to offer except more information about them in order to better perceive trends and distribute ads to them so that maybe they buy something.
Your list is missing "pro-democracy advocates", "Muslims", "Christians", "Jews", "atheists", "women who want an abortion", "people trying to escape an abusive relationship", "people with interesting medical conditions", etc, depending on location.
Please don't pretend that both "only criminals have something to hide" and "criminals are by definition bad people" are true statements. Neither is true.
I get what you are saying, and agree with you in principle... but the problem is, I am none of those things. I have none of the qualities that I need to hide from anyone, rather neighbors or the government. While it kinda feels creepy to imagine myself being listened to, I can't imagine any real consequences to my private conversations being heard.
However, I know that there are lots of people who do have those things they need to hide, and if only the people who need to keep things private work to keep things private, than the very fact that they are keeping things private becomes a risk factor.
Just like how part of the value of https everywhere is to prevent encrypted traffic from standing out, everyone maintaining privacy helps prevent people who NEED privacy from standing out.
Of course, as an individual, there is a cost for demanding that privacy - I lose out on the benefits of things like the echo and facebook. And since I am not an individual who needs privacy, there is no individual incentive to maintain it. It is a bit of a collective action problem.
Right, I have a hard time faulting people for deciding that they want to use voice assistants. That's more or less their choice. I am a lot more ambivalent about people who leave the voice assistants running when they have guests, because at that point they are making that choice for the guests too...
I think there's a lot of unexplored "how should/will society work" ground here, or at least explored only in science fiction, and it will be interesting to see how things shake out in the future. Maybe we will move to an "assume everyone is recorded all the time and put 0 weight on these recordings in anything that matters" world, for example... Hard to tell.
If you live under a government where those examples are a reason to hide, then in the eyes of that government you are technically a criminal. Doesn’t describe my government.
Even under the most oppressive governments though, there will still be a sizable majority that doesn’t have anything to hide and can feel secure using most technology, and they’ll be fine.
First, having talked a lot to people who actually lived under a somewhat oppressive government (USSR, and by the 80s it wasn't even that oppressive), most people _did_ in fact have something to hide.
Second, just because it doesn't describe your government now doesn't mean it won't in the future, unfortunately. Consider examples of a 1920s German, or a 1960s Iranian (the old regime was oppressive too, but in different ways), or a 2000s citizen of Poland.
Third, there are certainly places where the danger is not from the government but from your neighbors, including to some of the groups I listed. So thinking about this only from the perspective of governments and criminality is off. Hence my claim that "only criminals have something to hide" is a false statement.
Fourth, I suspect that perception of how much one has to hide is highly age-dependent. As people get older they discover more and more things that they either want to hide or wish they could have hidden. Maybe I'm wrong in my guess as to your age, of course, but if I am not, I strongly urge talking about this sort of thing with people a few decades older. It can be _very_ eye-opening. Certainly was for me.
> Second, just because it doesn't describe your government now doesn't mean it won't in the future, unfortunately. Consider examples of a 1920s German, or a 1960s Iranian (the old regime was oppressive too, but in different ways), or a 2000s citizen of Poland.
This is the risk averse argument against certain technologies. If you believe your government won’t change and are willing to take that bet, you can lean into technologies very hard and reap the benefits.
Of course, if you think there may be a day where your government will change and use technologies against you, you could live out in the boonies and never even touch a computer, and thus get no benefits from emerging tech.
It’s a trade off between being at the bleeding edge and being as secure and private as possible. Personally, I have made my decision. Could it be my undoing one day if government decides I’m guilty of everything? Sure, but I doubt it will happen. I’ll take the bet.
As I said above everyone can make this decision for themselves, and should. It's just important to me to not paint any of the decisions here as "only criminals would do that"! Past that, I figure adults can and should decide these things for themselves.
That said, I also think we should work toward changing the nature of the tradeoff, e.g. by changing how voice assistants work to minimize the privacy issues. I suspect there's a lot of work we could do on that front while not compromising the functionality of the voice assistants.
This mindset seems to be why the US political parties keep handing each other more and more federal power. Nobody can imagine their nation would be so stupid as to ever put the Other Guys back in power again.
I agree with the sibling comment, but just had to respond to this:
> The average person has nothing of interest to offer except more information about them in order to better perceive trends and distribute ads to them so that maybe they buy something.
This is a remarkably dehumanizing way of looking at people. You’re basically saying that people have nothing useful to offer except things we can use to sell them more stuff.
What would you say if the voice recordings were used to offer you differential pricing of items on Amazon? What if they profiled you and charged you 20% more because your speech patterns indicated you had more money? What if they sold this profiling data so that everybody charged you more? What if they shared the data with governments which ended up resulting in you not being able to get stop over visas, etc because you are and "undesirable" person to that government?
I'm not actually suggesting that Amazon does this right now, but it would be pretty easy for them to do so. People should be very worried about this technology and if they choose to use it, they should be absolutely sure the data is being used in a way that doesn't screw them over. Your boring data is absolutely interesting to people who want to abuse you. That's why there's such a big price on it!
> The only people who can say that are people who have good reasons to hide: drug dealers with multiple lines of business, professional black hat hackers, home grown terrorists, life insurance fraudsters, etc.
That this flawed and toxic thinking is so often repeated, especially in tech circles, truly horrifies me.
Ok, can I see your credit card statements for the last 5 years? Can I listen to all of your phone conversations? Can I watch you shower and use the bathroom? I mean, you sound like an average person with nothing to hide...../s
This approaches one thing I always wonder when this comes up: Are people with that argument really okay with their co-workers finding out what porn they watch and when they watch it?
It's perhaps a little extreme, but I can totally see it becoming a possibility due to large leaks like this.
You are ignoring the possibility that people saying that don't watch porn. I don't.
Regardless, I disagree with the argument, as things you don't need to hide now can become things you wish you had hidden later (think of Jewish people in Germany in the 1920s whose religion and ethnicity were obvious).
Sure. The only condition is that you don't share the information in public, on pain of massive fines and imprisonment - you can only use it to improve your internal business processes.
I consistently hear this from the same people that own iOS or Android phones. You do realize that the mic on any phone produced in the last few years is also "always listening," right?
A) It depends on if you enable it.
B) Just because you're willing to accept something in one domain doesn't mean you are universally OK with it being applied everywhere.
C) I trust Apple much more than I trust Amazon.
Been using snips in 3 different rooms in my home for about a year now, hooked into home assistant, with many custom actions that I specifically made for my apartment. I LOVE being able to control my apartment without the need for the internet.
Too bad local control seems to be a luxury these days, rather than the standard.
I can't believe that so many people, especially the ones on tech sites like this, bought into the idea that speech recognition is such a hard job that it needs to be run on the supercomputers of tech giants. We had somewhat decent voice dictation software on desktops 20 years ago, when 100Mhz processors and 32MB of RAM were top of the line, yet now it's impossible with an order of magnitude more resources.
20 years ago you spoke directly into a microphone and could only use an extremely limited set of supported languages and locales/accents.
You're also missing the point. I don't think anyone has ever claimed that speech recognition can only be done on supercomputers. Your laptop can surely run one of these models (though it would take a long time to train one). But there's a reason why an Echo Dot cost $20 and not $1000.
The main reason this kind of thing is outsourced to the cloud nowadays is because of deep neural network voice recognition technologies we have. Most of these models are too hefty to run inference on-device. Also, online learning allows for STT to get better as it’s used more if it’s centralized in a place like the cloud.
Your iPhone is also a $1k device that's faster than some laptops. And it still cant do convincing on-device text to speech a-la Google Tacotron, and its NLU capabilities _even in the cloud_ leave much to be desired.
Much of the cost of the iPhone is in the screen, battery, form factor and fashion accessory premium. Take that away and your much closer to raspberry pi territory.
FPGA on which it is worthwhile to do deep learning costs more than the iPhone, and consumes a lot more power. Your best option starting next year will be sub-$100 Chinese chips with a TPU-like unit built in. The only one I know of is RK3399Pro, which was supposed to come out this year, but didn't make it, apparently because the die had to be larger than they planned.
It can get data from the internet, I think the point is that it "lives" on-device and hence can be interacted with offline, unlike Siri/Alexa/Cortana which live entirely on the company's servers.
Siri/Alexa/Cortana are assistants, while the part you are talking about is speech recognition. Sure, it’s great that this is done offline, but I don’t think it’s fair to compare it to more services because at that point you do need to get data from the internet just like any other service.
I would say your comment was the unfair one. Sure, it cannot tell the weather without internet. But it can still do plenty of more important home automation functions, namely automation. No one owns one of these home assistants for it to tell the weather. You get it for controlling other devices with your voice and this can do that without access to thr world wide web.
I believe the intention is providing a voice recognition system you can use to interface with other things, like APIs. Products from Big Data (in my usage like Big Oil, Big Tobacco, etc.) offer convenience at the cost of privacy. For many people, this is a minor trade-off. For others, alternatives with perhaps less convenience.
I think it’s misleading to sell this as a “privacy-focused AI” (specifically, similar to Siri or Google Assistant) when it’s really an offline speech recognition tool. This keeps the “privacy” part accurate, and makes it clear that intents are still being sent out from the device (as opposed to audio recordings).
I am baffled by how many people have no problems adding always listening and recording Google and Facebook devices to their living rooms.
Some are considered tech people who should know stuff like this could happen, but they just don't care or don't think it will happen to them.
Somewhere there is an ex-Stasi officer who is thinking "I wish we were that good and had people fooled to voluntarily install listening devices and didn't have to tap phone lines and crawl around basements and rooftops".
1) It's always listening for a keyword, and then it starts recording. It's a voice interface for a computer.
2) Your phone is doing the same thing except for it follows you around everywhere you go and records your map location, what apps you use, who you are talking to and what your web searches are. Your phone isn't recording just what you say...its recording what you do and where you do it and when you do it. It's overwhelmingly more pervasive and invasive if you value privacy.
3) The utility of being able to interact with a computer via my voice wildly surpasses the risk. Even if it had been my voice sent to this random other person all they would get is 3000 recordings of me saying "Alexa, what is the weight of an elephant?" "Alexa, play Magic Sword." "Alexa, whats the weather?" My father-in-law thought these things were the devil until he realized he could ask it to play ANY SONG HE WANTED and it would do it. Now he is always unplugging ours and plugging it back in on the porch so he can use it out there.
Literally every article about home assistants brings out a number of people who continue to say how they can't understand how people could have this in their homes. Great, you can't understand it. Meanwhile, an enormous number of people find immense utility in home assistants otherwise they wouldn't be proliferating so wildly for this long.
Point 1: Keyword triggers and is actually said are miles apart, as shown by the Alexa creepy laugh episode. There is no practical way of knowing when Alexa has heard a keyword and recorded content around it. Not to mention the Google play incident where a poor wiring job made the devices think the "enable assistant" button was constantly being pressed.
Point 2: My phone, an iPhone, offers a way to turn off the keyword-listening functionality - and presents this switch during initial setup. It also offers a way to turn off all tracking functionality. Apple has no financial gain from lying about this.
Point 3: This comes back to point 1. If you don't know what and when the assistant is listening to, you have no idea what phrases are being collected and shipped off. Also, there are plenty of studies floating about the internet about how "anonymized" search engine queries have been collected, studied, and in many cases paint very personal portraits of the people doing searches.
> There is no practical way of knowing when Alexa has heard a keyword and recorded content around it.
I know this isn't the kind of certainty you're looking for, but Google Home and Alexa both allow you to enable audio cues for when they start and stop recording. I find this helpful from an accessibility perspective, but the privacy-conscious may also appreciate the indicator.
They also have visual cues enabled by default. The light turns on when it is listening. You quickly get an idea of when it is recording and when it is not. And it rarely falsely triggers.
Point 2: Although you’re right about the application processor side of your phone, you cannot disable a significant amount of tracking functionality built into the baseband/modem. Your cell carrier or ISP can track you the same way that Google could on an Android device. On the other hand, iPhone basebands usually don’t have DMA, while other phone manufacturers allow the modem to access application processor RAM.
The iPhone has a lot less tracking built in and has much better privacy controls than Android with Google Play Services, but there are still some avenues for different companies to track you and to have a lot of control over your device.
Re: point 3, I was reading Kevin Murphy's book on Machine Learning recently and in it he says that roughly 20% of Google search queries that come in everyday have never been seen before. Reminds me of that report about Russian election interference campaigns that included the tidbit about Google being able to pinpoint the exact search queries that members of some GRU unit (Fancy Bear, maybe? idk) made.
> There is no practical way of knowing when Alexa has heard a keyword and recorded content around it.
This is incorrect. If you want to know what your device has heard and processed (or failed to understand), you can look at the Alexa app on your phone. It will tell you what it thought you asked for and what it told you or played for you, and you can replay the recording of your voice request.
You can opt out of basically every Google login/tracking in Android, except the Play store. I know because I'm logged out of everything and I disable location services. Google Maps keeps working.
What I don't understand is why they have to store voice recordings instead of deleting them once they understand the vocal command. There could be a number of reasons and I don't like any of them.
This is one of those massive privacy violations that I cannot believe has been allowed to exist as long as it has without people making a concerted effort against it. I remember way back in the early days of Android when it was excused for reasons of allowing Google's live traffic mapping feature to work.
That excuse made sense in a time where public trust in these companies was justifiably high but it's turned into a situation where the major tech giants have the power to become a modern day Stasi on steroids (and increasingly show their willingness to play that role). We had to place enormous trust in these company's ethics (mostly on a subconscious level as they all used to go to great lengths to afford us the ability to implicitly trust them) in order for the smartphone/mobile revolution to occur. Now that these devices are a necessity in modern society, they've all dropped the pretenses that allowed us to trust anything to do with the information they're collecting.
We're living in a cyberpunk dystopia and I don't see an easy way out.
It's to train their speech recognition and intent recognition systems. They probably also use the data to prioritise which queries they add support for.
Even if you do this, on a standard Android device, there is still Google Play Services installed, which has every permission available on your device. You have to install a fresh custom ROM like LineageOS without GApps to avoid this.
Good point. You should install LineageOS first, you can then select the GApps bundle size you need or try and replace them with FOSS alternatives. For me, Yalp Store has made using LineageOS + minimal, permission sandboxed google services a bearable experience.
> What I don't understand is why they have to store voice recordings instead of deleting them once they understand the vocal command. There could be a number of reasons and I don't like any of them.
Not that I have any insider information but a while back someone made a post on reddit about what baffles them about their work and they had a job where they basically listened to very short voice clips and saw something on the screen and they had to decide whether it was the right word on the screen or not.
There are tons of phone systems that use speech recognition and warn you that you are being recorded for quality assurance purposes. I would assume those recordings come from that type of system. Google and Amazon have been clear that humans will not be listening to the recordings.
> Literally every article about home assistants brings out a number of people who continue to say how they can't understand how people could have this in their homes.
It feels weird to feel out of touch. I’ve been a computer/technology enthusiast my whole life, and used to proudly early-adopt all sorts of crazy gadgetry. But the older I get, the fewer things coming out of Silicon Valley appeal to me, and the less I understand the use cases. I just have to shake my head and accept that they are somehow useful to a lot of people and that the idea of a cool useful product is changing.
I would love to have an always-on always-listening home assistant which did not ship my audio stream off to a cloud I don't control.
The fact that all the current listening assistants do ship the audio stream to the cloud has nothing to do with CPU power and everything to do with a desire to collect as much personal data as possible, and as far as I'm concerned they can pound sand.
I get 1) and 3), but I don’t buy 2). Your phone works with Siri/any other assistant off. The connected speakers don’t. So the comparison doesn’t hold in my opinion.
As an example for #2, when I got an Android somehow I was either automatically opted in or accidentally opted into location sharing. Two years later I found that Google Maps could show me every location I had been to for the last 2 years. I could review, day by day, everywhere I had gone, what time I went, how I drove there and where I went to next.
You can just mute the connected speakers and it's the same as having Siri off. Or get one of the ones that requires you to push a button (the Amazon Tap).
1) It's always listening for a keyword, and then it starts recording. It's a voice interface for a computer.
Show me the source code!
...all of it, including what's running the entire cloud behind it! Until then, fuck no, no hot mics in my house. And actually then, still no hot mics unless I've plugged my Les Paul into my amp, in which case I switched them on.
If we are playing the "you can't prove that" game, then you also can't prove that your $12 throw away phone isn't always recording, or that you new kitchen countertop doesn't have a recording device embedded in it, or that the maker of whatever device you are using to type that comment isn't breaking a plethora of laws to record everything you type or say or do around that device at any time.
It's reasonable to embed motive into these suspicions. My confidence in iPhone not listening to me is linked to the fact that Apple's ad network is a fraction of the size of Amazon's and Google's, and that to the best of our knowledge, it doesn't require extensive access to personal data to run effectively. Same with GE appliances and countertop makers. One of the most reliable ways for me to believe a company will not harvest my data surreptitiously, now or with TOS updates in the future, is for me to be convinced it's not in their best interest to do so.
1- Apple might not be interested in listening to its users but I'm sure other businesses are. Apple users are loyal customers who can pay twice for a piece of technology and some of them will happily spend a night in line to be able to do that. Most advertisers would pay big money to Apple to know who those people are and what they want; that information is like gold.
2- unfortunately advertisers aren't the only entity potentially interested in spying people.
> Most advertisers would pay big money to Apple to know who those people are and what they want; that information is like gold.
I think its laughable that this money would be tempting to Apple when they have a business that almost sees customers buying 700-1000 dollar plus devices on an annual or bi-annual subscription, especially when the privacy stance is one of the core pillars of the marketing message.
There is likely no commercially viable business model Apple could adopt to sell that data that would be remotely worth risking a 60-70 billion dollar a quarter in revenue money printing machine. That information is practically worthless in the context of iPhone revenue.
Should iPhone sales start to falter, this is absolutely something I would consider a concern, but the iPhone has a long way to fall.
When my throw away phone is transmitting, it gets warm. I also set it next to an amplifier and I can actually hear it bleed into the amp when it chats with the cell site. I can even tell a couple of seconds ahead of time when I am getting a text or a phone call.
And all a hypothetical "bug" needs to do is listen for some key phrases, and send a single byte or a few bytes of data across the wire, even through side channels.
I can imagine a system that throws an extra byte or 2 on those "chats" with the cell tower, or even abuses the timing that it decides to reach out and check the tower as the method of communication.
The actual voice recording and processing tech is small enough that it could be thrown right alongside the actual phone, for pretty cheap, and could run entirely separate from the phone itself, maybe even having its own battery!
Yeah, but you can also do the same with the Alexa device by watching it transmit data over the wire. If you think watching the transmissions is enough to prove your cell phone isn't recording, then it should be enough to prove the Alexa isn't recording.
I vaguely remember rumors (or maybe it was used in a few court cases by the FBI), where carriers would turn on the microphones of 2G dumb phones remotely and silently. So your dumb phone probably has the same issue.
Carriers also keep records of where subscriber devices have been, so even if your phone doesn't have GPS, the carrier has cell tower triangulation logs of the device too.
ISPs such as comcast also save your browsing history to run their advertising networks, so that is also recorded somewhere too. And if/when net neutrality goes away completely, they might just sell that info too:
there's nothing stopping from Alexa recording content locally and transmitting it with a traditional request once activated.
Also, there's no reason for Amazon to want a constant stream of Alexa data hitting their cloud services. The goal is to mine and extract valuable data about users to build profiles, ideally you'd want to do that on-device and send up the most valuable bits.
These are also capabilities they can silently introduce at anytime for any device.
You can store around 1000 hours of recorded speech in a gigabyte of flash memory that costs $0.25. For most people, that could hold their entire conversational activity for a year. Storage and off-line upload (perhaps during a software update when packets are flying around anyway) isn't difficult.
They can not silently introduce new hardware to these devices. You should look into what kind of hardware these devices actually contain. Try to set up an intelligent, personalized data miner with those constraints.
Steve was not thinking this through. I can batch compress chunks of audio and lazy upload them when I think nobody is around. A listening device could probably even guess when you are not going to be using your internet connection.
You guys are way to paranoid. They're a public company and they are constantly under audit. They have explicitly stated in their white papers that the device has two computers, one that just listens for the "Alexa" keyword that activates the other one for signal processing.
Do you really think they would lie on those white papers? The government would be so far up their ass with fines it would be ridiculous. Not to mention people have actually done tear downs of the device and you can see the two computers and monitor the power input to each one as it runs. The "dumb" computer with minimal memory would be having to store hours of speech data until you say the keyword so it can transmit to the other computer.
I have suffered every kind of audit, many times. I know of many technical facets of every business I worked for that would never have been a part of any of those audits.
Do you know of any published Amazon audits that describe in detail what specifically around Alexa is audited, if anything?
Actually, quite the contrary has been proven to point 1: both for Alexa (the spooky laugh episode) and Google Play (the "always pressed" button episode).
I almost think that the burden of proof that things aren't being sent is a valid expectation in this point. And that proof needs to be re-confirmed with each firmware update to ensure a bug hasn't caused the behavior to change.
Those aren't examples of the contrary, those are examples of in-precise recognition. That's miles away from being the same thing as "always recording".
And your SMS is logged as well as cell towers to which you connect. Is the “phone company” more secure than any other? If you really are someone in need of a tin foil hat, you should ditch the phone completely and communicate solely with dead drops and one-time pads.
SMS is actually clear text over SS7. I would not send anything I care about over it. It is also trivial to spoof.
All you need is 1 SS7 link and you can be any phone, kick anyone off a cell site, override call congestion and bump people off, change the display on your phone to any text, too many things to name.
> Nobody can prove that. It's always listening and you have to assume always recording and transmitting
Uh... packet analyzers, IP logs, bandwidth monitoring tools etc...
If instead of transmitting occasional bursts of data when you say the wake word it's been sending back tens of megabytes an hour, or tens to hundreds of megabytes a day, you can go "hey, it's probably transmitting everything it hears".
There are people that religiously watch what these devices are sending back just to try and catch one of the companies doing something fishy. There are people that religiously watch their network traffic period looking for something nefarious.
These devices aren't like your phone, where simply having it on can have all sorts of services/apps using data in the background or you open a webpage and tens or even hundreds of different servers are contacted and data pulled/pushed to/from them. It's a device that should be periodically checking for an update and sending very small audio files back in nearly-real time. If they were recording regularly, or especially all the time, and transmitting it back to a server it would stand out like a sore thumb.
Never mind the fact that keeping it secret, and not having a whistle blower, would be damn impossible. Someone's going to mention it to a loved one or friend, or flat out go online "I discovered my Amazon/Google device is actively listening and recording and sending to these IPs in these intervals" when they just know it as fact and didn't have to 'discover' it.
Which nobody can do. It is encrypted and unless the device accept invalid certs, they can queue up, compress, batch upload any data they want. They control the security, not the end-luser.
We know how much storage is on the device - we could tell thru network monitoring if they were actually doing this - there is just enough smarts in the device to only turn on the recordings when the wake word is spoken - thats not being said a malicious player wouldnt hack it, and insert their own code to listen all the time - but based on the evidence, I don't think amazon is that player.
If they batch compress and upload, it wont be a stream. It could be aes-256 lzma2 compressed chunks. Using the right codec, they could likely upload a hour of audio in a few hundred kbytes. I have seen some demos of highly bw optimized codecs.
Here are some staring points. [1] I would go with Opus. My voice chat server uses that.
> Using the right codec, they could likely upload a hour of audio in a few hundred kbytes.
But at the same time, you're convinced that you would notice if your phone was slipping out these kilobytes as part of its usual communication?
It seems to me that you've just arbitrarily picked a level of paranoia where you've decided that this can definitely happen with the Alexa, but won't happen with your phone. Your proud paranoia isn't applied consistently.
That's possible but that's quite a bit of processing for actionable speech to text which means you have a chance of the device acting sluggish when you query it or you have to give the device way more processing power than it needs which increases cost and from my understanding with devices like Amazon's, they're already being sold relatively near cost because they are intended to get you to spend more money inside the Amazon ecosystem.
>The utility of being able to interact with a computer via my voice wildly surpasses the risk.
How?
Risks:
Willingly putting a proprietary device designed to listen to you and watch you, created by a company that exists only to maximize it's shareholders profits by either selling you things, advertising to you or collecting and selling data about you in order to do those first two things better, into your home to constantly watch and listen to everything that goes on.
Utility:
Control a computer by talking to it...
For me at least....that risk far outweighs whatever utility i migbt get by being able to speak to a computer and honestly, if i really needed it that bad i would just set up my computer with voice recognition and control software. It's existed for a while now. I remember playing with it when i was a kid...it sucked...but that was more than 15 years ago.
The only real novelty of alexa is that it'a a standalone box that answers you back.
We believe that people want more natural interaction and AI at home, but that it should never be at the cost of their privacy. You never know what a company, or hackers, could do with recordings from your home.
This is why we are creating AI which can work on a Raspberry Pi 3, and works in english, french, german, japanese, spanish, italian (more coming very soon)
Take a look at our blog if you want to build your own smart assistant: https://blog.snips.ai
Hey, your product looks super cool but I’m a bit stumped as to why you’re trying to force a token into it. It looks like it’s a simple payments token, which doesn’t usually turn out to be a great idea if you look at the remnants of blockchain projects that did the same in 2018.
Is this just a way to fundraise or do you have other ideas to make your product ‘decentralized’ aside from using a cryptocurrency?
For the record, everything else about the product looks amazing and I’m really excited to see an open and privacy conscience alternative in this space :)
Have you found that your users really want a decentralized marketplace vs what the one you've built already (which I think is really good!)? Additionally, remember that all transactions on ETH are public, so if anything transactions would be less private than if you were to contain them to a normal web2 marketplace.
Are you working on making actual hardware like smart speakers?
I'm only asking because this would definitely help spread this kind of platform. Raspberry 3, despite being great, is not something the average Joe will mess it.
There needs to be a physical product with Snip.ai on it that people can put in their home. Something sleek and well designed, not some Rasp 3 in a case with some mic attached with duct tape.
I've made a fairly well-considered decision to use most of these new services. I keep virtually all of my content on services run by five major corporations because I'd rather have my entire phone's photos published on the front page of the Washington Post than lose all the photos of my kids. I'd rather have Jeff Bezos listening to everything happening in my house than have the more frequent trips to the grocery store that were needed before I had voice-controlled shopping lists. I'm not an ignoramus, I've just decided that certain things that are very important to some are not important to me.
I agree that the decisions that me and others are making are hugely damaging collectively but that is a different issue.
And you're not the only one. Outside of HN, in the real world, the average person doesn't set up a FreeNAS server or homelab for data storage, doesn't know how to do backups, and doesn't have nearly as much concern over privacy as the fear mongers here at HN, Reddit, and tech publications.
Fear mongering is nothing new, folks. The media is incentivized to cater to people in that way; this is why they use scare tactics and write BS articles like this one which boil down to "someone fucked up, sent the data for one customer to another customer". Furthermore, the article implies this mixup would NEVER have happened if the user's had not requested their data be released per GPDR.
The only valid question worth asking is this: are you willing to risk ANY chance of your data being compromised? Google/Apple/et.al. are not the enemy. For most users, convenience trumps security, and they will accept the minimal risk that comes with using services like Alexa and Google Photos.
FWIW, I've lost years of photos and data before on my first run-a-round with FreeNAS. It's not something that's just easy for the average user or even average developer to use and deploy reliably. I'm trying FreeNAS again now 4 years later, but it's taken way more work than I expected to understand my options and ZFS, and I'm still not sure I'll ever store anything critical on it.
> the article implies this mixup would NEVER have happened if the user's [sic] had not requested their data be released per GPDR.
OK, cool. So as long as I don't assert my rights under data protection regulations, Amazon probably won't orchestrate a grotesque breach of my privacy?
It's a mistake, man (or woman). Shit happens. ¯\_(ツ)_/¯
We have to expect as users of cloud software there is a chance our data could be compromised by BUGS or human error. I think that should be a given in any scenario. There's a difference between releasing data to third parties (or internal employees having access) as a matter of standard practice, and an unexpected breach or error in process leading to accidental disclosure of personal data.
As a reader, I am uninterested in articles about the latter. It's clickbait / doomsday journalism. The most interesting stories are those about grave errors in judgment/behavior and cases that indicate a company has routinely breached user trust. I do not agree that "employee mistakes" fall in that category.
It's amazing to me that people are actively seeking to erode society as greatly as they are. The fact is, it's not better to have these systems long term. So what if it saves you even 10 minutes a day (which I highly doubt), if eventually someone comes to power that will decide to kill you and your families based on your prior beliefs.
Today we are seeing a more subtle manipulation of politics using all the data these large companies collect. I can't imagine what would happen if there was a concerted effort to "weed out" bad eggs.
There's some weird cognitive bias people have about visible vs invisible technology.
If they can't see it they're fine with it.
For example, if someone is walking in the mall they're being filmed by a dozen cameras.
But if you walk behind them with your own handheld camera photographing them they will get VERY angry and freak out and start yelling at you.
Yet nothing has changed...
Same thing with Facebook. People don't mind giving their data to Facebook but if Zuck was staring over their shoulders reading their email they would be upset.
I can also debug any code running on my machines and have some idea of what they are doing. I can also limit when they are allowed to talk to the internet and to what address.
Maybe if I have the debugging symbols and kernel headers. Even then there are circumstances where it won't help.
For example: Nearly all cell phones check in to their manufactures website from time to time. Nearly all of them look for a header that triggers a debug mode that turns on CarrierIQ. (It isn't called this any more, AT&T renamed it to an empty name to stop people from searching for it).
Trying to find obscure functions and understand how they interact with the servers they check into can be very difficult at best. It often leads to more questions than answers, at least based on my experience.
What's funny about that is, I had voice recognition software in 1991 that could start up any application on my computer, reboot my computer, etc. Surely there must be a small device with the processing power of my old 386-DX40.
I believe it was made by a company called Covox or something? There were several other companies that made similar software back then.
Yes. Not very well, but yes. Each person had to say a few sentences. This is fine for home use unless you run a halfway house.
Certainly not as well as the software we implemented to listen to phone calls in the background on wireless networks in the mid 90's. That was the wildfire project and was fun. We abandoned it, since our lobbyists were able to kick the hands-free laws down the road.
I would disagree. I despise voice interfaces and despite working for one of the major tech companies that build these devices and interfaces, I use exactly zero. I do trust my employer to do it correctly based on the processes I witness every single day, but I don't own any because I don't like voice interfaces. Remote control on my phone, sure. Clicking the button to change a song, sure. But no voice, thanks.
Maybe a naive question, but how are they any worse than having phones with non-transparent hardware/radio drivers in your living room? We seem to have accepted that risk, and even carry phones around everywhere.
We use smartphones for the same reason we don't avoid any non-free software like Stallman: a huge amount of additional possibilities. But what Alexa, Siri etc. offer is just a different way to use a limited set of existing features, in exchange for a much higher risk of a privacy breach.
Everyone needs to draw a line somewhere, I just don't see a reason to use voice assistants, definitely not with internet connectivity and where the company keeps all recordings for some reason.
They keep all recordings because they train their voice recognition off them. GOOG-411 kept all recordings too, for the same reason, as do hundreds of other services, but I don't recall anyone caring about them.
Google gives you a UI to view and delete your recordings and suddenly they're monsters.
Has anyone on HN put their phones into Airplane mode with WiFi on and then had wireshark tracking all outgoing packets from the wifi devices in their home?
To see exactly what outbound connections were made by which device and which app when either device was being actively used, or laying dormant?
This is how it was discovered in the UK that SmartTVs were sending viewing data....
> I am baffled by how many people have no problems adding always listening and recording Google and Facebook devices to their living rooms.
To be less baffled, think about how many of us have those devices under a different name: Our phone.
Functionally I do not see the decides any differently. Yet, I think everyone I know has an Ok Google, Hey Siri, or Ok Bigsbi (however you spell it lol).
I'm not going to lie, for some reason Alexa makes me more concerned than my phone. Yet, the logical part of me fails to see a meaningful difference between my phone and an Alexa.
Disclaimer: I own an iPhone, I do not own an Alex/HomePod/etc.
Hey I’m right there with you. I keep voice assistant disabled on my phone FWIW and my girlfriend and I both refuse to have voice assistant devices in our studio. Our upstairs roommates have the whole top floor of the house bugged. They mostly use it to turn on and off the lights. Downstairs we also have Hue lights but we use the app. I’d love a locally hosted assistant but I’m busy with so many projects and it’s low on my list.
> Some are considered tech people who should know stuff like this could happen, but they just don't care or don't think it will happen to them.
It's this thing where you somehow believe that just by being an expert in some field, bad things in this very field can't happen to you, even if it's out of your control. Like a brain surgeon suddenly getting a tumor in his own brain.
This. You're trusting everything you say & do to a company who says they're not going to share your data with others and that have the proper control in place to prevent other people from getting it.
Saf thing is, some current officers are thinking “oh nice. Lets find ways if exploiting this opportunity, by finding security issues on these devices, paying these corporations to allow us to sneak in their devices it any other means necessary to use these amazing machines.”
I do as well. I used to politely ask they unplug it, bit invariably there were more in other rooms. I just explain my concerns, and avoid those people if they cannot grasp or honor them. So far it seems a good strategy, as I have doged a few 'bullets' both security-wise and socially.
Yeah, I'm extra happy I have my Amazon Echo now, if it also works as a way of preventing paranoid conspiracy-theorist lunatics from interacting with me!
If there is one thing you could have learned from the post Snowden era it is that your paranoid hacker colleagues may (1) know more than you do and (2) tend to be right in the longer term about what you today consider to be ridiculous lunacy and (3) usually are still too mild in their delusions compared to reality.
Ha. Sure. It's going to keep happening, and on a larger scale. Then people will (again) aruge it's no big deal. Stil relatively early in the conditioning stage.
>I am baffled by how many people have no problems adding always listening and recording Google and Facebook devices to their living rooms.
I am baffled by people that think anyone cares about their conversation about Jeff at work or what Monica and Ross just did on the 75th viewing of an episode of friends, or that they talk to their stuffed rabbit named Bun that they've had since about a week after they were born about how miserable their life is and how they should just pack up and run away together... I mean...
"oh no, big bad google knows I shouted 'oi shut the fuck up' four times instead of three at the neighbors today" and "oh no, Amazon knows I asked what we want for dinner"
I guess I must be weird and other people are having extremely sensitive, if not classified, conversations in their residence but I just don't see what the big deal is.
Most of the people I personally know that find it bizarre people would have such a device have rewards cards, smart phones, take every damn quiz that myspace and facebook have ever had, checked in religiously with foursquare, tag their friends in every photo they upload seven times a day from every place they've been that day, check in everywhere they go on facebook and have 2+ streaming services monitoring every single thing they watch. Like, hellloooooo you're worried about a microphone that can't even get "hey alexa" "hey google" right half the time and actually trigger?
You know, if nobody cares about this, then why are people constantly up in arms whenever yet-another-Facebook-privacy-scandal surfaces?
Why people object to have e.g. their financial or medical information disclosed?
Yes, there are people who are "sheep" in the sense that they would do anything if there is some voucher or discount for it. But most people would also object to a camera in their bedroom or photos of their kids being posted publicly online for everyone to see.
You are right that most of the stuff is absolutely mundane, especially in isolation. However, with a bit of analysis one could suddenly learn things about you that you would likely want to keep private - e.g. your sexual fetishes, whether or not you have some illness (insurers and some employers would kill for such info!), your political views, ton of data about your interests, which TV shows you are watching, etc. That's an absolute bonanza of data that makes any marketer salivate and see dollar signs. And a gold mine for all sorts of stalkers and creeps too.
"Arguing that you don't care about the right to privacy because you have nothing to hide is no different than saying you don't care about free speech because you have nothing to say."
What's baffling is how many times this same conversation has played out and people like you still don't get it.
There is a difference, though. I agree privacy is important as a principle, and we need to protect the right to it. But privacy is also not important to me, so I can use services that require me to give up some of it. I want people to be able to maintain privacy if they want to or need to, but I might not care about giving up my own.
I am baffled that people like yourself are not concerned in the least.
I grew up in an Eastern Europe Communist State where this kind of personal surveillance was the norm. People went and wrote down all their interactions with others. You can see these documents now as the former secret police archives were published (what wasn't destroyed to protect the higher ups).
It's amazing how the most innocuous remarks can be used against yourself in an unbelivable twists. Just an example:
Person A criticized person C, which was his boss, as being incompetent, to person B, which was a secretly delator. Person B went and reported to the secret police this tension, as he was payed on the number of pages he would write.
The secret police compiled a weekly report of stuff that would happen, in each company, and send it up to the party for review. It just happened in the party that somebody who saw the report had frictions with that Boss, and he used that report to push the Boss around. The Boss thought that person A himself wrote the report, and used his connections to get person A fired, and banned, on the party line, from ever having a qualified-work position again - ie he could not be an engineer, teacher, or anything, leaving him with just construction work or janitorial jobs.
How often do you bitch about your boss? How would you like that your boss gets reports about what you bitch about him in your own home? How would you like that what you say, no matter how private you want it to be, is always recorded?
I'm astounded by the fact that people see nothing wrong having these devices always listening to them.
>I am baffled that people like yourself are not concerned in the least.
>I grew up in an Eastern Europe Communist State where this kind of personal surveillance was the norm.
Do you feel the same about smartphones? They record location, are capable of recording audio and video, if you have wifi and/or bluetooth on they can record every device ID you come within range of, they can in theory record everything you do in every app you use.
Don't get me wrong, I recognize the surveillance value/implications of such devices. As Pokemon Go was starting to initially gain popularity I even wrote a piece showing how such an app would be extremely useful as a tool for HUMINT.
TLDR: You create a game location, like a gym, in an area you want to surveil or deploy a rare spawn in a location you want to surveil. You then sprinkle it out on social media geo-targeting, then people flock to the area with their app that already chew through data and point their phone's camera all around the area giving you video and audio that you can either pull in real time at a risky data-cost or grab still images and then decide if you want to compress captured video and send it or leave it uncompressed and upload when the device connects to WiFi next.
Here's the post https://www.ryanmercer.com/ryansthoughts/2016/7/11/pokmon-go... and please, 'pokemon go funded by the CIA' was not something I believe then or now but there is a direct connection to the U.S. intelligence community via funding John Hanke received for a previous company.
Well, they made an active decision to buy a device specifically meant to record them and provide responses to said recording. It would be another thing to find out a phone records every conversation passively, but these devices are literally made for the purposes of recording you.
That kind of sounds like a self inflicted wound if they actually cared about privacy.
This is all said with the presumption that the devices actually are surreptitiously recording. Which is far from proven but already addressed by other comments here.
OK, by your definition of "recording", a laptop contains several I/O devices that "record" my input, not just webcam and microphone, but keyboard as well. I guess I should just shrug at the revelation that it comes with a keylogger. Because hey, if I didn't want my written thoughts to be recorded, why did I buy a memory-enabled Internet-connected device to do actually that?
I even used the example of a phone recording you being bad because that wasn't the primary purpose of the device when bought. That argument obviously applies to laptops for the same reason.
But if you as a consumer decide to buy a device who's literal only purpose is to record you, then you don't also get to cry out about it recording you. That's the very definition of a self inflicted wound.
I don’t understand how this problem of “human error” happened at a company like Amazon/AWS. Perhaps requests for Alexa info are rare enough that the internal interface for servicing such requests isn’t fully automated. But I’d be shocked if the process involved someone (low-level data entry person, or engineer) manually typing in a customer ID number.
There is obviously more to this story than Amazon is telling. https://www.heise.de/downloads/18/2/5/6/5/3/9/6/ct.0119.016-... says that this was a "one-time error" and that "Amazon also claimed that they had discovered the error themselves". It is highly unlikely that either of these are true. The one time they made this error just happens to involve someone who is savvy enough to contact the right journalists to investigate it? And Amazon coincidently discovered the error themselves after being contacted by c't? Amazon is flat out lying.
If there is ever a case for the max GDPR fine to be imposed, this is it.
I wouldn't be shocked at all. Internal tools are never as streamlined as you would think. On top of that, the tools for doing this are probably not that mature given that GDPR hasn't been around that long, nor do a majority of users take advantage of it.
I'm having trouble getting worked up about this one. Yeah a privacy breach happened, but it was only one person's data exposed, and only to one other person.
The only reason it made the news is because people are already paranoid about voice assistants.
Was the data supposed to be stored in the first place according to the privacy policy/user consent? If not, it'd mean that Amazon stored highly sensitive data (audio recordings from people's bedrooms) illegally and in breach of user's trust.
if you enable Alexa to tune to your voice, it retains recordings. You can listen to the recordings in you Alexa mobile app- so it shouldn’t be that much of a surprise.
It shouldn’t be a shocker if you understand ML and infer what “tuning to your voice” implies. Most people aren’t sophisticated enough to infer that, however.
To me- this is a big demo of unexpected consequences of GDPR. Sounds like a great law, but forcing companies to share everything they know about you increases the risk that a nice package of everything is shared with the wrong party.
I'm not necessarily disagreeing with the thrust of your argument (do you really need to store all that?), but constraining your sample to people paid to talk to Alexa can create huge swathes of bias. You'd need to make sure the people you pay also reflect all the accents and languages of the people who use Alexa. On top of that, without some amount of voice data, how are you to even know what that accent breakdown looks like? That's a near-impossible task.
The only other kind of data I could imagine this kind of reaction to would be NEST cameras or similar. I'm pretty accepting of voice assistants now, but even I would be pretty put out by the thought of video of me in my jammies going to a stranger without my permission.
Browsing your site it is hard to see what you will charge for, and I am confused by the inclusion of some sort of token system. (It seems directly in conflict with the desire not to be beholden to an outside company to use the hardware if they are tied together?)
Mycroft does most of the Alexa parlor tricks such as weather and wikipedia lookups out of the box. If you use the plugins you can integrate it with Home Assistant, Kodi but I've had mixed luck.
They send the voice to Google for speech to text, but I believe that is configurable and they are working on a personal server which theoretically could operate entirely locally.
I was the TV "remote control" for many years growing up. If you ask my dad about the lights, he would say I had a bug when it came to turning them off.
I built something just for fun a couple of weeks back. I used the Sphinx Open Source Speech Recognition Toolkit from CMU https://cmusphinx.github.io/ Look for the sphinxbase and pocketsphinx packages under Ubuntu.
It'd be simple enough to hook this up to Philips Hue or whatever to do what you want.
There are numerous Sphinx language bindings. I went for Ruby via Isabella https://github.com/chrisvfritz/isabella. I used this because it provided a framework for what I wanted: define a simple grammar (JSGF, Java Speech Grammar Format) and call specified script(s) with the parsed results. The hardest part was probably mapping out the phonemes for the grammar atoms. (If your target language isn't English, you may be out of luck).
This worked really well for what I needed (directing band-in-a-box from the other side of the room) but still needs a little tuning. Even with a leading activation token ("Hey Isabella...") she sometimes gets confused and thinks she's been summoned when it's just some random song playing. Choosing a concise, simple, unambiguous grammar was helpful, along with sensitivity adjustment. There are other knobs to twiddle -- as evidenced by the academic mailing list activity -- but I didn't need to look closer for my simple use case.
It was a fun little project and the kids liked it, especially paired with a text-to-speech module (tts gem under Ruby): "Hey Isabella, am I <adjective>?" (or "Is <sibling> <adjective>?"), and a randomly generated response :)
I ended up using a set of JSGF grammars (one per intent) to generate a statistical language model for use with pocketsphinx. Rhasspy also features a web-based interface for creating custom words -- I have a mapping from Sphinx phonemes to eSpeak phonemes so you can iterate over a pronunciation until it sounds right.
As you mentioned, the wake/hotword stuff with Sphinx isn't terribly robust. I've been Docker-izing Mycroft Precise (https://github.com/MycroftAI/mycroft-precise) to address this.
Whats worse is that newer generations will not know what an analog switch is and how powerful and yet simple it is and how it just works in an instant when you want it to. When I was a kid I used to disassemble things and learn lots from the adventure because it was logical and intuitive (I also had to assemble it back or I'd get scolded at). Kids these days don't bother because the device of their choice captivates most of their attention and their control is limited.
This is an external service I wrote that does offline speech/intent recognition with pocketsphinx, and forwards structured JSON events into Home Assistant for use in automation scripts.
IIRC, that was a case of someone inadvertently activating Alexa, and being misinterpreted. This is a completely different situation than what’s discussed in the posted article, which implies that an Amazon employee sent someone another user’s Alexa data.
The original article stated that they do know how it works but the device malfunctioned and detected a command where there was none without the owner even being aware that it was activated.
>"Using these files, it was fairly easy to identify the person involved and his female companion; weather queries, first names, and even someone’s last name enabled us to quickly zero in on his circle of friends,” according to the report. “Public data from Facebook and Twitter rounded out the picture.”
Please keep in mind the next time you or a friend says "well, what do I care if Amazon knows when I ask to turn the heat up".
I read about a cool Echo Dot hardware hack where it was made into an old phone with mute on the speaker and microphone until picked up. As a tech people it boggles my mind that other tech people accept an always on microphone in their homes. I truly don't get it!
I visited an AirBnB property that had a Dot in the lounge room. I got strange looks from my fellow guests as I unplugged it.
The best way to avoid this information from leaking out to the whole world is to NOT UPLOAD IT IN THE FIRST DAMN PLACE.
I honestly don't care how many pinky-promises Amazon, Apple or Google make -- until there are business-destroying fines akin to HIPAA on steroids for everything these services hold, I'll wait this tech revolution out, thanks.
It's an always on microphone (as opposed to an always on device) if hacked or if Google/Apple change something fundamental, but only then. Echo/Dot have an always on microphone by design.
This is key. Yes the CIA or Apple or whoever could conceivably hack/change my phone or laptop to turn on the microphone and camera on, but there's not much you can really do about that as hardware switches seem to have gone out of vogue (I've got tape over my laptop camera like most sane people, but microphone?)
However I suspect the majority of iphones have "Hey Siri" enabled, and probably the same with whatever android uses
> However I suspect the majority of iphones have "Hey Siri" enabled, and probably the same with whatever android uses
IIRC, there's been a lot of coverage here about how Apple put in hardware to detect the "hey siri" phrase without having to go to software and bypass any security. That was it can always listen in "dumb" mode and switch to actively listening/recording and decoding request when the right phrase is heard. I'll see if I can find the discussion (and whether I'm correct or not) and update this comment with whatever the case is.
Edit: From what I've been able to find without doing a deep dive, is that in general Apple notes that only local pattern matching is done on the "hey siri" patter, and after triggered then it may utilize other resources (such as online maybe), but prior to that nothing is recorded.[1]
In conjunciton with their newer laptop hardware and the new T2 security chip, I think since the Chip handles the "hey siri" matching, it might not actually hit regular OS software processes until it does match. It's not entirely clear from what I skimmed.[2]
From the docs on the T2 chip, it looks to be in the newer laptop lineup only.[3] I'm not sure if there's an equivalent bit of hardware in the phones or not, or an older iteration that has the same features in question on older laptops.
I turned hey Siri off, but found something unsettling.
If I say RED FIRE TRUCK to my iPhone with HeySiri turned off, and immediately hold the home button down for Siri, “she” sometimes has fire or fire truck pulled up already. This was WAY WORSE with Hey Siri turned on where it would always have those words and sometimes even more.
So best case, they are always caching a little bit and maybe only upload or process it when I hold the button down - but it’s definitely clear they are doing a little general pre-recording of all words.
If they’re doing a little, it stands to reason they could do a lot. Which is a bummer, but it’s not like I have much say in the matter.
That's interesting. I often have a similar experience with pressing the voice button on Android, where if I was streaming some podcast the last few words would precede my intended voice search in what was actually entered. That's usually when streaming in the car through bluetooth though, and I think there's a couple second delay in audio there anyway, so that may explain it in my case. I can't seem to get it to do it manually right now with the phone's mic.
> So best case, they are always caching a little bit and maybe only upload or process it when I hold the button down - but it’s definitely clear they are doing a little general pre-recording of all words.
That is somewhat troubling to hear. I had some thoughts about how possibly some dedicated hardware is buffering and only handing it off on notification of the manual siri request, but on thinking about it, if the OS can request the info whenever it wants, that's not really any more secure than the OS recording it itself.
On the vast majority of devices it is not always on.
It is not a nitpick. Accidentally recording or even intentionally recording is many orders of magnitude simpler when the device is supposed to always be listening.
The fact that a single employee even could make the mistake mentioned in the articles is one example of why. Eventhough it also really does hint that the backend isn't up to par at amazon.
> On the vast majority of devices it is not always on.
Would you consider the device "listening" if a DSP is listening for a particular hotword, but no audio data is being uploaded?
I'm pretty sure that's what happens in either case, whether you're using Alexa, Siri, or Google Assistant, and whether it's a "home" device or a phone with hotword detection. I used to work at Google on some related hardware projects.
If you're suspicious because the home devices seem much more accurate, that's because they have microphone arrays which can facilitate beamforming.
This example doesn't really persuade me from removing Alexa either. I don't care if Amazon knows who I am. They have my credit card and billing address anyway.
I cannot think of 1 single thing I talk about in my own home that I feel uncomfortable with Amazon knowing.
I'm a generally happy Echo owner but am getting increasingly worried these mistakes are only going to ramp up, especially with the new Drop In feature, allowing users to enable the microphone on any Echo -- not limited to their own -- given the appropriate permissions are in place. (It's not clear if these are enforced at all on-device or simply cloud-based permissions.)
I'm just waiting until people become accustomed to these devices and the tech giants reveal their new real estate play with these devices always on and built into the walls of homes.
EDIT: I am definitely feeling what Yanis Varoufakis was talking about when he said that if something isn't done soon, The Matrix will be a documentary.
Does Amazon keep the raw recordings for GDPR compliance reasons?
It seems to me keeping them just creates liability? (Yes data analysis for targeting and such would be valuable, but those can be done with, say transcripts instead of the raw recordings)
I'd imagine that a competent company would keep access to user audio recordings tightly locked down, either because they actually care about privacy or because they care about the bad PR of recordings getting leaked.
But if the GDPR requires them to send the recordings to users upon request, they need to have a way to do that -- either by exposing a service on the web, or by providing access to customer service employees to handle the request. Not sure how I feel about that.
I would assume that if they made a voice print for you personally it would be fine. It should be still recognized as your own personal data, and underly the same laws
Nothing in the article suggests they were linked to any kind of user identifier, all we know is that each recording is linked to the transcription alexa made of it (which makes sense, training and log wise).
The user were easy to identify because if I listen to the last twenty queries you made to alexa between what you ask, what other people say in the background, what you talk about and what noise is in the background that gives me a lot of information; that's exactly how they were able to identify some of the users from their recordings.
> Nothing in the article suggests they were linked to any kind of user identifier
They were able to provide the customer a bundle of his recordings upon request so they must've maintained such a lookup mapping.
Of course that led to the mix up reported here. But I'd argue it's safer if on Amazons side they simply can't fulfill such request themselves due to anonymization.
I suppose that does raise a question of where you draw the line on a company's obligation to anonymize information. Is disassociating a user ID from the data enough? What about data that makes the user identifiable through patterns?
If I say "Alexa, my name is Bob Jones and I have chlamydia," they're not really helping me out by just not associating that audio clip with my username.
Didn't Amazon tell it's customers, that Alexa constantly records locally, but only sends the words after the "Alexa" magic, including 6 seconds before and 6 seconds after it, to the servers?
You are right. But I remember, that I had read a similar story, a week, or two, ago and the topic was, that the police found out about the killer in a murder case, by listening to Echo recordings from the kitchen. I doubt, that this would be possible, if there is no constant stream of data on Amazon's servers.
That gave me the impression, that they store it all. At least for a while.
I feel for average, clueless Joe that has limited knowledge, but anyone that has average and above average knowledge of tech, deserves what he /she gets in these cases. How stupid or tech ignorant would you have to be to add a device that records you in your home?
Imagine, fights /sex with wife, lover, arguments with kids, talk about drug deals, tax evasion, leaving job, stealing neighbors home, cheating on husband /wife...WTF would you want to take the chance that a recording device is even within 10 000 feet of your home?
Still worth it for me. And if they actually recorded all that and used it against me it would be like living in a real dystopian science-fiction movie, which would be an awesome experience in itself. Similar like how you want to experience a zombie apocalypse. Yes, I think it's that unlikely.
If civilization keeps increasing data production around ambivalent use cases, perhaps we can successfully hide >99% of the lethal data siting around out there until its utility expires.
I paid full retail for my HomePod (BestBuy in the states has them for 150 usd off) and love it. But I love it more because of the company behind it: Apple is not in the business to sell my data to the highest bidder. And their stance on security and customer privacy allowed me to put a talking tube in my home. There was no way in hell I was going to add a listening device from Google or Amazon.
It wasn't long ago that people were up in arms about the Xbox One sending audio back to Microsoft in order to implement voice commands. In the time since this kind of thing seems to have become acceptable or even mundane. What is it that changed? It's not as if big tech has become more trustworthy in recent years.
Experience was underwhelming, especially play lists from Spotify were hit and miss, one day no problem, the next day sometimes an excuse, sometimes just a beep, sometimes silence. I would need to turn on the light by myself in the future though.
Very unlikely. In fact, the error came from an actual GDPR Right of Access request. If they inform the affected person and take measures to prevent it from happening again, then they're in compliance of GDPR.
GDPR was created to prevent organizations from hiding data breaches, trading or transfering customer information, avoiding privacy compliance, etc.
This is making the news because is Amazon, but you will be surprised how many of these breaches happen in a daily basis in all types of organizations.
The first GDPR fine was against knuddels.de because they accidentally stored passwords in plain text [0]:
“By storing the passwords in clear text, the company knowingly violated its duty to ensure data security in the processing of personal data in accordance with GDPR Article 32(1)(a),”
When I was in uni root send me the unshadowed /etc/passwd file... no idea what that was about. Just logged in one day with an email of 'passwd' and everyone's passwords were there salted... almost seemed like they were daring me!
What happened to all the layers of security Alexa is supposed to have to prevent data from being sent upstream, isn't it supposed to process everything locally? Eww. Come on guys, Rule 1: Don't make creepy tech.
I'm all against unwarranted surveillance but for voice assistants you basically sell the risk of privacy intrusion for utility. You're making an active choice. I see nothing wrong with that.
If this happened today but with Verizon and someone's location via cell tower records, no one would use that as evidence that no one should own a cell phone. That probably tells you what you need to know about the trajectory of voice assistants.