This is a bold claim. The VRT journalist were able to re-identify few data subjects and confirmed with them that their data was being listened by Google contractors. "‘This is undeniably my own voice’, says one man, clearly surprised." wrote the journalist.
> The Google Assistant only sends audio to Google after your device detects that you’re interacting with the Assistant.
Well, this is not what the contractor said . The contractor heard private conversations, sex scenes, and violence including "a woman who was in definite distress".
Well, both claims could be true. Even if the snippets are not directly associated to users, there could still be identifying information in the recording that could be used to identify the person speaking.
I think this is a bit disingenuous and intentionally sensationalist, nobody has any reason to believe that the Assistant is always listening or will just trigger randomly or on demand by Google because they want to listen to particular conversations, that doesn't even really make any sense. Also various parties have access to the source code, so if it were malicious, there can be evidence of it. That there isn't any evidence is important to mention in this context.
In reality its just that it can randomly miss-classify "Ok Google", nothing malicious about that, and obviously the purpose of the annotator is to look at unrecognized samples, so obviously they are gonna frequently hear those cases, that's sort of the whole point of this.
Maybe "well, duh, how do you think it knows your voice? magical pixie dust?" is a bit of a cop out, but honestly there are two options, them doing what they are doing right now, and not have Google Assistant.
I would notice because it sat on my computer desk.
To be clear, I don't think it was nefarious that it would, just that I realized I wasn't using it as much as I thought I would and decided to unplug it.
> We just learned that one of these language reviewers has violated our data security policies by leaking confidential Dutch audio data.
Isn’t the whole point that the audio is supposed to be innocuous? “Hey google, play Coldplay” or “Hey google, play Taylor Swift.” If that’s the only audio that Google stores, then there should be no problem with leaking it to the press.
But it isn’t, which is what the translator was trying to show in the first place!
I guess those standards don't include clearly letting end users know that their data will be stored, shared and transcribed by 3rd parties (not that the 1st party in this case isn't bad enough).
The leak is not the problem here (and it's extremely disconcerting that they're focusing on that). Their lack of communication with consumers is the problem. This blog post makes things worse as far as I'm concerned.
As if they need your location and browsing history if they're gonna play that song you asked for.
They claim that their experts can listen to a small number of recordings of human speech directed at their devices. However, there is no telling what people might be speaking to their Google devices. I don't use Google Home, but given that it allows one to set reminders, etc., I would think that people could be saying pretty sensitive things, including phone numbers, addresses, credit card numbers, etc. which combined together can identify them uniquely or at least allow those bits of information to be misused.
Also, embedded in the figure of 0.2%, mentioned for the percentage of spoken interactions listened to, is the assumption that it is a very small number. However, that number implies that 1 out of every 500 interactions are listened to. For a family of 4 owning a Google Home, the number of spoken interactions with it in a year would easily run into thousands. Therefore, the experts at Google are listening to at least a handful of interactions every year for each family. Given the speech-to-text state-of-the-art, if these recordings are being converted to text and being stored, it would not be hard to group the recorded interactions per family and derive some identifiable information about them from it.
If by 'conspiracy theorist' you mean someone who suspects the government or corporations of foul play, then don't forget about 'government surveilance', one of the biggest conspiracy theories of the last several decades - until it turned out to be true. (did you dismiss them as well, before snowden put hard evidence behind the reasonable speculation?)
Either way, you shouldn't talk down to people who find minimal, available evidence worthy of speculation. It's not conducive to reasonable discussion.
A hobby of mine is seeing how far away from "Hey Google" I can stray. The accessibility feature to beep on wake helps with this.
From my own perspective (and certainly dont mean this in a way that's dismissive towards you/your comment), given that those companies lie constantly, I'm not sure why I would trust their own PR about their products as a reliable source of information.
This very article. But since you said you don't believe them ("those companies lie constantly") why would referencing them help here? Either you believe them, and it is settled, or you don't and it is worthless.
Hook up wireshark, you can quite clearly see when Google Mini or an Echo are sending an audio stream since it is a substantial packet-load relative to normal "keep-alive" comms. The only exceptions to that, from what I've seen, are automatic software updates (but traffic is moving down, rather than up).
You could also look at the device teardowns, for example this article has a good overview of the underlying workings (inc. Wake Word detection):
The only reference that would be credible would be something from the company, so if you don't believe them, then it wouldn't really matter, would it?