Much of this strikes me as a "just because you can, doesn't mean you should" issue. Google clearly loves machine learning and doing cool things but I think lately they've been taking it too far.
For example; after purchasing a book on Amazon recently I happened to do a Google search on that book and the first thing I see is, "Your book is scheduled to be delivered on..." Aside from the creepy factor I'm left wondering what purpose this serves? I just ordered the book. I KNOW it's on its way.
Turns out they just mined my emails from Gmail to provide it in search results.
I'm sure some developer or product manager thought it would be a cool thing to do without giving any consideration to usefulness much less user privacy. I really don't feel like Google needs to know what I'm buying thankyouverymuch. Gmail account: closed.
One man's "creepy factor" is another's superbly useful feature. The feeling of creepiness probably stems from being surprised and defaulting to negative reaction. Remember that GMail was never supposed to be a dumb mailbox. If you want a dumb mailbox, there are tons of alternatives (i.e. almost every other provider and various open-source UI packages).
Honestly, I really enjoy those "creepy" features and want much more. For me, they can, and definitely should.
And please - like Google cares that you ordered that book from Amazon. Until they do.
Abuses of this technology are inevitable, but we haven't seen it yet. It's the source of magical "how did they do that" wow factor in technology that touch screens and thin devices used to have.
Maybe I'm weird, but for me none of this (touchscreens too) is "magical", and all of this is "interesting" and then "obvious" when I learn/figure out how they do it. Maybe that's why some people are afraid - because it's more magic to them?
> And please - like Google cares that you ordered that book from Amazon. Until they do.
Well, if Google starts caring that I ordered a book from Amazon (more than they already do - Google Now shows me info on my purchases, including delivery time), then they'll do what exactly? Tell their self-driving cars to kill me because I didn't use Google Play?
Google so far has a stellar history of being helpful, pro-user, quite often pro-bono at it. Please apply these levels of scrutiny to someone else first, like every other SV startup running on the investor-storytime model.
I don't think the risk is Google harming you for the books you bought, but disclosing the information to a government who may.
For example, a government (China?) may pass a law to force Google to disclose the list of nationals who bought certain books (political book criticising the Chinese government?), and Google may choose to comply to stay in that market.
Except China tried something like that and Google abandoned the country. So we have at least one strong data point they're unlikely to do it now.
But basically, you can draw such arguments about anything. What if the evil government asks my local bookstore for CCTV recordings and credit card recipes? What if they ask my bank?
If your government wants to be evil, they will find a way to do this, regardless of whether people posted their data all over the Internet or not. The problem is with your government and not with the tools they would use in a hypothetical, unlikely scenario of going batshit insane in the nearby future. It's like a country deciding to destroy all roads and bridges because they can be used by an invasion force to quickly overrun the country. Well, they would be, but since you destroyed them your enemy will airdrop soldiers on you in the extremely unlikely future when they decide to invade. In the meantime, you have no roads and bridges.
I see your point, but I think the counterpoint is that there is a huge difference between
1. getting a warrant and running down to the book shop to ask them about what a single suspect bought, and
2. Having enormous amounts of data collected about hundreds of millions of people, processed and interpreted, essentially sitting in one place. Whether Google voluntarily gives it up or not is almost irrelevant (as we saw, the NSA is happy to tap Google's private fiber without their knowledge even when Google is cooperating on other fronts).
Of course the government will find ways to be evil if it wants (and I think that we're generally lucky because it doesn't seem to want to with any particular intensity). But that's not really the point here.
It's the difference between having many small, complicated little targets that each yield very little information, versus one conveniently enormous target that yields information on everyone.
I agree that it's easier when it's all sitting conveniently in one place. But if we're really to be worried about it, we need to ditch the concept of civilization as it's presently understood, because everything we do to make our society better leads to more information about us being available to more people. You can't have a cake and eat it too.
I structured my previous comment in this way to proactively address the argument that someone always brings up - the story of how Nazi Germany used census data to track down and exterminate Jews. My point being that yes, evil government could use this data to do evil things in an efficient way, but it doesn't mean that we have to proactively stop doing censuses - they have many other, real, actual, positive uses.
Or another example - the best way to prevent a house fire from burning down a neighbourhood is to not build houses near each other. But instead, we invest in firefighters, better materials and procedures, all of which addresses the problem of fire spread. Why? Because we want the houses to be close to each other.
The fact that none of the other commenters above made this connection is a bit strange for a community of Tech/Internet/Web people. Is all of this really that shiny?
What about the fact that this situation has already happened, but played out differently - the government of China asked Google for moar intel, and Google basically responded "fuck it, we're outta here". And that's why today I had to check my private e-mail over a VPN.
That's cool that you feel that way, but I really enjoy these features. I'm never home and never sitting down so being able to see everything that I need to pay attention to the things I can't remember is amazing. Google Now reminding me of meetings, flight times, hotel check out times, package deliveries, etc, are things a personal assistant would do but without the associated cost of a yuppie's salary.
Just because you don't see it as convenient or even useful doesn't mean no one else does.
I understand why it might seem creepy, at least until you get used to it. But surely you knew that Google's computers were already reading all mail sent to your gmail account. They filter and check for malware and spam based partly on content, and even serve related advertising in the gmail interface (do they still do this?).
> a "just because you can, doesn't mean you should" issue
To me, this phrase is the essence of much of Google's features. To my discredit, I chuckled when I read the blog's phrase "we've used...deep neural networks to improve...YouTube thumbnails." I am certain this was no easy task, and a resulting technical breakthrough. But doesn't it sound kind of petty?
Of course, what's petty for one is essential for another. I wish every e-mail client had that "undo send" feature, which was just GMail whimsy years ago. Is the line between petty and essential always going to be blurred?
Improving YouTube thumbnails can be a huge usability win
Consider a series of lectures or DIY videos with a common setting. Pulling out a frame that captures something unique about the video (be it the DIY item being worked on in close-up or an important theorem on a title slide or blackboard) makes it easier for users to separate content and find specific items.
Then what isn't 'petty'? It's not like it's zero-sum, there are also people working on using AI for recognizing cancer cells on medical imaging, or to manage climate change risk. And if we don't try, we'll never know what is 'petty' and what is useful. And also, we can learn a lot from the 'exercise' we get in developing small-scale applications of machine learning, which can then be applied later to more 'worthy' applications.
For example; after purchasing a book on Amazon recently I happened to do a Google search on that book and the first thing I see is, "Your book is scheduled to be delivered on..." Aside from the creepy factor I'm left wondering what purpose this serves? I just ordered the book. I KNOW it's on its way.
Turns out they just mined my emails from Gmail to provide it in search results.
I'm sure some developer or product manager thought it would be a cool thing to do without giving any consideration to usefulness much less user privacy. I really don't feel like Google needs to know what I'm buying thankyouverymuch. Gmail account: closed.