I appreciate the privacy standards they used (no humans reading your email to de...

_f1dq · on Nov 3, 2015

Much of this strikes me as a "just because you can, doesn't mean you should" issue. Google clearly loves machine learning and doing cool things but I think lately they've been taking it too far.

For example; after purchasing a book on Amazon recently I happened to do a Google search on that book and the first thing I see is, "Your book is scheduled to be delivered on..." Aside from the creepy factor I'm left wondering what purpose this serves? I just ordered the book. I KNOW it's on its way.

Turns out they just mined my emails from Gmail to provide it in search results.

I'm sure some developer or product manager thought it would be a cool thing to do without giving any consideration to usefulness much less user privacy. I really don't feel like Google needs to know what I'm buying thankyouverymuch. Gmail account: closed.

TeMPOraL · on Nov 4, 2015

> Aside from the creepy factor (...)

One man's "creepy factor" is another's superbly useful feature. The feeling of creepiness probably stems from being surprised and defaulting to negative reaction. Remember that GMail was never supposed to be a dumb mailbox. If you want a dumb mailbox, there are tons of alternatives (i.e. almost every other provider and various open-source UI packages).

Honestly, I really enjoy those "creepy" features and want much more. For me, they can, and definitely should.

alttab · on Nov 4, 2015

And please - like Google cares that you ordered that book from Amazon. Until they do.

Abuses of this technology are inevitable, but we haven't seen it yet. It's the source of magical "how did they do that" wow factor in technology that touch screens and thin devices used to have.

TeMPOraL · on Nov 4, 2015

Maybe I'm weird, but for me none of this (touchscreens too) is "magical", and all of this is "interesting" and then "obvious" when I learn/figure out how they do it. Maybe that's why some people are afraid - because it's more magic to them?

> And please - like Google cares that you ordered that book from Amazon. Until they do.

Well, if Google starts caring that I ordered a book from Amazon (more than they already do - Google Now shows me info on my purchases, including delivery time), then they'll do what exactly? Tell their self-driving cars to kill me because I didn't use Google Play?

Google so far has a stellar history of being helpful, pro-user, quite often pro-bono at it. Please apply these levels of scrutiny to someone else first, like every other SV startup running on the investor-storytime model.

eloisant · on Nov 4, 2015

I don't think the risk is Google harming you for the books you bought, but disclosing the information to a government who may.

For example, a government (China?) may pass a law to force Google to disclose the list of nationals who bought certain books (political book criticising the Chinese government?), and Google may choose to comply to stay in that market.

TeMPOraL · on Nov 4, 2015

Except China tried something like that and Google abandoned the country. So we have at least one strong data point they're unlikely to do it now.

But basically, you can draw such arguments about anything. What if the evil government asks my local bookstore for CCTV recordings and credit card recipes? What if they ask my bank?

If your government wants to be evil, they will find a way to do this, regardless of whether people posted their data all over the Internet or not. The problem is with your government and not with the tools they would use in a hypothetical, unlikely scenario of going batshit insane in the nearby future. It's like a country deciding to destroy all roads and bridges because they can be used by an invasion force to quickly overrun the country. Well, they would be, but since you destroyed them your enemy will airdrop soldiers on you in the extremely unlikely future when they decide to invade. In the meantime, you have no roads and bridges.

bespoke_engnr · on Nov 4, 2015

I see your point, but I think the counterpoint is that there is a huge difference between

1. getting a warrant and running down to the book shop to ask them about what a single suspect bought, and

2. Having enormous amounts of data collected about hundreds of millions of people, processed and interpreted, essentially sitting in one place. Whether Google voluntarily gives it up or not is almost irrelevant (as we saw, the NSA is happy to tap Google's private fiber without their knowledge even when Google is cooperating on other fronts).

Of course the government will find ways to be evil if it wants (and I think that we're generally lucky because it doesn't seem to want to with any particular intensity). But that's not really the point here.

It's the difference between having many small, complicated little targets that each yield very little information, versus one conveniently enormous target that yields information on everyone.

TeMPOraL · on Nov 4, 2015

I agree that it's easier when it's all sitting conveniently in one place. But if we're really to be worried about it, we need to ditch the concept of civilization as it's presently understood, because everything we do to make our society better leads to more information about us being available to more people. You can't have a cake and eat it too.

I structured my previous comment in this way to proactively address the argument that someone always brings up - the story of how Nazi Germany used census data to track down and exterminate Jews. My point being that yes, evil government could use this data to do evil things in an efficient way, but it doesn't mean that we have to proactively stop doing censuses - they have many other, real, actual, positive uses.

Or another example - the best way to prevent a house fire from burning down a neighbourhood is to not build houses near each other. But instead, we invest in firefighters, better materials and procedures, all of which addresses the problem of fire spread. Why? Because we want the houses to be close to each other.

bespoke_engnr · on Nov 4, 2015

The fact that none of the other commenters above made this connection is a bit strange for a community of Tech/Internet/Web people. Is all of this really that shiny?

TeMPOraL · on Nov 4, 2015

What about the fact that this situation has already happened, but played out differently - the government of China asked Google for moar intel, and Google basically responded "fuck it, we're outta here". And that's why today I had to check my private e-mail over a VPN.

bespoke_engnr · on Nov 4, 2015

PRISM doesn't ring a bell for you?

TeMPOraL · on Nov 4, 2015

It does. I remember Google being the victim, not a perpetrator.

roninb · on Nov 3, 2015

That's cool that you feel that way, but I really enjoy these features. I'm never home and never sitting down so being able to see everything that I need to pay attention to the things I can't remember is amazing. Google Now reminding me of meetings, flight times, hotel check out times, package deliveries, etc, are things a personal assistant would do but without the associated cost of a yuppie's salary.

Just because you don't see it as convenient or even useful doesn't mean no one else does.

leephillips · on Nov 3, 2015

I understand why it might seem creepy, at least until you get used to it. But surely you knew that Google's computers were already reading all mail sent to your gmail account. They filter and check for malware and spam based partly on content, and even serve related advertising in the gmail interface (do they still do this?).

thrownaway2424 · on Nov 3, 2015

Order status and package tracking are extremely popular features of Google Now.

morgosmaci · on Nov 3, 2015

On http://myaccount.google.com you can go into search settings and turn off "Private Results" and it will not return GMail search results.

Edit: It is under Your personal info -> Search settings -> Manage settings

alttab · on Nov 4, 2015

I imagine this could be useful for live demos.

DaveWalk · on Nov 3, 2015

> a "just because you can, doesn't mean you should" issue

To me, this phrase is the essence of much of Google's features. To my discredit, I chuckled when I read the blog's phrase "we've used...deep neural networks to improve...YouTube thumbnails." I am certain this was no easy task, and a resulting technical breakthrough. But doesn't it sound kind of petty?

Of course, what's petty for one is essential for another. I wish every e-mail client had that "undo send" feature, which was just GMail whimsy years ago. Is the line between petty and essential always going to be blurred?

jsolson · on Nov 3, 2015

Improving YouTube thumbnails can be a huge usability win Consider a series of lectures or DIY videos with a common setting. Pulling out a frame that captures something unique about the video (be it the DIY item being worked on in close-up or an important theorem on a title slide or blackboard) makes it easier for users to separate content and find specific items.

roel_v · on Nov 4, 2015

Then what isn't 'petty'? It's not like it's zero-sum, there are also people working on using AI for recognizing cancer cells on medical imaging, or to manage climate change risk. And if we don't try, we'll never know what is 'petty' and what is useful. And also, we can learn a lot from the 'exercise' we get in developing small-scale applications of machine learning, which can then be applied later to more 'worthy' applications.

strong_ai · on Nov 3, 2015

Thumbnails drive engagement -- doesn't seem petty to me.

nl · on Nov 3, 2015

I love this feature.

IgorPartola · on Nov 3, 2015

But you have a lot more to go off here and the number of replies is limited to maybe a few thousand at most. It can quickly determine if it's a scheduling email, check your calendar, and generate responses like "I am available" and "I'm busy". For others it can be as simple as "I'll check it out and get back to you". Finally, if you are expected to review the automatically composed response or choose from several options it's actually not that bad at all. This actually seems a lot like the iOS feature where if you miss hang up on an incoming call you can send a quick SMS reply back saying things like "I'll call you back" or automatically adding a reminder to call back in an hour.

imh · on Nov 3, 2015

I'm talking about a slightly different problem. I'm not suggesting that you might accidentally click to send a reply you didn't want to share, so you reviewing it is beside the point. I'm suggesting that by mining all our emails, it might make a suggestion to me based on something you didn't want to share.

E.g. Someone writes me an email about a rare kink you often talk about. You're the main data point on that kink, so it suggests I respond with something you often say when you talk about this topic, maybe including personal details. It's not a totally precise or realistic example, but with large numbers and complex models, unintended things are bound to happen on occasion. Will those things leak information?

As for your comment that the potential replies are limited in number and as structured as you say, I don't get that from the original post, and it doesn't quite fit with my understanding of the model.

newjersey · on Nov 3, 2015

You raise a very important point. I'd hope that there is an actual finite (and relatively small) corpus of approved, manually white listed answers that transcend through Google accounts. You might get personalized options based on what you write most but they would not show for other people. Would that be enough to satisfy this concern?

darkmighty · on Nov 3, 2015

Yes, it would. If curation is too troublesome for gathering a large enough training set, it might be possible to train a smaller curated network with a higher false-negative rate that flags responses that aren't appropriate (personal info, insults, etc) and removes those from the training set.

Splines · on Nov 3, 2015

You could also do something like googlebombing. Have lots of people send each other the same question, and all of them reply with the same/similar response.

newsignup · on Nov 3, 2015

Hehe. I bet they will filter that out :)

rasz_pl · on Nov 3, 2015

yes, just like they filter out plagiarised copycat websites in pagerank https://news.ycombinator.com/item?id=10493754

IgorPartola · on Nov 3, 2015

Ah, that makes sense. I think the step where you end up reviewing the reply it composed would prevent most of that.

Houshalter · on Nov 4, 2015

Overfitting requires "memorizing" the dataset, instead of generalizing it. I think that's very very unlikely. The neural network parameters can only store so many bits of information. But the dataset is millions of times bigger.

imh · on Nov 5, 2015

That's why I wouldn't worry about how it performs in general, but in edge cases. The question isn't whether it's memorizing the whole dataset, but whether it's "memorizing" any particular points it shouldn't. Kinda like when you do a polynomial regression and the ends go more wild than the middle. The predictions in different parts of the space have different variances, some determined more strongly by single data points.

I have no doubt that in the vast majority of the email space, this will do great, but wonder will it leak privacy anywhere at all?

andrewtbham · on Nov 3, 2015

That is a potential problem, especially if there is overfitting, but personal details are not likely to be generalized.