I'm often amazed how Google still gives it staff this unique hacker-like approach to their million dollar projects.
For example, the stuff in Google labs (gmail) has had some silly things like don't hit send when you're drunk but sometimes very useful features which may be considered unorthodox like "Gmail telling you when it thinks you wanted to attach a file but forgot".
It's something I wouldn't generally associate with a very corporate design but here they are wanting to add another silly feature and who knows how it will turn out? Maybe it will be super useful and then all the other companies will start to copy it.
But the thing is they're inventing new ways that really don't fit your product development roadmaps. I really like that about them.
I had a similar thought with Apple -- El Cap has a feature where if you shake your mouse around, the cursor will momentarily grow in size so you can find where it is.
That does not feel like a top-down design idea, but a bottom-up feature designed by an engineer who was sick of those moments where he/she had lost the cursor.
Oh wow! That explains wierd artifacts I've been seeing since El Capitan, when using my trackball mouse. The cursor randomly grows in size, usually when I move it between screens quickly. I expect it's some "random jitter" in my using the trackball that causes it to happen.
I seem to recall the Microsoft Mouse drivers for Windows 95 had a key shortcut to make lines radiate out of your cursor. Very useful on the low refresh rate LCDs where that cursor would appear like a submarine on the other side of the screen.
You can still do this (at least in Windows 7, the last time I checked.) Just look in the 'Mouse' section of your control panel, and select the one to make <Ctrl> radiate circles around your mouse :)
This feature is awesome, and super-discoverable. I didn't need to change my behaviour to discover it, or to use it. I just wiggle my finger like I used to, and it just works.
While I'm sure some great hackers worked on Smart Reply, I don't think it's the result of the process you imagine. Smart Reply is using really sophisticated machine learning to advance one of Google's core goals, which is the creation of an AI and autonomous agents. A chatbot like the kind that finds a response to email is just the first shoe to drop.
I just tried sending "Attached is my photo". Then "I've been feeling less attached to you". One warns, the other doesn't. This is also more than a single if.
Doing it with a low false positive rate (so that you can enable it for everyone without being annoying) is much harder.
Also, Gmail has lots of users who speak languages other than English... has anyone tested whether the forgotten attachment detector works for them too?
I like the attachment recommendation thing and it can be useful but what I find ridiculous is that they still have yet to recommend that you use the correct from address when sending an email to your work contacts. I have about 15 email address that I can send from. If any recipients are in the x.com domain and I am not sending it from my x.com account it should alert me. Better yet, it should just switch the from address for me. This feature has been requested for years now.
"Gmail telling you when it thinks you wanted to attach a file but forgot" is hardly unorthodox considering Outlook has had it for years. That said, the other experiments do seem very informal and fun.
I appreciate the privacy standards they used (no humans reading your email to develop this), but am concerned that it's not enough. As I understand with language models, overfitting takes the form of returning a sequence of words seen in the training set. If this is overfitting in any part of the response space, this could happen. Out of a million emails, how many suggested responses are going to substantively resemble another response the original author wouldn't want read by others?
Much of this strikes me as a "just because you can, doesn't mean you should" issue. Google clearly loves machine learning and doing cool things but I think lately they've been taking it too far.
For example; after purchasing a book on Amazon recently I happened to do a Google search on that book and the first thing I see is, "Your book is scheduled to be delivered on..." Aside from the creepy factor I'm left wondering what purpose this serves? I just ordered the book. I KNOW it's on its way.
Turns out they just mined my emails from Gmail to provide it in search results.
I'm sure some developer or product manager thought it would be a cool thing to do without giving any consideration to usefulness much less user privacy. I really don't feel like Google needs to know what I'm buying thankyouverymuch. Gmail account: closed.
One man's "creepy factor" is another's superbly useful feature. The feeling of creepiness probably stems from being surprised and defaulting to negative reaction. Remember that GMail was never supposed to be a dumb mailbox. If you want a dumb mailbox, there are tons of alternatives (i.e. almost every other provider and various open-source UI packages).
Honestly, I really enjoy those "creepy" features and want much more. For me, they can, and definitely should.
And please - like Google cares that you ordered that book from Amazon. Until they do.
Abuses of this technology are inevitable, but we haven't seen it yet. It's the source of magical "how did they do that" wow factor in technology that touch screens and thin devices used to have.
Maybe I'm weird, but for me none of this (touchscreens too) is "magical", and all of this is "interesting" and then "obvious" when I learn/figure out how they do it. Maybe that's why some people are afraid - because it's more magic to them?
> And please - like Google cares that you ordered that book from Amazon. Until they do.
Well, if Google starts caring that I ordered a book from Amazon (more than they already do - Google Now shows me info on my purchases, including delivery time), then they'll do what exactly? Tell their self-driving cars to kill me because I didn't use Google Play?
Google so far has a stellar history of being helpful, pro-user, quite often pro-bono at it. Please apply these levels of scrutiny to someone else first, like every other SV startup running on the investor-storytime model.
I don't think the risk is Google harming you for the books you bought, but disclosing the information to a government who may.
For example, a government (China?) may pass a law to force Google to disclose the list of nationals who bought certain books (political book criticising the Chinese government?), and Google may choose to comply to stay in that market.
Except China tried something like that and Google abandoned the country. So we have at least one strong data point they're unlikely to do it now.
But basically, you can draw such arguments about anything. What if the evil government asks my local bookstore for CCTV recordings and credit card recipes? What if they ask my bank?
If your government wants to be evil, they will find a way to do this, regardless of whether people posted their data all over the Internet or not. The problem is with your government and not with the tools they would use in a hypothetical, unlikely scenario of going batshit insane in the nearby future. It's like a country deciding to destroy all roads and bridges because they can be used by an invasion force to quickly overrun the country. Well, they would be, but since you destroyed them your enemy will airdrop soldiers on you in the extremely unlikely future when they decide to invade. In the meantime, you have no roads and bridges.
I see your point, but I think the counterpoint is that there is a huge difference between
1. getting a warrant and running down to the book shop to ask them about what a single suspect bought, and
2. Having enormous amounts of data collected about hundreds of millions of people, processed and interpreted, essentially sitting in one place. Whether Google voluntarily gives it up or not is almost irrelevant (as we saw, the NSA is happy to tap Google's private fiber without their knowledge even when Google is cooperating on other fronts).
Of course the government will find ways to be evil if it wants (and I think that we're generally lucky because it doesn't seem to want to with any particular intensity). But that's not really the point here.
It's the difference between having many small, complicated little targets that each yield very little information, versus one conveniently enormous target that yields information on everyone.
I agree that it's easier when it's all sitting conveniently in one place. But if we're really to be worried about it, we need to ditch the concept of civilization as it's presently understood, because everything we do to make our society better leads to more information about us being available to more people. You can't have a cake and eat it too.
I structured my previous comment in this way to proactively address the argument that someone always brings up - the story of how Nazi Germany used census data to track down and exterminate Jews. My point being that yes, evil government could use this data to do evil things in an efficient way, but it doesn't mean that we have to proactively stop doing censuses - they have many other, real, actual, positive uses.
Or another example - the best way to prevent a house fire from burning down a neighbourhood is to not build houses near each other. But instead, we invest in firefighters, better materials and procedures, all of which addresses the problem of fire spread. Why? Because we want the houses to be close to each other.
The fact that none of the other commenters above made this connection is a bit strange for a community of Tech/Internet/Web people. Is all of this really that shiny?
What about the fact that this situation has already happened, but played out differently - the government of China asked Google for moar intel, and Google basically responded "fuck it, we're outta here". And that's why today I had to check my private e-mail over a VPN.
That's cool that you feel that way, but I really enjoy these features. I'm never home and never sitting down so being able to see everything that I need to pay attention to the things I can't remember is amazing. Google Now reminding me of meetings, flight times, hotel check out times, package deliveries, etc, are things a personal assistant would do but without the associated cost of a yuppie's salary.
Just because you don't see it as convenient or even useful doesn't mean no one else does.
I understand why it might seem creepy, at least until you get used to it. But surely you knew that Google's computers were already reading all mail sent to your gmail account. They filter and check for malware and spam based partly on content, and even serve related advertising in the gmail interface (do they still do this?).
> a "just because you can, doesn't mean you should" issue
To me, this phrase is the essence of much of Google's features. To my discredit, I chuckled when I read the blog's phrase "we've used...deep neural networks to improve...YouTube thumbnails." I am certain this was no easy task, and a resulting technical breakthrough. But doesn't it sound kind of petty?
Of course, what's petty for one is essential for another. I wish every e-mail client had that "undo send" feature, which was just GMail whimsy years ago. Is the line between petty and essential always going to be blurred?
Improving YouTube thumbnails can be a huge usability win
Consider a series of lectures or DIY videos with a common setting. Pulling out a frame that captures something unique about the video (be it the DIY item being worked on in close-up or an important theorem on a title slide or blackboard) makes it easier for users to separate content and find specific items.
Then what isn't 'petty'? It's not like it's zero-sum, there are also people working on using AI for recognizing cancer cells on medical imaging, or to manage climate change risk. And if we don't try, we'll never know what is 'petty' and what is useful. And also, we can learn a lot from the 'exercise' we get in developing small-scale applications of machine learning, which can then be applied later to more 'worthy' applications.
But you have a lot more to go off here and the number of replies is limited to maybe a few thousand at most. It can quickly determine if it's a scheduling email, check your calendar, and generate responses like "I am available" and "I'm busy". For others it can be as simple as "I'll check it out and get back to you". Finally, if you are expected to review the automatically composed response or choose from several options it's actually not that bad at all. This actually seems a lot like the iOS feature where if you miss hang up on an incoming call you can send a quick SMS reply back saying things like "I'll call you back" or automatically adding a reminder to call back in an hour.
I'm talking about a slightly different problem. I'm not suggesting that you might accidentally click to send a reply you didn't want to share, so you reviewing it is beside the point. I'm suggesting that by mining all our emails, it might make a suggestion to me based on something you didn't want to share.
E.g. Someone writes me an email about a rare kink you often talk about. You're the main data point on that kink, so it suggests I respond with something you often say when you talk about this topic, maybe including personal details. It's not a totally precise or realistic example, but with large numbers and complex models, unintended things are bound to happen on occasion. Will those things leak information?
As for your comment that the potential replies are limited in number and as structured as you say, I don't get that from the original post, and it doesn't quite fit with my understanding of the model.
You raise a very important point. I'd hope that there is an actual finite (and relatively small) corpus of approved, manually white listed answers that transcend through Google accounts. You might get personalized options based on what you write most but they would not show for other people. Would that be enough to satisfy this concern?
Yes, it would. If curation is too troublesome for gathering a large enough training set, it might be possible to train a smaller curated network with a higher false-negative rate that flags responses that aren't appropriate (personal info, insults, etc) and removes those from the training set.
You could also do something like googlebombing. Have lots of people send each other the same question, and all of them reply with the same/similar response.
Overfitting requires "memorizing" the dataset, instead of generalizing it. I think that's very very unlikely. The neural network parameters can only store so many bits of information. But the dataset is millions of times bigger.
That's why I wouldn't worry about how it performs in general, but in edge cases. The question isn't whether it's memorizing the whole dataset, but whether it's "memorizing" any particular points it shouldn't. Kinda like when you do a polynomial regression and the ends go more wild than the middle. The predictions in different parts of the space have different variances, some determined more strongly by single data points.
I have no doubt that in the vast majority of the email space, this will do great, but wonder will it leak privacy anywhere at all?
This kind of stuff is really cool. I imagine the future of Google being the same old search box, but instead of entering a search query, you engage in a conversation with Google so you can delve deeply into a narrow topic and get back tailored responses to your question (as opposed to opening 10 tabs of Stack Overflow links that are maybe related to your question).
I worry a bit about the long-term risks of this kind of training (query / reply). While this is obviously very far from being "high resolution" enough to single out very specific information, at some point these kinds of tools (and assistant AIs that can answer questions) will out of necessity be able to converse around very domain specific topics. At this point, how do they know what data is private and what is not? It could be tempting to train these AIs on chat logs or e-mail conversations, but that'd give them knowledge of very private information which they might leak to others. Even if they're limited to data that is accessible anonymously, they'd be extremely good at picking up information that wasn't intended for the public. For example, if you could describe a person called /u/andreasblixt on reddit, and leave it up to the bot to put the pieces together that this person is also on Facebook, Twitter, etc... and that obscure forum from 10 years ago. Food for thought.
A final thought on this... When these assistant AIs will inevitably have to know something about you. For example, your preferred schedule, food restrictions, preferred airlines, name, family, friends, phone numbers, where you were last night, who you've talked to, etc. Even if we all get our own namespaced AI assistant (i.e., the trained neural network that contains private information is stored and encrypted for your access only), that assistant's "brain" may very well become a prized target because if you get access to it you can interrogate it for information (you most likely can't just access the information in any meaningful way – you'd literally have to give it queries to make it return semantic output with the information you're trying to access).
I just started reading Avogadro Corp (http://www.amazon.com/Avogadro-Corp-Singularity-Closer-Appea...) this weekend, and this reminds me quite a lot of the emergent AI that figures heavily in the story (ELOPe). A quick synopsis: developers build a system to "improve" responses from emails. The system at some point is given the ability to send emails on its own, and a poorly issued directive. It's been an engaging read so far, and fairly hilarious since the corporation in the book is very obviously based on Google.
Hey, remember that one time when this feature was Gmail's April Fools Day joke[1]? Kind of funny how it's now a serious and actually useful-looking product.
Gmail itself was a sort of a serious April Fools joke. It was launched on April 1st, 2004 and because it offered 1GB of free storage when other providers were offering ~5MB, a lot of people assumed it was a hoax.
While the technology is very cool, this pushes me that much closer to a "dumber" email service. It's always a conflict for me as I love shiny new things, but I'd rather we let AI loose on someone else's email, especially when the AI's revenue stream is advertising (a business predicated on knowing as much as possible about your audience).
Very cool, and that definitely takes me back to eagerly pursuing/waiting out my beta invite. Unfortunately I would guess that even with a retro interface all of one's email data is still subject to content trawling, but that's a nifty discovery nonetheless.
I don't think I understand your reasoning. You're already tolerating an AI processing your email to produce the ads, so assuming that, why does a cool new feature push you closer towards a "dumber" service?
Because I'm lazy and switching would be time consuming. Each new privacy-related "innovation" motivates me a little more. You're right that the practical effect of this new service is negligible on a technical level, but it still reminds me that I should find an alternative I find acceptable.
So basically, because pretty much every useful data-and-automation-based thing - especially on the social, not individual level - can be considered an erosion of privacy, should we just stop progress? That's the sentiment I feel radiating from yours and others' complaints about privacy here.
Certainly not. I quite enjoy the features I get from having my data run through a lot of systems. However, you don't need to be a Luddite to recognize that sometimes some users of data (like sometimes Google) aren't as responsible or considerate of the privacy implications of their actions as they should be.
When their primary financial incentive is user engagement to sell ads, I have a hard time believing that the latter doesn't influence the use of my data as much as the former.
I still use a variety of Google products all the time, but when they're training machines to understand the meaning of my emails, I begin to think perhaps that's beyond my comfort zone. I think that machine learning has tremendous things to offer society, but I'm not sure I'm super excited about a company using smartish machines to pull meaning and context from my emails so that we can save some percentage of users 10 seconds when they reply to something. I get that in most cases this isn't ever going to have a practical effect, but I would be a lot more comfortable if our society's data stewards were less blase about that side of things.
I see your point better now, thanks for clarifying.
> When their primary financial incentive is user engagement to sell ads, I have a hard time believing that the latter doesn't influence the use of my data as much as the former.
I don't agree with it in case of Google. I may be mistaken, but my impression was that they sort of separate their products into two groups - the ad-related are earning the money, and then the money is spent on funding something else (like GMail), with little direct connection between the two groups. That is, products like GMail certainly help Google earn more money on ads, but are not themselves optimized for ad-related purposes.
> I think that machine learning has tremendous things to offer society, but I'm not sure I'm super excited about a company using smartish machines to pull meaning and context from my emails so that we can save some percentage of users 10 seconds when they reply to something.
I think you're seriously underestimating the potential for productivity gains here. Mobile use case is perfect for replying to e-mails, but mobile experience totally sucks. This service, if it works as advertised, will be probably saving not 10 seconds, but something like one minute per e-mail. That + friction reduction have enabling properties that make people do things they didn't before (e.g. I often don't reply to e-mails on a phone only because it's too slow, opting to browse Facebook instead), and aggregated that can help liberate a lot of time.
I would love if companies would explore on-device processing more. While it might not work in this case, it should work fine for e.g. data extraction for flight tracking, hotel bookings, package tracking, etc.
Oh yes, so much this. The cloud business model thing leads to not just privacy-eroding solutions, but to implementations that are absolutely ridiculous from the engineering point of view, and exist only because the company thought it's a good way to extracting rent from the user. See, for instance, most of the work done around IoT, home automation and by hardware startups.
how do you know the poster isn't blocking the ads? (or for that matter using a separate client?) and anyway, ads are stupid awful crap, so who cares if they get that part right? It's nothing like actual email messages.
advertising becomes scary when this service tries to suggest responses like "Yes, let's have a meeting tomorrow and why not enjoy a refreshing glass of ice-cold Dr.Pepper together!". But if not, I'm not that worried about the ad revenue aspect.
You can make gmail "dumber" by using the IMAP api and then encrypt your email? Unless you use google apps for work since you aren't paying for the service you are the product.
Using IMAP + encryption obviates all of the features that keep me on gmail, and would also be time consuming (especially with my non-technical contacts). On a practical level, I think your suggestion is equivalent to using a different email service -- high effort of implementation, loss of gmail's "smart" features, etc. Whether I'm using Google's servers or not, hiding my stuff from their systems amounts to the same effects on my life.
Google does not sell "users", that's why it is inaccurate and misleading.
What would you say of a private highway that sells billboards? If the business model of a private highway is not tolls, but instead selling ads along the road. "You're not the user, you're the product?" Would you say the highway owner is selling the personal information of the drivers?
The people buying billboards do not receive any personal identifying information about the drivers, they are instead buying access through an auction, like if this particular highway feeds into a sporting arena, then it is likely a good place to put ads for football fashion.
Within Google, Gmail/Inbox are Products. Their success metrics are user happiness and usage statistics. Like many startups, Google produces products which get marketshare and please users first, and then figure out the business model later. Google itself was launched with no real business model. And when Gmail was created, there was probably no plan on how to make money on it. They just wanted to do something awesome, and Gmail was born.
> What would you say of a private highway that sells billboards? If the business model of a private highway is not tolls, but instead selling ads along the road.
If it's funded primarily by the billboards and not by the drivers, then yes, the billboard purchasers are the real customers. And that could potentially lead to poor results, such as optimizing for billboard viewing time rather than safety and throughput.
Even if I accept your interpretation, it still doesn't justify the description "you're not the user, you're the product" In the case of the private highway example, even if the customers are the advertisers, the product being sold isn't the drivers, it's the billboard space.
Google's users are not its products. You might claim its users are not its customers (users != customers), but that does not imply the users ARE the products. Just like the New York Times readers are not the real business customers, the advertisers are, however no one has ever exclaimed "The New York Times Readers are the Products!"
The Product is the New York Times, because that is the thing _Being Produced_. It is the work product of the journalists.
Likewise, the work product of the engineers, SREs, PMs, managers, ops and support for Gmail is the Gmail Service. That is what is created out of the hours they put in. There is an ancillary product that leverages externalities produced those products (virtual real estate to auction off), but it is not people's data as the product being sold.
This meme is really tiresome because it doesn't get to the root of the matter, which is, are the incentives of those working on Google's products aligned with the incentives of the business operations which have to derive a return on those investments.
There are two ways to look at this depending on your level of cynicism:
1. Google is a smart company that looks at the long term and realizes value by the creation of positive externalities. That is, it works from the premise that if you make good products that please your users, you will retain and get more of them, and there will be opportunities to monetize that. That is, retaining users is paramount, ergo, user trust, branding, and generally not pissing off users is very important.
2. Google is a short sighted company that looks at what is currently has, lets bean counters manipulate its products purely to increase the bottom line by maximizing the amount of stuff they can sell. This means, they only consider the wishes of advertisers, and act to increase purely the amount of ads that can be shown, regardless of how much it annoys users.
Now, you can decide to believe #2 or #1 depending on how harshly you view the trajectory of Google products over the years. To be sure, they have done things that have annoyed users. But I believe, based on evidence of actually working here, that the primary impetus of engineers and product managers who guard the high level product features, that the focus is tipped towards user concerns.
No, the product being sold is viewers and viewing times.
Which means the highway owner will try to make sure that everyone has to use this highway for their commute, no others, and then the owner will try to make sure people drive slower, so they see more ads.
And I can tell you, Google is definitely #2. Many examples in the past showed that. Like them directly ignoring a law, and then arguing in court "but throwing away all the data we got when we violated that law would mean we’d have to collect it legally, and that would cost millions!" (Compare: Google vs. Federal Republic of Germany: Street View)
Then it would be fair to complain if the company was a) new, or b) it was already a typical thing to do for them. Google keeps proving otherwise; applying "you're the product" meme to them is nonsense.
Perhaps so, if the company was short-sighted. A company thinking for a longer time-frame would likely weight customer happiness as a very important factor for long-term profits.
The product is the service being offered, in the case of Google it's search, mail, maps, etc. How they make money is unrelated to what the "product" is. E.g. I switched from free ad-supported YouTube to the paid ad-less YouTube Red, does that mean what the "product" is suddenly changed? Also you really are the "customer" or "user" of the many services being offered by Google just like an advertiser is a "customer" of the advertising side of Google. That doesn't imply one type of customer (the advertiser) is more important than another type of customer (the user).
The entire point of the "if you're not paying, you're the product" spiel is that everybody already knows everything you said, but there's a fundamental difference between paying and not paying. You're not going to get that far demanding that people stop thinking there's a difference, nor are people going to stop verbalizing that difference, nor should they. What's cliche on HN is something the vast majority of the rest of the world still has not heard, and in this case, ought to.
The reason this argument is so useless it that it is so non-specific. A useful form would be "This company is treating me poorly in way X because they have an incentive to do so." Instead, X is never specified, just implied via FUD.
Companies only interested in short-term profits might be willing to strip mine their users and treat them poorly. Companies that are interested in the long term have an incentive to treat their users well, i.e. as if they were paying customers.
That was a rational argument. You know as well as I do that if you explained it the way I did above most people would not care. But instead you insist on the "you're the product" FUD nonsense to elicit an emotional response. Spreading FUD is a demagogue's job, I would never condone it.
The point is that it's a plainly obvious rational argument that is the ground state of the discussion, not some brilliant insight that once it is revealed to people they're sure to see it your way.
And it's not FUD to point out where money flows really come from. In fact I find myself wondering why you are so passionate about people not talking about it? It's not an invalid observation, a lie, or something irrelevant, so people aren't going to stop talking about it. Generally jumping up and down and shouting "Stop following the money!" is, well, not how conversations on HN go, shall we say? If you're searching for something to be contrarian about I can provide a few better candidates....
I'm passionate because I'm sick and tired of seeing the same nonsensical dishonest FUD memes such as "you're the product". "You're the product" is absolutely a lie, even with the most generous interpretation it implies something you own is being sold. Can you point out what thing is being sold? I'm not interested in being part of some "true HN" clique that rewards paranoia for imaginary internet points. And I'm not saying that you should "Stop following the money!", it's clear where the money comes from. Just don't make the waters muddy by spreading FUD when it's already clear is all I'm saying.
In EU my personal information (my name, my username, my email addresses, my clicks, my metadata, etc) is mine and there are limits to what you can do with it.
In US who knows. There seems to be strict rules on HIPPA, and some tax stuff, and some money laundering stuff. But other than that selling the data seems to be fine.
EU data laws for everything are on a similar level to HIPPA.
Today I got an email because a website where I had an account was sold (from one SevenOne subsidary to another one) and they explained who owned it before, who will own it after, and explained to me that I’d have to allow them to copy my account data to the new company, otherwise it would not be transferred and thereby be lost.
That’s how you do it.
When US startups I have accounts with get bought – like WhatsApp – I never got an email asking if they could sell my data to Facebook.
Why is your being 'sick and tired' of seeing a valid concern negating the value of discussing a valid concern? Nobody's doing anything about this valid concern, so clearly it hasn't been discussed enough.
you insist on the "you're the product" FUD nonsense
I like the phrase as it draw attention to the fact that the services offered aren't "free" but rather are paid via a mechanism that a typical user wouldn't think about. Sure it is shocking, but it is useful.
I'm not sure how it is FUD ... your data is being repackaged and sold off to the highest bidder. There's no doubt in that.
Spreading FUD is a demagogue's job, I would never condone it.
A more accurate statement would be that you are the supplier of several key inputs (both data and ad viewership), that are used in creating the product that they sell to customers paying money, and that, as a supplier, you are paid with the products you consume (e.g., gmail, etc.)
Another (equivalent) accurate way is that you are, in fact, a paying customer for the product you are consuming, but a customer that pays in-kind with things that they use to create another product that they sell to people who pay with money.
He's not wrong and you are. You are not "the product", and saying such is so divorced from the truth to be nothing more than a lie at this point. "You" are not being sold, access to you is.
Look at some ads, get some great web services. Seems a fair tradeoff.
I'm not entirely sure this is a semantic argument worth wading into, but I think there's a wrinkle worth addressing in the "you're the product" mentality. In order for Google to effectively serve you ads (and convince advertisers it's worth their prices), they accumulate vast amounts of information about their users. From some perspective one's identity is largely based on or equivalent to one's purchases, tastes, interests, friendships, knowledge, etc. Advertisers (like Google) try to know as many of these as possible about everyone they can, and I can see how for many people this is equivalent to Google selling "you" -- they have a digital representation of your identity, and their ability to comprehend it is the real service that sits behind the advertising.
I disagree that this is an argument over semantics seeing as how "Google is selling access to your attention based on what they know about you" is much less scary sounding (and more accurate) than "Google is selling you/your information".
The latter makes it sound (willfully) as if random third parties have access to the information Google has collected, which is a pretty blatant falsehood. Anyone is able to go sign up for an Adwords account and see how "targeting" works. At no point does the advertiser get to see anything about the people.
The whole "Google is selling you/You're the product" meme is a breathless, thought terminating cliche that needs to die.
Probably not. I don't have a very absolutist perspective on these privacy issues, but I do think that I should probably be taking my advertising business elsewhere (if anywhere at all), and that I should probably find a way to limit the amount of my data they gobble up. I'm lazy and I would prefer something that works to something which keeps all my information safe, which is why I'm still using gmail even though I'm creeped out.
Arguably, the easiest way to contend with this would be to send Gmail users a private link with the text of your communication. Or encrypting the text.
I've looked previously, and usually thought I would get some bare metal at a trusted colo and run my own server. However this has a variety of issues, including but not limited to getting the big email services (especially gmail, ironically) to trust your server as a non-spam gateway. Part of the problem here is that running a quality email service isn't cheap, and I'm not sure how much I'd have to trust a provider for it to be worth ditching google as my provider. In the past, the "easiest" answer to this in the abstract was for me to just do it myself, but I haven't had time and I'm concerned it would make it difficult for others to communicate with me.
Running self-hosted mail is getting a lot harder because spam engines are much more hostile to it. Also, most well-known mail clients for self-hosters are very dated. There's a few new ones on the way, like Mailpile, but we aren't there yet.
I kept imagining it would generate responses based on MY past emails until the end of the article. Example "When does your flight leave for Paris?", I was imagining it would pull the info from Google Now to reply "Friday at 9am", but I don't think it is built for this type of specific question.
> The solution was provided by Sujith Ravi, whose team developed a great machine learning system for mapping natural language responses to semantic intents. This was instrumental in several phases of the project, and was critical to solving the "response diversity problem": by knowing how semantically similar two responses are, we can suggest responses that are different not only in wording, but in their underlying meaning.
Does anyone know where I can find out more about this? A paper or something maybe?
This looks more powerful than iOS's context sensitive keyboard, but that also looked impressive when demoed. It will be interesting to try this out with real-world emails and see how well it works in practice.
This gets really exciting when implemented on a smartwatch. Single tap responses that are actually useful would make smartwatches significantly more powerful. Speech recognition is great but it's not always appropriate for social environments.
Reading that made me very happy for some reason. I find it exceedingly cute that after churning through what was probably millions of gmail users' email responses, this Google machine decided the most likely response to many new emails is simply "I love you." Makes me feel warm and fuzzy inside to know there's so much love out there in the world for this algorithm to absorb.
I have a concern that is somewhat related to this issue. I've been wanting to play around with doing NLP in the client using the SpeechRecognition API available in Chrome. It typically gives pretty good results for simple recognition purposes, but there is not yet any way to specify an arbitrary grammar for the backend service to use. This is a big deal when someone wants to be able to recognize a name spelled like "Bobbie" rather than "Bobby". The W3C Web Speech spec currently allows for setting arbitrary grammars, but it isn't implemented. I'm just saying it would be cool if some of you AI geniuses at Google could help me out on that front.
You can play around with the dumb little AI app that I've been working on that does a little bit of NLP in the browser. Check out this site with Chrome: https://yc-prototype.appspot.com. His name is Bertie. You should see his face on the bottom of the page. The site works like an OS in your browser.
Interesting. I used a sequence to sequence methodology about 10 years ago in a chatbot that won in a self-learning competition ... of course never had the data available that google has.
A bot like that is a great tool to experiment with NLP and ML ideas. This is really an intriguing area of research.
When you receive a call today, you are already prompted with auto response messages such as 'I'm busy, I'll call you back.' This seems like the natural next step.
> But replying to email on mobile is a real pain, even for short replies.
Imo, this has already been solved. I just whisper replies into my watch for both email and SMS. Google's voice recognition is finally good enough for this. I can't think of anything more convenient.
I'm not sure if AI-ish replies are something I'm interested in. Throwing complex fuzzy logic at what should be a hardware/interface problem seems shortsighted. I prefer a much higher level of granular control, especially if these messages are work related.
dudette gets suggested reply message generated from one of dude's earlier emails to his friends: "mrs.Plinketon has great tits!"
It is amazing to realize this type of technology can be bootstrapped only once (?)- as soon as you start getting automated replies in the system quality of learning on past data will drop, at some point snake will be eating his own automated reply tail.
I imagine this system would benefit greatly from optional additional user voice input, something as simple as 1-3 spoken words fed back into Reply network to help with the reply intent decoding. You get email, NN generates 3 replies it thinks you would like, you have different idea for a reply, click microphone icon and say "no, maybe next week", Reply network rebuilds suggested answers with your input shaping reply vector.
I think you have it backwards, before it read your email to scan for keyword advertising tags, now it actually interprets the email and can provide a coherent response. Next up, auto responding to pharma spam with orders because you complained to a friend in one previous email about your love life :-) Its just the computer trying to help out, it knows you want have a better life :-)
The worrisome thing is that it while it used to be the privileged who insulated themselves from the "real" world by having staff answer their email, this would let everyone do that, and that might give everyone more tunnel vision.
I understand your concerns, but for most people, reading their email is unlikely to "help them escape from their filter bubble" or something. Most people aren't pundits with rivers of politics in their email. It's mostly spam, offers from mailing lists, more spam, some personal emails, a couple of urban legends, etc. I mean, I guess some of the spam is political, too... did you know that Obama is the actual Anti-Christ predicted in revelation? But it's not generally of the "mind-changing" variety....
Why do you care if a computer reads through your email? Its not like Larry Page is looking through your search history. AI seems to just be data expressed in some way, ei me and you are a physical expression of years of gigabit streams of data. This will only become more strongly felt as we get closer to serious AI.
Sadly I think its something you either get used to or go live in the woods.
Interesting point, but there's a potentially important, albeit somewhat subtle difference. Google collects data which they claim nobody will ever look at. The reason they collect lots of data is so that machines can look at all of it now, while the NSA collects lots of data because they think that a person might need to look at some of it in the future.
If anything, that makes Google's bulk data collection more suspect than the NSA's. Not only do they collect it en masse, they also process it.
(Note that I personally think Google's collection is more ethically defensible than the NSA's, if only because they at least seek implied consent whereas the NSA doesn't even have to pretend)
The most obvious answer for me is that the computer could tell people to take interest if the email matches some criteria. So it isn't the intrusive one, but it could automate finding people to intrude on.
Do you think that this hasn't been happening? As someone posting on tech site, you should be aware that they scan emails for keywords, probably google and certainly the NSA.
Furthermore, training LSTM networks on emails has nothing to do with what you are concerned about.
Privacy issues aside, please don't.
These kinds of features are outright harmful to our society.
Human communication and our social skills in general have already detoriated a lot by replacing face to face conversations with audio only, then text.
This is just further taking away the incentive to sneak in something personal, something human into our written communication every now and then.
IMO, the very reason someone takes the time to type out these kinds of questions as an email message is to make it more personal. Otherwise, they could just use an rsvp/calendar system or a bug tracker for the particular queries in the article, which already enable you to send such generic replies with 1 or 2 taps at most.
Can we stop misusing technology to dumb down our everyday lives and stop transforming our society into a bunch of isolated individuals living on the same planet?
Minus all other concerns, for the operation / infrastructure / devops, the auto-replying about server downtime or inquiry about performance, those kinds of stuff is something I am actively thinking about and starting to learn how to implement in my organization.
"...In developing Smart Reply we adhered to the same rigorous user privacy standards we’ve always held -- in other words, no humans reading your email."
That the humans at Google may read my email is the least of my privacy concerns. That the robots will pass them along to other robots at places other than google is the real worry. But as no such agency exists, why worry?
The real potential here is eliminating human interaction. On the sending end machine should auto compose, auto target a mailing list, send it out. On receiving end machine should scan incoming email, compose a response, send it out. That would cover 95% of current email traffic.
I'm not sure I trust that this will work that well. Because I still get other peoples flights added to my calendar automatically if they forward their itinerary to me.
This would seem to be an area where senders could alter behavior to make things work better with this...maybe being explicit about the question and options.
Sounds like the system has no way to take into account context. Each message is encoded into a 'thought' vector, but is it difficult to come up with a 'thought' that changes meaning depending on prior messages? I imagine it is such a concern that leads one to look into Hierarchical Temporal Memory and similar techniques.
I guess I've already automatically consented to Google using my emails to train their algorithms. How does filing bugs work in the scenario where the machine learning can't figure something out? Is a copy of someones personal email attached to the bug report?
Gmail does not literally unsend SMTP messages. To give you time to undo, Gmail delays sending the message for a few seconds. So if you don't select "Undo" within the time limit, your message will be sent.
Yes, that was a nice strong statement but you are wrong, and the other guy is right. When you send a message from Inbox, you can cancel it, for a few seconds. It doesn't bother actually sending it until the Undo button has disappeared.
I regret to inform you that actually it isn't really about the algorithms, it's really about the scale. A lot of these algorithms have been sitting on the shelf for many years because they are "that's neat but you'd need an insane amount of power to do that in practice" algorithms. Google connects these otherwise-useless algorithms with insane amounts of power.
For example, the stuff in Google labs (gmail) has had some silly things like don't hit send when you're drunk but sometimes very useful features which may be considered unorthodox like "Gmail telling you when it thinks you wanted to attach a file but forgot".
It's something I wouldn't generally associate with a very corporate design but here they are wanting to add another silly feature and who knows how it will turn out? Maybe it will be super useful and then all the other companies will start to copy it.
But the thing is they're inventing new ways that really don't fit your product development roadmaps. I really like that about them.