Gmail scans, parses, analyzes and catalogs your email

debatem1 · on June 3, 2019

...and, most menacing of all, they may send your email to other people!

More seriously, these things all seem like essentially mandatory features for an email provider today. If you don't provide fast search of my inbox you probably aren't a serious option for me, and a spam filter is an absolute must-have. How can they do those well without all of the above?

FartyMcFarter · on June 3, 2019

I hear that browsers these days parse the HTML content in your private browsing sessions, even the incognito ones.

dmix · on June 3, 2019

True. Although spam/search features could be the only thing you want done with your email.

There's no reason why someone couldn't create a GMail clone with both of those features on parity but with a business model that doesn't involve mining your email for receipts.

joshuamorton · on June 3, 2019

This depends on what you mean. There are features (that many people find useful, even!) that depend on "mining your email for receipts". For example, I get push notifications that I have bill payments due, and flights and events that I register for are automatically added to my calendar. A similar feature, "trip bundles", which would group flight/hotel/car rental information for a logical single trip was available in inbox, and was one of the many features that inbox users complained about losing when Inbox was merged back into gmail.

And of course, Gmail's business model doesn't really rely on mining your email receipts. Gmail data isn't used to target ads. So unless by business model, you mean the features, Gmail basically meets your requirements ;)

shereadsthenews · on June 3, 2019

Well, the reason is cost. Gmail works well because when you click on it you get a few milliseconds of resources on thousand of computers. The rest of the time those computers are doing something else. The economics aren't great for any organization that doesn't have those boxes laying around already. Also obviously Google has search technology laying around in their kitchen drawer. If you don't have that you are starting at a disadvantage.

lern_too_spel · on June 3, 2019

> There's no reason why someone couldn't create a GMail clone with both of those features on parity but with a business model that doesn't involve mining your email for receipts.

Implementing the first two features well requires lots of users. Users you won't get if your service is less useful in assistant features for purchases.

SteveNuts · on June 3, 2019

Duh. Anyone who didn't think this was happening is simply not paying attention. Gmail will show a "track your package" link on Amazon order emails, adds flight or hotel booking to your calendar, and shows ads relevant to whatever email you are looking at.

This is how Google makes money, and they're not very subtle about it.

tantalor · on June 3, 2019

> shows ads relevant to whatever email you are looking at

Not since 2017. "Consumer Gmail content will not be used or scanned for any ads personalization" https://blog.google/products/gmail/g-suite-gains-traction-in...

solitus · on June 3, 2019

When I first saw Google Trips I thought that it was so great. It was so useful to have an automated tool that grouped all my trips' bookings and travel arrangements. Then I was like: "Waiiiit a minute! They're reading ALL my emails...OMFG.". I am now migrating towards Fastmail one provider at a time.

cycrutchfield · on June 3, 2019

Why are you anthropomorphizing an algorithm?

culot · on June 3, 2019

Don't you think you could get your point across without being obtuse and patronizing?

"It's not like individual employees are reading your mail, its just an algorithm that handles it automatically."

That could have been a cooler move.

solitus · on June 4, 2019

Companies are moral persons.

maehwasu · on June 3, 2019

Yup, this was the wake-up call for me. I know everyone here is like "duh of course they look at everything," but sometimes it takes a concrete case to really feel what that means.

I switched to Protonmail soon after that, and honestly I barely miss Gmail at all now. I just changed my workflow to not use email as a catchall for everything that I needed to be able to search.

gowld · on June 3, 2019

Fastmail also has to read your email in order to transfer it from the sender to your browser.

windowshopping · on June 3, 2019

Exactly....I was like, people are surprised by this? How did they think gmail worked? It doesn't seem to anymore, but years ago it had been showing ads in the sidelines relevant to the content of your recent messages.

la_barba · on June 3, 2019

>Exactly....I was like, people are surprised by this? How did they think gmail worked?

The problem is Google continuously changes the terms of the deal. For e.g. If in 2006 I looked at what Google did with my data, and decided to create a gmail account, that is not the same trade-off that exists today. Its a never ending privacy creep.

HALtheWise · on June 3, 2019

As mentioned elsewhere in this thread, Gmail no longer shows ads related to email content, or uses email content at all for advertising purposes. Yes, the deal changes, but in this case it got better since 2006.

fghtr · on June 3, 2019

> Anyone who didn't think this was happening is simply not paying attention.

This is not so simple. Previous replies to similar claims: https://news.ycombinator.com/item?id=20052749 and https://news.ycombinator.com/item?id=20052851

la_barba · on June 3, 2019

"You should have been paying attention" can be said with respect to all such incidents. If we don't have the outcry and register our protest then you can pretty much guarantee that nothing will change. In this case, Google's scanning could be benign, but the fact that they're building large scale systems to do this is what is scary. It only takes one or two economic downturns to make a company like Google turn full-on evil; when the bean counters start ordering the engineers (if that hasn't happened already).

>This is how Google makes money, and they're not very subtle about it.

And us being not very subtle about our protest is how Google has reversed policies in the past.

baxtr · on June 3, 2019

I hope this will create an interesting discussion. I have two hearts. I try to avoid services like google as much as possible since I understand what their business model is based on and I don't want to support that/put myself at risk. But, if any of my friends asks me why they should not use it, the risk stays at exactly that meta level: you should not, because they will "scan, parse, analyze, ...". But then people ask: so what? I do not care. It will not affect my life. And, they are probably right. I would love to hear some "real" non-meta level risks associated with carelessness in privacy matter.

RedneckBob · on June 3, 2019

(do not forget that our CIA was an early funder of Google)

--> Tracking Phones, Google Is a Dragnet for the Police <--

"When detectives in a Phoenix suburb arrested a warehouse worker in a murder investigation last December, they credited a new technique with breaking open the case after other leads went cold.

The police told the suspect, Jorge Molina, they had data tracking his phone to the site where a man was shot nine months earlier.

But after he spent nearly a week in jail, the case against Mr. Molina fell apart as investigators learned new information and released him. Last month, the police arrested another man: his mother’s ex-boyfriend, who had sometimes used Mr. Molina’s car."

dictum · on June 3, 2019

> I have two hearts.

I usually don't trust Google, but on this issue I tend to see email as paper mail — it's inherently not private, and the overload of private information that goes through it is a problem regardless of Google's presence in the exchange.

I find the usual alternative, attaching identity to mobile phone numbers/SMS, even worse. Other alternatives, like WhatsApp, are proprietary, restrictive, and also harvest data from users.

In the heyday of paper mail, you wouldn't get a copy of your entire grocery receipt back in the mail. There was a lot of sensitive information (e.g. a credit card bill) but it was less fine grained than the information that leaks from order confirmation emails.

Maybe that's something merchants should be careful about.

fghtr · on June 3, 2019

Check out recent discussion: https://news.ycombinator.com/item?id=20050764

maehwasu · on June 3, 2019

Your entire life is open to Google if you use Gmail. Google has demonstrated a willingness to run algorithms to analyze that life. Google can, at any time, for any reason, ban you from its platform.

If something in your life runs afoul of Google, your identity gets shut down. This wouldn't be a big deal if you weren't using Gmail as the hub for tons of your identity and logons--but many people use it for exactly that.

TLDR; non-meta risk is high and unpredictable if you use Gmail as a life hub, much lower otherwise.

shereadsthenews · on June 3, 2019

The more I think about this the more it upsets me to see easyDNS using these fear tactics to advertise their own terrible email service. Whatever you think Google is doing with your email, I guarantee you that the privacy controls at easyDNS are impoverished compared to those at Google. Are they seriously trying to argue that easyDNS has superior insider risk mitigations compared to Google? If so, let's hear about their architecture. We're all breathlessly awaiting the revelations of easyDNS's superior technology. I'm sure it's not just a giant pile of qmail servers that a bunch of random sysadmins have root access to, right?

avocado111 · on June 3, 2019

The argument is more, Google will try this data to its advantage. This may be convenient for you or even sometimes a real advantage, but there will also be situations where they use it in a way which is advantageous for them and a disadvantage for you.

Let's make up an example. Google scans emails with your purchases and their frequency and accumulates a pretty good understanding of your purchasing power and income situation. This can, for example, be used to sell advertising space for ads which are geared towards a specific income strata.

So far so good.

But the data is also in principle open to companies like credit reporting companies. But let's drive it further, your landlord might be interested in the data because it tells him precisely how much he can increase the rent before you move out and rent something cheaper.

A counter-argument which I see often is that companies like Google and FB do not sell personalized data abut people because data is in some way capital for them. But cases like Cambridge Analytica show quite clearly that FB was already exchanging personalized information against data. It is only slightly simplified to say that they are already selling data.

tantalor · on June 3, 2019

> Then what are they doing with it or why else would they even bother?

Why? So "you can view all your payments, bookings, and subscriptions in one place". https://support.google.com/accounts/answer/7673989

bhhaskin · on June 3, 2019

I'm not sure why people are being so negative here. This is a good thing. The general public are finally starting to catch on to the true cost of something being free.

lozenge · on June 3, 2019

Perhaps the title should be "Google catalogs purchases of gmail users"? Any email service with spam filtering and search indexing could meet the description of "scans, parses, analyzes and catalogs your email".

everdrive · on June 3, 2019

The technical ability to view the contents of an email and apply spam-filtering decision is meaningfully different from intentionally retaining that information and parsing it into various marketing databases, though. I understand where you're coming from, though. Once a provider can view the contents of your email for normal functions (such as spam filtering) the user would never know if that provider was doing something like Google is.

This is a "most caution" approach, and isn't without merit, but it's also not valid to assume all email providers are doing this simply because it's technologically possible.

greiskul · on June 3, 2019

And why is it valid to assume Google is doing this "parsing it into various marketing databases", where the only evidence about that is Google explicitly saying they don't do that? They used to scan email for relevant ads (and only next to the email). They decided that was wrong, and stopped doing that. What evidence do we have that they are lying?

everdrive · on June 3, 2019

I'm confused. Is the posted article not a direct counter-example? Google is scanning your email contents and at least, cataloging your purchases.

joshuamorton · on June 3, 2019

There's no requirement that data that is parsed get fed into a model or algorithm to sell you stuff. Note that this article doesn't make any concrete claims about nefarious actions (that's because there aren't any).

muro · on June 3, 2019

Visible only to you.

falcolas · on June 3, 2019

And Google. To help build out their profile of who you are and what you buy. To increase the value of you as a target for advertisers when you click on a page that uses Google as an ad broker.

greiskul · on June 3, 2019

Ok, any sources on Google doing that? Cause the article doesn't support that at all, and Google says it doesn't do that. Or do you disagree with the statement "also not valid to assume all email providers are doing this simply because it's technologically possible", and just assume Google is always malicious because it could be malicious?

Alex3917 · on June 3, 2019

Every email client does this. Go into any Gmail message and click on "Show Original." That's exactly what your email would look like if it wasn't scanned, parsed, and analyzed.

The fact that Gmail is reasonably good at this is literally why people use it.

dictum · on June 3, 2019

I don't really understand the mindset of questioning the validity of an article about something you already know.

It's an opportunity to discuss the specifics of a problem or implementation.

Also, it's always the first day of learning about something for someone. It's an opportunity to let them know about other related information.

Demiurge · on June 3, 2019

Exactly, this the perfect article to send to all your friends who don't follow all the ##l0pht #h4x blogs.

cantrevealname · on June 3, 2019

OK, this brings up a question I've been meaning to ask tech-savvy Gmail users. The typical Hacker News reader is surely aware of the extent that Gmail is profiling you and surveilling your email. The cost of a commercial privacy-centric email provider is about $50 per year, a trivial amount for the average US-based Hacker News reader, much less than going to a sports event or concert. So why are so many technically-sophisticated and privacy aware people, even here on Hacker News, still using Gmail? I'm not convinced that it's due to Gmail's superior spam filtering. It's been my experience that the spam filtering on commercial email providers work just fine.

Radle · on June 3, 2019

Because that story is bollocks I don't want to be unfriendly, but part of your comment is bollocks, too.

"the extent that Gmail is profiling you and surveilling your email" - A serious description of the fundamental technology that a spam filter uses.

The purchase history that google shows me clearly says that google is only showing this to me, this information will not be shared.

See screenshot from my account https://ibb.co/jhkCSjK

Well the truth is, most companies are a lot more hostile with your data than google. Privacy centric services are incredible dangerous. We have seen that VPNs always log your data. (And thereby are completely redundant) You remember lulzsec? https://www.theatlantic.com/technology/archive/2011/09/lulzs...

Overall the purchases page is only there to give me a better overview over my data. And I am fine with google having my data. Most companies are less trustworthy than google. Splitting my data over dozens of other services is a guarantee that one of them will sell my data to the outside.

The price of using google is that google gets free data from you. The pro is that this data is top secure and only used to serve targeted ads and confirm to subpoenas.

Most companies are just going to sell your data to anyone willing to pay for it, including bounty killers. https://www.theverge.com/2019/1/8/18174024/att-sprint-t-mobi...

So yes. Google isn't good, but I can clearly tell what they are doing and what they are not doing. I trust google not to sell my data because they can get more value out of my data using it themselves and I actually don't have a problem with (reasonable) targeted ads.

cantrevealname · on June 3, 2019

> description of the fundamental technology that a spam filter uses

Privacy-centric email providers don't retain any data from your emails after scanning for spam. Gmail does, and in unexpected places that I keep discovering and hearing about, and as far as I know, there is no Gmail switch to say, "Do the spam filtering but don't keep any data".

> google is only showing this to me, this information will not be shared

They can easily change their own policy on that, and they regularly do change their policy. Also, as you already mentioned, they'll share it after a subpoena. I don't want them to have or keep that data at all.

lern_too_spel · on June 3, 2019

> Privacy-centric email providers don't retain any data from your emails after scanning for spam.

Privacy-centric email providers send your emails to /dev/null after they have been scanned for spam? Your definition is absurd.

lern_too_spel · on June 3, 2019

Because collecting data and sharing data are the only actions that affect privacy. Gmail neither collects nor shares any more data than any other email provider. The only difference is that it provides additional value on top of that data through better spam filtering and assistant functionality, neither of which affect the privacy implications of the service.

aantix · on June 3, 2019

I’m privacy aware, but I just don’t think I care. So many things that I’m suppose to be outraged about.

I’m exhausted.

TheSpiceIsLife · on June 3, 2019

With regard to Gmail, privacy might be the wrong metric to be concerned about anyway.

Gmail can unilaterally lock you out of your email at any moment, immediately and forever, with no recourse nor consumer protections.

That, for me, would be mighty annoying.

shereadsthenews · on June 3, 2019

Every email provider can do this to you.

StuntPope · on June 3, 2019

If you run email at your own domain, and sync it offsite even if you use hosted IMAP, you can always switch your MX and be back in control of your email.

TheSpiceIsLife · on June 3, 2019

Some email providers claim to offer support, sold even have a phone number you can call.

cantrevealname · on June 3, 2019

> I’m exhausted.

Your answer makes a lot of sense to me. I too am exhausted by the number of creeping privacy violations I can do nothing about. (But switching email providers is something I do have control over.)

misterman0 · on June 3, 2019

I'm totally exhausted too. This morning I googled "migrate from chrome to firefox" because I remembered reading on HN a couple of days ago about a promotional site Mozilla had launched in the spirit of their old "getfirefox.com" campaign but this time it was about how to make the jump from chrome to FF. I needed Mozilla to motivate me to make the jump. I mean, FF is already on all of my PCs. I just can't seem to bring myself to actually make the jump. I just can't convince myself to marry yet another browser, as you do when you sign in and sync to the cloud. It's out of the frying pan into the fire, to me.

Google gave me an instant answer on how to migrate from FF to chrome. I immediately closed my default browser, went and played some Tetris on our Sega arcade machine, and tried to forget I'm both extremely privacy aware and extremely disappointed in Google.

Edited (spelling)

jsnell · on June 3, 2019

Are you sure that was the exact query? Of course it's hard to say anything conclusive these days, but I can't reproduce your results with that exact query. The featured result [0] is for migrating to firefox, and so are all the normal results, all the suggested searches, and two out of the three video results.

I can only see mixed results with queries like "migrate chrome firefox" or "migrate firefox chrome", which have no disambiguation about the direction of the migration at all. For those, the top result I see is migrating to Chrome and the second is migrating from Chrome. Given the ambiguity, that hardly seems malicious.

[0] https://www.howtogeek.com/333047/how-to-migrate-all-your-dat...

Frondo · on June 3, 2019

For me, the issue has a couple of layers:

First, there's little I can do about it, even if I'm not on gmail. So many people are, so Google has copies of most of my email even if I'm not using them.

Second, this is a small wrong in the world; I've literally never heard of political violence linked to someone's use of Gmail, while I have heard this about a variety of other services. It's a little disturbing, but there are "evil corp" things much closer to home that I reserve my change-the-world energy for.

Third, on a basic level, there's nothing wrong with this service, or any of the various privacy-invading services. The problem is that, right now, the mega-corps are all so unrestricted from passing my data around to one another, building giant profiles.

What I want is to be able to share my data freely with anyone, but for them to be heavily restricted from sharing it, under serious penalty, until I say it's okay. I want a cultural and legal shift around privacy to take place, so that companies have to respect my ownership of my data as a primacy, but then I want them to be able to do whatever cool stuff they want with it.

Location sharing? That shit's awesome. If you're meeting up with people in a new city, there's no better way to do it. But my telco and mapping providers had better be under corporate death penalty not to share that with anyone, under any circumstances.

That's the world I want, not one where these things don't exist, just one where I and every other user of the tools is in charge of what we do with them.

andybak · on June 3, 2019

> What I want is to be able to share my data freely with anyone, but for them to be heavily restricted from sharing it, under serious penalty

I think you pretty much described the GDPR there (or at least part of it)

the_jeremy · on June 3, 2019

They aren't as good at the basics or the extensions. I have my own email provider, and I just set it to forward to gmail.

Custom filters and sorting with colored labels as they arrive in my inbox. Adding things automatically to my calendar. Better spam filtering.

muro · on June 3, 2019

1. Everyone already has my Gmail address 2. I actually like those features (e.g. adding flights to calendar) 3. I don't see it as much of a privacy problem

craftyguy · on June 3, 2019

1. No, not everyone has a gmail account. 2. I actually don't like those "features". 3. I definitely do see it as a major privacy problem.

muro · on June 3, 2019

1. That's a different statement than I made 2 & 3. You are welcome to your opinion too.

dageshi · on June 3, 2019

People value utility and cost over privacy. They say they don't, but their continued usage of Gmail says they do. That and the fact is most people I think are not vehemently against ads unless they're really annoying.

barberousse · on June 3, 2019

Dunno about that spam claim, stuff gets through Fastmail's filters

threatofrain · on June 3, 2019

I would appreciate if Apple did such things, categorizing email into "humans" or "advertising" or "bills". That would help a lot of people use email more than a style refresh.

seltzered_ · on June 3, 2019

One thing to remember is that most rewards credit cards also do similar aggregation of purchase data - https://gizmodo.com/amazon-and-chase-are-still-confusingly-o...

shereadsthenews · on June 3, 2019

It's much worse than that. You can call any of the credit card companies and just buy data dumps of transaction histories that are barely even anonymized. Google organizes your purchase history on your behalf. Everyone else is trying to sell it.

kerng · on June 3, 2019

When Microsoft pointed this out many years back (they had some controversial ads pointing out Google reading your mail) they were confronted with criticism.

It's good that finally the public and media is catching up and we have discussions around these issues.

avocado111 · on June 3, 2019

What I find more concerning is that, at least in the UK, gmail seems to include the public IP of the sending device in the "Received" email header. Because I assume that it's possible (not sure if legal in the UK) to buy the mapping from IPs to names / identities, this means that it isn't possible to use an unnamed gmail account in an anonymous way. I guess that most gmail users are not aware of this issue.

tantalor · on June 3, 2019

> We don’t know what else they are scanning for, what else they are parsing out, where they are storing it and what they are doing with it.

You didn't know that before, either; the "purchases revelation" changes nothing. Also, this is true of every website, retail business, method of transportation, government agency, etc.

dragonwriter · on June 3, 2019

This isn't secret; features that openly depend on this are key Gmail selling points.

NikolaeVarius · on June 3, 2019

I don't understand how this is a new realization, this has been the implication since Day 1, for using email to personalize Ads. Everything else is just new fluff ontop of the core offering.

Lendal · on June 3, 2019

They said they weren't using it to personalize ads. To me, this means they're doing something worse than ad personalization. They're selling the data to third-parties or they're allowing third-parties to analyze the data in return for cash. That's what Fakebook does and so that's my expectation of Google. Especially since they dropped "Don't be evil."

shereadsthenews · on June 3, 2019

What evidence do you have of that? Google just uses this data on your behalf to make their services more useful. Purchase analysis was a core launch feature of Google Inbox and now that Inbox is gone the feature is rolled into Gmail. I personally get a lot of value from this feature. I like getting notifications on my mobile when my orders ship and when they are delivered. It also helps me re-order consumable items, which I did just yesterday when the igniters in my gas oven failed (again). I bought them last in 2016 but it was trivial to find the receipt in my purchase history on Google.

Lendal · on June 3, 2019

My evidence is that they are a publicly-held company which means they're obligated to make as much money as possible for their shareholders. If they have something of value, (your purchase history) they must make money off it. Giving you free stuff without getting anything in return is "bad" in the eyes of shareholders in the current unregulated business climate.

shereadsthenews · on June 3, 2019

That's a pretty dumb point of view. It says right at the top of the page on Google's purchase history list that "Only you can see your purchases". A rational person needs some kind of basis to refute this claim.

Lendal · on June 3, 2019

Personal attacks now. You're out of intelligent arguments. Attacking both intelligence and rationality. I've given my reasons and they are rational. If you can't see that you can go f-k yourself. What an ass.

philips · on June 3, 2019

Is there an alternative that runs everywhere (iOS, Android, webapp, etc) and provides productivity enhancements like email snoozing and cross platform contact integration?

naringas · on June 3, 2019

and my oldest email is from 2004... i.e. earlier email has vanished

rconti · on June 3, 2019

http://www.google.com/search?q=gmail+launch+date

naringas · on June 7, 2019

oops i.e. notice of mistake acknowledgement

shereadsthenews · on June 3, 2019

Every MUA ever written does these four things to your mail.

Havoc · on June 3, 2019

Does this apply to g suite paid products too?

unstatusthequo · on June 3, 2019

And even if you leave GMail, you may email with people that have it, and Google will still have those emails of yours via your email contact.

I’ve been moving to ProtonMail. It’s harder than you expect to de-Google-ize.

jwr · on June 3, 2019

I find this particularly interesting, as everyone seems to be assuming that Google keeps their information "private".

But my YouTube history contains other people's videos, and I've been unable to get them to resolve the mixup. They don't seem to care.

Of course, in the case of YouTube, it is simply amusing to get videos for toddlers in your history or recommendations. It will be much less amusing to get other people's purchase history mixed in with yours.

jsnell · on June 3, 2019

You've probably got malware on your machine, which is using your browser and account for Youtube view boosting.

https://www.reddit.com/r/youtube/comments/a9y59n/rare_malwar...

jwr · on June 3, 2019

Every time I mention this, I get downvoted, and I get the usual suggestions. It seems no one can accept that Google can indeed have bugs in their software.

I do not have malware. I have three devices logged in: a Mac, an iPhone and an iPad. All under my control, managed carefully. I tried logging all of them out, removing through google's online panel, resetting my password, and lots of other things. I reported this to the YouTube team (silence).

The usage patterns of the "other" history are that of a real child: I'm guessing it's a child between 2 and 4 years old. The usage is limited, so it isn't likely to be an automated pattern. Various videos, not from a single author.

I think I'll stop posting about this, because clearly people do not like to hear about Google having a problem and mixing up people's data. And honestly, I'm tired of explaining…

mikejb · on June 3, 2019

Some posts in that thread suggest[1] an application called Gramblr could be responsible for it - an application used to upload content to Instagram.

[1] https://www.reddit.com/r/youtube/comments/ap6brw/gramblr_app...

tantalor · on June 3, 2019

Your account might be signed-in on somebody else's device.

Check "recently used devices": https://myaccount.google.com/device-activity