Absurdly wrong, marketers already use a unique image URL for each email recipient, and Google has no way to know that all of those point to the same image. So they won't see "a single request from Google", they'll see one request from Google per successful delivery to an inbox.
Now, an open question is if Google will make that request when the email is actually opened, which would allow marketers to determine if and when the email was read by the user, or if Google will make the request as soon as the email is received. The latter would enhance users' privacy at the cost of bandwidth for Google, but early tests indicate that they don't actually do that, waiting for the user to click the email to make the request.
I'd like to add that there's no possibility the Gmail team is stupid enough to not have considered this. They must know full well what they're doing, and marketing this as a privacy enhancement when it's actually detrimental to privacy is willfully dishonest.
I've just tested this.
The image was retrieved when I viewed the email in Gmail.
The tracking info basically comes back as "anonymous" and viewed from an unknown location.
The image was retrieved twice even though I only viewed the email once.
Currently I'd say that seeing the image being viewed is still valuable. I'm sure Google could move to proactively fetching the images in future destroying even that value.
The image is cached (via browser headers), but isn't aggressively cached (via reverse proxy). Fresh views on different browsers or a later session would still result in a request for the image back to the source server... registering the view again.
My personal view on all of this is that this is a bit Microsoft... you know, convenience and features over security and privacy.
For me, this "feature" leaks data about what I view to 3rd parties where today I block all images and do not leak that info.
That dialog was much along the lines of "We'll now show you images in your email automatically." with a big "OK" button.
I don't recall whether there was a less prominent "No, thanks" as I was only logging on to reply to one question really quickly.
I suspect this is a UX anti-pattern. I've gone back into my settings and changed it back to "Don't display images".
If you accept the new default, it's far easier for them to track you, because Google will kindly be loading their tracking images on your behalf when you open the email, every time.
Gmail has done this enough times around already for to me become a UX anti-pattern in itself.
It's weird to look back to the days people were begging for invites and Gmail was the best game in town. These days the few times I'm forced to use GMail it almost makes me rage.
The "normal users" I know which are forced to cope with Gmail are constantly and consistenly confused, asking me where things they used to know have gone, and why Google is ruining Gmail.
Glad I migrated to something better, somewhere where I know for a fact that I am the customer, not the product.
Once you try using fastmail, you will be surprised by how incredibly lightweight it is. When you first get used to that, trying to use Gmail feels like wrestling a horribly bloated pig. The UX has just become terrible.
As for fastmail or other options... I was very keen on being able to host things 100% myself, preferably using FOSS, because of privacy concerns and being in control of my own data.
After trying out various FOSS (and non-FOSS) solutions I decided none of them were good enough/polished enough for my needs.
In that regard fastmail is a compromise for me compared to what I ultimately want, but it's a compromise I'm more than happy with.
It seems the commenter functionally tested an external URL being requested by Gmail, and found that the request was coincident with when the commenter opened the email. This essentially "leaks" that you've opened the email as the image URL could be unique to your email message.
But even if they did, this is still more information leakage than the old default (don't load images).
Spammers who email via botnets and the like, with false return addresses, doesn't get bouncebacks to clean their lists.
But if you (or Google, on your behalf) give them a hand by reliably loading their tracking image, that flags your email as a valid one.
If you weren't actually reading the email, that's still a false positive I don't think you'd benefit by giving.
Email from familiar senders would have images prefetched (thus avoiding leaks of user data).
And DDOSing concerns would be reduced because those emails would not be from a familiar source.
The ideal thing would be to just prefetch all the images sent to existing and non existing accounts. This way there is absolutely no way for a spammer to tell whether an email is existing or not.
The advantages I had in mind were:
1. No leak of user IP address, cookies, etc
2. No leak of timing information (when user opened the email).
It will however leak that the email address is valid, which might be a fine compromise with a selected subset of senders.
There is bigger benefit of doing this proxy for non-familiar emails.
Google could prevent leaking if email address is valid by simply prefetching images even when email is sent on non existent accounts.
But that option to not see images is still there, and if you've defaulted it to off you still wont see images. Unless I misread the blog post.
I don't understand. It probably has to do with the Zeitgeist.
The images were originally blocked because of security and privacy concerns.
Rendering of images is potentially insecure because of bugs in the browsers. By proxying the images, Google as an webmail provider, screens you from your browser bugs. Solved.
Rendering of images is a privacy concern because of tracking.
By fetching the images from another place, the attacker cannot know your location, your OS, language setting, etc... Solved.
Rendering the image allow checking when and if the email is ever opened, which can be useful in marketing^Wspam primarily because they can understand whether an email is active/exists or not. Not Solved.
The latter however is a general problem. There are many other ways to know whether an email account exists or not. Many mail servers respond with bounce emails anyway. They won't bounce on detected spam, but that's not the point; this feature is an additional barrier for image tracking for content that has passed through the spam filter already.
1. Google may cache all images in all emails sent to gmail.com instantly and regardless of the existence of the address. This would remove the possibility for marketers to check user timestamp, remove user data from request and hide user email existence.
2. Google does _not_ need to save each image from each unique URL separately, all they need to do is fetch each image and check against an already existing (mega)array of images they've fetched. This greatly reduces storage needed, but doesn't do much for the bandwidth requirement, but they won't care about bandwidth in all their Googleness.
3. The single most important aspect of this change has been omitted in the article, and in your comment: This change completely eliminates the risk of CSRF attacks by spammers and the likes. CSRF attacks are still number 8 on OWASPs list of top 10 attacks.
My three cents ;)
Using a separate tracking pixel is pointless unless you for some reason want to let some third party track the opens (which some people might, e.g. to prove certain open rates)
The entire article is just plain wrong. For instance:
This move will allow Google to automatically display images, killing the "display all images" button in Gmail.
Go ahead and do that, Google and you'll bring Web bugs back completely. How about this:
1. Marketer embeds a "jpg" file whose filename consists of a GUID that matches back to a user.
2. When you load that "jpg" file, it gives you an image unique to that user - maybe an MD5 hash of their GUID filename or some other thing unique to them that holds no value.
This would uniquely identify that a user reads their email. How would Google stop this? They can't cache it for multiple users. Both the filename and contents are unique to the user. If you think they could otherwise detect it for this particular situation, I could think of 100 other ways to do this that would not be able to be tracked in the same way.
There's no way Google or any other email provider can legitimately automatically load email images and not open the door to web bugs. No way.
Since every result will be a positive - false or not - no information is revealed to the marketer AND the images are displayed.
Sure, in theory that works. In practice, you make an easy may to do a Denial of Service attack against Google or innocent third parties so it in actuality would never work.
Send a million emails to Gmail accounts that each have fifty links to 1 MB JPG files hosted all over the Internet. The size of the file you send to Google in the email is what - 1k? The size you are making Google download is 50 MB. This is a 50,000:1 attack ratio. You could take Google down with a 56k modem.
You could also launch a denial of service attack against any other target, courtesy of Google. Send a million emails to Gmail users that downloads JPGs on a target web server. You can even make up the JPG names to be non-existent. Again, figure your email is costing you 1k in transmission and Google is putting tremendous strain on the target server downloading 404 error messages.
There are literally tens of millions of computers infected with malware making them part of a botnet (ex: http://www.csmonitor.com/USA/2011/0629/Biggest-ever-criminal...). The cost to hire 1 million of these computers (all with unique IP addresses) to send emails out is trivial. You'd be shocked how cheap it is.
Now here's the thing... SPAM is pretty easy to spot because someone is trying to sell something. In this case, you're not trying to sell anything. You just need Google to download things. So you send an email with "Vacation Pics" in it. Sure, Google could filter them out but they're all from unique IP addresses from home computers across the country/world and they aren't trying to sell anything. They probably could filter out some of them at the cost of filtering out a lot of false positive legitimate mail.
Internet security is complex - you seem to think otherwise.
As it is right now, spam is just mostly a drain on network and human resources.
This attack would also target storage. Google has enormous amounts of storage, so I've no idea whether it would be effective against them or not. But, it would be very effective against any smaller service providers that tried to do the same image caching thing (which so far seems to be a roundly bad idea, IMO).
It would also open up the possibility of using Google as an attacker to take down other sites. If Google's servers immediately requested the image for caching, then just send out a few million messages to Google addresses, each one with a unique URL pointing to some big image file on some site you want to take down. Most sites will accept a query string after an image URL, so ... bob's yer uncle. As it is now, this isn't an effective attack because you need to get people to actually open up the email message and then click "Display images", all within a short period of time.
And spammers aren't even that smart, they're just numerous and financially motivated.
If somebody wanted to annoy someone else with this, they could.
And, again, even if this doesn't work against Google directly, it certainly would work against other service providers who decide to follow Google's lead.
Heck, just look at the recent popularity of the WhatsApp spam that spread malware to tons of people (including Cryptolocker in some cases), or the "Secure Document" phish that made the rounds in Gmail in September.
Attacks on third parties are trickier, but you can do the same sort of thing today; how often is this tactic used already? And can't you download 404 messages yourself for cheaper than 1k?
If the same, just show a cached version. No more tracking.
(of course, email marketers would thwart this by making each image slightly different. google would respond by checking if the images are almost exactly the same. repeat. whack-a-mole ad infinitum).
Google might save themselves some storage space, but not anyone's privacy.
If Google changed their policy to fetch all images when the email is delivered, that basically delivers a false positive to all marketers/spammers/etc. -- which is better than more accurate info, but it's still worse than just not loading the images.
It's still a confirmation that the email address is valid, plus... who wants to give spammers (or marketers, for that matter) a false positive? The unsophisticated ones won't realize that it's a Google change; they'll just put you on the "interested" list and move on.
So I think they actually could take a pretty good stab a deduping images with unique tracking URLs. It might not be perfect, but even if it works 95% of the time they could still kill the profitability of the unique tracking image technique.
(Also, they don't currently seem to do anything like you're suggesting.)
I'm trying to work out whether this is more useful as a way to get Google to DoS themselves or as a way to get them to DoS arbitrary web sites of others. Either way, isn't this a gift to trouble-makers?
Of course Google would probably develop an automated defence against such attacks quickly if they happened in practice, but it seems any such defence would necessarily involve not caching all the images in advance, which would defeat the original point.
I also strongly suspect that google's crawling infrastructure is more than capable of fetching a bunch of images for every single message gmail receives.
But even if I'm wrong about the above, google is perfectly capable of throttling their fetching to mitigate. (The problem really ends up looking an awful lot like crawling the internet, which is an area that google seems to have a bit of experience)
Google can't tell, a priori, whether or not a series of similar e-mails sent to many thousands of people with Google Mail addresses and containing similar but different image links like the above is a genuine mail going out to someone's list or a DDoS of www.example.com in which Google is about to become an unwitting participant.
By the time they've worked out whichever trick is being used this time (in the same way that they adapt to changing black hat SEO tactics, but probably only make major changes every few months) it's not hard to see a hostile party busting the bandwidth cap for anyone on a basic, low-volume hosting plan.
Not sure if they limit it at some point, but if a server accepts urls such as:
Google would fetch each separately. Send this out to a bunch of people, and it seems problematic. I'm going to be optimistic, and assume they built in some sort of limiting, but who knows.
(and "normal users" do click the show images links)
Edit: ah, codeflo and EGreg are right. I was just thinking about the task of determining that the images serve the same role in each message (which I'm sure google could do a good job of). But (as they point out) in the "Dear <user>" case they'll still have to show the right image to the right user. Although, as Nacraile and jaxn say, if they load all those images eagerly they'll remove the value of those unique tracking images, and impose a cost on the sender.
They could do a "This sender appears to be attempting to track you. We have disabled images as a precaution. Click here to load."
Scary enough to the average user it'd probably kill the the technique very quickly.
I'm not sure about that. It's nothing that desktop clients such as Thunderbird haven't been doing for years. I don't see any remote images in any e-mail until I click to say load them, and this works in much the same way that plug-in elements like Flash and Java are now click-to-show in various browsers. Numerous marketers and mailing list services still use the technique to track an approximation of reader numbers, though.
A particular example of crying wolf that comes to mind is the yellow box that says, "HEY! THIS SENDER ISN'T WHO THEY SAY THEY ARE!", which usually means that someone just forwards their .edu address to Gmail.
Anyway, I think it would be perfectly fine if Google matched up emails with similar content and where there was one image that was unique for everybody just remove it, maybe with a note to the user in that case.
Do NOT have a "click to load images" button. If users can't ever see them then it completely destroys spammers' ability to use them even for a rough sampling.
I would love to see spam as an advertising method be completely destroyed. It won't be, because even without tracking it is still easy and useful to spam out lots of ads, but this would help.
It'd be a gold mine.
My assumption is that it's making the images unique.
Try again. :)
And even if marketers do start customizing images, hasn't Google gotten pretty good at comparing very similar files? Isn't that how Google Music works without having a copy of every single individual upload?
An image link in the email to:
will work just fine. So, Google will follow the link and compare the files and... wait a second, the information we care about - that slg read my email - has already been transmitted.
Now, when user_y also receives my spam and I get a request for image_for_user_y.jpg and I just serve the same file, Google is probably gonna deduplicate them on their cache or cdn or w/e, but only after they've sent me the request and confirmed that someone read my email.
I'm not trying to overload google's storage capability here (lol), I'm just interested in the information leak.
So they could learn that for all these emails with largely the same content, this one image has a slightly different URL, but the image is always the same (or similar). So as with spam, the marketers might see the first few "opens", but once Google learns that they all similar anyway, the won't see any more.
Now, I don't know if that's what they are doing, but its certainly possible.
No, you just know that the email was delivered to the user's inbox. You don't know if the user looked at it or just trashed it.
People love their e-mail offers. The type of users e-mail marketers want the most - namely the ones that responds to their offers very well - would be up in arms if Gmail makes them start missing out on offers.
For image tracking pixels, I'd just start returning PNGs of random dimensions with completely random pixel data. Ta-da, unique images that aren't similar at all (unless they wanna start doing wavelet transforms or something).
Sure, but they'd have to download the file first. At which point the tracking has succeeded.
Succeeded in what, confirming that Google is still in operation? Please note that Google doesn't even have to confirm the receiving email address is valid in order to get the image.
$ telnet gmail-smtp-in.l.google.com 25
Connected to gmail-smtp-in.l.google.com.
Escape character is '^]'.
220 mx.google.com ESMTP nh2si25383829icc.26 - gsmtp
250-mx.google.com at your service, [my.ip.was.here]
MAIL FROM: <firstname.lastname@example.org>
250 2.1.0 OK nh2si25383829icc.26 - gsmtp
RCPT TO: <email@example.com>
550-5.1.1 The email account that you tried to reach does not exist. Please try
550-5.1.1 double-checking the recipient's email address for typos or
550-5.1.1 unnecessary spaces. Learn more at
550 5.1.1 http://support.google.com/mail/bin/answer.py?answer=6596 nh2si25383829icc.26 - gsmtp
python -m SimpleHTTPServer 8080
creates a webserver serving the current directory, you can then create an email linking to a file in that directory and observe when it gets queried.
You've probably noticed that most spam comes from "borrowed" email addresses, not ones the spammer actually controls. If anyone ever sends a ton of spam with your email address on it (this has happened to me) it really drives the point home.
(Also, they might not actually point to the same image, it's very easy to make the images themselves unique if required.)
The TLDWatch is that Google does give you back the open event and they also give you the email address of the open since this data is encoded in the URL.
The data that is not accurate is what you would expect: IP address, geo-location and user-agent string.
We'll hopefully write a more extensive blog post shortly.
Also any cookies present in the user's local environment from other actions (images from the same ad network in other emails or from visiting web pages that use the same ad network) are not going to be sent, so tracking you between locations is going to be neutered somewhat.
- Since all images flow through google, phishing and other malware attacks could be subverted.
- Images will be hosted faster in many cases (possibly less cost to run newsletters).
- Less connections on your server/cdn from multiples sources but google singularly.
- Still able to identify users legitimately but new users and newsletters will have more trouble getting your information initially.
- Re-views later will not be tracked if out of cache, it will come from google the second time if is hasn't been purged (re-views are not big on newsletters anyways)
- Google getting all this data as well as your company
- The obvious 'national security' reasons
- Limited location and meta information
The email already has the image URL. What extra leak is there if Google fetches the image?
Also, google already has this data so why is that a con?
There is no national security reason.
Bottom line: Christmas came early for spammers this year.
I reply to this question here. Sadly, a request is done each time and only when a user loads the image.
Which makes the Ars article even more wrong.
Google has the technology to do that, the question is whether they want to put in the investment of storage required to actually guarantee users privacy, of if they just want to spend the least amount they can get away with...
/me goes to go find and turn on the "block third party images" feature.
This essentially allows the marketer to track whether the email was opened by default.
And the spammer. Unless they decide to not load images by default from untrusted senders.
If Google retrieves every image in every mail sent to @gmail.com, @googlemail.com etc, then the only thing the retrieve tells you is that you spelled "gmail.com" correctly - nothing about whether there is a mailbox there or whether it was delivered.
But they don't. If they retrieve the image, the account exists (they reject mail after the RCPT TO: stage for non-existant accounts).
If Google does this with every mail regardless of the inbox it goes to (spam etc) then it doesn't tell you any more information than you learn from not receiving a bounce. However, I could imagine a scenario where the bounce address is wrong anyway (spoof) - is this really that useful for anything?
I mean, presumably most combinations of common first and last name plus two digits go to a registered mailbox. How does being sure that it is registered (but knowing nothing else) without having to be in a position to receive the bounce, mean a compromise?
I'm open to the possibility that it does - but I'm not seeing it.
EDIT: another possible area of concern is that you can get Google to visit an address just by sending mail to firstname.lastname@example.org and calling the link an image. But can't you already do the same thing with the Google bot by including a link causing it to probably visit? This could be more instantaneous and hide the actual referring source of the visit behind an email, but I don't really see how this can be used for anything. For example if an extremely malformed server performs actions on the basis of a simple http GET then I guess you could craft that command into an image url, send it to any gmail address, and then Google will do your dirty work of actually visiting that link. But, really, is this a vector that is dangerous for anything? Don't URL's already get random Google traffic?
Anyone can check for the existence of gmail accounts in this manner without screwing around with images.
But you are spot on them being "willfully dishonest". The way I look at it, they are at this point trying to push the barrier and see where users will protest enough that they need to roll-back.
you can of course do simple image processing and identify similar images. if the marketers only change the name(url) of the file and not the content one-bit, you can trivially compare the hash of the file... even if the marketers change content, assuming the marketer sent the image %90 same and 10% customized per person, you can borrow techniques from image compression domain to compress this humongous data very efficiently.
"In some cases, senders may be able to know whether an individual has opened a message with unique image links" suggests Google (at least for now) fetches the images upon opening of the email.
They could do content hash-based caching rather than URL-based caching. It would be more private, as the email senders would have to generate a unique image for each recipient.
But if the images are pre-cached before the user opens it, there is still no way of knowing the E-mail was read.
Unless the image is cached upon opening, which is a bit counter-intuitive.
I think the guys who implemented image search has better ways to figure this out.
Do you think copyright law disallows running your desktop in the cloud?
The most important part is at the end:
"In some cases, senders may be able to know whether an individual has opened a message with unique image links. As always, Gmail scans every message for suspicious content and if Gmail considers a sender or message potentially suspicious, images won’t be displayed and you’ll be asked whether you want to see the images."
So Google apparently does not see read receipts as a problem. The privacy and security protections are about preventing other information (like ip, browser headers, cookies) from leaking, rather than read notifications.
If you care about maintaining your privacy, I would recommend disabling the new functionality.
(Why wouldn't they let users combine this behavior with the old one? That is, don't display images by default, but if you choose to display them anyway, get the file from Google's proxy server.)
"The privacy and security protections are about preventing other information (like ip, browser headers, cookies) from leaking, rather than read notifications."
"If you care about maintaining your privacy, I would recommend disabling the new functionality."
The new functionality seems to by default enable read notifications as Google seems to load all images by default. Unless that's false. Then it should have no impact on privacy.
Gmail is pre-fetching the image as part of the message.
I then reverted back to the old image settings but when I open unread messages now, they also have their images loaded automatically and without asking me.
Either the "revert back to original settings" are broken, or the caching of images is done when I enter my inbox, for all unread messages.
This isn't the privacy and common-sense win you think it is.
I don't think you're being sincere in your concern.
Because of this, any real interaction with an email message - where the user has to interact with the content, could then be counted as an, "open". If you get any other interaction with the message - say a user clicks a URL (that is also tracked), you just look and see if an, "open" was also recorded for this message, for this user - and, if not: record an open.
You'll still get instances when opens happen in messages, that aren't tracked, since no other interaction is done - there's that inaccuracy. But, if no other interaction is done, might as not count it as an open, anyways. Perhaps we should rename, "open" track as, "was this person, at all engaged, in the slightest?"
You can also then just give that a ranking: opened, clicked x amounts of links, followed through with a sale, DIDN'T unsubscribed - good rank!
Citation? I was not aware of this.
Selling eyeballs: yes.
Selling information: oh good lord no, so very, very no. That information is never ever leaving Google's servers.
Imo it's enough to be a bad thing. We can't rely on good will of companies to fight incentives and money in the name of privacy of their users.
Even the evilest of companies isn't going to sell off its competitive edge, that's just idiotic.
Also you are wrong, Google is not a marketing company. Google is a middle man company. Google matches marketers to consumers. THAT is Google's business. Google is paid to be a match maker, nothing more.
Marketers don't care who _isn't_ reading their mail, they care how to send mail that more people will read.
Who subscribes to a direct email campaign (and doesn't unsubscribe, and doesn't flag as spam), and is still offended by the thought of the sender knowing it was read?
Those people can flip the setting back to prompt-to-show-images.
This is a silly fear, all email clients already do this, as you don't display raw unmodified HTML from emails, you have to scrub it. They are just adding one new kind of scrubbing to the list of things they already must do.
Read Mailchimp's post (December 6th):
"Image caching still lowers our ability to track repeat opens, but turning those images on means we’ll be more accurate when tracking unique opens. At least, theoretically it should work that way"
I've seen a couple startups that were working on dynamic email marketing - they fed in the content as an image, e.g. a "one-day promotion", but would change the image content server-side for future email opens to reflect current details. I guess that this breaks that functionality.
You wouldn't get the IP address like you would with conventional bugging, but you could still find out how many users read the mail and what time they did so.
One point nobody has raised yet, though, is that there could be valid use-cases for the sneaky stuff people were doing before. Images generated on the server that reflect current (updated or updating) info could be handy. It might even be worth serving different images to different clients based on user-agent strings. I'm skeptical on both counts, though.
Even the same URL can return a different image. That wouldn't be super useful for tracking, but they can only truly dedupe if they read every response.
They'll cache and own even more of your data and keep it out of the hands of spammers - in turn spammers will have to buy into google to get data about you.
This isn't for us, this was done to make money off of us.
The images they're caching aren't mine, anyway, and in many cases they're unsolicited. Sure there's the evil aspect to this (they own advertising), but there is the potential good of obfuscating your actually private data - the IP you check your mail from, when you check it, anything you send back with an HTTP request - from marketers. On solely that note, I'm all for it. But I'm also one of those who like the new Tabs setup, and rarely loaded images for emails from people I don't know.
You always opens our offer e-mails? We'll send you more of the same of what you open, and less of what you don't, increasing the chance you'll find something you like.
Stopping web-bugs from the spammers will improved things, but stopping it from legitimate opt-in marketing mails will make the experience worse and less targeted for people.
The company I work for send millions of e-mails on behalf of customers. All opt-in, and I spend far more time than I'd like making sure we comply with all expectations of the mail providers and ISPs...
But I'm all for Google proxying and hiding IP, cookies etc - I wrote a webmail solution back in the day, and co-founded a company to run it, and frankly I pretty much assumed Google did this already; we did that back in '99 because it was the obvious thing to do.
On the one hand, their proxy solution has a positive effect for privacy, but on the other hand, the load-by-default setting has a negative effect.
Either way, Google already knows which e-mails you're opening, so using an image proxy is not going to give them "even more of your data" that they didn't already have.
As a Gmail user, that's what I want.
> keep it out of the hands of spammers - in turn spammers will have to buy into google to get data about you.
Again, that's what I want. Spammers depend upon an incredibly low cost of sending emails with almost no accountability through botnets and foreign servers. If they need to open Google advertising accounts, provide a credit number, have to get their ads approved, and get charged market rates for their impressions ... I'm all for it.
But none of them will actually do it. Google won't make a dime from them.
I tried to opt-out of external content in Settings > General but unfortunately it's still loading images.
If a piece of content is delivered to me via the mail, I should be able to open a cached version as many times as I want without any request to the remote server.
And the cached version can be built for me by my mail system, which by ALWAYS fetching the resources protects me.
As a question of what SHOULD happen, I thinkread receipts should be voluntary.
Agreed, and not even the server has to do that, any good e-mail client could (as it seems Gmail is now starting to do).
I'd really hate to be in the bulk email business today. Perhaps google will sell them back the data they used to create themselves.
Alternatively, if you're not the sort of company that naturally engenders an active community, Twitter. It's not perfect, but it's a far better experience for customers who may want to keep up with what you're doing but don't want a new email to delete every week.
On the other hand, if Google (either now or in the future, crucially) alters the behavior to be smart about pre-caching images, then e-mail marketing is screwed. It will likely make sense at Google to do this, since it will improve the user experience to have the images be pre-fetched to the proxy server before they open an e-mail.
In other words, e-mail marketing vis a vis gmail is now in a Schrodenger's cat-like situation. We can't know if Google is now fully, partially, never will, or will in the future pre-cache images, so for all intents and purposes e-mail marketing data is both highly accurate and completely worthless at the same time :)
First they started filtering marketing messages into separate tabs, which I'm assuming dramatically cut readership. Now they're going to make it impossible to "bug" emails for read receipts. The only metric left is the "click".
Email marketing just became a whole lot less valuable.
I recognize that there are a couple potential downfalls to this thought.
1) The time/processing it takes to determine the md5 could be problematic on such a large scale.
2) I have no idea how easy it is to change an image to be unique for each user.
I wonder if the Promotions tab gets a lot of attention from users? I archive everything in there as fast as possible.
That just isn't true on gmail. The whole service is served over https and won't pass referral information.
Not to mention your IP and whatever other information they feel like embedded in links will still passed along when you click. So theres still some tracking going on, but they miss out on open without action emails (which is of course useful information to marketers).
GMail serves all images from a datacenter in Mountain View, CA, so if your email's images were served from multiple datacenters or a CDN, there is a good chance they will load more slowly, depending on your caching headers. They optimize images on the fly, which may introduce more latency. Their optimizer doesn't take into account whether the optimized image is smaller than the original, so the image they serve is occasionally larger (and/or looks worse) than the original. The maximum image size seems to be about 10MB.
I probably shouldn't have asserted Mountain View, as it's more of an educated guess.
Edit: here's a traceroute from Hong Kong to the Google server that made the proxy request. Is there a flaw in this method?
@hongkong:~# traceroute 126.96.36.199
traceroute to 188.8.131.52 (184.108.40.206), 30 hops max, 60 byte packets
1 220.127.116.11 (18.104.22.168) 1.168 ms 1.130 ms 1.122 ms
2 22.214.171.124 (126.96.36.199) 1.092 ms 1.058 ms 1.052 ms
3 vl902.edge3.hkg1.rackspace.net (188.8.131.52) 1.307 ms 1.205 ms 1.304 ms
4 RHI-0001.gw2.hkg3.asianetcom.net (184.108.40.206) 1.169 ms 1.148 ms 1.123 ms
5 google.gw2.hkg3.asianetcom.net (220.127.116.11) 1.782 ms 1.755 ms 1.741 ms
6 18.104.22.168 (22.214.171.124) 6.716 ms 126.96.36.199 (188.8.131.52) 17.183 ms 184.108.40.206 (220.127.116.11) 2.105 ms
7 18.104.22.168 (22.214.171.124) 81.019 ms 80.999 ms 80.966 ms
8 126.96.36.199 (188.8.131.52) 53.256 ms 53.077 ms 53.060 ms
9 184.108.40.206 (220.127.116.11) 72.528 ms 72.488 ms 18.104.22.168 (22.214.171.124) 80.903 ms
10 126.96.36.199 (188.8.131.52) 150.251 ms 148.859 ms 184.108.40.206 (220.127.116.11) 149.147 ms
11 18.104.22.168 (22.214.171.124) 215.363 ms 215.368 ms 126.96.36.199 (188.8.131.52) 201.955 ms
12 184.108.40.206 (220.127.116.11) 223.694 ms 18.104.22.168 (22.214.171.124) 211.817 ms 212.567 ms
13 126.96.36.199 (188.8.131.52) 212.505 ms 184.108.40.206 (220.127.116.11) 215.874 ms 213.109 ms
14 * * *
15 google-proxy-66-249-88-203.google.com (18.104.22.168) 211.877 ms 212.484 ms 212.371 ms
The true bottleneck would be the pipe between your procedurally generated image server & Google's server.
Penalize mass emails containing unique identifying image URLs for identical images.
Where identical means virtually identical.
Surely it's this same technology google themselves use more than anyone else to identify users?
JPEG XR actually adds significant features like OpenEXR- and radiance-compatible HDR encoding, whereas WebP is basically the same old 1980s functionality with better compression.
So while there's something slightly sketchy about doing this, I'd say the world would benefit more from MS doing it than from Google doing it...
[I use gmail and other Google stuff, and have an Android phone, and generally hate MS, but it's very hard to be enthusiastic about WebP...]
In order to maintain privacy it's been well discussed they would have to cache always and forever. So large images will definitely add up over time.
I also wonder, even if they have a persistent cache, you might still want to check the Last-Modified and Etag of the URI. I don't think many people embed dynamic images like this, and I'm not sure how most clients would handle it, but it's an interesting corner case.
Saying that the proxy is enough to require everyone to opt-out of auto-images may be a bridge too far, especially when there are ways to register your domain so that inline-images ARE automatically displayed, which IMO is what they should be encouraging.
Another way at this would be to find a UI widget which helped users actually understand the possible tracking info they would be giving up to the sender.
Still further putting control in hands of the sender would be a data tag on the IMG which told Google they should cache, and in exchange would result in wider image viewership. Tracking opens, actions, and coverts is the most important metrics to providing feedback to improving copy, it's devious for a display ad company to fuck with this on shaky privacy grounds. I guess at least they do provide an opt-out, which will be used by ~0.1% of users...
Remote address: 66.249.x.x [any google ip]
Referer: [not set]
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:22.214.171.124) Gecko/2009021910 Firefox/3.0.7 (via ggpht.com)
You send me a mail - you've no business being able to track if/when/how I open the envelope, unless I explicitly wish to inform you.
I'm not too keen on the idea of Gmail modifying the body of emails sent to me.
it makes it super simple to enumerate valid email addresses.
patch it to fetch the images on valid+invalid email addresses, then we'll talk