Hacker News new | past | comments | ask | show | jobs | submit login
Google Webfonts, the Spy Inside? (fontfeed.com)
75 points by plurby on Feb 20, 2015 | hide | past | web | favorite | 77 comments



This may be an unpopular sentiment, but here goes.

The hyperbole over this kind of reasoning threatens the very fabric of the Web. Snowden did the world a service in revealing all of the NSA hacking going on, but the paranoia that is resulting from this is breaking the original spirit of the Web.

It is, after all, a Web of links, and those links were intended to be not just between siloed content, but between different sites owned by different people. Links by their nature, permit tracking. All you need for for sites to pool their web logs and collude, you don't even necessarily need fancy JS tracking.

When Web 2.0 was ushered in, there was an early euphoria in the community, of everyone offering transparent data and APIs to their sites, and people being able to easily compose content and services between multiple actors to make new sites and services.

It is one of the things that makes the Web better than native -- the ability to compose parts of the Web. No need for stuff like OpenDoc, or other notions of document composition, all you need are URLs and semantic elements that import or interface with external resources.

In the pursuit of paranoia levels of "privacy", what will we lose? Will we balkanize everything into content silos?

I'm not against trying to make things more "private by design", e.g., proxying to scrub requestors before the CDN sees the hit, or replicating resources locally. But if we take this to the extreme, we end up making local copies of everything, and the Web loses some of the semantic information from it's graph that I think is valuable to retain.


A simple "noreferrer" (or referer if you like) tag on elements or in pages would solve a lot of this. 3rd parties would obviously still get the request, but they wouldn't know what page it comes from.

Interesting that "norel" got adopted so quickly for spam. So it shouldn't be hard to have a "noreferrer" tag added, right?

Yes, users can install addons to modify header behaviour, but site designers should be able to use third parties without disclosing things, too. Not just privacy, but security. Currently, apps need to implement a bouncer page to hide sensitive referrers.


So it shouldn't be hard to have a "noreferrer" tag added, right?

It's so not hard that it's actually already a part of HTML5 and supported by several browsers :-) http://www.w3.org/TR/html5/links.html#rel-noreferrer


Seems like it isn't on link, image, or script elements though, which is the way most third party content gets loaded.


I like this proposal. Although as you say, it only cuts the direct link between the site and the resource.


cromwellian, you are not alone in this. I have been using CDNs for years and now, all of a sudden, they are deemed the root of all evil? I get it though, digital fingerprinting, heartbleed attacks, superfish, those are some serious privacy issues. But this blogpost? It feels like the whole debate about not implementing socialism because it allows for a small percentage of people to abuse the system. What about the amount of energy saved by using a CDN, the decrease in latency, the browser support? Doesn't that count for anything?

This is a serious question: What damage does Google Fonts do to the users visiting my website using these CDNs and does it outweigh the benefits? Why should I take this "Hey, CDNs can deliver JS and therefor are able serve customized code, which theoretically means they are spying on us" seriously?

More and more am I starting to believe this privacy thing is turning into mass hysteria and it's being cleverly spinned by some organizations in order to gain traction. Maybe am I missing the point and if so, please, enlighten me.


While I agree with your argument regarding the paranoia, I think what bothers users the most is that this is not just some random site, but Google that the resources are coming from.

Understandably, there are users who would prefer to have much less Google in their online experience.


And here we see the contempt for privacy that some employees of Google hold.

What would you regard as private, pray tell, if it's not being able to access a web page without telling Google (and other advertizing companies) that you're doing so? You regard a pursuit for that freedom as "paranoid"?

Linking is the great power of the Web, and is why it is what it is today. That's all. Scripting is sometimes useful, but more often than not, it's used to enable an industry of services-as-software-substitutes ([1]) to thrive. Cross-site resource requests are not important or valuable (I think they're detrimental), and they are totally replaceable anyway, as you mentioned. As HTTP2 becomes more commonplace, cross-site requests will be replaced in favor of same-site requests. I look forward to that.

[1]: https://www.gnu.org/philosophy/who-does-that-server-really-s...


Oh come on, that first sentence is uncalled for.

Cromwellian (while awesome, and someone who never fails to impress me with his writing) is not speaking for Google, or even other Googlers. He is speaking for himself. As is his right, I'd hope you'd agree, even if you (like many) would disagree with some of the things he writes.

(Edit: you changed the first sentence. Which reads better, thank you. Though I would actually still make the case it's far less contemptuous regarding privacy than you suggest. Worth reading deeply, since I think what he is saying is nuanced.)


My first sentence does not imply that he's speaking for Google, or other Google employees. cromwellian is an employee of Google, and I would regard his opinion here as being contemptuous of privacy.

Edit: okay, I see the implication now of me referring to all Google employees in that sentence. I had intended for the plural to refer to "more than one", which I think is a safe bet - but it could also be construed as referring to "all" employees. I've qualified the sentence with "some".


>>> And here we see the contempt for privacy that employees of Google hold

Um... yes it did. You're trying to portray all of the employees of Google as being against privacy, and that's just simply not the case.

Besides that, the argument cromwellian was making is hardly unique to Googlers.


I've held this basic view of the Web far longer than I've been a Google employee (http://timepedia.blogspot.com/2008/05/decentralizing-web.htm...)

I wrote one of the first anonymizing proxy servers for the Web (http://cypherpunks.venona.com/archive/1996/02/msg00885.html) which was later referenced by others (Ian Goldberg references it here: http://www.cs.berkeley.edu/~daw/papers/privacy-compcon97-www...)

In the early days of Cypherpunks, I collaborated with Hal Finney, one of the founders of the technology behind BitCoin (http://cryptome.org/2014/09/hal-finney-cpunks-1992.htm) In fact, I sold a startup in 2000 that was based on HashCash, the forerunner to Reliable Proof Of Work/Blockchain.

I wrote one of the first Shamir sharing utilities for Unix, Cryptosplit. I authored one of the first Remailer 2.0 proposals on Cypherpunks, on ways of networks of PGP remailers to defeat traffic analysis. I wrote an anonymous forwarding, and later, a double blind anonymous mailing list software where neither the recipients of the list are known, nor the address of the mailing list itself. (http://cypherpunks.venona.com/archive/1993/09/msg00509.html)

I have been involved in cryptography and privacy since the mid 90s and I care deeply about it. But I am not an extremist. Just like I believe in capitalism, but I am not a libertarian/Objectivist/anarcho-capitalist, and I tend towards progressivism and regulation as reasonable requirements.

There is a fundamental tension between transparency and privacy.

We are heading into a scary world where the cost of cameras, microphones, and networking is going to zero, and the size is tending to zero. That means tracking will be cheap and ubiquitous. We will need to find a way of dealing with the implications of this, without going to live in a log cabin in the woods. Some of that is technological, some of it will be political/legal, and some of it will be cultural.

I love the Web, it's the greatest human invention since the printing press, but I fear for the balkanization of it, and the Internet. We need to tread carefully and not go overboard in being reactionary, lest we hurt the thing we love.

This is not being "contempuous" of privacy. It's considering the tradeoffs, looking at the threat model, and looking at the cost/benefits of various levels of privacy protection, all the way from "none" to "perfect privacy", and what the repercussions of that might be.


Aside your abstract commentary here is a defense of your argument that the web will lose something valuable if more sites stop directing their visitors' browsers to send requests to advertizing companies and CDNs for resources. I think that's baloney - the web will be better off for it, because it will be faster, more private, and simpler.

No semantic information is lost (except for the semantic information in Google's profile graph - let me play my violin). There's no balkanization, because there's no noticeable difference to end-users (which is why cross-site requests for things like fonts is so nefarious).

The web would provide all the value it currently does, because that value is founded entirely on linking.

You seem to maintain that wanting to achieve private browsing is "paranoid". Can you expand on this belief?


In this particular case, I'm not particularly arguing against it, just in general, the way I see things going.

There are lots of other promising ways that people compose Web services beyond this issue with fonts, services like Stripe or Geo, technologies like the upcoming Web Components, embedding media like Tweets, where I don't particularly think we will be served well by a paranoid model.

Your model of blue-links-only almost entirely prevents the kinds of service composition that almost all sites engage in these days.

It's also not clear it's a net win for speed or security. CDN sites are likely significantly more hardened than most regular sites, and most regular sites don't necessarily scale, or don't want to pay to scale, to reach top performance. That means people cut corners.


Your argument has analogous parallels between static and dynamic linking. That using 3rd party fonts, particularly those of a known personal metadata horder makes the web brittle both in structure and unduly trades the visitors metadata with a 3rd party. Pages are only faster, not semantically better by serving content from a third party.

I too prefer my pages to be statically linked.


I'm not speaking in any official capacity, but to at least get the conversation started off with data, here's Google's public FAQ regarding the Fonts API privacy policy:

  https://developers.google.com/fonts/faq#Privacy
What does using the Google Fonts API mean for the privacy of my users?

The Google Fonts API is designed to limit the collection, storage, and use of end-user data to what is needed to serve fonts efficiently.

Use of Google Fonts is unauthenticated. No cookies are sent by website visitors to the Fonts API. Requests to the Google Fonts API are made to resource-specific domains, such as fonts.googleapis.com, googleusercontent.com, or gstatic.com, so that your requests for fonts are separate from and do not contain any credentials you send to google.com while using other Google services that are authenticated, such as Gmail.

In order to serve fonts as quickly and efficiently as possible with the fewest requests, we cache all requests made to our servers so that your browser only contacts us when it needs to.

Requests for CSS assets are cached for 1 day. This allows us to update a stylesheet to point to a new version of a font file when it’s updated. This ensures that all visitors to websites using fonts hosted by the Google Fonts API will see the latest fonts within 24 hours of their release.

The font files themselves are cached for one year, which is long enough that the entire web gets substantially faster: When millions of websites all link to the same fonts, they are cached after visiting the first website and appear instantly on all other subsequently visited sites. We do sometimes update font files to reduce their file size, increase coverage of languages, and improve the quality of their design. The result is that website visitors send very few requests to Google: we only see 1 CSS request per font family, per day, per browser.

We do log records of the CSS and the font file requests, and access to this data is on a need-to-know basis and kept secure. We keep aggregated usage numbers to track how popular font families are, and we publish these aggregates in the Google Fonts Analytics site. From the Google web crawl, we detect which websites are using Google Fonts, and publish this in the Google Fonts BigQuery database. To learn more about the information Google collects and how it is used and secured, see Google's Privacy Policy.

For further technical discussion of how Google Fonts serves billions of fonts a day to make the web faster, see this earlier tech talk from the Google Developers YouTube channel.


This particular issue has come up in previous HN discussions, but I would draw people's attention to innocuous and quite reasonable-sounding phrases like "need-to-know basis." What does that really mean for a company like Google, whose core business model fundamentally depends on extensively data-mining user information? "Need-to-know" could mean almost anything, or whatever Google wants it to mean. This is a classic Google privacy strategy: controlling the debate by defining the terms.

Despite a reassuring policy, you, as the website visitor, don't get to decide these things and to the extent possible, the fact that this is even happening is abstracted away from most non-technical users.

Another example of Google's brilliance in 'controlling the debate by defining the terms': policies like this cleverly (but wrongly) lead the reader to assume that cookies are the only way Google tracks users or correlates their activities. What about TLS-based tracking mechanisms, for example?

But this is a problem that's bigger than Google. When information accumulates in distinct places, the value of exploiting that information always increases. Eavesdroppers naturally move to those places to exploit that information, sometimes with a legal backing (NSA/GCHQ) and sometimes without one (Aurora attacks, and other NSA/GCHQ activities).

Even if you interpret Google's pronouncements charitably, it would be a mistake to assume that using the Google Fonts API can't or won't harm user privacy. Google is a massive target for essentially all eavesdroppers, and the Aurora attacks (and other breaches with lower profiles) show that the accumulation of information--even under reasonable-sounding terms like Google's--can still end up in the wrong hands, and can be an inherently dangerous thing for user privacy.


This isn't my area of ownership, (and I actually agree with a number of things you said), but as far as I know it is exactly why Google goes out of its way to NOT retain those logs, and to explicitly NOT serve this traffic off of a domain that handles user cookies or other PII (i.e., separate by design from search or gmail, etc).

It seems the original author didn't understand this, so it's worth calling out here clearly.


(not talking for google) Two quick points:

- The fonts have to be hosted somewhere. And the more common the hosting site is, the better the browser cache behavior is.

- The cache behavior prevents requests from going out. If the font is cached, then there's no web request going back to google. And there's no web request on the wire for NSA/GCHQ/Verizon to sniff.

As for the terminology, I personally think that there should be some standards for defining the terminology and criteria, so that we can get human-readable privacy policies without getting uselessly vague, into a discussion of how some backend systems work, or into a giant mess of legalese.


It really depends on the relevant counterfactual; yours makes total sense from the vantage point of lots of developers, but I tend to prioritize privacy and autonomy. When I visit catphotos.wordpress.com, my intention is not to leak information to Google even though they have great fonts. My intention is just to visit the website.

So the counterfactual I would frame the discussion with would be something more like self-hosting fonts by default and prioritizing privacy over performance (different strokes for different folks, and I realize it can be a significant performance hit).

To respond to your "more common the hosting site is" comment, Wordpress is also extremely common, and they probably could have devised alternative solutions by making different trade-offs.

Cache behavior resulting in fewer requests can be a double-edged sword, too: if you cache fonts with clients, you're probably also caching a bunch of other things that may decrease your privacy in other ways. There are many layers of indirection, especially with NSA/GCHQ/Verizon.

I wouldn't argue that this and another services offered by Google don't add value for developers and even users (they absolutely do), but my argument is mainly that there are costs--maybe distant/abstract/indirect costs in terms of privacy/autonomy that are difficult to discuss in concrete terms, but costs worth considering nonetheless.

I wish WordPress had been more thoughtful about the trade-offs they made.


> I wish WordPress had been more thoughtful about the trade-offs they made

More accurately, you wish that WordPress had agreed with your priorities. They clearly did think about this and made a different decision and it's unfair to suggest otherwise.


I agree with everything you say. I think that we're still very early in developing acceptable norms for privacy -- we'll sadly have to have real collateral damage before people wake up to it.

I don't know how to proceed in developing the terminology, calculus, and as a result, standards and norms for good privacy without going either "screw it all, your reality is now public information" or "pre-paid gsm phone modem to tor/privoxy". It's the middle bit that has the reasonable space in there, but it's hard to track down and there are certainly different reasonable spaces there for different people.

Ugh.


If you don't want Google to know about catphotos, you can download all the fonts from google every day, so they stay in your cache and don't relate at all to your browsing.


Google, whose core business model fundamentally depends on extensive data-mining of user information?

Does it? Try Googling from an incognito window on your neighbour's wifi. Use a live distro if you want to be completely sure. Are the results significantly different? Are ads any worse?

I have tried a couple of tests like that (on other people's devices, etc) and the only noticeable use of that trove of data Google has about me is suggested searches from my search history. Google Now can also pull an article of interest every once in a while the same way.

That doesn't mean they don't have the data and won't cough it up on government's request but there seems to be very little effective mining going on.


Have a look: https://www.google.com/settings/u/0/ads

If you don't trust that, then a great way to find out what any advertising company knows about you is to act like an advertiser and look at what user data you can get. I don't mean call up google advertising and pretending that you're head of marketing at $BIGCOMPANY (you could, but it's not what I meant), but try looking at the product pages made for advertisers.


Your comments seem to underscore how Google controls the debate by defining the terms on privacy issues. You seem to be saying "Concerned about privacy? You could check up on Google by acting like one of its advertising customers.." which sort of highlights the parties Google is most interested in being transparent with (for completely understandable business reasons).

But of course in order to do that, you also have to have a login, cookie, etc. for a panel that Google controls, and that exposes only a tiny subset of the information it could most obviously and trivially correlate about user activities.

These just aren't really things people trying to visit wordpress sites should have to consider..


Please see my other comments on this thread about privacy terminology, norms, etc. But yes, I think that when considering the debate, one should look at both sides for an honest understanding of what's going on.

Tinfoil-hatting a login for intel gathering's a stretch. Please use whatever you feel necessary (e.g., TAILS) for keeping your privacy while doing a little recon on what google's offering about its users to advertisers.

Ultimately, wordpress chose to refer to google for font loading. That's their choice and right, and it's what people reading wordpress will have to deal with.


On the face of things, concern over this type of 'privacy violation' seems to be reasonable. However, coming from a page that is loading content from Fontfeed, Twitter, Gravatar, Google APIs, Fontshop and lo...Google Analytics, I think it's a bit of a silly argument.

If you want privacy, don't expect the sites you are hitting to take care of that for you. If you expect others to enforce security for you, well then, you have an entirely different problem.


This doesn't imply that, as a webmaster, you should be fine with asking browsers to load external resources from third parties with potential privacy implications, just because privacy-conscious users should have disabled it by themselves.


I'm not sure, as a webmaster, that I have any better control over my user's data than Google's font service does. If I think I do, I'm probably naive.


A good way for Google to address this would be by enabling CORS and encouraging the use of crossorigin=anonymous to avoid credentials being sent for fonts:

<link href='http://fonts.googleapis.com/css?family=Open+Sans' rel='stylesheet' type='text/css' crossorigin='anonymous'>

Unfortunately, a quick test (http://chris.improbable.org/experiments/browser/webfonts/goo...) shows that this can't be done currently because fonts.google.com doesn't have an Access-Control-Allow-Origin header:

https://redbot.org/?uri=http%3A%2F%2Ffonts.googleapis.com%2F...

(Oddly, the actual fonts are served with "Access-Control-Allow-Origin: *" so it works if you self-host the CSS, which would presumably be a bad idea: https://redbot.org/?uri=http%3A%2F%2Ffonts.gstatic.com%2Fs%2...)

This is the behaviour defined in the HTML5 spec:

https://html.spec.whatwg.org/multipage/infrastructure.html#c...

In some ways, this feels like an oversight in the spec because crossorigin=anonymous is actually better than the legacy behaviour but any use of the crossorigin attribute triggers mandatory full CORS checks.


Google has little to gain from these.


Google has an interesting position advocating for improved privacy and security. This would be a cheap way for them to back that up at relatively minimal expense.


I use NoScript and Policeman on Firefox, with conservative settings (disallow all active content (scripts, fonts, WebGL), whitelist-only cross-site requests). I've also configured Firefox to block cookies by default; only permitted sites can store cookies for the session, and just a handful I allow permanent cookies.

Web pages load much quicker, Firefox uses less resources, my browsing is significantly more secure (see [1] for risk of loading arbitrary fonts), and I can browse the web without Google/Facebook/AdvertizingCorp (and thus the Five Eyes) building a profile of everything I do. It's a nice feeling.

This set up also blocks ads served from third parties, which I feel is an agreeable compromise on web advertising. If I send a request to your website, and you send me a document with embedded images stored on your website, I'll download them and view them alongside the page. However, if you try to tell me "go send 5 unsecure requests to each of these three companies you've never heard of, and execute their 20KB of code, to get flashing ads alongside this page" - I'll ignore you.

Sites loading resources from external domains (usually Google) is nothing new. I've been browsing this way for two years now, and I've developed a healthy level of contempt for 95% of web developers. The vast majority of them just don't care for their users; campaigning to get the developers to change their habits is a broken model. Ultimately, you have to take control, and decide for yourself what you want to run on your computer.

I don't know why more people don't browse this way; some actually ridicule this approach ("get with the times"). It boggles the mind.

[1]: https://hackademix.net/2010/03/24/why-noscript-blocks-web-fo...


The real quote there being:

" It really worries me that the FreeType font library is now being made to accept untrusted content from the web.

The library probably wasn't written under the assumption that it would be fed much more than local fonts from trusted vendors who are already installing arbitrary executable on a computer, and it's already had a handful of vulnerabilities found in it shortly after it first saw use in Firefox.

It is a very large library that actually includes a virtual machine that has been rewritten from pascal to single-threaded non-reentrant C to reentrant C... The code is extremely hairy and hard to review, especially for the VM.

"

FreeType's news page http://www.freetype.org/index.html#news - has something very curious. Two fixes for the same CVE, but the second fix 9 months later. A look at the CVEs[1] for it is also interesting that they're all memory safety issues (at least, from a quick glance). So in 2014, it's still difficult to read fonts without exposing yourself to code execution vulnerabilities, eh? I'd imagine better languages would help here.

1: http://web.nvd.nist.gov/view/vuln/search-results?adv_search=...


Not using external service for font anymore because it's blocking. That means your time to first render is directly impacted by the time your user download fonts from a third party.

And have you ever landed on a site fully rendered but can't see the text? High chances that it's a third party font that can't be downloaded for whatever reason.


Another good reason to install Privoxy - http://www.privoxy.org/

Add the following to the config and you'll still be able to retrieve fonts and other shared stuff from Google's servers, but it'll block any tracking cookies and hide the referring site:

  { +crunch-incoming-cookies \
    +crunch-outgoing-cookies \
    +hide-referer(forge) }
  .googleapis.com
  apis.google.com
Unfortunately, it won't help against SSL sites.


Unfortunately, it won't help against SSL sites.

Proxomitron will, although its not open-source, its author has passed away, and it's only being maintained by the community. Among the things I use it to block are these "unexpected links to Google" and if it's something like jQuery or fonts I can have the proxy host it locally.


Aren't these web fonts just files they can include with their code? Why include anything from any 3rd party, it's a security and privacy issue.


using font files from a popular public cdn like google fonts is a good idea as they are generally highly available and are generally already cached on the user's machine from use on other sites.


Keep in mind the Google CDN is blocked in countries such as China, so your web fonts are not going to render for those visitors, and if you rely on jQuery from the Google CDN, those visitors will experience a broken site. If you have a global reach, this is one reason to self host.


Interestingly, that's true for scripts but it's not for webfonts because @font-face allows multiple sources, including local system fonts, and the browser will keep trying until it finds one which works.

Here's a test page: http://chris.improbable.org/experiments/browser/webfonts/web...

You can see that it first gets a 404 before continuing on to a URL which works:

http://www.webpagetest.org/result/150221_J4_43K/1/details/


Steve Souders from Google goes through dealing with this [1] and I was surprised to see how Twitter approached this (basically encouraging putting sync javascript link at the top of the page)

[1]: https://www.youtube.com/watch?v=aHDNmTpqi7w


<script src="//code.jquery.com/jquery-1.9.1.min.js"></script> <script> window.jQuery || document.write('<script src="/js/jquery-1.9.1.min.js"><\/script>'); </script>

This has to be pretty standard in 2015 right?


This still has to time out on the first request to trigger, right?


Yes – you could do something creative with async scripts but then you're transferring too much data for everyone to benefit only the subset of users who experience failures on the first script.


Kbar, why do you need "highly available" fonts when you can bundle them in your web site? If the web site is up, fonts will work, if not fonts won't be needed anyway.

Regarding caching, anyone knows how browsers cache content? I.e. if I host my own fonts and someone visits me, then visits another web site with same fonts.. are they retrieved from the cache or downloaded yet again? I'm guessing they are downloaded again which is unfortunate..


> I'm guessing they are downloaded again which is unfortunate..

Which is exactly the reason to use a web font CDN like Google Fonts. I have Open Sans on my computer, and I'll redownload it again in a year when the cache expires. If each Wordpress blog started including their own copy, I'd have to redownload it each time.


AFAIK, the fonts are hosted as regular URLs with cache policies specified in HTTP headers. So, if you host your own fonts, and someone visits your site and someone else's site with the same fonts, they will download it twice (unless the other site's referring to your site in the URL). The browser doesn't know that they're the same font until after it downloads it (twice).


Fonts are a small subset of a site that makes a conscious decision to use a third-party font. There are many other resources being downloaded, which may or may not be duplicated. If the sites are revisited, they will be cached.


Safari doesn't deal well with web-loaded fonts. I have seen Safari load corrupted webfonts from cache even when turning on "Don't Use Cache for Anything" in the Developer Tools menu. ... so there's always that.


If they're not being loaded from the same url (e.g. from Google), then they're not the same fonts as far as the browser can tell.


It's a pity that there isn't currently a way to leverage cached resources with different URLs for the same content.

In fact this would not be hard: if you could indicate, with the resource URL, the hash of the resource content, the browser could just use the resource with that hash if it has it in cache (no matter from which URL), and otherwise retrieve it, check the hash, and add it to the cache.


to be fair, what happens if there's a hash collision?


Use a cryptographic hash function, and there will be no collisions. There are a lot of things that rely on the assumption that hash collisions do not happen.


No, Google Web Fonts tailor served CSS to the browser agent. Now, Firefox receives 'woff' with all requested languages combined in the single file, recent Chrome receives 'woff2' with all available languages in separate files. Try visiting http://fonts.googleapis.com/css?family=Open+Sans:400&subset=... in different browsers.


For the end user 3rd party fonts can be blocked via Adblock Plus;

https://secure.fanboy.co.nz/filters.html


I've found ABP's filters to be sporadic in their effectiveness. Open up the network inspector tab and browse to theguardian.com and others - look through what cross-site requests are still getting through.

I detail what I use in this comment [1], but I still set up friends and family with ABP because it's easier to use. I just find I have to manually blacklist a lot of domains to get it to actually work. You only need to make one request to Google for them to know what page you're on.

[1]: https://news.ycombinator.com/item?id=9083895


Or Ghostery, which only blocks trackers, but allows privacy-friendly ads.


"privacy-friendly ads", all ads need to track on some level, so it will only depend on which company you trust more.


What the hell is wrong with the fonts my browser already has installed?


Going through exactly this when trying to optimize a site for a large company. Designers ended up with a little cursive font for headings. Getting approval from bigcorp takes forever. So many months in, that's the design.

There's no cursive-looking font that is commonly available on all machines. So we either have to use a font, or render images. Since it's used in more than a couple places, the font ends up taking less time.

Or I could try fighting the design, customer relationship, corporate branding, etc. teams to convince them the site's better off with just "Sans-Serif". In short: fonts aren't going anywhere.

Now, it would be nice if a dozen or two fonts of varying styles were included by default with browsers. But I imagine that'd just lead to designers raging about how <someone> is trying to limit creativity and enforce conformity.

FFS, many fonts don't even render properly on Firefox on Windows (like on medium.com) so I highly doubt usability is being considered as a main priority here.


Is this only a font problem? Many sites serve jQuery and other files from Googles CDN. Aren't these the same problem? Also there are other widely used CDNs that could affect your privacy .... will this not get very far, when you want to avoid any of these??


This seems to be a growing problem, even among so-called 'privacy advocates.' The last time I checked, EFF's Privacy Badger extension was designed around having no qualms about making exactly these types of bad trade-offs on users' behalf.

The cost of your privacy--even in the eyes of the EFF--sometimes is worth little more than reliably serving a font or a copy of jQuery and claiming to respect 'Do Not Track'.


I do find it odd that Wordpress would call these fonts from the authenticated section of the site. If it was just a bundled theme or plugin served publicly I don't really see the big deal, it is the web after all. But in my opinion authenticated sessions are authenticated for a reason, thus requesting assets from an un-authenticated resource does seem to be a concern. Just bundle the font!


I still don't understand why don't the major browsers ship with at the very least a copy of jQuery installed locally, and then create a way to replace that URL for the locally installed version. No request made, faster access times, is there any downside?


If sites are reliably hitting a major CDN (like Google) for jQuery, then you get that advantage through caching anyway. The problem is that they don't, and if they're hosting their own jquery.js, there's no way to know before you download it that the script can safely be replaced with the known jQuery. I can imagine a scheme where the browser sends a hash of what it thinks the file is and the server only sends new content if it's different, but that would be a massive change, probably at the protocol level, to standardize, and doesn't do you any good if the bottleneck is a server that's slow to respond.


You've just reinvented HTTP Etags. :) It's a cookie-like resource caching mechanism where the server can return an arbitrary value (usually a hash or timestamp).

https://en.wikipedia.org/wiki/HTTP_ETag


Well, no. There is no requirement in the standard for how Etags are implemented, so there is no way to use them outside of treating them as server-specific opaque tokens. They are useless for the parent's concept.


An even better reason to avoid using Google fonts is that they're frequently slow as molasses. Part of the issue is that fonts are frequently used badly and that browsers often don't handle them well, but it's still one of the more annoying things online.


Can you not just put 127.0.0.1 fonts.googleapis.com in your hosts file?


Thank you for making me aware of this insanity. I'll make sure to block those on my sites. The thoughtless denial of privacy is so weird, no one seems to mind letting third parties spy on their visitors. Yes, you, Google, jquery, cloudflare, typekit, gravatar, disqus and whatever your names might be.


The only person that really cares about your privacy is you. It's cognitive dissonance to believe otherwise.


And yet plenty of people with the knowledge and means to ensure their privacy online and elsewhere get spectacularly vocal about the privacy of those who have neither.


I think we should get spectacularly vocal about protecting people-who-don't-know-better's privacy rights. We created this damn thing called the Internet, and sold it to them as this awesome tool. We should at least try to secure it better.

It is my belief that decentralizing services is one possible solution to this problem. At a minimum, having the user store their data at home on servers they plug-in and turn-on may be the solution. We're a ways out from that, but I'd much rather see people pushing the argument toward decentralization than faulting a website owner for using nice looking fonts for their gluten free pie crust recipe. Everyone knows what happens when hackers get access to your pie crust. They eat it.


You don't know what a CDN is, do you?


I do know what a CDN is.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: