If you use 3rd party CDNs, please consider implementing client-side failover strategy so you don't leave out 50% of the Internet "population".
Not sure if that works properly in China, if it just spins. It might never 'fail' and fall back. I'd need to test that.
src: local('Slabo 13px'), local('Slabo13px-Regular'), url(https://fonts.gstatic.com/s/slabo13px/v3/B9U01_cNwYDvIHK04hX...) format('woff2'), url(https://fonts.gstatic.com/s/slabo13px/v3/fScGOqovO8xyProgHUR...) format('woff2'), url("/fonts/Slabo13px-Regular.ttf");
Unfortunately in HTTPS, I need to bypass HSTS protection and for Firefox it's annoying (I don't know for Chrome).
For the fonts I tried to create a self hosted mirror too, but Google does not offer to download exactly the exact fonts they host.
This is because the loaded resource will 'fail' anywhere between x seconds up to minutes, or never! In the meantime the user just sees a blank page, or best case, some 80-90% page that keeps trying to load something...
I've experienced this myself a couple times. Most probably my ISP messed up some stuff taking down whole chunks of 'internet' :)
I do things like http://stackoverflow.com/questions/7383163/how-to-fallback-t...
I still think it’s hypocrisy to care about one and not the other. Those corporations are actually actively working to erode Internet freedoms, which affects everyone, not just a single country, and one that is not even democratic in the first place.
To get on your high horse over censorship in Asia while at the same time merrily include spying in your own code, as a simple convenience, nonetheless, is very much indefensible.
It may not be your intention, but you appear to be defending the practice of censorship and demanding others to share your disinterest in the matter. That won't work, not because google and fb tracking everything on the web is good, but because censorship is bad.
EDIT: I'd like to know what made my downvoters do their thing?
The Berkman Center estimates that just 2-3% of users in censored markets use circumvention technologies. So you should be prepared to have your content pirated or your service cloned if it is of interest to a Chinese audience.
That's just the reality of the situation.
Why does HN allow throw away accounts? It seems to go against the idea of "internet karma".
Even if the posts give you karma (which my throwaways are pretty universally positive karma) it isn't worth the downvote brigades every time the haters see you and occasional death threats.
throwaway accounts very much tie in with the concept of free speech and the open internet. If I don't want my personal safety tied to a few paragraphs of ranting there shouldn't be an issue with that. I should be able to speak my mind without fear of recourse.
Nobody is more interested in "internet karma" than the Chinese govt at the moment. Haven't you heard of the "online credit score" they're implementing? Everything you say and do online is tied to your real identity and has wide ranging effects on your well being. Needless to say there's some suggestions that this may affect free speech
By calling the account "throwaway*" it tells people that this person is a regular and delibrately using a pseudonym
I'd like to add, that I don't think I judged it in my wording. I've made throwaway accounts, but never on sites where discussion between users is the focus. I just don't really understand.
I use uBlock Origin, Ghostery and Disconnect, and Flash Control. peppercarrot.com is all zeroes for all three blockers, meaning nothing is blocked because there's nothing noticed that needs to be blocked. There are no Flash Control icons, meaning no video or audio noticed and blocked. Thanks for caring. :)
On the front page of theguardian.com, logged in as me, there's a V icon at the top, meaning that Flash Control has blocked video, probably for some gratuitous menu feature. I have zero trouble using and reading the site.
When I first opened theguardian a few minutes ago, uBlock was blocking 13 requests. It's steadily climbed in those minutes to 32 blocked requests. Ghostery is noticing/blocking 0 trackers. Disconnect is blocking two: nielsen and comscore. Disconnect is also blocking 1 from Facebook and 3 from Google. All three tools may be seeing and blocking some of the same things.
Without these four tools, except for low/no-commercial technical sites and public service sites like wikipedia my web is all but unusable. With them my web is fine.
I very rarely have any problems using any site. I had to enable my bank in uBlock to use their popup bill pay feature. I think I had trouble viewing a cartoon at The New Yorker; I forget what I did to view it. Youtube and Flash Control seem to be in a perpetual arms race, as was the case with Flashblock. Youtube is my main motivation for using Flash Control, to prevent automatic video playing.
And yep, I get that sites pay the bills with ads. I $ubscribe to three news sites, and I also get that that doesn't pay the whole bill. The web is either going to have to block me for using a blocker (I've been seeing that very rarely recently, or at least "Unblock us please") or figure out a less dangerous, intrusive and loadsome way to serve ads. (And yep, I just made up the word "loadsome." I can do anything!)
EDIT: I whitelist duckduckgo.com in uBlock.
I just have to say, thank goodness for Moore's Law. Without it, we would never have so many wasted cycles!
 Not saying you're wasting, but the fact that we have to jump through sooo many hoops to stop all this crap is just disgusting.
How we operate, a good example using a very simple product: https://medium.com/@kevin_ashton/what-coke-contains-221d4499...
"Well a big one: Privacy of the readers of Pepper&Carrot."
Before even thinking about tossing things like Google Fonts or AddThis or whatever, the very first thing you need to do is turn on HTTPS. If you're concerned about privacy, or content injection, or MITM attacks, or name-your-poison-here, you must immediately only serve up pages via HTTPS with strong encryption.
- HTTPS is for attacks.
- What the article describes is run-of-the-mill tracking by Google etc.
If I am not being attacked, the CDN resources will still allow Google to track me. If I am being attacked the CDN resources will still allow Google to track me.
If I don't have these Google resources (let's just use Google resources for now), I don't think that Google will MITM me.
You are always under attack on the internet.
This isn't really hyperbole. While I'm sure it's possible to find the occasional exception, you really need to assume all internet traffic could be hostile.
- Verison vandalizes most of the plaintext HTTP by adding their X-UIDH tracking-id header.
- One of the goals is the privacy of the readers. Google isn't the only attacker, and MitM is only one type of attack. If you aren't encrypting, your requests being analyzed - probably several times - with DPI. If you aren't encryption, you are enabling passive surveillance.
That's just some of the obvious stuff.
Time and time again you see stories of people having tracking, ads, and malware injected into their browsing from free wifi, most ISPs, cell providers, hacked wifi routers, or even antivirus software.
Enabling HTTPS is THE baseline, there's no excuse not to have it.
When they went to HTTPS all the problems went away, apparently the culprit was code and HTML injected into their pages by
1) wifi hotspots - this is REALLY common, how do you think they redirect you to their login page?
2) content filters like sonicwall - I used to work in an IT consulting shop, these are EVERYWHERE. And they don't just filter, they record every page of every site you visit. Expect any non-residential wifi to be reading all of your non-encrypted traffic.
3) crappy ISP's trying to serve ads
4) bad actors on public wifi - this is common at the airport, meeting halls, and conventions. Even encrypted wifi is vulnerable as long as the attacker is on the same network (if they have wireless isolation off, which most places do for printers).
I generally hate when people point out things like "if you really cared about your users like you said you do, you'd implement [unrelated thing]", but in this case it's an extremely small change that would improve the privacy for every single one of their visitors.
You're correct with the fact that they are tracking us, but there's a trade off that comes with this that holds tremendous value. If that value of speed isn't a factor or low on your list of priorities then by all means, sever everything.
My needs and wants from a page are rather simple. Render your content, then kindly get out of my way and let me take in the message you are trying to communicate to me. Attempting to import distracting fanciness from CDNs is more likely to cause me to skip your site than an extra 100ms of load time because I'm in Germany and you're in Canada.
On an unrelated note, it looks like I might be taken by another archive binge here soon.
> Then there's other annoyances, like the alignment of the text, pictures, and other content suddenly shifting about, because the font has changed, or the JS from one CDN finally finished processing and decided that, no, actually, pictures should go over there.
This is because the scripts and css are so large that it takes time for the browser to render all the rules set by that site. This can easily be optimized with the right tools or throwing out dead code, even only fetching code specific to that web page. Most developers just `script` tags into their head as well and this _kills_ loading of pages more than anything (unless your css file is thousands of lines).
CDNs are meant for distributing across the globe, you're hosting your own assets without a CDN in the middle and the site is still slow, you're optimizing in the wrong place.
The initial page load is also one of the most important things to optimize for things like, you know, conversion of visitors to paying customers. I've given up on subscribing to new products and services simply because their pages weren't performing well, and I'm sure many others here have done the same.
Even if you're caching/serving static content efficiently it still adds load to a server.
Usually, one would amalgamate the resources to avoid additional requests to any server at all (CDN or not, all requests have unnecessary overhead). Many Web CDNs include frameworks that combine CSS, JS etc. resources to speed up page loading. Add to that SVG inlining and image optimization and you're good speedwise.
What you still miss the the geo-targeting of an Anycast network like CF et al. This will slow down the initial resource request again.
The question is: If you knew that you could live without the aforementioned pros of a CDN, why use it in the first place...
I thought that's what HTTP/2 was for? I'd rather solve this at the connection level than have some third party amalgamate my content and thus silently break it in maybe 5% of cases.
> It'll vary in download rates across the global [sic].
Nobody's CDN-ing CSS for latency when the rest of their assets are served from a single location. As the author says, it's just "laziness" (or 'ease of development').
It is entirely valid, and common, to front your own application code behind a CDN.
Love the sentiment, just wish the terminology was more accurate.
There are 2 reasons to use CDN. First is caching (different sites using the same resource from the same CDN will download it only once), second is speed (some browsers restrict connection count to the same domain, so hosting resources on a different domains might improve download time). Caching is better solved by using checksum as a key, instead of URL. Speed with HTTP/2 is not an issue, because there's only one TCP connection. The only advantage of CDN might be geographically distributed servers, so user from China would download resource from China server instead of US server. I don't see easy and elegant way to solve it, but I'm not sure it should be solved at all, HTTP/2 pushing resources should be enough.
I really like this idea! Store your heavy assets in a public DHT with each browser storing a part. Then fetch said assets by content-hash if not already in cache.
Maybe disable serving for mobiles.
The W3C needs to get on this!
In most cases, client does not even request bytes from CDN which is then not able to track Client.
But then again CDNs can implement tracking based on this lack of requests (which is kind of ironic and should be infeasible the more clients use this technique I think).
Actually the other issues are solved by the "DHT" part of this idea:
no centralized party can track which assets are already in your history.
The only tracking I can think of is by your nearest neighbours's browsers.
If such a neighbour N empties your cache (DNS attack?) it will trigger a full fetch from N.
Then N can attempt to fingerprint this assets query with what other pages list.
But then the whole point of this is to cache assets that are used on most pages!
I love this idea. Let's make the Web decentralized again!
(I couldn't resist)
The post seems to say "I don't like where some content is coming from, so I re-created said content by myself".
I know I would rather save a few bucks over make a site work for China. Many sites don't need to work in China.
Also, for anyone with a similar problem, consider backing https://www.kickstarter.com/projects/232193852/font-awesome-... . They're 15 hours from completion, and $38k away from a stretch goal to release SVG icon support in the Open Source version.
You can download the Google Web Fonts and serve them from your host.
You can also download and serve Font Awesome from local.
And there doesn't seem to be a reason why you can't do it with gravatar either.
I don't get this post honestly. It seems to be about replacing stuff with other stuff instead of replacing CDN with locally served content.
I'll happily use these services for quick POCs and throwaway demos, but once anything starts to become semi-permanent I'll make sure I control my uptime and host these assets myself.
The author doesn't even mention the big players: every FB share or like button, on all that nasty porn you watch (even in incognito mode), straight to FB. They recently changed their policies and signaled that they are going to start using this data for ad targeting, probably in a push to expand FAN and be more competitive with Google.
Something as simple as a share button that some blogger copy and pasted into their blog turned into an ad tech/data company!
I personally love that story and think that's cool and innovative thinking from AddThis.
But I also think more data = better ads, at the expense of privacy (probably not a popular opinion around here).
Isn't there an alternative? A more transparant way to provide users with source files and still keep the 'cached items' aspect.
As you observe, they do not explicitly answer the question, but their reticence should be taken as an implicit green light, encased in a warning about loading times.
Most Google fonts are merely served from their hardware, and not created by them, so the license selected by the font's creator applies. Think of Google Fonts as an aggregator of free-to-use fonts.
There is also a list of fonts and their licenses available from Google Fonts here: https://fonts.google.com/attribution
If you're really concerned, check who created the font and see if they make the font available under a permissive license on their own website. Lato, for instance, is available from its creator's website and is published under the Open Font License.
Playfair Display - "Copyright (c) 2010-2012 by Claus Eggers Sørensen (firstname.lastname@example.org), with Reserved Font Name 'Playfair'" in the SIL licence right next to the font files.
Therefore yes, you should be able to download them, and use them, according to the original licence. ( Which, by the way, usually required the font creator to be credited, which Google only does when you select it, but not in the served CSSs, which I believe, is not fair. )
... I don't even know myself if I'm being sarcastic or not ...
Is looking at some comics website even a privacy problem? Let's say google finds out your user X looks at your website. What possible damage can they do? Sell it to the advertisers so they can target X with some comics ad? If you ran a medical site, I would get it.
Then you have to give up other cool things like Google Analytics.
Some beautiful artwork on your site.
I can't control information they willingly give away, but I don't have to give them additional data on the people who chose not to send everything to google.
> Is looking at some comics website even a privacy problem? Let's say google finds out your user X looks at your website. What possible damage can they do? Sell it to the advertisers so they can target X with some comics ad?
For me it's because it ruins the search results. Just because I looked at a comic doesn't mean I want them ranked higher in search results. I've found the less google knows about me the better the search works.
> Then you have to give up other cool things like Google Analytics.
Personally I'm about to give up on it anyway. Maybe if your a big site it's useful. I just want a list of page hits and the referrer url if possible. Analytics completely fails at this.
Isn't that what clicking the Globe icon next to the Cog icon does?
I don't think it matters what the website is. Privacy should be the default. The idea that popular CDNs could passively gather a list of websites I visit is disturbing to me, even if all the sites on that list happen to be mundane. That information could still be used to build an advertising or personality profile on me. It's even more disturbing that leaking user information is commonplace on the web and most web devs don't give user privacy a second thought, though it's nice to see this one does.
That's no reason to hand over Google the remainder of your users on a silver plate. (Especially not when you're not even getting paid for it.)
> Is looking at some comics website even a privacy problem?
Why is privacy something that needs to be justified? Why is exhibitionism supposed to be the default?
"Being shiny" is not a justification for mass surveillance.
This is a conscious decision that the user has made to use Chrome and send their URLs to Google. When you visit a website you don't really get much of a say in this - you're up to the mercy of the website.
To be clear, I use Chrome (and Gmail and Youtube and Google Search and everything else) and I'm not overly concerned by these concerns, but I can appreciate when webmasters are being responsible with the services that they subject their visitors to.
Not to pick on this one point, but I want to add that I doubt many users are really conscious of it. People on HN are in a bubble; go to the local mall and ask people what they know. Ask them what a web browser is.
That's why people in the IT field have a responsibility to provide safe products to the public. It's like someone on Wall Street saying that a typical person is making a 'conscious decision' to accept all the complex risks of a sophisticated financial instrument, or that I'm making a 'conscious decision' to accept all the risks of an airplane when I get on it - I have little idea how the plane works; it's up to the manufacturer, airline and goverment to ensure it's safe.
What if they look at something politically unpopular, something that favors unpopular group or ideas? Those can be in comics. How about socially embarrassing things? They also determine you are in a certain place at a certain time. It is standard practice for governments to use those things to persecute people; there is no reason to think it will suddenly cease. I'm talking about western governments too - the U.S. government tried to embarrass and blackmail Martin Luther King; arguably it interfered in the recent election. Is there a reason governments would have changed?
Yes, I fully understand the need for privacy when it comes to political and other social issues.
Charlie Hedbo, http://m.bbc.com/news/world-europe-30710883
Thus I applaud the author of this website. And I'm really bored of the dullards who chip away at my freedom saying "but it's only a tiny transgression, what does it matter?"
That makes zero sense. Sorry, not buying that argument.
I'm talking about this particular comic, not some political manifesto.
Alternatives do exist. If you want advanced tools similar to GA, a local Piwik install will do the job. For much more basic options, there are lots of log processing programs available, I have recently tried and liked GoAccess.
Honest question: What is the benefit of Google Analytics? Can someone share a story where they got actionable insight out of it?
(Context: Being very concerned with privacy, I would never for the live of myself install any analytics on my sites. In fact, I have even disabled the nginx access.log.)
When your goal is to attract an audience, that's just standard boilerplate stuff as far as actionable insight goes.
Your privacy concerns are legitimate, but it's hard to consider any of these issues from a black box. You need data—and Google Analytics is the most common way to get that data.
Single sites are not the issue. The reach is. Alright, what's the matter if somebody knowns that I visit single comics site, right? Well, yeah, but if they also know that I also visit that other comics site, and that site too, and hmm, yes, that one too, then somebody starts getting the picture of me (disclaimer: this is not description of me).
Chrome is now the most popular browser, and for a reason. It's clear that most people value better browser experience over privacy fears.
Plus Google says they don't sell user's data, they only use it to target their own ads:
Google has used its services to aggressively push people towards Chrome with banners, degraded service etc.
While the privacy issues have been actively downplayed by either ignorant bystanders or by astroturfing, I do not even think that is the biggest issue here.
It is the monopoly on every level, that is my biggest concern.