Hacker News new | past | comments | ask | show | jobs | submit login
Google penalizes you for using Google Analytics (simpleanalytics.com)
228 points by twapi 12 days ago | hide | past | favorite | 102 comments

Isn't this a good thing? One part of Google made a tool that evaluates performance and they're not giving any preference to their other tools, despite also being made by Google.

If they hadn't objectively penalized you, everyone would be complaining that Google gave preferential treatment to their own products.

It's a good thing that Google don't give preferential treatment to their own products.

It's a bad thing that Google's own analytics product isn't good enough to still get 4x100 in Google's own perf tool. It means that people will give up and accept lower scores because it's "impossible" to be perfect. That harms everyone who uses the web.

Why will people think it's impossible and give up, when competitors like Simple Analytics demonstrate that better performance is possible?

(Disclosure: I work for Google, speaking only for myself)

There is a common fear (and I have to admit I irrationally feel this fear) that your ranking will be impacted negatively by not using Google Analytics. Maybe not intentionally, but perhaps because Google knows less about your site.

I know there's a case for these additional stats being less than useful, but SimpleAnalytics doesn't appear to provide the same statistics that GA does. It's a pretty big hurdle for some people to give up on funnels, time on page, bounce rates etc.

If there's an analytics platform with a closer feature match to GA, but with good page speed scores, that might be more convincing to them.

Maybe small business owners doing this themselves will give up because they don't know about these other analytics services. But I agree I don't think a marketer would give up and settle for a lower score, but I definitely can see a marketer being frustrated with the lack of feedback in order to get a perfect score.

They could... Google it.

Not everyone has the time, desire, inclination, or capability to articulate the questions required to chart out the "analytics package" space.

Nor does everyone have the luxury of hiring a PM to do it for them.

If the marketer doesn’t have the time, desire, or inclination to work this out I’m not sure it is a problem for anyone but the marketer.

It's very unlikely that swapping Google Analytics for an alternative will be an acceptable solution to a performance issue. Clients like GA. It's "industry standard", it plays well with other marketing tools, and people know it. Performance is understood to be important but not at the expense of understanding what users are doing. Website owners will readily drop a bit of perf in order to keep GA.

You can get very detailed and interesting statistics directly from your access.log. Zero JS, zero third party tracking, zero performance decrease.

I do this more than google analytics today, on one side because i want to respect my users on the other side because i too block google analytics (as many) and i want my data to reflect something somewhat close to reality.

Edit:// GoAccess is a favorite of mine https://goaccess.io/

If these logs link users together, for example, with an IP address, you would still need consent regarding GDPR.

Thats configurable tho

It is funny that AMP pages could have GA without being penalized.

Tangent: I find GA to be mostly useless nowadays for any website used by a more tech savvy community. When comparing the GA results to server logs and a separate JS logging script, and already discounting for bots, it's clear GA is only counting about 10% of my visits.

Too many people blocking that script. I have about 20 different sites using it that I manage in some form bit cannot couch for it anymore.

The scary thing about this is that marketing folks still seem to consider the results from GA as somehow valid. With the result that non-techy demographics get counted more, and therefore marketers assume that non-techy people are more interested in their stuff. It becomes a self-fulfilling prophecy.

Fighting this fight at the moment:

marketing folk - "we're seeing more responses from old people than young people, we should focus on that market"

technical folk - "Are we allowing for the fact that older people are less likely to be blocking GA and therefore most of those untracked clicks are likely to be younger?"

marketing folk - "well, no, but we don't have any information on those, so we can't make any decisions about them."

technical folk - "but we know GA is blocked by ad-blockers. And we know that ad-blockers are used more by younger, more tech-savvy people. And we know that approx 60% of the visits to our site are not registering on GA. So... can we include that in our analysis?"

marketing folk - "...."

technical folk - "...."

marketing folk - "I don't know how to change the pretty graph that GA produces to include that."

Once had to argue with a company that on Firefox the website is just blank (white), a Javascript error prevented any content to be shown. Response was that based on their analytics Firefox isn't used by anybody and thus not a priority to be fixed.

Ohh that's a "computer says no" level of awful. As an aside, I've forgotten the name for this sort of anti-pattern.

Are you thinking of Survivorship bias?

Willful ignorance?

If the marketing people are only running online ad campaigns, then they often believe they can disregard people running adblockers. For non-online campaigns they use bigger product metrics, I've seen. GA is not always the end-all be-all, but for online ad tracking they use that.

I'm also personally shocked at how FEW people relatively end up using ad blockers. Its a night and day difference

It's about 40-50% now depending on country so it's pretty substantial!

What's the source for that number?

Here: https://www.forbes.com/sites/tjmccue/2019/03/19/47-percent-o...

And this is 2 years ago so I assume it's even higher now. It's a lot more than I expected for sure. They only count consumers though.. Perhaps companies don't always allow it, but I always use uBlock at work.

That's surprising forbes has an article like that. It's one of the only sites I've opened on a modern iphone that made it uncomfortably hot.

And don't forget what I call 'the hell path':

Marketing folk - "Wait...if we're getting more responses from old people than young people and old people are more likely to run adblockers than young people, let's increase our spend on Google Ads because only people without adblockers will see them."

Technical folk - Get into woodworking.

Edit - If the technical folk push back, that's when marketing folk will say that 'the law of really big numbers' means that 40% of a big market is still worth a lot. Trust me, woodworking....:)

This is exactly that image of the World War 2 plane hit by bullets isn't it

This is why I proxy GA visits directly through my server. Full accuracy but less privacy issues as my clients don't require JavaScript and I can anon the IP myself. I'm surprised more people aren't doing this.

I'd rather use something like Plausible Analytics behind my own domain than go to extra effort just to use Google.

This is also blocked by many lists. All I want is accurate reporting, so there's no incentive to go with that over Google, especially if Google can keep our data safer than Plausible can.

> All I want is accurate reporting

That's the problem, accurate visitor tracking wasn't on the web design goals.

But if you want an independent track to verify your JS report, the server logs have almost an almost completely disjunct set of problems.

You already have that in /var/log.

It's not even close. Bot spam is > half of my logs sometimes.

And as a user I appreciate your efforts to do this in an ethical and privacy preserving manner.

Have you written on how to do this/followed a guide somewhere?

The Google Analytics Measurement Protocol documentation was used. We created a middleware which sends data to GA as a POST request (much like our regular logs middleware)

Newer version off ga tracking doesn't use that protocol. It uses an undocumented one

How do you do this? Doesn’t googles script load other scripts and post back directly to Google?

Did you write your own scripts to use your proxy URLs with the same API?

Ublock still blocks this

I think OP meant that the info coming to the server gets sent directly to GA. IP Address, etc.

I'm sure OP meant that. Proxy through a server != return a redirect to the client. And as mentioned elsewhere, they're writing directly to GA using a documented API; unless they are sending data from the client to the backend using a query and path that look exactly like GA's (which, why?), there is no way to know what client requests are logging data (and that's assuming they're specific client requests, rather than being logged as part of the actual functional interactions, using a session store, which if Ublock tried to heuristically detect and block, would block actual user facing functionality)

Only if it's DNS level (just a CNAME). Ublock can't detect it if it's really proxied through the server itself.

Yes it can. It blocks the request to your first-party domain based on path/query.

... and there are half a dozen ways to get around that too. You could proxy everything through a function that base64 encodes everything. It's an arms race.

Besides, I was talking about my server directly sending data to Google Analytics without JavaScript on the client side. GA has strong adherence to GDPR so there's no legal or privacy issues I can see with that.

You're not thinking this through. Any request can be proxied. Ublock won't block GET requests for `index.html` or other assets, nor even random POSTs to `/detail` from button clicks and such

Just speaking from first-hand experience.

This isn't detected automatically though, right? Someone would have to reverse engineer your setup and add it to the lists?

uBlock does detect this automatically, without reverse engineering your setup and adding it to their lists.

This assumes you self-host analytics.js and proxy GA API requests through your domain without changing path/query. If you manage to reverse engineer analytics.js and change the path/query AND proxy through your own domain, then this likely wouldn't be detected. But there's a chance that Google will make changes to analytics.js that aren't compatible with your reverse engineering it, and your setup breaks.

Someone needs to remind Mozilla that most of their users are likely blocking telemetry.

This is on point. I run a dev focused SaaS and it has become clear that any metric we get from the frontend are severely skewed. Probably due to add blockers etc. Heck, I even use them myself.

We now record key metrics just through our backend. No tracking, no cookies, just aggregate numbers.

Hi. I have a few content websites that may not be used by a tech savvy community, but by young people. Is it there a way to implement a GA alternative in a few minutes to compare by how much the stats differ? I suspect GA is not counting all my visitors.

Yes: https://httpd.apache.org/docs/2.4/logs.html#accesslog (for apache, obviously) or the equivalent for whichever flavor of httpd you're using.

Consult `man 1 grep` for information on how to query it.

> Consult `man 1 grep` for information on how to query it.

Or something like goaccess to get nice charts.

Worth noting that Google doesn't appear to use Google Analytics, either.

Good riddance. There are many ethical ways to get the stats you need.

I go out of my way to not use any Google services and I don't like when websites negate that choice by using Google analytics, Facebook pixels etc.

Lighthouse / PSI scores are irrelevant to search ranking.

Data from the Chrome UX Report (CrUX) is going to be used in results ranking as part of the page experience update - this comes from real-world usage of Chrome

GA affecting Lighthouse scores may be a good storyline for Simple Analytics (and there are plenty of reasons not to use GA) but you can still use GA and pass all the core web vitals

Google may not be using Lighthouse to perform the tests, but it is understood that Google use performance metrics as an SEO factor.

It's understandable that Simple Analytics would use Lighthouse to measure performance and extrapolate from there. I'm not sure how else they could do the test — as presumably they don't have access to Google's data.

I don't think it's too far fetched to think that Google will somehow tweak CrUX numbers to counteract the performance drop caused by GA. For example, on a small percentage of users block GA and send the collected CrUX numbers with a special tag to the mothership, and then using only those when ranking sites.

Possibly, but what they care about really is "What experience will users have when they visit a site". The GA script loading time is totally part of that, so why discount it from the measurements?

Experience of Chrome users when they visit a site.

Hahahaha.... takes deep breath... «… they care about users»? ROFL… Google doesn't, never has, never will. Google cares about money. Period. It's in their interest to rank sites which use GA above sites which use competing products.

Kind of depends on how compelling GA is to the overall organization.

GA makes some money, and maybe helps global tracking.

Otoh, if it's an easy way to get ranking, that's going to be abused. And parts of the org do seem to want fast pages to win, so if GA means slow pages, there's a conflict.

The Chrome UX Report itself uses PSI, and all of its metrics are based on page performance, so I wouldn't say it's irrelevant.

This is kind of a snake pit. Google now has three different but related page performance initiatives.

1. Pagespeed insights 2. Web vitals, further sub divided into Core Web Vitals and just Web Vitals. 3. Lighthouse

They all work together or are sub components of the other. Even more, two of three Core Web Vitals (Cumulative Layout Shift and First Input Delay) are not really reliably measures in a typical lab environment like PSI and Lighthouse: they require actual user interaction with a page. They are essentially RUM (Real User Monitoring) metrics.

Other Web Vitals like Time to First Byte are much more deterministic.

> this comes from real-world usage of Chrome

Measured by what metric(s)? Core Web Vitals.

Using GA isn't bad because there are two ways to run it.

First is when GA script is on <head> section. This is mostly popular, but making CWV scores little bit low.

Second is when GA script is anywhere on page, but not on <head>. Like before </body> or in <body>. This doesn't hurt your CWV scores.

They are using your latter example, here's the page: https://blog.simpleanalytics.com/with-ga-script

It is hurting the score. The GA script is NOT in the <head>.

Yes - 91 on mobile.

Because fonts, not because GTM.

Also you can run from cache and it also solves this performance score issue E.g.


Or use lightweight alternative: https://github.com/jehna/ga-lite

that's kind of weird, I would think you'd want to measure things like time to dom load, if someone clicks off before that, etc.

Maybe for performance reasons?

Yes - putting on head can measure DOM loading and interactions as quick as they happens. But have performance hit - Webkit (Safari, Chrome, etc) doesn't show even single pixel on screen until they load all resources in head. And another bad news - HTTP partition cache for Safari and Chrome.

Putting in body - you can not catch all interactions, but won't stop rendering.

Everything is an compromise...

Wowie, not a single pixel? So I'm not crazy firefox seems faster than chrome?

Sort of...

But Chrome has much better JS engine - V8.

Official statement from Google's John Mueller regarding this: https://twitter.com/JohnMu/status/1389322980547833856

"No, it's not the case that we penalize for Google Analytics. We don't special-case Google products in Search, but that goes both ways. The LH score is not what we use in Search, but my ancient WP + GA site is 100 there." John Mueller

BTW, the same is true for Google Fonts. These prove to be the major bottleneck on my sites. (Since I'm hosting fonts locally, these are at 100% performance or close. We may argue that in practice these fonts would probably have been available in cache, but this tends to be less a thing, compare Firefox privacy policies.) This is actually a testament to the neutrality of Lighthouse.

Holy cow Google Fonts is a great way to say "go away" to people on slow internet connections. Please don't use it. Because you can't predict the metrics for the font until it loads it means you either:

a) have the page jump around 30-60 seconds after the page loads (when the reader is probably half way through it) or

b) leave the page blank for the 30-60 seconds it takes for the font to load, if it finishes loading at all.

This has frustrated me so much, as I spend a lot of time optimizing web performance. Gtag.js is pushed by analytics, however after loading it then async loads analytics.js. It is very inefficient, especially for sites that do not much more than track page views. It is the worst scoring factor on sites I optimize because there’s very little you can do about it without hacks.

If you don't need any of the other features of GTag or GTM, you can omit them and just load analytics.js directly. It'll work fine and save you some data.


I don't think this option exists with GA4, unfortunately.

Looks like they're deploying the Simple Analytics script directly in the page source and the Google Analytics script via Google Tag Manager. Though Google Analytics through Google Tag Manager might be the Google-recommended way to deploy Google Analytics, this isn't exactly an apples-to-apples comparison. Would be interesting to see if Google Analytics directly in the page had a different score than through Google Tag Manager. That said, I would almost always trade the miniscule score penalty for serving something through Google Tag Manager (including Simple Analytics) for ease of maintenance.

I really wish Google would split Analytics into two products. One for "advanced" website operators and one for "simple" users, aka business owners aka real people, not professional data analysts.

The problem is that most of my clients don't care enough about their stats to pay $19/month. So they opt for the "free" option (Google Analytics) which is now being positioned for the high-end market.

GA offers GA360 for "advanced" users who need integration with Ads, Videos, GCP, Salesforce, etc.

I like the idea of Simple Analytics but that pricing model is out of reach open source projects like PortableApps.com. Separating out by page views is tough when I only need a single user and don't care about support. $600 a year for up to a million page views per month and "contact us" for more means it'd probably be at least a couple thousand dollars a year.

Honest question -- why do you need analytics on an open source project like the one referenced? What information could they possibly get from it that matters to what they're doing?

I've used GA once on an open source focused blog, and the information was entirely "interesting" but I didn't get anything useful out of it other than a vague "hey people are visiting" picture that really changed nothing about what I was doing.

Has analytics changed to be more useful? Who cares what countries people are visiting from? I feel like people don't question their need for analytics as much as they should and just automatically do it.

It's very handy for working to set up partnerships/sponsorships to help keep the project going, help bring other apps on board to use our open source format, and to track where site visitors are coming from to determine what languages we should add to the site (as either just a landing page or a full translation) or the software platform (currently in 56 languages).

I guess I can see that, do potential partners and sponsors not just accept a website counter? I guess an outside company is sort of an auditor on that kind of things. Language localization stuff is about the only reason I could think of, and even then like... don't open source projects almost always rely on passionate users for translations?

Thanks for the explanation in any case, I at least can picture the first working with bigcorp that doesn't know any other way of measuring the impact of what they fund.

You can use an open-source alternative like https://umami.is/

Thanks, I'll check it out.

My biggest complaint is that Lighthouse has a longstanding open bug where out-of-process iframes are counted in the main thread, thus causing third party embeds to wildly degrade the LH score, even when real performance is not affected much. This even happens to Google's own YouTube embed. Add one YT video to a page and see how destructive it is to a LH score. It's really difficult to explain to customers when they track their LH score like a professional bodybuilder tracks fat percentages.

I tried both Simple Analytics & Plausible but both are blocked by Adblockers which made then useless to us.

In the end - ditched Google Analytics and just use log files & sales data.

Hello, this is stupid. Google doesn't penalize you for using GA, but because you're not using it the right way. If you want to use GA without JS overhead, use their API and build your own JS friendly tracker.

That's why I use https://github.com/jehna/ga-lite/. It saves a roundtrip and is quite small.

You should see how much Facebook’s tracker impacts performance! GA does a relatively good job at not impacting performance compared with others. Not defending GA though, just whining about FB. :)

I'm using Google Analytics to track pageview count on a website and it does appear to add around 0.4sec to Time-To-Interactive of the PageSpeed score.

And the GA script size is about 40KB.

The amount of data Google gets off of people using GA is incredible and they don't pay for it. Google should be paying people for using their GA.

GA's users are being paid with access to a product they find useful.

Wait until you measure of showing Google Ads!

Anyone who takes this article seriously has (A) not read how to lie with statistics (B) does not analyze bias in articles based on the person writing it and (C) is directly contributing to a society of misinformation.

This is an ad.

TLDR; Google Analytics penalizes you for using Google Analytics (slower page speed), not Google.

Google doesn't penalize sites in any way if you're using their products (like GA or embedding a YouTube video). This is just silly and flat out wrong to say that Google penalizes you for using Google Analytics. It's flat-out wrong and untrue.

It's kind of like saying that buying Google Ads will boost your website's organic search engine rankings.

I really suggest that SimpleAnalytics.com update the title tag on this, as it's just wrong. Period.

That doesn't mean that using Google Analytics doesn't slow down your site (a bit) and Google should speed it up. I've had that complaint for years now, and Google just hasn't done anything about it that we can noticeably see.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact