
Deconstructing the Google Analytics tracking script - BillFranklin
https://billfranklin.svbtle.com/deconstructing-the-ga-script
======
tkazec
Documented, including the official unminified source, in Google's analytics.js
reference:
[https://developers.google.com/analytics/devguides/collection...](https://developers.google.com/analytics/devguides/collection/analyticsjs/tracking-
snippet-reference)

~~~
keganunderwood
Would be nice if they included what variable names they use inside google
instead of just i s o g r a m.

~~~
austinjp
I've noticed several minified JavaScripts that have the variables spell words
like the i s o g r a m here. Haven't bookmarked any so I can't find an
immediate example. I thought it was just a curiosity, but I think the
frequency is increasing. I wonder why. Just for giggles? Or is there any
particular reason?

~~~
fenwick67
kanyewest.com replaces the letters with "k,a,n,y,e"

    
    
        !function(k,a,n,y,e){k.GoogleAnalyticsObject=n;k[n]||(k[n]=function(){
        (k[n].q=k[n].q||[]).push(arguments)});k[n].l=+new Date;y=a.createElement('script');
        e=a.scripts[0];y.src='//www.google-analytics.com/analytics.js';
        e.parentNode.insertBefore(y,e)}(window,document,'ga');
    
        ga('create', 'UA-34495711-11', 'auto');
        ga('send', 'pageview');

~~~
che_shirecat
that's hilarious, I wonder who makes/manages celebrity sites like that.

~~~
fancy_pantser
Lane Goldberg in this case.

[http://www.builtbylane.com/](http://www.builtbylane.com/)

------
acdx
This is just an analysis of the snippet, not the script it loads (www.google-
analytics.com/analytics.js).

~~~
CapacitorSet
If there is interest, I could analyze that. (I'm not the author of the
article.)

~~~
ssharp
If you could analyze the old ga.js and figure out a way to _accurately_
duplicate the way the old utmz cookie was created, you could probably make
some money off your effort. I've yet to see a great solution to this.

~~~
teej
How old is "old", and what do you need it for? I've written code to parse the
utmz cookie and separate code to recreate the whole organic/referral/utm
attribution model in GA using first party log data. The only major drawback
AFAIK is that adwords utm parameters are obfuscated and Google doesn't provide
a way to resolve glclids to their campaign's utm params.

~~~
jgalt212
it's dated, but still the best public resource on de-obfuscating gclid's.

[https://deedpolloffice.com/blog/articles/decoding-gclid-
para...](https://deedpolloffice.com/blog/articles/decoding-gclid-parameter)

------
jchw
The reason why the arguments spell isogram is because you can't have two
arguments with the same letter. So they chose the most meta isogram,
"isogram." Surprised the author missed that and assumed it was just reference
to the script itself.

~~~
sdoering
Thanks for clarification and also thanks for the explanation of an isogram. Or
the nudge in that direction.

------
untog
Anyone not wanting to use the JS might be interested to know that the Google
Analytics Measurement Protocol is fully documented, and you can create your
own front-end implementation, should you wish:

[https://developers.google.com/analytics/devguides/collection...](https://developers.google.com/analytics/devguides/collection/protocol/v1/)

~~~
rob-olmos
One thing to be aware of is that latent hits, via the Queue Time "qt"
paramter, have a maximum delta of 4 hours:

"Used to collect offline / latent hits. The value represents the time delta
(in milliseconds) between when the hit being reported occurred and the time
the hit was sent. The value must be greater than or equal to 0. Values greater
than four hours may lead to hits not being processed."

Would be nice if that delta time could be greater than 24 hours, eg. a human
approves that a low-volume contact form submission isn't spam and that hit is
tied to a GA goal.

------
shubhamjain
Something that has always puzzled me is why only GA follows the saner approach
to push all the function calls in an array which the async loaded script can
pick up later. Many of the popular analytics solutions (like, Segment,
Mixpanel) create a factory function that initialises all API calls with a
generic body. It seems rather unnecessary and only adds to the boilerplate.
Take a look at the GA's init code and compare it with Segment's [1].

[1]: [https://prnt.sc/fpj9ec](https://prnt.sc/fpj9ec)

~~~
CaveTech
Some libraries (fb, ga) using a single factory method that handles all
commands. Ex: fb('init', {})

The others (segment, heap) use explicit functions for different actions. Ex:
heap.init()

The main benefit is you can detect API violations without loading the async
script. If you call an invalid function with the former you have to wait for
it to execute at some point in the future.

This allows you to surface errors immediately - breaking current exection -
even if it's additional libraries havent loaded.

------
Jgrubb
I wrote almost the same post, with almost the same title a couple of years ago
-
[https://www.ignoredbydinosaurs.com/posts/239-deconstructing-...](https://www.ignoredbydinosaurs.com/posts/239-deconstructing-
the-google-analytics-tag)

I like mine better :)

------
spullara
It is pretty obvious that the snippet loads the real script. I expected this
article to be about what the actual google analytics tracking script does
rather than the tracking script loader.

~~~
negativ0
+1

i opened the article and "oh, really?"

------
shreve
I'm amused by the fact he thinks `array.push(arguments)` inside the nested
function evaluates and pushes the arguments from the parent function.

~~~
lucideer
Yeah, I'm surprised noone had pointed this out.

For all those here mentioning that he's only "deconstructed" the snippet, and
not the actual main linked analytics script, you're the first to observe that
he's deconstructed the snippet incorrectly.

Don't get me wrong, it's really great and heartening to see someone new to JS
getting back to basics and trying to understand things in proper minute detail
(instead of doing another tutorial about a boilerplate React app), but this is
hardly HN front-page material? It's a very simple JS snippet.

------
iakh
Tried to search but nothing obvious showed up. What's the purpose of the 1 in
'1 * new Date()'?

Edit: found it. Gets the timestamp[1]

1\.
[https://stackoverflow.com/questions/24182317/multiplication-...](https://stackoverflow.com/questions/24182317/multiplication-
with-date-object-javascript)

~~~
rubyfan
Right, back to the point of saving characters/bytes at scale it's a cheaper
way to getTime()

~~~
jackmoore
And if you don't need to support IE8 or lower, you can use Date.now()

------
mrschwabe
On a related note, there is an open source Google Analytics alternative in the
works:

[https://github.com/vesparny/fair-analytics](https://github.com/vesparny/fair-
analytics)

Hope it comes along; there is surprisingly not much else in the way of Node/JS
based open source web stats/analytics solutions.

~~~
extra88
Piwik is an open source GA alternative [0]. It doesn't share many of the goals
of Fair Analytics. I also don't see much actual analytics in Fair Analytics,
just data collection.

[https://piwik.org](https://piwik.org)

~~~
greenhouse_gas
The thing with piwik is that it's

1\. Has a large footprint (won't run on tiny instances) 2\. Relies on
blockable user side JS.

Is there something like that that just analyzes server logs?

~~~
extra88
Back when we walked 5 miles uphill both ways to get to school, server log
analysis was all there was. I think AWStats [0] was the most prominent.

But Piwik can use methods other than JavaScript in the browser, including
importing server-side logs [1].

[0] [https://awstats.sourceforge.io](https://awstats.sourceforge.io)

[1] [https://piwik.org/faq/new-to-piwik/#faq_63](https://piwik.org/faq/new-to-
piwik/#faq_63)

------
chrismorgan
I have minimised it in my own case to just this (using ga.js instead of
analytics.js because this part is shorter—regardless of the fact that ga.js
itself is longer—and not worrying about any fanciness that I don’t need):

    
    
      <script>_gaq=[['_setAccount','UA-????????-?'],['_trackPageview']]</script><script async src=//ssl.google-analytics.com/ga.js></script>
    

With analytics.js:

    
    
      <script>ga={q:[['create','UA-????????-?','auto'],['send','pageview']],l:+new Date}</script><script async src=//www.google-analytics.com/analytics.js></script>

------
throwaway2016a
The reason it is a function that calls itself is to not pollute the global
namespace. The rewritten expanded example leaks "gaScript" to the global
namespace which is a little rude if you are injecting your script onto a
third-party page and using a random or long function name is not as clean a
solution since it still leaves dirt and takes more bytes.

------
j_s
Another interesting aspect of Google Analytics is understanding how fraudsters
sneak by, and the discussions here on HN whenever this comes up.

Examples:

· Hackers Make $5M a Day by Faking 300M Video Views |
[https://news.ycombinator.com/item?id=13219871](https://news.ycombinator.com/item?id=13219871)
(6 months ago)

· Uncovering an advertising fraud scheme |
[https://news.ycombinator.com/item?id=2333824](https://news.ycombinator.com/item?id=2333824)
(6 years ago)

Discussions:

· Alleged $7.5B fraud in online advertising |
[https://news.ycombinator.com/item?id=9796102](https://news.ycombinator.com/item?id=9796102)
(2 years ago)

· Inside Google's Secret War Against Ad Fraud |
[https://news.ycombinator.com/item?id=9628967](https://news.ycombinator.com/item?id=9628967)
(2 years ago)

------
Exuma
Here's something fun for people who wan't to customize what it says instead of
'isogram' (a fun little easter egg)

[https://isogrammer.com/](https://isogrammer.com/)

------
xg15
> _Interestingly, the arguments passed to the function spell out i, s, o, g,
> r, a, m, is a “term for a word or phrase without a repeating letter”
> (source), which I guess makes sense, given that the script looks like it has
> the bare minimum of characters possible._

I thought that was a kind of recursive joke (not necessarily a very witty one)
: if they want to make the parameters spell out a word, the word _has_ to be
an isogram, otherwise you'd define a parameter twice. And since "isogram" is
an isogram, maybe the temptation was too great..

------
GrayShade
Looks like the insertBefore call got dropped along the way.

------
ggambetta
> It seems like a and m are optional arguments. > Now those unused parameters
> a and m come in handy. No writing var in this script.

Precisely. A trick to reduce the character count, nothing more. Love doing
this kind of thing
([http://gabrielgambetta.com/tiny_raytracer.html](http://gabrielgambetta.com/tiny_raytracer.html)).

> Making a.async truthy ensures

"Truthy"???

~~~
kevinmannix
In JavaScript, "falsy" is null, undefined, 0, '', or false. Everything else is
truthy, including empty objects, strings with only whitespace.

Many bugs have been caused by incorrect uses of truthy and falsy.

~~~
venning
Also `NaN`.

------
janneklouman
A tool that generates tracking codes with a customised isogram as parameters:
[https://github.com/shinnn/isogram](https://github.com/shinnn/isogram)

Example: (function(i, s, o, g, r, a, m) {...}) -> (function(y, c, o, m, b, i)
{...})

~~~
pc86
I can't think of anything less useful/productive/meaningful.

~~~
packetslave
this comment, for a start.

------
jagthebeetle
I'm not sure if it's actually intentional, but given the command-queue nature
of the ga function, i[r].q's resemblance to IRQ could be a cute reference.
(Assuming i and r were fixed, p=1/26 for using q... science!)

(Originally noted in some old article that I can't find right now.)

------
johop
I also liked this deconstruction:
[http://code.stephenmorley.org/javascript/understanding-
the-g...](http://code.stephenmorley.org/javascript/understanding-the-google-
analytics-tracking-code/)

------
krallja
> An unminified and less convoluted version of the script might look like
> this:

(references to undefined variables `a` and `m` in some sort of IIIFE)

------
dna_polymerase
Sherlock here discovered that it asynchronously loads the actual script, which
he conveniently did not bother to explore further.

~~~
nobleach
Yes, and when I was 5, I took apart my parents rotary telephone and figured
out how it worked. I'm sure there was a document somewhere that already
explained it... but for me, actually "discovering" it led to a new level of
understanding. I'm confused by your use of "Sherlock" as a pejorative. I
actually like the author's use of deduction as a learning tool... instead of
just vomiting knowledge read elsewhere.

~~~
Ajedi32
> I'm confused by your use of "Sherlock" as a pejorative.

It's not a pejorative, it's sarcasm. You could replace "Sherlock" with "This
genius" in that sentence for essentially same effect.

I'm not really sure why, but for some reason "Sherlock" has become very common
to use in a sarcastic context like that. Kinda like how "fat chance" and "slim
chance" now mean basically the same thing due to "fat chance" almost always
being used sarcastically.

~~~
nobleach
Regardless... even though what the author discovered is common knowledge to
those who've done some work with GA tracking (or perhaps simply researched
interesting ways to add elements to the DOM). I applaud the curiosity that
lead to this blog post.

~~~
vlasev
Nah. The author did the equivalent of saying they would tell us how the iPhone
works, but got as far as just opening the box it comes in and powering it on.

