Hacker News new | past | comments | ask | show | jobs | submit login
The origin story of Google Analytics (urchin.biz)
397 points by auston on Nov 18, 2016 | hide | past | web | favorite | 53 comments

That was a great read! I've been doing this long enough that I still complain about how much I hate GA compared to the old Urchin. I used Urchin for years and loved it. I was all excited to see what it had become as it changed to GA, and it was just... blah. It went from this super incredibly powerful tool that gave me access to all sorts of stats I needed, to this... I dunno... seemingly random assortment of things that seemed like would only excite someone that worked at Google or had AdWords on their site. That was, of course, probably the goal, so I suppose they succeed, but damn I still miss the old Urchin numbers I'd see every day. They were useful for webmasters and sys admins. GA is useful for those trying to see more ads (not that there's anything wrong with that, but that ain't me).

Funny I love the history of urhin but I think the measurement protocol is one of the best things that had happened to GA. Also the user explorer interface is great for understanding how individual users interact with your site too

Can you give any examples?

Also the reason why you sometimes see utm_campaign=... (and other utm_*) appearing in URL query strings.


I like the different acronym mistakes people always (and very authoritatively) make for utm. "It stands for Universal Task Module" "Nope, it stands for Universal Tracking Monitor!" ... inevitably the old guy jumps in, "Kids, let me tell you about back in the day when we used Urchin"

For a while the JS that Google handed off to you was "urchin.js"

For years, actually. I kind of chuckled when they finally changed it.

btw, they still have urchin.js file on GA servers


Urchin 4 continued our tradition of supporting way, way too many random platforms (Google still has Urchin 4 help: check out the OS support… ever heard of Yellow Dog Linux?).

Heck yeah! Yellow Dog was one of first GNU/Linux distro I attempted to use, back in 2k2 (Or was it SUSE 6.4? Both ran too sluggishly for desktop use on my 5400/120, tho.)

Here's the trial Urchin 5 for RedHat 6, btw. https://web.archive.org/web/20060223041140/http://download.u...

While many web users don’t know that utm in a URL stands for Urchin Traffic Monitor, there are also Red Hat users who don’t realise that yum stands for Yellowdog Updater, Modified.


I not only remember it, it still makes me sick that they've replaced "yum", a project with a name that rolls smoothly off the tongue like no other, with "dnf", a project name that has all the grace of a running camel. (Also, it makes me sad that Seth's lovely project will be forgotten now that he's gone.)

I remember Yellow Dog Linux ONLY because it was the official Linux Distribution supported by the PS3.

Didn't use it enough to know how it distinguished itself from other distros beyond that, but thats enough of a distinction for me!

They live on in a small tool they built... The Yellowdog Updater, Modified.

Now just known as yum.

Now superceded by dnf on Fedora, sadly.

Does anyone prefer the name dnf? I will likely never agree that it should have been renamed to dnf, even though it represents a significant amount of rewriting. yum is an incomparably better name. "dnf" is even worse than "apt".

DNF always makes me think of the fact that in racing, DNF stands for Did Not Finish.

yum wasn't renamed dnf -- dnf is a new, yum (mostly) command line compatible replacement.

They share quite a bit of code.

I've made that mistake in our products. Hell, I'd nearly finished Gentoo support before realizing I was going down a stupid rabbit hole and the only prize was supporting the product for a half dozen grouchy users, and only one of them actually ever pays for software.

We support three distros now (the obvious three), and politely encourage users of other distros who really don't want to use one of the big three to fork it and add support themselves (with our blessing and encouragement).

Ah, Yellow Dog... I remember getting that running on a PowerBook way back in the Mac OS 9 days. What a pain, and X11 was always a bit glitchy.

The lessons learned at the bottom are fantastic. This is a must-read.

I wonder what Analytics uses to store data these days. That's an awful lot of data they've got coming in.

it is but it also can easily be sharded by customer so not as scary

big table.

Or Spanner?

Nope, big table.

Nah, bigtable is row-store (too much overhead). They probably use a column-store. Do you have any link ?

> But the real killer feature IMHO, which was released in Urchin 6, was individual visitor history drill-down. If this sounds potentially, um, sensitive, that’s because it is. Google wouldn’t touch this feature and it was summarily axed, never to return.

Any word on what this feature was? And do others currently offer an equivalent?

> And do others currently offer an equivalent?

And this here is why I use ad blockers.

This is what marketing automation (MA) software does today (Marketo, Pardot, etc).

It cookies every user and then collates site traffic data per cookie, so you can look up a single visitor and follow all their activity on your site. In contrast, software like Google Analytics collates data only by non-personal factors like URLs, browser, referrer, etc.

Guessing it showed all the touch points of a given user. Most companies with deeper analytics have this first party data stored in their DB somewhere. Many CRMs can also provide this.

It might take the form of a time stamped list of event data, page view data and categorized referred data.

No one touches PII

I'm not so sure that's true. Especially, when you leave the ad-tech arena move on to the app monitoring arena where none of the PII data would leave the house (or there is certainly less of an incentive for the PII data to leave the house).

Piwik offers per user session tracking.

I really like this kind of post-mortem story; it doesn't do that thing that so many entrepreneurs do where they make it seem like they knew what they were doing all along. This makes it clear that for every great idea or success they had, there were setbacks that made it plausible that less steadfast individuals might have let it fold before they reached the goal.

It's also nice to see a story with a realistic timeline; most acquisitions happen something like 7-10 years out. Overnight successes are extraordinarily rare, but get a lot of attention, because they make it seem easy, and people like to think that if they have a good idea, they can get rich overnight.

And, I enjoy that the company was always small enough to where everyone knew each other, even at the end, even though they had a gazillion installs.

> Warren Buffett lent Goldman Sachs $5 BILLION on a hand-written note.

Is this actually true? I googled it but couldn't find a source.

Token ring in 1995? I guess a lot changes in 20 years. Networking was very different then, ethernet cards were like $600 in '93


I'm a little dubious that it was actually Token Ring, based on the description. From "coaxial cable with fun twist-lock fittings" it sounds more like thin-wire Ethernet-- 10Base-2. That would be a lot lower cost and definitely more common than token ring in '95.

I supported a token ring network in 1998, but that was the emergency room network at a hospital in Arkansas.

You're right - it sounds like BNC.

Token ring delivered 16mbit, 6mbits more than regular 10-base-2 coaxial or twisted pair Ethernet at the time.

Yeah, that was coax Ethernet.

Thin-wire Ethernet didn't use vampire taps, however. Everything was terminated in BNCs.

"Now that I’ve moved out of San Francisco, I wear Google shirts a lot more often."

This made me laugh.

Have they fixed the referer spam problem yet?

You can add a view and add a filter to that view which filters requests which have domains of your choice in them. Add the domain(s) your site is hosted on and you'll get rid of most referrer spam; it seems like they call the GA API directly with your UA code, without providing a destination host.

Caveat: the filter did not apply retroactively to my data and only started working from the date of its inception.

If you do it with a segment instead of a filter, the effect is retroactive.


Why all this sudden google news outburst? Why?

My best guess is to drown out the story of them them nuking the Pixel sellers account (which was just reinstated because of the bad press).

It was a big reminder to people how much trust they put in Google with their data and how little recourse they really have against a faceless algorithm.

Great read! Thanks!

Long path up to just getting blocked by no-script

Yeah, go figure right..

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact