

Why we'll use Google Universal Analytics over Mixpanel and KISSMetrics - duwip
http://blog.fleex.tv/post/56429461771/universal-analytics-and-user-centric-analytics-why

======
snide
As someone who has built several top 1000 trafficed websites over the past
decade here is what the publishing industry definitely needs out of an
analytics program.

1\. Please give me a report that can prove that my user traffic is real.

2\. Please give me a report that can prove that the traffic is healthy.

I know that I can get this from analytics now, but it needs to be the focus.

For a decade I've competed against content websites that for the most part
game seo traffic, build click traps and generally pollute the Internet with
secondary source content. I've always had fairly large audiences on my sites,
with healthy 50% returning visitor rates. However, when it comes to getting ad
dollars, I always lost to competitors who had much larger volume mostly
because they were either buying meaningless inbound links or using some other
scam like click trap "we recommend this hot girl talking about prostate
cancer" photos to goose their numbers. Meanwhile we'd create quality content
and my sites would have hundreds of comments, while theirs would have very
little. It didn't matter that my audience was more engaged, advertisers bought
volume.

I just need something that I can show to an advertiser (or even better, that
they have access to and can compare) that says... hey, this website isn't a
constructed fabrication made to fake volume and take your money you sucker.
This is a real website.

A lot of the industry right now is based upon buying links from aging front
door portals (Yahoo, MSN, AOL) which still do ungodly amounts of traffic with
a mostly Internet illiterate audience. Sites buy these links, convert them
into CPM click traps on their targeted magazine sites and sell their inventory
to advertisers who don't know that the whole thing is shell game. They think
they're buying ads on a hot new site with explosive growth.

~~~
alexatkeplar
Hey snide - I think we can probably help you at Snowplow Analytics. We
warehouse all your atomic event data (including page views and in-page pings -
v hard to fake) with IP address, browser fingerprint, 1st party cookie,
optional 3rd party cookie, optional business defined-user ID, user timezone,
browser features, useragent... If that sounds useful for proving your audience
to advertisers, get in touch!

~~~
corin_
After seeing Snowplow mentioned a few times on HN in the last week, each time
I've thought "hmm, looks maybe interesting... but I don't have time to figure
out what it is or how to use it". Finally just now seen the starting guide, so
will probably play around with it sometime soon.

So piece of feedback is to maybe try and make it easier/more obvious how to go
from "this might be interesting" to "what can this do for me?" (I'm still not
100% sure).

------
joevandyk
I started saving all my page views in a postgresql database. Schema is pretty
simple.

I have the following tables:

    
    
        sessions
          session_id (uuid type)
          created_at
        
     
        page_views
          page_view_id
          session_id
          created_at
          site_id
          path
          query_string (hstore)
          user_agent
          referral_url
          ip_address
          user_id
          http_method (get, post, etc)
          details (hstore, used to tag page views/actions)
       
    

This allows me to simply query all my page views against data in my live
database. I can see the path a user took to place an order. I can easily
integrate a/b tests. If someone uses a coupon on the site and we want to see
if they later came back and viewed/purchased more, we can easily write a sql
query to figure that out. We can simply figure out lifetime customer value,
even if not logged in. If we're getting a large amount of traffic from a
certain affiliate, we can alert our staff.

It's really awesome to be able to have your data in the same place. Having
analytics data spread out to GA made it difficult to match that data against
ours. If we need to scale out to multi-terabytes, postgres_fdw will make
querying against the analytical database simple.

Since we're also tracking affiliate purchases to pay out commissions, I also
have another table that that stores additional information about a page view
if they came from an affiliate site (click id, the affiliate network, etc).

Here's the plpgsql function I use for saving the sessions and page views:
[https://gist.github.com/joevandyk/f63523cdd1a3aa75d0ec](https://gist.github.com/joevandyk/f63523cdd1a3aa75d0ec)

~~~
duwip
Yeah, we do that kind of stuff as well. At least you know what your data
means. But when you start getting millions of hits a day, you won't
necessarily want to spend some time scaling your system... In that case
leaving it to the pros and focusing instead on your product may prove the most
sensible move.

~~~
joevandyk
It should be pretty easy to scale out a simple set of data like this.

"Leaving it to the pros" means you don't control your data and you can't
easily combine it with your other data about products, orders, whatever.

------
mikeknoop
The last paragraph is important. I spent some time earlier this week when I
learned about Universal Analytics -- but quickly discovered that UserID
tracking hasn't shipped yet.

Can anyone on the GA team speculate about a release date for the uid bits?

~~~
hu_me
userId bit has been there from the start in Universal Analytics. Its called
custom dimensions and can be used to send any property about the user into the
GA and then link it to a User or a specific Visit.

[https://developers.google.com/analytics/devguides/collection...](https://developers.google.com/analytics/devguides/collection/analyticsjs/custom-
dims-mets)

~~~
mikeknoop
OP (and the article) refer to the uid tracking mentioned here:
[https://groups.google.com/forum/m/#!msg/google-analytics-
mea...](https://groups.google.com/forum/m/#!msg/google-analytics-measurement-
protocol/rE9otWYDFHw/7ThqY5swy3oJ)

As of Jun 27 it hasn't shipped according to a GA team member.

~~~
hu_me
had missed that thanks for sharing. I have been using custom dimension for uId
in our client projects its worked out well, though a dedicated api method is
always welcome.

~~~
duwip
Custom variables don't let you consolidate on users though? In the visitor
count, for instance, I don't think there's a way to tell GA to use a custom
var to distinguish between visitors.

------
j_s
I was not aware that the new analytics would track users. One interpretation
of section 7 of the Google Analytics Terms of Service is that tracking
individuals is not allowed:

[http://www.google.com/analytics/terms/us.html](http://www.google.com/analytics/terms/us.html)

    
    
      > You will not [...] use  the Service to track, collect or 
      > upload any data that personally identifies an individual 
    

[http://productforums.google.com/forum/#!topic/analytics/tTaq...](http://productforums.google.com/forum/#!topic/analytics/tTaqssN7sY8)

    
    
      > you cannot store names or ip addresses in a custom var, 
      > but you can store ids that need your backend to resolve 
      > into a person identification

~~~
Brandon0
Tracking an individual is different than storing personally identifiable
information. I can assign you an arbitrary (or seemingly arbitrary) userID
(that is unique to you), but does not personally identify you, as a way to
track you. This arbitrary userID is meaningless to any third parties. What I
cannot assign you, is your name, email address, or even IP address as a way to
track you since anyone that sees that information could figure out who it
belongs to.

------
jamiequint
This article is really making a big deal out of nothing. All the "major
issues" brought up here only create problems in edge cases. When you're trying
to drive growth or understand your users (the purpose of metrics at the end of
the day) you should not be focused on edge cases.

In most cases the reason you care about tracking logged-out -> logged-in
behavior is to measure onboarding behavior, understanding what the user does
pre-signup so you can do a better job of driving signups. Signup is not a
multi-client process in the common case so being able to track multi-client
behavior pre-signup doesn't really matter at all.

~~~
duwip
Agreed, these are edge cases. They did create a lot of questions for me
though, and made the whole thing rather confusing as a user.

As to how much of an issue these edge cases represent, I find it hard to get a
real sense of it. I guess it really depends on the situation, what you want to
measure and the user experience you offer to your visitors.

------
taf2
My gripe about google universal analytics or analytics.js vs ga.js is

broken backwards compatibility (cookie data is no longer stored in the same
way) this was an interface many add/systems used and depend on from the days
of Urchin.

Otherwise, new interface is pretty slick, features look good, the API to send
data server side is so much nicer.

broken compatibility just kinda sucks though

------
jdangu
> For one, there can’t be 2 [clientID, userID] couples with the same userID:
> with the way mixpanel does things, this is essentially a technically
> impossible scenario (...) And yet one user can access your site through
> different clients, leading to a systematic overestimation of the number of
> visitors hitting your site.

Really? Anyone can confirm this behavior? I'm pretty sure KissMetrics doesn't
have this limitation.

~~~
losvedir
Indeed, and this is why we ended up choosing KM over MP. With KM you just
"identify" a visitor whenever you want and if there's already another
anonymous cookie, it'll tie together all events retroactively. We couldn't
find an easy way to do this with MP when we looked at it.

~~~
duwip
Yep, it would seem that KISSMetrics has a better implementation where aliasing
can be called several times (as stated here:
[http://support.kissmetrics.com/apis/common-
methods.html](http://support.kissmetrics.com/apis/common-methods.html)).

The fact that it links accounts retro-actively though can be dangerous, in the
scenario of publicly-accessed devices. I'll have to admit though, this is not
the common case.

I guess my personal gripe with what MP and KM are doing boils down to: if you
can't infer stuff about who is visiting my website, be honest about it and
don't.

------
KaoruAoiShiho
Last I checked User based analytics is directly against the Google TOS. You
are not supposed to store any identifying information about specific users,
probably because Google has been under privacy scrutiny. So not only is google
not for user based tracking they prohibit it, making them a real non-starter
in any case.

~~~
duwip
Check out the Google I/O video I mention in the article if you need
convincing. As far as not collecting user data fo privacy reasons, I think
brandon0's comment says it all.

