

Ask HN: How do you capture and analyse usage patterns on your RIA? - swombat

Hello everyone,<p>I want to answer the question: "What is the user activity most likely to lead to the user upgrading?"<p>What are the best practices for capturing potentially large amounts of usage data (e.g., perhaps, capturing every click on every button?), storing it cost-effectively without swamping your main app db, and then running reports/analysis on that data?<p>Is it best to log this stuff to text and then run a parser to load it into a separate db? To use external services (which ones work best? why?) To store it in your main db? Is it best to capture ALL the data and then figure out what you want to extract out of it later? Is that practical/realistic? Is it better to just focus on specific hypotheses and just record those specific data points?<p>I'm particularly interested in these questions in the context of a Rich Internet Application - i.e. something that looks more like a desktop app than like a website. Website analytics are fairly well documented, with plenty of articles tools, etc, but I haven't found much on the topic of doing this kind of tracking for complex RIAs that don't have simple and/or obvious conversion funnels.<p>For example, if you built an application like Huddle, how would you record data to figure out interesting usage patterns and which ones are more likely to lead to a purchase?<p>Thanks for any input/insight!
======
patio11
I'm kind of a metrics junkie.

One thing I've learned over the years: tracking data is easy. You can generate
a virtual firehose of noise just by snapping your fingers. Tracking data which
actually drives decisions meaningful for the business is a bit trickier.

I use a mix of Mixpanel, the DB, and key/value stores. I have log files, too,
but log files are where data goes to die.

In general my core activity loop is generating a hypothesis, asking what I
need to answer the hypothesis, either building just what I need to capture
that data or a more generalized system (particularly when I keep asking the
same kind of questions -- this is how I ended up writing my own A/B testing
framework), then analyzing the data and trying a few things in response to it.
The hardest part about drowning in data is understanding, in your bones, that
if it doesn't drive a decision at the end of the day, you're just wasting your
time. Even though the graph is pretty.

I generally get more bang for my buck out of simple hypotheses than complex
ones. Simple ones are easy to describe, easy to instrument, easy to test, easy
to improve, and likely to apply to a large portion of my users/business/etc.
Complex ones are, well, the exact opposite. For example, while it is within my
capabilities to target behavior at Mac-owning Firefox users who are interested
in Catholicism... I'm probably going to spend time working on buttons seen by
substantially everyone.

I generally do the first cut of analysis by playing around in my Rails
console. When I find something interesting in playing around, I often promote
it to a graph somewhere on the backend.

I think if you count all the data points I've got filed away somewhere you'd
come up with a number in the low tens of millions, so my chief bottleneck is
less technical/scaling (3 million entries in a key/value store? Oh well,
iterate over all of them.) and more that the time, creativity, and attention
to ask the right questions are very limited.

------
bemmu
I record everything to Google App Engine data store, then have a script to
export the data into a simple text file on my machine that runs from cron.
Then I can run Python scripts on my own computer to just go through the text.
I get about 60MB of events / day. Soon I will have to start deleting old
events from App Engine, because I have to pay Google for data store entities
stored.

At first I was planning on doing some clever incremental thing to get all the
stats on App Engine, but it's a lot simpler if you can just have everything
locally and rerun any analytics quickly when I make mistakes. It only takes a
few minutes to load a week's worth of data and go through it, so doesn't make
sense to spend days trying to be really clever about doing it on App Engine.

My favorite thing to extract is kind of a life story of users. When a user
joins, what do they do on average in the first 24 hours? How about the next
day? This is a pretty different point of view than Google Analytics, because
it is from the user point of view and not historical of the whole app.

So does this point of view give you some new insight? At least in our case.
For example we are now testing which of two profile box designs is better on
our MySpace app. Just a simple stat would reveal that the new design has a
better clickthrough rate. BUT, when I look at the whole story, it seems that
users are more likely to remove the app when they have this new design, so end
results may be worse.

Additionally I use Google Analytics events tracking, but I end up looking more
at the life story than those events, because it is difficult to pull out how
many events happen per visit, or how many events happen per visit in first 24
hours since user joined etc.

~~~
swombat
Would love more detail about how you do this... how exactly do you "record
everything to GAE data store"? Did you create a GAE REST app? Why GAE? What do
you record? Can you post an example of a few events that you record?

~~~
bemmu
The app server is already in App Engine, so it was easy to just add an Event
model and create an Event entity every time the user does anything that I
might want to track. This is what it looks like to browse the events in App
Engine admin panel: <http://i.imgur.com/yJTeZ.png>

The model is just 10 lines of code, and storing things there is even easier.
For events that happen on client side I have to do a silent ajax request to
the server to let it know what happened, similar to what Google Analytics
does, but mostly everything that happens in the app already is known by the
server.

------
geoffc
I'm using jquery and google analytics to track this. Tie the interface actions
into different analytic tags and use google to store and display the activity.

[http://www.thewhyandthehow.com/tracking-events-with-
google-a...](http://www.thewhyandthehow.com/tracking-events-with-google-
analytics/)

~~~
daveying99
Agreed. Basically, even when there are no page loads, GA allows you to
register events, actions, etc. to be tracked. You can have dynamic names that
you send to Google analytics using javascript, and everything will be visible
when you check your stats in addition to the normal metrics like page views,
etc...

------
ynniv
<http://mixpanel.com> (YC '09)

Basically, when something interesting happens, you make an AJAX call to a
logging service. Mixpanel then graphs events and conversions over time and
provides user profiling similar to Google Analytics.

You can also create virtual page views in Google Analytics by calling
pageTracker._trackPageview.

~~~
swombat
The problem with this is, what if you don't know what your funnels are? What
if you don't know what your key events are?

Recording _everything_ into Mixpanel would get very expensive very quickly.

~~~
ynniv
Hmm, what is the order of magnitude of your action volume? Worst case, you
might write them sequentially to log files via a trivial web service and deal
with them later, a la "tin" [ <https://nosqleast.com/2009/#speaker/anglade> ]

------
stevejalim
Why not try Google Analytics for Flash's event tracking? I've used that in the
past and it's been interesting

~~~
bemmu
It seems great in theory, but I've had difficulties getting out the info I
need. I'd like to know how many of certain events / visit users do. For
example, I have a feature in my app that users can steal points from other
users, I can see the aggregate stats for a certain time period just fine for
the whole event category: <http://i.imgur.com/22kxL.png>

But if I click on a certain event, "steal" in this case, I get this:
<http://i.imgur.com/mXUtQ.png> It says total events: zero, even though the
graph shows something very nonzero. And there doesn't seem to be a way to see
how many times users used the "steal" function per visit. How about how many %
of users use the "steal" function during their first month of using the app?
Maybe it's possible, but I haven't quite figured out how to get these answers
from GA.

~~~
swombat
My concern is the same.. there are a lot of questions I can ask my current
relatively rudimentary system, that I couldn't ask of GA.

