

Ask YC: How do sites like Compete, Alexa collect traffic estimates? - ashishk

I've always been curious.
======
ecaron
Compete & Alexa heavily rely on browser toolbars
(<http://www.compete.com/help/s11>) to estimate their data. They work hard to
normalize their data, but you can probably imagine that the demographic of
people with that kind of toolbar installed is heavily skewed (and some people
argue loudly against it: [http://venturebeat.com/2007/02/22/traffic-measuring-
continue...](http://venturebeat.com/2007/02/22/traffic-measuring-continued-
why-compete-doesnt-work-and-why-quantcast-does/))

Other sites, like Comscore, use various methods like router-level analysis &
client-side monitoring software (like Nielson w/ TVs).

In the end, no method is 100% accurate and the best analysis comes from
looking at various sources & taking it all with a grain of salt.

~~~
ashishk
i remember seeing their browser toolbars, i didnt know that was the only
source of data.

it seems like collecting data like search engine queries, number of inbound
links to the site might also be used. no?

~~~
ecaron
There's no way to them to get that interstitial data without having code
analyzed/reporting per-page on the site (either backend Apache-log analysis or
something like Google Analytics). Alexa did this for a while with their "embed
your site's score", but that was too easily gamed so it was discontinued as a
metric source.

------
ironkeith
<http://blog.compete.com/where-do-these-numbers-come-from/>

------
gaius
In the UK there is the ABC (Audit Bureau of Circulation IIRC) to whom you can
send your Apache logs and they'll give you an official number. This is where
magazine readership figures come from too. Having a neutral third party do it
is important for advertisers.

------
HouseTrip
A good way of understanding the popularity of a website is google trend. This
article on Techcrunch shows an example of obvious bad data with Hitwise (which
can be applied to Alexa and Compete as well):
[http://www.techcrunch.com/2009/03/20/new-hitwise-stats-
show-...](http://www.techcrunch.com/2009/03/20/new-hitwise-stats-show-how-bad-
hitwise-data-is/)

At the end of the day, everyone who owns a blog or a website (myself included)
has been able to experience first hand how flawed the data from such companies
are by simply comparing them with their own google analytics data :-)

~~~
mg1313
Every web hosting company offers analytics software. Beside having Google
Analytics you can use that too in your comparation.

~~~
ecaron
But the point of these services is so that you can know how your competition
is doing.

------
pearlS
They get a lot of their data by purchasing ISP traffic logs and parsing it.
The browser data only serves as a much smaller "focus group" pool of users.

