
HN front page, 16000 visitors in a day, how many actually read the article? - plam
http://www.quantisan.com/post-gone-viral-16000-visitors-in-a-day-how-many-actually-read-the-article/
======
gnosis
I rarely read even the articles with interesting titles that make it to HN's
front page.

This is because when I did read more of them, they usually turned out to be a
lot less interesting than the ensuing discussion on HN.

So now I use the HN discussion as a proxy for article quality. In the HN
discussion I can often find a good summary of the article and get a sense of
whether the article is likely to be worth reading or not.

Only maybe 1 out of 10 articles or less that I look interesting to me on HN
wind up ones that I actually bother to click through. And of the ones I click
through, only 1 out of 10 wind up deserving of being read rather than skimmed.

Some years back, there were a couple of "HN Full Feed" type RSS feeds, that
would send the contents of the entire linked article, so I could read them
without even bothering to go on to the web site.

I valued these services not only because they were more convenient in that it
made clicking through and waiting for the aritcle to load no longer necessary,
but also because there'd be less tracking of my interests this way.

I also have javascript disabled for 99% of the sites I visit, and am
considering starting to use TOR for more of my browsing. It's really nobody's
business what I'm reading, and it's a real pity the Internet wasn't built with
more inherent privacy and anonymity features.

~~~
JoeAltmaier
Yet that discussion is littered with uninformed pot-shots at the topic. We can
hope that more than 6% of the HN comments come from folks who read/understand
the topic I guess.

Btw how did you accumulate your '1 in 10' stats? Off the cuff? Can you think
of a way to measure this? Because I don't think 1 in 10 articles 'deserve to
be read'. Somebody DID read them, took the time to post them here. So for some
audience at least they were meaningful.

~~~
gnosis
_"Btw how did you accumulate your '1 in 10' stats? Off the cuff?"_

Yep. Just a rough estimate based on my sense of how many articles I actually
bother clicking through to. Having a more accurate estimate of my own click-
through rate would not be valuable to me, so I never bothered to try to find
out.

 _"Can you think of a way to measure this?"_

If I was interested in gathering such stats for myself, I suppose I could use
a browser add-on to measure my HN use.

Alternatively, HN could start using indirect links. But I'm not sure if I'd
stay with HN if they started doing that. I hate being spied upon, which is one
major reason I stopped using Google, and would probably drop HN as well if
they started going down that road. Not that HN really needs to do that, since
they already know which HN discussion pages I open and what I write (which are
reasons for me to start making myself a bit more anonymous in my HN use).

 _"Because I don't think 1 in 10 articles 'deserve to be read'. Somebody DID
read them, took the time to post them here. So for some audience at least they
were meaningful."_

It all depends on who the audience is, doesn't it? If you aim for the lowest
common denominator, you'll probably get a bigger audience. This is a major
reason for much of the mainstream media content being such utter garbage (from
my perspective).

Also, just because someone clicked through on an HN link doesn't mean that
they liked what they found when they got there. The same goes for tracking of
people clicking on "Like" buttons or even sending links to their friends.

People could have all sorts of reasons for clicking "Like" buttons that have
nothing to do with them enjoying or even reading the content. And I can't
count the number of times I've forwarded unread articles to friends because I
thought it might be something they might be interested in, but that I had no
interest in myself.

------
lenazegher
These data are not being interpreted correctly. Analytics calculates time-on-
page based on the time between loads of the google JS embedded in your pages.
Any visitor who 'bounces' - that is, only visits a single page - only loads
the JS once, so their time-on-page is recorded as 0 seconds, regardless of how
long they actually spent on the page.

~~~
stopcyring
what if i told you they use the onUnload event?

~~~
tehwebguy
Do they?

------
cosmie
There seems to be a lot of confusion about how GA tracks user engagement,
which is understandable as even the Support article linked in another comment
doesn't accurately explain what happens with single page visits.

First off, the metric by definition will always be skewed lower than reality.
For multi-page visits, GA takes the time of the first hit and time of the
second hit to calculate time on page (and will chain these together to get
time on site). Since the page the user leaves on doesn't have a "second hit",
that time is never included.[1]

For single page visits, as blog posts tend to be, the calculation is slightly
different.[1]

    
    
       Time on Page = (time of last “engagement hit” on page) – (time of first hit from page)
    

If you set up Event Tracking to trigger as a user scrolls to predetermined
lengths of your article, it'll trigger these 'engagement hits' and give you a
better approximation of time on site. If you just throw in a standard tracking
code that fires off a _trackPageview() event on page load, then GA will never
see a second engagement and will not be able to calculate any approximation of
time on page/site, so it'll default into the "less than 10 seconds" bucket.
Depending on what blogging platform you're using, there are some add-ins that
provide such functionality.[2]

[1] [http://cutroni.com/blog/2012/02/29/understanding-google-
anal...](http://cutroni.com/blog/2012/02/29/understanding-google-analytics-
time-calculations/)

[2][http://www.analytics-ninja.com/blog/2012/06/google-
analytics...](http://www.analytics-ninja.com/blog/2012/06/google-analytics-
bounce-rate-demystified.html)

------
happyshadows
Analytics calculates the time on page by the time difference _between_ page
hits. One hit: 0 seconds on site. Because of this, it isn't an accurate metric
to measure engagement for a single blog post.

~~~
plam
I assumed they have some fancy javascript thing to take care of that? what
would you suggest to use to estimate actually readership?

~~~
happyshadows
I would stick with GA and just trigger an event via javascript once the body
of text is scrolled through.

If I didn't know how to do that, I would probably use a scrollmap tool like
CrazyEgg.

------
RivieraKid
Most of long articles are just a waste of time, the actual information can be
condensed into a single paragraph or less and the rest is just redundancy or
useless information.

------
kijin
As other commenters have mentioned, 0 seconds doesn't mean anything.
Meanwhile, those 18 visitors who spent more than 1800+ seconds on your page?
Probably they just opened the page in a new tab and only got around to reading
it a few hours (or days) later. So data at both extremes are useless.

If we ignore the 0-second anomaly, it looks like we've got a nice bell curve
peaking between 180-600 seconds, probably closer to 180 than to 600. That
sounds about right for a 670-word article.

------
edent
Interesting, that ties in with my observations of around 700 visitors per hour
on the front page - [http://shkspr.mobi/blog/2012/11/whats-the-front-page-of-
hack...](http://shkspr.mobi/blog/2012/11/whats-the-front-page-of-hackernews-
worth/)

While I didn't track engagement time, I looked at number of comments (both
here and on my posts, and shares on Twitter and Facebook) to try and figure
out how much of it was "real" traffic.

------
olalonde
Here are my stats for two of my blog posts that made the front page:

[http://syskall.com/how-to-roll-out-your-own-javascript-
api-w...](http://syskall.com/how-to-roll-out-your-own-javascript-api-
with/index.html/)

    
    
        3051 visits
        00:00:16 average visit duration
        98.9% less than 10 seconds
    

[http://syskall.com/yc-w12-startups-hosting-
decisions/index.h...](http://syskall.com/yc-w12-startups-hosting-
decisions/index.html/)

    
    
        3920 visits
        00:00:14 average visit duration
        99.1% less than 10 seconds
    

Somewhat depressing...

(edit: according to lenazegher's comment the average visit duration stats
might not be as bad as they look since my bounce rate was pretty high, ~95% on
both posts)

------
johnpowell
Interesting. I posted a link to my shithole of a site in a comment a hundred
deep in a post that had reached the top of the frontpage and had fallen to
pretty much the bottom when I posted in the thread.

I got a extra 300 visitors that day and about 50 the next. The average
visitors per day is around 25 so this is a big and noticeable spike.

I guess I am kinda shocked a random link in the middle of a dying thread
generated that much traffic while something hitting the frontpage only
generated about 53 times more traffic.

~~~
dasil003
You should put your site in your profile so you can see how many people come
after you post an offhand reference to the fact that you may have once written
something interesting elsewhere.

~~~
johnpowell
Well I thought it was interesting that my shitty comment that linked to
something I wrote got so many hits while getting on the frontpage got so
little in comparison.

I could get your angst if I had linked to it again or actually linked to it in
my profile. But I did neither. So now I just think you are a dick.

~~~
bennyg
I had a Show HN that made no headway on the frontpage (like 3 karma then it
disappeared into the depths), but I posted that same link, relevantly, as a
comment somewhere in the middle of the thread and it absolutely blew up. It's
now my highest starred repo on Github and I started the phone interview
process with Hulu because of it. Don't underestimate the comments here. A TON
of people browse these posts too.

------
e12e
This discussion highlights exactly why I don't consider GA a very useful tool
- there is no real transparency as to what and how data is
collected/measured/filtered [That I've been able to find, anyway].

So in the end, the only useful information you get from GA data, is the rate
of change (which is useful for many things) -- but not, for instance, the
actual number of visits to your pages -- because you have no idea what is
counted and what isn't -- and what is considered a visit.

------
feniv
If the article is long, I usually add it to instapaper or pocket to read
later. The time spent on the actual site is low but I still engage with the
content.

~~~
plam
good point, I'll keep an eye on returning visitor metric over the next few
days

------
Smerity
I was just discussing this with a friend today. We both had front page stories
on HN recently. He reported 6k of 6.8k[1] leaving within 10 seconds and I saw
that 11k of 13.5k[2] left within 10 seconds.

If these numbers aren't accurate due to Google Analytics, I'd be interested to
know a way to get the accurate numbers.

The other annoying thing was that, HTTPS never sends referrers. Hence, not a
single one of my visits said it was from Hacker News.

I know, you don't want to leak the referrer in most circumstances when it's
HTTPS, but it just seems so vital. The Internet was made and understood by
referrals and links, lacking an ability to see referrers seems quite
unfortunate, especially if all the Internet ends up HTTPS.

Google and Facebook are the only ones who would be able to stitch together
significant portions of referral traffic due to Google Analytics or Facebook
Like / Connect. Everyone else is just left stumbling around blind.

[1]: <https://twitter.com/taybenlor/status/326622962377695232>

[2]: <https://twitter.com/Smerity/status/333534743670951936>

~~~
jseliger
_I was just discussing this with a friend today. We both had front page
stories on HN recently. He reported 6k of 6.8k[1] leaving within 10 seconds
and I saw that 11k of 13.5k[2] left within 10 seconds._

This is probably minority behavior, but I will often use Instapaper to
bookmark articles for later, and then read a batch together. For most in-depth
articles, I probably spend less than ten seconds decide whether I should click
"Read Later" and then leaving, even though I do in fact read later.

~~~
Smerity
I did consider the possibility of Instapaper, Readability or other similar
apps, but as you say I couldn't imagine they'd be the majority, even on the
relatively tech savvy Hacker News.

As an example of alternate HN clients, hckrnews.com had 124 referrals,
ihackernews.com had 85, HackerWeb had 77 and PulseWeb had 48. That's a grand
total of around 300 out of 13.5k.

I'd imagine Readability and Instapaper to be big but probably only some small
multiple of that at best.

~~~
gwern
> As an example of alternate HN clients, hckrnews.com had 124 referrals,
> ihackernews.com had 85, HackerWeb had 77 and PulseWeb had 48. That's a grand
> total of around 300 out of 13.5k.

I see similar proportions for submissions of my pages as well.

------
RogerDodger_n
I suspect that if you only make one page view, Google will assume your visit
was 0 seconds. Most HN visitors will read the article and close the tab.

------
gwern
Here's a bit of data: last week my page
<http://www.gwern.net/Google%20shutdowns> was submitted to Hacker News and hit
the front page for a while, racking up thousands of visitors. As it happens, I
was running an A/B test on fonts, where a JavaScript timer sleeps 40 seconds
and then fires, telling Google Analytics that a reader has 'converted'. (This
hopefully avoids the bouncing distortion of the 'time on page' metric.) So,
what percentage of readers stayed on the page long enough for the timer to
fire after 40 seconds? (The Markdown source is somewhere around 12k words, so
it's not the quickest read in the world.)

~18%

(See
[http://dl.dropboxusercontent.com/u/85192141/Analytics%20www....](http://dl.dropboxusercontent.com/u/85192141/Analytics%20www.gwern.net%20Referral%20Traffic%2020130503-20130512.pdf)
)

~~~
BCM43
What happens if a user is running noscript?

~~~
gwern
Then they won't be counted either in the page load (Analytics was never run)
or conversion figures (both Analytics and the conversion trigger will never
run).

------
plam
at this rate, I'm now hoping that I can do an analysis of the analysis of a
viral post that also have gone viral. how awesome would that be? :)

------
brudgers
If I read your page on my desktop, JavaScript was off - unless I had forgotten
to revoke temporary permissions after whitelidting Google analytics on another
page.

Which illustrates that Google analytics reports something, but what it reports
is what it reports. To put it another way, Google Analytics records
information useful to Google. What it reports back to the datapoint is
designed to _appear_ useful to the datapoint. The purpose of the information
provided is solely to encourage the datapoint to keep using Google Analytics
so that Google can keep using the datapoint's website to track people on the
internet.

------
gojomo
A post that estimated how many HN visitors block Google Analytics would be
useful.

~~~
trhiawd
It may surprise you, but HN visitors are pretty low on the scale of privacy-
concerned. /. and /g/ are far more active in using privacy tools.

~~~
jonknee
That's not too surprising as a decent percentage of HN visitors are currently
making tools that destroy privacy.

~~~
trhiawd
s/destroy/monetize/

~~~
qu4z-2
Pretty sure more of them monetize the lack of privacy, and destroy privacy to
do it. Just sayin'.

------
hispanic
For me (echoing some of what gnosis has stated), the real value that HN brings
is the discussion. Frequently, I find the comments, insights, opinions, and
tangents elicited by HN submissions to be more interesting and thought-
provoking than the submissions themselves. I typically browse through the
discussion a good bit before ever clicking through to the article/site which
initially drove the discussion. There are plenty of "show-and-tell" mechanisms
on the Web. What sets HN apart, in my mind, is the round-table that develops
in response to a lot of those submissions.

------
thauck
Like has been mentioned by many, this is incorrect interpretation... and quite
common to see on blogs or single page sites. Although technically it's not
dependent pageviews, but interactions (so pageviews or events).

So, one common way to handle this on blogs is to use setTimeout in conjunction
with an event. Basically you fire an event after 15 or so sections which will
then count as an interaction.

------
tylerneylon
As another data point, I recently had a 60% read rate on a post that was on
HN's front page for a while.

It was a post on medium.com about Pac-Man. Medium tracks number of views and
number of reads per day. I think they use a metric that's not just time-on-
page to differentiate between views and reads. My post had about 26k views and
about 15k reads the day it was on the front page.

------
huhtenberg
Paul, you may want to compare raw web server logs to the numbers you get from
GA. I wouldn't be surprised if there's a big discrepancy, especially when
there's HN in the mix. Moreover, those who are nerdy enough to surf with
tracking scripts blocked might be the ones who actually read the article ;)

------
koshak
Can anyone count those who read translations without link to the original
article or to the HN discussion?...

Do this stats make any sense at all?

Rephrase: can anyone count positive effect of the articles mentioned on HN and
further discussions to them?

------
iM8t
I tend to quickly go through the front page and bookmark the articles that I'm
interested in. Then, when I have the time and I'm on my tablet - I read them.

It may be that I'm not the only person that does this kinda thing.

------
petercooper
Here's another way to measure engagement for longer content, scroll depth:
<http://robflaherty.github.io/jquery-scrolldepth/>

------
MasterScrat
A tool to record which portion of the screen was visible for how long would be
interesting.

Using something like this for example: <http://larsjung.de/fracs/>

------
NathanKP
The most traffic a domain controlled by me ever received from HN was about
3000 uniques: 2000 the first day, 1000 the next.

------
propelledjeans
I add a lot of these articles to my Pocket and read them later. I wonder how
GA reflects that.

------
tonylemesmer
Anything to do with preloading by Chrome?

------
ForFreedom
16K is not the real number

------
ronaldx
tl;dr

