
news.yc leaders' average points per submission - brett

======
brett
I was curious:

danielha        :   3.52

danw            :   3.24

brett           :   6.10

python_kiss     :   4.24

mattculbreth    :   5.34

sharpshoot      :   7.06

jwecker         :   3.88

staunch         :   4.57

amichail        :   2.30

Harj            :   5.74

Alex3917        :  11.77

joshwa          :   4.62

far33d          :   4.84

nostrademons    :   7.50

jamiequint      :   7.26

Sam_Odio        :   6.06

Elfan           :   5.00

domp            :   3.44

zaidf           :   5.93

dfranke         :   6.40

Readmore        :   5.00

paul            :  15.67

blader          :   7.56

phil            :   7.44

mattjaynes      :   5.72

herdrick        :   6.24

palish          :  29.60

veritas         :   4.08

bootload        :   2.18

BioGeek         :  10.06

~~~
brett
the code:

require 'rubygems'

require 'active_support'

require 'net/http'

Net::HTTP.start('news.ycombinator.com', 80) do |http|

  body = http.get('/leaders').body

  users = body.scan(/user\?id=([^"]+)/)

  users.each do |user|

    body = http.get("/submitted?id=#{user}").body

    points = body.scan(/(\d+) points? by/).map(&:first).map(&:to_f)

    puts "%-15s : %6.2f" % [user, points.sum / points.size]

  end

end

~~~
ralph
It looks to me like the submitted page lists at most 50 posts. Here's my
script which shows the problem. I've hex-encoded it to get it through the
posting system unbuggered.

python -c 'print "77676574202d714f202d20687474 703a2f2f6e6577732e79636f6d6269
6e61746f722e636f6d2f6c656164657273 207c0a74722027223e2720275c6e5c6e27207
c0a736564202d6e2027732f 5e757365723f69643d2f2f7027207c0a7 8617267732
02d726920776765 74202d714f202d20276874747 03a2f2f6e6577732e79636f6d62696e6
1746f722e636f6d2f 7375626d6 9747465643f69643d7b7d27
207c0a747220273e2720275c6e27207c0a736 564202d6e2027732f5e5c285b302d395d5b302d
395d2a5c2920706f696e74732 a206279202e2a3d 5c282e2a5c 292e242f5c32205c312f7
027207c0a61776b20277b635b243 15d2b2b3b20735b24315d 202b3d20243
27d0a20202020454e4420 7b666f7220286e20 696e206329 207072696e74 662022252d32
307320253564202 535642025372e32665c6e22 2c206e2c20635b6e 5d2c20735
b6e5d2c20735b6e5d2 02f20635b6e5d7d27207c0 a736f7274202 b336e720a".replace(" ",
"").decode("hex"),'

Cheers, Ralph.

~~~
ralph
And the same code as hex-encoded above but with various missing characters,
etc., introduced by the posting system.

wget -qO - <http://news.ycombinator.com/leaders> |

tr '"' '\n\n' |

sed -n 's/^user?id=//p' |

xargs -ri wget -qO - '<http://news.ycombinator.com/submitted?id={}'> |

tr '' '\n' |

sed -n 's/^\\([0-9][0-9] _\\) points_ by . _=\\(._ \\).$/\2 \1/p' |

awk '{c[$1]++; s[$1] += $2} END {for (n in c) printf "%-20s %5d %5d %7.2f\n",
n, c[n], s[n], s[n] / c[n]}' |

sort +3nr

Cheers, Ralph.

~~~
ralph
Hmm. It seems some _italic_ text slipped in there. What other mark-up __works
__? _Software Tools_ is an excellent book. The /leaning/ /tower/ of /Pisa/.
Disappearing: lt= amp= & star= _(becomes italic)_ question=? hash=#.
Recognised: ralph@inputplus.co.uk <http://google.com/> Breaking lines: abc\
def\ ghi. Nope, how about abc\c def\c ghi. abc \ def \ ghi?

~~~
bootload
defiantly there needs to be some sort of markup. pasting code (especially
python) ~ see guido there is an instance where python and whitespace fails.

~~~
ralph
I've added a post in the "improvements" thread. Please vote up if you think
it's useful.

<http://news.ycombinator.com/comments?id=13271>

Cheers, Ralph.

------
mattculbreth
You should add in comments as well I think. Remember there are karma points
here for discussions, as opposed to Reddit's practice.

~~~
brett
Yeah. I got lazy. The problem with comments is that many good ones only get 1
or 2 points so the average is less meaningful (not that there aren't all sorts
of wacky aberrations in the submissions averages).

------
yaacovtp
And what happens to the rankings when you filter out the top 10 domains being
submitted?

