

Hnweekly: weekly top stories from Hacker News - xtimesninety
http://hnweekly.watdahel.com/

======
siong1987
Cool idea. Anyway, I created a Top Stories Directory 2 days ago too. It will
record down every number one stories starting from 2 days ago. But, it lost
tracking some stories due to some bugs. But, I already fixed them all.

<http://hn.siong1987.com>

~~~
xtimesninety
nice work too :)

~~~
siong1987
How long do your crawl the Ycombinator once? Do you actually crawl every page
on YC?

~~~
xtimesninety
I scrape only the front page once every ~4hrs (minimum)

Nearlyfreespeech.net doesn't have cron yet, so I'm using onlinecronjobs.com.
If you want to scrape it on demand just go to
<http://hnweekly.watdahel.com/index/scrape>

~~~
siong1987
But, if you only scrape the frontpage, how can you know the points for all the
past stories which are not on the frontpage anymore?

~~~
xtimesninety
It's not 100% accurate, but it's close. The reason I did the website is so I
can catch up with the good stories that were submitted earlier during the week
(I usually read during weekends).

------
ggruschow
It looks like it sorts by points?

Is that the best way to determine top stories, or would something like a time-
at-position score be better?

(Thanks - I kept meaning to do this myself. Would you do proggit & friends
too, please?)

~~~
xtimesninety
yep it's sorted by points. reddit already has this:
<http://www.reddit.com/r/programming/top/?t=week> not sure about the other
sites, I'm just usually on hn, reddit and stuff on my google reader

------
jasoncartwright
Good work. Needs an RSS feed.

~~~
xtimesninety
now with rss feeds!

------
tome
Doesn't seem to handle non-ASCII very well: "Does Gdel matter?"

~~~
xtimesninety
I might have problems with unicode in PHP (I really don't know much here yet)
so I opted to remove them (for now) using regex.

btw I'm using <http://simplehtmldom.sourceforge.net/> to scrape/parse which is
great, it let's you use jquery style selectors on html.

~~~
delano
In that case, it should replace ö with o, etc.

~~~
rtw
I have a screenscraping thing in place that targets an ASCII only environment
(LambdaMOO mud). Manually replacing unicode stuff like this has been a pain in
the butt. Luckily that was all just for fun and didn't need to be perfect.

Is there a good library out there (in any language) that does good unicode -->
ASCII substitutions for major languages?

~~~
xtimesninety
I once used latin1_to_ascii (The Unicode Hammer) in python: (works great)
<http://code.activestate.com/recipes/251871/>

~~~
rtw
Awesome thanks. Two coincidences: this fun mud thing is in Python (works via
RPC)... and the entry method name is "hammer," heh.

------
xtimesninety
I've been looking for something like this but didn't find one. btw I only
started scraping like 2 days ago :) so it's not yet a week's worth of data.

fyi: there's also <http://news.ycombinator.com/lists> which shows the top
stories in 2 weeks

~~~
TweedHeads
Lists are almost cool.

Allow us to choose: Today, Yesterday, Week, Month, Year

------
smanek
Pretty useful.

It would be nice to be able to see the top stories over any arbitrary time
period as well. I imagine an interface with a draggable timeline to specify
the period (something similar to what google finance uses, perhaps) could be
even better.

~~~
xtimesninety
thanks, i'm planning to add that too. you can also try this (to see more
recent links): <http://hnweekly.watdahel.com/?days=1>

------
epi0Bauqu
For a feed of all the #1 HN stories, use
<http://feeds.feedburner.com/HNWatrcoolr>

For more top stories from hacker feeds, use <http://hacker.watrcoolr.us/>
(feed: <http://feeds.feedburner.com/HackerWatrCoolr>)

------
rokhayakebe
Nice and very useful.

------
TweedHeads
Allow it to sort by number of comments

~~~
abossy
Agreed. It is often useful to see which stories are the most-discussed.

~~~
xtimesninety
alright :) <http://hnweekly.watdahel.com/?order=comments>

