

Hn-api: a simple, ad-hoc Python API for Hacker News - scottjackson
http://github.com/scottjacksonx/hn-api

======
iamelgringo
PG has asked a number of times that people avoid scraping HN, because it
thrashes the server.

HN is also running really pretty slow today.

~~~
scottjackson
> HN is also running really pretty slow today.

Just my luck. As soon as HN took a little while to load for me, I _knew_ that
someone would mention it in this thread.

I made hn-api understanding that scraping HN is frowned upon, but also hoping
that programmers wouldn't do dumb things with it. As I [said][1], I minimised
the number of requests hn-api makes to HN.

In the readme, I say that hn-api is unofficial and unauthorised. I also tell
people to be gentle with making requests to HN. Is there some other action I
should take? Ideally I'd like to keep my work up, but if pg tells me to take
it down, I will.

Great, now I feel all dirty like The Pirate Bay.

[1]: <http://news.ycombinator.com/item?id=1112267>

------
jacquesm
Best run that with some delays in there or you will definitely hit the
velocity checks and you'll find your IP banned.

~~~
scottjackson
I hear you -- in fact, I think I'll put something in the readme about that.

~~~
jacquesm
see this thread:

<http://news.ycombinator.com/item?id=789469>

~~~
scottjackson
Thanks for the link.

hn-api only makes one request when it gets stories or one request when you get
a user's karma -- that doesn't seem like too much. I knew there was a velocity
check on IPs (and that I was making an _unofficial_ API), so I made it as
light on HN as possible.

~~~
jacquesm
Can't you cache the users state ? It's not like you need karma updates every
time you do a request to the API.

~~~
scottjackson
> It's not like you need karma updates every time you do a request to the API.

Obviously.

When a request is made to get stories from HN, no HackerNewsUser objects are
created (and thus, no karma gets updated). hn-api will only update karma when
you, the coder, make a HackerNewsUser object or call the
HackerNewsUser.refreshKarma() method.

Like I said -- one request when it gets stories _or_ one request when you get
a user's karma.

~~~
jacquesm
I get that, I meant cached as in persistent across runs :)

Then you can specify some kind of time-out (say 3 days or so), after which it
does get refreshed.

~~~
scottjackson
Oh, sorry. I didn't even think of that :)

I'll look into it. Or, if some intrepid volunteer would like to contribute to
the project, maybe they could.

