

Python API for Hacker News - karangoeluw
https://github.com/thekarangoel/HackerNewsAPI/blob/master/README.md

======
eeadc
That library is in many ways deprecated and broken: At first, it uses only
old-style classes because it doesn't inherits object explicitly. Furthermore,
it uses print in a method; it would be more "Pythonic" to return a str object,
which was formatted using str.format.

I think the future is Python 3, and new implementations in Python 2 syntax are
simply unneccessary. I would suggest the usage of Python-3-style syntax, which
is also valid in Python 2.7 (which isn't hard).

~~~
karangoeluw
> At first, it uses only old-style classes because it doesn't inherits object
> explicitly.

Please explain this further.

> usage of Python-3-style syntax, which is also valid in Python 2.7

Will do this

~~~
mapleoin
See [http://docs.python.org/2/reference/datamodel.html#new-
style-...](http://docs.python.org/2/reference/datamodel.html#new-style-and-
classic-classes) for the distinction

~~~
karangoeluw
Alright. Fixed.

------
mapleoin
I tried building a REST API once for a challenge if anyone is interested:
[https://github.com/mapleoin/newhackers](https://github.com/mapleoin/newhackers)

------
gpsarakis
Nice effort. Just a few remarks:

\- You should certainly use Requests [http://docs.python-
requests.org/en/latest/](http://docs.python-requests.org/en/latest/)

\- The Story class seems somewhat redundant. You could possibly use
collections.namedtuple as a container for properties or simply a dictionary.
The print_story method could just be the __str__ special method.

\- JSON output would be useful.

~~~
karangoeluw
I will try and implement these. Thanks for the suggestions.

------
thejosh
Does it use [https://www.hnsearch.com/api](https://www.hnsearch.com/api) ?

~~~
Xeoncross
Is that an official API? How long has it been around?

~~~
karangoeluw
Completely unofficial. I started creating it a month ago.

~~~
dholowiski
Wow, that's great. I use another one and it's quite unreliable. Thanks!

~~~
karangoeluw
You can use mine and compare the two, and based on your feedback either I or
any other dev can improve it.

------
Sharma
I think screen scrapping is not allowed by HN. Few tries with these APIs might
get your IP banned!

~~~
sprizzle
The robots.txt file doesn't seem to disallow scraping.
[https://news.ycombinator.com/robots.txt](https://news.ycombinator.com/robots.txt)

~~~
karangoeluw
Scraping the listing pages seems allowed though.

------
addflip
I don't get why you're using a try except block for the num_comments variable.
You shouldn't be casting to an int if it doesn't have the attribute.

~~~
karangoeluw
The meta text on any page can be this:

> 21 points by johns 15 minutes ago | discuss

or

> 152 points by ar7hur 3 hours ago | 58 comments

If the rgex matches (case 2), then I cast it to an int. Otherwise (case 1, 0
comments).

------
sprizzle
It's silly to use BeautifulSoup to parse the page when you could use a simple
RegEx:

<td class=\"title\"><a href=\"(. _?)\ "(._?)>(. _?) </a>(._?)</td>

~~~
kaeawc
"HTML tags lea͠ki̧n͘g fr̶ǫm ̡yo​͟ur eye͢s̸ ̛l̕ik͏e liq​uid pain"

[http://stackoverflow.com/questions/1732348/regex-match-
open-...](http://stackoverflow.com/questions/1732348/regex-match-open-tags-
except-xhtml-self-contained-tags)

~~~
michaelmcmillan
I am willing to sacrifice my soul and everything that is holy.

