

Meta: Paginated Comments - etcet

I've just noticed that comments are now paginated. In some cases like "Lighten Up" (http://news.ycombinator.com/item?id=3736037) only the first comment and its children is displayed.<p>I noticed this because I'm working on an extension that does various things on news.ycombinator.com and one thing is determining new posts on comments. It's now not practical to find the last comment id on a page so that's broken. I'm very hesitant to implement pagination because I really don't want and it's a bad solution.<p>Besides the problems for certain extensions, top comments are now even more privileged than before. The incentive for piggybacking the top comment is greatly increased because otherwise you'll be on the next page of comments.<p>edit: Not to mention the unreliability of 'More' (/x?) links to still exist when you get round to following them.
======
chaosprophet
You can use a convoluted bit of javascript to determine if there is a more
link in the comments page. Then if you detect a more link, use ajax and get
that page and append to the current page. Rerun your javascript to determine
if the newly appended page has a more link in it. Repeat till you find a page
that does not have a more link in it. This is what I did for my HN ipad app.

Also, comments pages have been paginated for quite some time now. I believe it
started around 9 months to a year back.

~~~
etcet
It seems like threads only become paginated when they become closed for
comments. I think this makes sense in some ways. I noticed it because I way
testing, went to sleep, woke up and refreshed the thread I was testing with
and everything broke.

Your approach is what I was thinking of doing. It's good to hear that it's
working for you. Do you perform any queueing or waiting? I was worried that
the reportedly aggressive rate limiter might block someone who opened a few
large threads at once.

~~~
chaosprophet
I'm not really using it to scrape the comments threads in bulk, I'm just using
it on an individual thread at a time. So I haven't had any problems with rate
limiting.

Also, unless you are hitting hundreds of threads at the same time, I don't
really think you will be caught by the rate limiter. Not sure about this
though, but it's something that has been discussed before on HN. Try searching
for something like 'HN crawl rate', and you will get more info.

