Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: RewindHN - Go back in Hacker News history (rewindhn.com)
250 points by kami8845 on Nov 11, 2012 | hide | past | web | favorite | 51 comments



Hey. I built RewindHN.

I created this for the HN community and I'm very happy to see so many people enjoying it :)

If you have any questions about how I built this or if you'd like to suggest something new, let me know!


Your app is astoundingly fast. What's your secret?


His secret is open source https://github.com/doda/rewindhn


Buffering [0] and Caching [1].

The front-end specifically asks for pages in 200-chunks. So whether you slide to page 299,300 or 385, it will request 200-400 from the API. This means I can then very easily serve these requests out of Redis [1] and during usage spikes, requests never hit the disk :)

[0] https://github.com/doda/rewindhn/blob/master/static/rewind/j...

[1] https://github.com/doda/rewindhn/blob/master/server.py#L25


Mind if I ask who you're hosting with?


hetzner.de

i7 with 16 gig RAM


I see that you are using pyquery to scrape the content. I was using BeautifulSoup for my previous projects, but it seems pyquery is a better choice due to its compatibility with jQuery, so I am planning to switch too. Are there any downsides to pyquery, though?


It's difficult to follow changes. Step up and down the ladder of abstraction.

http://worrydream.com/LadderOfAbstraction/


Congrats, it's a very good app!


Awesome work, thank you


Nice. I'm also a fan of http://hnrankings.info/ which is useful to see when a story has been flagged and suddenly drops off the front page.


Interesting. Why do some stories get flagged off the front page?


Interesting one.

I would love to see a list of all the Hacker News "meta tools".

I posted my own last week: http://www.HnEasy.com sorts hn all posts and comments by upvotes from the last day... to last 5 years.


Here's how I see it if I don't resize my window: http://i.imgur.com/WeV5K.png


Wow. Thanks for the feedback.


Very nice. I actually thought of implementing this a few months ago but a quick Google search brought up http://hackerslide.com/ (also open source), which works fine as well.


Yeah, http://hackerslide.com/ is mine. This new one has a more precise interface with the keyboard shortcuts though mine continues to run OK two years on with no changes. Here's the thread from when it launched which also topped HN: http://news.ycombinator.com/item?id=1794614 and I wrote a post about the reason for putting it together and how it works: http://peterc.org/blog/2010/334-hacker-slide-anatomy-of-a-4-...

Perhaps one minor bonus to HackerSlide for now is anyone can take the data collected. The URLs for the JSON archives are formatted like: http://hackerslide.com/data/2012-11-01-23.json (YYYY-MM-DD-HH) although even better long term are the once a day versions, e.g.: http://hackerslide.com/data/2012-11-01.json

I've learnt two things from this project in particular. First, that most similar projects don't seem to stick around very long (the Reddit one it was based on disappeared after a few months as have several others - http://hckrnews.com/ is an exception I can recall). Second, these tools seem to be popular at first but then rarely used over time. Luckily I still find it useful to catch up after vacations, etc ;-)


"Perhaps one minor bonus to HackerSlide for now is anyone can take the data collected."

Indeed, may I make a suggestion here:

1) Why not make a datadump so people wouldn't need to scrape ~800*24 json files individually?

2) OP ought to load this data into his version so the timeline goes back further

3) It seems quite a few people get the urge to tinker like this with HN, I'm sure pg doesn't mind the scraping, but it strikes me as vastly more efficient if some sort of shared resource was setup and perhaps added to the footer, in the vain of HNsearch, so people don't waste time get crawling data setup.

I'm sure somebody else has a dataset just like yours that goes back further still. :)

Also, thank you for making this and OP for making his. Fun.


1) Why not make a datadump so people wouldn't need to scrape ~80024 json files individually?*

Publicly available JSON files was just a side effect of the implementation. But it's easily to tar and gzip it up, so there's now such a file at http://secretshenanigans.s3.amazonaws.com/hnfrontpages.tar.g... (32MB).


  >>  most similar projects don't seem to stick around very long ...

  >> ... http://hckrnews.com/ is an exception I can recall)
Much improved way for me to quickly skim HN.

I would be sad if it disappeared.


I really like the UX. I implemented something like this some years ago for HN and other sites, but not as granular, at http://rrrewind.com


Nice. I made something like this for reddit called redditrewind: http://www.redditrewind.com


That's awesome, but I don't like the UI. A slider would be nice!


We had a slider at one point, we thought it wasn't as good as this method. Ill look back into the options soon.


This looks amazing! Good job! If I may ask(yes I have taken a look at your code but unfortunately my python is not too good), whats the "secret" to grabbing HN's historic data? Also, would it be possible to go back beyond one month towards maybe a year or even few years back (idk if it is just me but I can go up until October 9th, 2012)?

Thanks!


https://github.com/doda/rewindhn/blob/master/scrape.py#L26

This is where the scraping happens. The code is a little uglier than I'd like, but that's largely to do with the hard-extract-data from HN markup. I'll look at adding more data soon.


There was something very similar to this for reddit that was very handy and useful. It was at redditsnapshot.sweyla.com but it shut down at some point. I still think there is an opportunity for someone to index reddit's front page and have it hour by hour and maybe a premium version of all the subreddits you subscribe to.


When I saw that sweylas's stuff has disappeared, I asked them in an email if it will be back again. I didn't get an answer so I implemented my own version: http://redditsnapshots.com/ (running since March 2012)


Awesome. One can also use it to analyse various trends on HN. For ex. look at this story: http://news.ycombinator.com/item?id=4739649 from November 4th 6:40 pm to 11:00 pm. Shows you how HN'ers are eager to help each other.


Great work! Please try to change control-arrows to something else, as Macs map it to switching desktops :)


Loved this. I would love the feature where I could "highlight" (add a glow for effect) any particular story and watch it move up and down as I play with the arrows. Right now everything moves, so this would help spot trend for a particular story.

All in all, Awesome.


This is awesome, and really well done. The keyboard bindings are very handy. Would be great if it had older snapshots.


That's a good idea! I'll look at incorporating some "historic" snapshots from archive.org


Great idea! The left and right arrow key functionality doesn't appear to be working for me though (FF 16, Fedora 17).


Hm. That's weird. I'm using http://github.com/madrobby/keymaster to hook up the keys to the same function also used by the slider itself.

Does the keymaster live demo at http://madrobby.github.com/keymaster/ work for you?


Blown away on how fast this website is. I kept the left arrow key pressed and it didn't shutter once.


Awesome. How about animating it so you can see the posts actually moving up and down?


I'm ashamed to say I had too much already visited links in the past months..


Great work! Typing date/time directly would be a nice extra feature.


Doesn't show ANYTHING without javascript....


Ha, you're right. I just put up a notice for <noscripters>


lol! Don't do that, show some real content instead. :-)


Sorry :( The application as it exists currently heavily relies on EJS (embedded JavaScript) templating.

I originally just dumped the raw scraped HTML into the DOM but that proved to be too space-inefficient (even with gzip).


Seems like there's an API if you want to get some datas out of it. Otherwise, you might want to try to use javascript on trusted websites.. it actually often enhance the pages.


How do I know a web site is trusted?


This is awesome!


Good one kami


very well done. ux is great.


Very nice


Very cool!


works great. I love it.




Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: