I have emailed Reddit about the legality of this project. (Have not received a response yet) So far it seems likely that I will have to take down the site.
So consider this a small preview I suppose. I plan to port this to a Chrome Extension.
Do you guys find this format much more readable than the current completed IAmA?
I've wanted to build something like this for a long time, the problem I encountered (which I couldn't solve) is how to effectively and automatically find not just the questions and answers, but the discussion that stem from them.
A good AMA is one that creates discussion between the subject and commenter's so the majority of good content is buried underneath multiple comments. For example I did an AMA and it had a few hundred comments, the "top" question was pretty mediocre but it had a reply that made it a valuable question and that reply was the most "popular" (most replies, most votes) thing in the AMA. Your site would not have picked this up.
If you can solve the problem of dealing with all the nested comments and trying to work out the value of each reply (taking into account votes, length, number of replies and possibly the "relevant" workds?) then your site will absolutely be worth using. Instead of just being a different way to display the data, create extra value. Good AMAs are not just Q&As, they're discussions.
> So far it seems likely that I will have to take down the site
Traditionally reddit is very good towards community projects, they pride themselves on their "different" approach to community, unless you're causing harm to reddit I don't see why they would require you to take this down. There are multiple tools that already exist that extend reddit and provide extra value, none of which have been taken down -- from what I've seen anyway.
Specific to your current implementation of the idea, your font-size is way too low, the focus of the site is the words so make them bigger! The size you have it now isn't very readable. Check our some blogs for examples of what sizes to use for text.
The problem of dealing with nested comments is very tricky as you mentioned. I did my best to grab the "best" comments, but I know that it does miss some in a deep thread. I will try my best to further enhance its accuracy and ability to continue a discussion.
I will also work on the font sizes and perhaps even modify the font choices to further enhance readability.
It looks like you grabbed every comment by the OP along with the comment it was a response too.
In at least one instance, I see a situation where someone asked a question, I (not the OP) provided an answer, and the OP then commented on my answer to confirm that my answer was correct. The original question doesn't appear on your website.
I ran into the same problem. I started a pet project that involved scraping reddit (though for a different purpose than AMAs). Their robots.txt and an admin writeup from somewhere on their site made me realize that I'd probably just have to take it down and/or my scraper would just get blacklisted. It's a bummer because there's 1001 great ideas out there for filtering, categorizing, and viewing reddit's data in different ways. And it seems like they encourage 3rd party interaction to some extent with their API and all, yet scraping is kind of needed in most cases.
I do like the format for sure. The only thing I would consider is maybe nesting the Q/A divs (.qitem) for threads because a lot of times the Q/A content is contextual to past Q/As. You already order them that way and that helps a lot but on one of the ones I was reading it got confusing on whether they were speaking in the context of a thread or if it was a fresh Q/A. Maybe set it as a view option to toggle or something (maybe have it be a carousel where each frame contains all the Q/A divs in a thread starting with the root level, and keep it displayed flat like they are now).
I made sure to be nice to reddit. The scrapper is set to crawl reddit once every 12 hours for new "top monthly iamas".
Very good suggestion for nested threads. A good example of a reply to a question is on the westboro-baptist-church thread. I think it's possible to implement this suggestion. Will fool around with it on localhost and see what I come up with.
Yeah, I had a limiter put in mine as well so that it only made a request every 6 or 8 seconds.
No worries on the suggestion. Those threaded comments can be tricky sometimes.
Hey, if you do hear back from them about their stance on this sort of thing, I'd really appreciate if you could let me know what they say. I sort of halted my project after a certain point because I had the fear I'd just have to take it down as soon as I completed it.
User comments, but going by users rather than threads. That way you could get a profile where someone posts, or turn it around and see what prolific posters existed in a given subreddit.
The thing is it wouldn't sweep everything. Instead a user would only get scraped if a request was made to my app, and I had a tool that would go through a request queue (storing to my own DB) in a metered way so that reddit only experienced a handful of requests from me per minute.
Nonetheless it still breaks robots.txt and if I could dig it up admins have said in the past that don't want automated/batched requests hitting their site.
Very very awesome! I love IAmA's, but the signal to noise ratio is relatively low, so that I can only bear it for about 1 page. This does a fantastic job in separating out the great questions and answers!
I'm pretty sure you'll be fine. Erik's a Pretty Cool Guy(TM); I can't see him trying to take down a site whose express purpose is to make the content on reddit more accessible.
Absolutely, If you add link references, like related news, events or uncommon knowledge this will be perfect.
I always imagine someone making a book about the IAMA, there is just so much value on the posts, there are an amazing human view on people. Sometimes amazing, sometimes heartbreaking mostly just fascinating.
This is MUCH better than the reddit version. If they're okay with you doing this, I whole-heartedly support your efforts here and wish you the best of luck with this endeavor
Yes I much prefer this format, great job. The only thing thing that I don't like is that the questions are hard to read for me. Maybe because they're large blocks of text in bold.
Ah, I hadn't clicked any of the links. I thought you had just linked back to their site.
By the way, I like your site. It makes it easy to cut through the cruft that always appears in reddit threads. Do you just grab all responses by the poster and the parent comment?
If they have a full content RSS feed, then this use should be okay. But I hope it survives their scrutiny since this is vastly more readable then the native view.
The concept is good and its easier to scan than IAMA but the design/typography just hurts to read. I would probably fix that asap. Make it bigger, easier to read and change the font.
In regards to your concerns of legality. Reddit seems to be ok with scrapers, bots and the like as long as you dont make more than 30 requests per minute.
That looks fantastic. I actually tried to get a designer on board to help with the front-end, but unfortunately he wasn't able to help out. So I did the best I could. :)
I will definitely implement these font changes. :)
A couple months ago, reddit started refusing requests from my web scraper. Figured out they started checking the user agent and refusing connections that didn't look like they came from a user's browser. Unless I missed an announcement somewhere, it doesn't seem like they're overly friendly about allowing web scrapers.
They are fine with it as long as you abide to their terms, they have a subreddit dedicated to reddit development and the reddit api which has discussion of scraping: http://www.reddit.com/r/redditdev
Thank you!! Serious time saver when you cut the ding-dong-ping-pong Reddit karma whoring replies. Just enjoyed this AMA with a former Rosetta Stone employee:
To expand on that, you could create a stylesheet or such that modified the look of it. Something similar to the mobile HN sites where banner ads are preserved could be considered as well.
Minor problem: Zero based indexing, combined with not showing current page in navigation, is confusing. At main page I saw links to page 1, 2. Thought that 1 was current page, 2 was next page.
I like this, although I do think it would be better if each comment had a permalink back to the original so that we could see all the responses, not just the one by the person who created the thread.
I don't know if this is a feature or a bug or if hasn't been crawled recently.
It would be great to see these top-level comments!
------------
Linking to an answer's context would be super-useful as well. A great deal of the fun and value reddit supplies is the community commentary and responses to an AMA's answers. This can range from the funny to the insightful to the scary - something topiama.com doesn't capture. Which is great - sometimes you don't want the peanut gallery.
There used to be a comment bot/person who put all the reddit AMA questions into a tabular format - I recently discovered it has a pretty useful subreddit: http://www.reddit.com/r/tabled
Also, links to questions in their original context would be nice.