Reddit Insight

ChuckMcM · on July 14, 2013

So two big design issues, if you scroll down from the wheel of terms you go wobbly, clicking the down arrow below that last wheel, also wobbly. If you use on of the tabs to go to a graph you create a 'back loop' where the back button takes you to the start, then back again takes you back to the graph, then back to the start then back to the graph ....

The data is interesting but I expect to see a bit of analysis in projects like this. Without it the project becomes a sort of data Rorschach test where the viewer projects their perceptions into it.

gdi2290 · on July 14, 2013

We are aware of a lot of the issues on the home page and plan to completely redesign it soon. However which graph are you talking about? If it's the post tracking we will definitely look into it.

ChuckMcM · on July 14, 2013

I had clicked on the rightmost tab (labeled Graphs) and shows the Subredit graph and then got stuck in the back page loop. Although I just did it again and it didn't get stuck in a loop so you may have fixed that one already.

gdi2290 · on July 15, 2013

Thanks for your feedback. At the moment I'm not able to reproduce your issue, but I'll add it to our github issues

ChiperSoft · on July 14, 2013

I can't even read this on my iPhone. All of the text is forced into a tiny column so small that it wraps after nearly ever word. A lot of the text is pushed right off the right edge of the page.

Please don't alter the viewport, you're doing it totally wrong.

gdi2290 · on July 15, 2013

We updated the landing page for better mobile support. Although we can't say the same for some of our d3 visualizations

malcolmmcc · on July 14, 2013

> "Dogs" occurs more times in titles in r/aww than "cats" or "kittens"

That can be a fact.

> Despite their internet popularity cats are not submitted nearly as many times on this cuddly SubReddit.

Or dogs are rare enough that they're worth naming, because cats are default. Seriously, run whatever calculations you want, but be careful about what conclusions you draw from the numbers.

AjithAntony · on July 14, 2013

Yeah, there are probably more cat pictures just in the "If I fits, I sits" category, than dogs altogether

gdi2290 · on July 14, 2013

Good point. We like to stir the pot a bit though.

jader201 · on July 14, 2013

So you're knowingly misrepresenting the data?

mahmud · on July 14, 2013

Nobody skews cat data on the internet and gets away with it.

dktbs · on July 14, 2013

What do you mean by "stir the pot"? The purpose of this site seems to be data-driven analysis of reddit. If the conclusions being drawn aren't rooted in that data, I don't see the point.

dmentat · on July 14, 2013

By purposefully lying?

ctide · on July 14, 2013

Nothing like a bunch of comments tearing apart a website built by people who, presumably, learned to code a few months ago and threw it together over the course of a few weeks. Yeah, it has a few issues, but it's significantly better than the first things I ever built and sent off to the world.

Keep it classy, HN.

ineedtosleep · on July 14, 2013

Examples of "tearing apart"? At its worst, the feedback is terse. If all one wants are warm and fuzzies, one wouldn't submit here.

gdi2290 · on July 15, 2013

Constructive feedback rather than warm fuzzies are suggested.

freehunter · on July 14, 2013

What's the point in submitting this to HN if not to get feedback?

g0lden · on July 14, 2013

to get mindful feedback.

scholia · on July 14, 2013

It's fundamentally horrible. The whole design approach is horrible. It's the contemporary equivalent of GeoCities, just using more recent but equally mindless design cliches.

In fact, it's so horrible it makes PowerPoint look good.

foobarbazqux · on July 14, 2013

Your criticism is fundamentally horrible. The whole rhetorical approach is horrible. It's the contemporary equivalent of a flame on USENET, just using more recent but equally mindless rhetorical cliches.

In fact, it's so horrible it makes YouTube comments look good.

scholia · on July 15, 2013

Rhetoric goes back centuries so it's unlikely I've come up with anything new. Otherwise, it contains genuine criticism that is rarely found in YouTube comments. The site is a patchwork of trendy design cliches that make the information less accessible than it would be in, for example, PowerPoint. And what we know about the mindless use of design cliches (see GeoCities) is that they date badly.

In sum, I think my complaint has more to it than your parodic reply.

There is in fact a more sensible response above: "We are aware of a lot of the issues on the home page and plan to completely redesign it soon."

foobarbazqux · on July 15, 2013

Genuine criticism would simply explain the design cliches you didn't like and move on. Calling something horrible is not effective criticism. You're a horrible person! How are you meant to grow out of that? Comparisons to GeoCities and PowerPoint without explaining yourself are not doing much either.

The point of criticism is that you give something actually useful to the person you are criticizing. Unless your goal is simply to put down the other party. It's a cliche to just dismiss something out of hand, the same way I dismissed your arguments without justifying myself. The reason I replied like that was to demonstrate how fill in the blanks things were. You cannot do that with real criticism. For example, this critique of your rhetorical cliches could not be applied to the website in question.

scholia · on July 15, 2013

I was being brief. The average HN user is easily smart enough to get the point. And at least it addressed the point at issue, which made it more of a contribution than your content-free response.

By the way, you're also ignoring the original context. Anybody trying to offer helpful criticism surely wouldn't have done that, would they?

foobarbazqux · on July 15, 2013

Sorry, I genuinely don't understand. How am I ignoring the original context (in my followup reply)?

I'll agree that my response was content-free. It was even aggressively hostile, and I could have been kinder about it. I'm sure that it was not fun to have someone flagrantly mocking your words.

scholia · on July 16, 2013

My original was a response to https://news.ycombinator.com/item?id=6039938 and the context was that the site was being criticized.

> I'm sure that it was not fun to have someone flagrantly mocking your words.

Seriously? You must have missed Usenet in the 1980s.

foobarbazqux · on July 16, 2013

Oh I see. Yeah, I never really considered it. I like to think that we live in a more... enlightened age than when Usenet was around, and that was a genuine apology.

scholia · on July 17, 2013

None needed, really. I didn't take offense at your comment, and I didn't think it was unfair. I'm not carrying any grudges ;-)

(Belated response because I've just spent 24 hours travelling, and my brain is still something like a wet dishrag....)

shawndrost · on July 14, 2013

(For parties that wanted to downvote this comment more than once: you can do so by clicking "link", then clicking "flag".)

MichaelApproved · on July 15, 2013

From your profile: "Cofounder of a dev training program called Hack Reactor"

Hacker Reactor is where this website originated from and you're behaving this way? You've been a HNer for over 2,000 days. You should know better.

Tomdarkness · on July 14, 2013

Site was borderline unusable with Chrome on android. Text was overlapping, animations to display new text were being triggered after I scrolled past them and in some cases the text just flew in one side and right off the other side of the screen before I even got a glimpse of it.

gdi2290 · on July 15, 2013

We updated the landing page to better reflect your suggestions.

oliverhunt · on July 14, 2013

> r/Technology leads all SubReddits, with an average of 2,027 Karma per Post

Is this saying that all posts in r/technology get an average of 2027 karma?

malcolmmcc · on July 14, 2013

My prior on that is extremely low, so my guess is that's like, of the posts that make front-page? Or of the top posts. What this really reflects, therefore, is something more like how up-votey people are.

gdi2290 · on July 14, 2013

Our data was largely from posts that were above 0 Karma which I think is reflected in the analysis that we did.

joshu · on July 14, 2013

average is probably the wrong characterization of the distribution of karma per post.

visarga · on July 14, 2013

This is great and all, but if there is one tool I'd like to see is a personal data exporter.

I want to get all the posts and messages I have ever posted, going back many years, and beyond the 1000 cutoff in their comment history feeds. I'd like to run some keyword analysis on my own data, search it, access it however I see fit. As it stands now, there is no way to retrieve it.

gdi2290 · on July 15, 2013

Thanks for your feedback, but what you're asking for Reddit doesn't provide in their API

alexleavitt · on July 14, 2013

How much of "the Reddit" did you actually use to calculate these stats? Is this based off a one-time snapshot from the API (which limits to 1000 posts, per type of query) or is it a longitudinal crawl (and how data was produced in that sample)?

gdi2290 · on July 14, 2013

One-time snap shot. Given longer time on this project we would love to do a larger analysis.

zoba · on July 14, 2013

You should know this site isn't usable/viewable on iPhone Safari or Chrome.

gdi2290 · on July 14, 2013

sorry about that it's our initial prototype

frakkingcylons · on July 14, 2013

For how long did you collect data? During previous crawls, I've found that one of my spider bots can scrape through about 11.5k submissions and 51k comments per day (while observing Reddit's API access rules).

gdi2290 · on July 14, 2013

Side note: everyone in our team is available for hire

robg · on July 14, 2013

What is your contact info?

gdi2290 · on July 14, 2013

email us at TeamReddit@gdi2290.com

joeblau · on July 14, 2013

This looks awesome. Next step is to make it "real-time."

I worked for an analytics company and got to build some pretty awesome visualizations using D3 and one of the problems I always ran into was that while the visualizations are cool, you rarely get any actionable information from the charts. I feel like this would be a lot better if at the end, there was some call to action.

minimaxir · on July 14, 2013

Real time analytics is tricky due to API limits. (unless you can accept a "real time" of minutes/hours per update)

Example: Twitter's search API is limited to 15 queries of 100 Tweets every 15 minutes. Do you query 100 Tweets every minute, or 1500 Tweets every 15 minutes?

jaredsohn · on July 14, 2013

One other challenge is that some of the presentation (especially portions of the front page) isn't data-driven. It can be tricky to figure out which information to highlight, what the clustering topics should be, etc.

babs474 · on July 14, 2013

I've got an experimental, realtime comment search engine over at commentfindder.com. Perhaps some useful visualizations could be created off of it. Do you have any ideas for what kind of actionable chart would be interesting?

shaohua · on July 14, 2013

we should do one for hacker news too.

minimaxir · on July 14, 2013

It's much harder to get data from Hacker News, as HN doesn't have a public API (and HNSearch's API limits make it infeasable)

jaredsohn · on July 14, 2013

>(and HNSearch's API limits make it infeasable)

I searched around and couldn't find information on API limits. Are these documented or just found via experiencing them?

The Hacker News karma tracker posted this past January seems to be able to get the information that would be necessary for writing something like this.

minimaxir · on July 14, 2013

HNSearch has a limitation of 1000 maximum results returned (see [1]).

You can take a snapshot of the Front Page / New Posts every so often a la [2], but it's limited per the robots.txt. It's possible, but more time intensive (as in the linked post)

[1]: http://api.thriftdb.com/api.hnsearch.com/items/_search?q=med...

[2]: http://mayank.lahiri.me/writing/hackernews/index.html

jaredsohn · on July 14, 2013

Thanks for the information (including the links.)

Based on other comments on this post, it looks like HNSearch and Reddit APIs have the same limitations, so doing something like this should be feasible for Hacker News. Also, another post indicates that the data for Reddit was done with only single 1000-limit queries, so the time-intensive method shouldn't be needed to produce equivalent results (although it could provide more accurate results.)

Finally, for those looking for something like this for Hacker News right now, check out these sites:

http://www.hntrends.com/, http://hntrends.jerodsanto.net/, Hacker News hiring trends (https://github.com/adamw523/hackernewshires)