So two big design issues, if you scroll down from the wheel of terms you go wobbly, clicking the down arrow below that last wheel, also wobbly. If you use on of the tabs to go to a graph you create a 'back loop' where the back button takes you to the start, then back again takes you back to the graph, then back to the start then back to the graph ....
The data is interesting but I expect to see a bit of analysis in projects like this. Without it the project becomes a sort of data Rorschach test where the viewer projects their perceptions into it.
We are aware of a lot of the issues on the home page and plan to completely redesign it soon. However which graph are you talking about? If it's the post tracking we will definitely look into it.
I had clicked on the rightmost tab (labeled Graphs) and shows the Subredit graph and then got stuck in the back page loop. Although I just did it again and it didn't get stuck in a loop so you may have fixed that one already.
I can't even read this on my iPhone. All of the text is forced into a tiny column so small that it wraps after nearly ever word. A lot of the text is pushed right off the right edge of the page.
Please don't alter the viewport, you're doing it totally wrong.
> "Dogs" occurs more times in titles in r/aww than "cats" or "kittens"
That can be a fact.
> Despite their internet popularity cats are not submitted nearly as many times on this cuddly SubReddit.
Or dogs are rare enough that they're worth naming, because cats are default. Seriously, run whatever calculations you want, but be careful about what conclusions you draw from the numbers.
What do you mean by "stir the pot"? The purpose of this site seems to be data-driven analysis of reddit. If the conclusions being drawn aren't rooted in that data, I don't see the point.
Nothing like a bunch of comments tearing apart a website built by people who, presumably, learned to code a few months ago and threw it together over the course of a few weeks. Yeah, it has a few issues, but it's significantly better than the first things I ever built and sent off to the world.
It's fundamentally horrible. The whole design approach is horrible. It's the contemporary equivalent of GeoCities, just using more recent but equally mindless design cliches.
In fact, it's so horrible it makes PowerPoint look good.
Your criticism is fundamentally horrible. The whole rhetorical approach is horrible. It's the contemporary equivalent of a flame on USENET, just using more recent but equally mindless rhetorical cliches.
In fact, it's so horrible it makes YouTube comments look good.
Rhetoric goes back centuries so it's unlikely I've come up with anything new. Otherwise, it contains genuine criticism that is rarely found in YouTube comments. The site is a patchwork of trendy design cliches that make the information less accessible than it would be in, for example, PowerPoint. And what we know about the mindless use of design cliches (see GeoCities) is that they date badly.
In sum, I think my complaint has more to it than your parodic reply.
There is in fact a more sensible response above: "We are aware of a lot of the issues on the home page and plan to completely redesign it soon."
Genuine criticism would simply explain the design cliches you didn't like and move on. Calling something horrible is not effective criticism. You're a horrible person! How are you meant to grow out of that? Comparisons to GeoCities and PowerPoint without explaining yourself are not doing much either.
The point of criticism is that you give something actually useful to the person you are criticizing. Unless your goal is simply to put down the other party. It's a cliche to just dismiss something out of hand, the same way I dismissed your arguments without justifying myself. The reason I replied like that was to demonstrate how fill in the blanks things were. You cannot do that with real criticism. For example, this critique of your rhetorical cliches could not be applied to the website in question.
I was being brief. The average HN user is easily smart enough to get the point. And at least it addressed the point at issue, which made it more of a contribution than your content-free response.
By the way, you're also ignoring the original context. Anybody trying to offer helpful criticism surely wouldn't have done that, would they?
Sorry, I genuinely don't understand. How am I ignoring the original context (in my followup reply)?
I'll agree that my response was content-free. It was even aggressively hostile, and I could have been kinder about it. I'm sure that it was not fun to have someone flagrantly mocking your words.
Oh I see. Yeah, I never really considered it. I like to think that we live in a more... enlightened age than when Usenet was around, and that was a genuine apology.
Site was borderline unusable with Chrome on android. Text was overlapping, animations to display new text were being triggered after I scrolled past them and in some cases the text just flew in one side and right off the other side of the screen before I even got a glimpse of it.
My prior on that is extremely low, so my guess is that's like, of the posts that make front-page? Or of the top posts. What this really reflects, therefore, is something more like how up-votey people are.
This is great and all, but if there is one tool I'd like to see is a personal data exporter.
I want to get all the posts and messages I have ever posted, going back many years, and beyond the 1000 cutoff in their comment history feeds. I'd like to run some keyword analysis on my own data, search it, access it however I see fit. As it stands now, there is no way to retrieve it.
How much of "the Reddit" did you actually use to calculate these stats? Is this based off a one-time snapshot from the API (which limits to 1000 posts, per type of query) or is it a longitudinal crawl (and how data was produced in that sample)?
For how long did you collect data? During previous crawls, I've found that one of my spider bots can scrape through about 11.5k submissions and 51k comments per day (while observing Reddit's API access rules).
This looks awesome. Next step is to make it "real-time."
I worked for an analytics company and got to build some pretty awesome visualizations using D3 and one of the problems I always ran into was that while the visualizations are cool, you rarely get any actionable information from the charts. I feel like this would be a lot better if at the end, there was some call to action.
Real time analytics is tricky due to API limits. (unless you can accept a "real time" of minutes/hours per update)
Example: Twitter's search API is limited to 15 queries of 100 Tweets every 15 minutes. Do you query 100 Tweets every minute, or 1500 Tweets every 15 minutes?
One other challenge is that some of the presentation (especially portions of the front page) isn't data-driven. It can be tricky to figure out which information to highlight, what the clustering topics should be, etc.
I've got an experimental, realtime comment search engine over at commentfindder.com. Perhaps some useful visualizations could be created off of it. Do you have any ideas for what kind of actionable chart would be interesting?
I searched around and couldn't find information on API limits. Are these documented or just found via experiencing them?
The Hacker News karma tracker posted this past January seems to be able to get the information that would be necessary for writing something like this.
HNSearch has a limitation of 1000 maximum results returned (see [1]).
You can take a snapshot of the Front Page / New Posts every so often a la [2], but it's limited per the robots.txt. It's possible, but more time intensive (as in the linked post)
Based on other comments on this post, it looks like HNSearch and Reddit APIs have the same limitations, so doing something like this should be feasible for Hacker News. Also, another post indicates that the data for Reddit was done with only single 1000-limit queries, so the time-intensive method shouldn't be needed to produce equivalent results (although it could provide more accurate results.)
Finally, for those looking for something like this for Hacker News right now, check out these sites:
The data is interesting but I expect to see a bit of analysis in projects like this. Without it the project becomes a sort of data Rorschach test where the viewer projects their perceptions into it.