

Show HN: Post-mortem of an Internet Accident - pravj
http://pravj.github.io/blog/post-mortem-of-an-internet-accident/

======
minimaxir
Since I've been looking into a data-driven approach into Github Archive data,
I have a few comments:

1) A blog post is not a Show HN topic:
[https://news.ycombinator.com/showhn.html](https://news.ycombinator.com/showhn.html)

2) If you're plotting a density scatterplot in ggplot2, you _must_ used a
reduced alpha, otherwise there is no indication of density. (e.g. the
conclusion of "Almost 90% of our repositories have less than 20,000 stars and
20 languages." is not apparent)

3) Why did you use a "repository index" for the second chart? Why did you sort
it descending? Why are you using a scatter plot instead of a histogram?

~~~
pravj
First of all let me say something, Max, you're the inspiration :)

1) Sorry for not following the Guideline, It just went out of my mind. Lesson
learned, won't repeat.

Actually this is first time I'm doing something practically with data, so
gradually I'll try to learn things and minimize my mistakes.

2) Yes, you're right. I'll change it.

3) Honestly speaking, I was getting the X-axis ruler all mixed up because I
was plotting 'repository name' on X-axis. So as a workaround I made a new
column of digits, that fits in the image.

