Hacker News new | comments | ask | show | jobs | submit login
Growth of the Scientific Boundary (cole-maclean.github.io)
1 point by cole-maclean on Nov 5, 2015 | hide | past | web | favorite | 1 comment

Hi all,

I created this as part of Udacity's Data Science nanodegree. It uses metadata provided by Arxiv's API to build a categorization model for the scientific disciplines, using each papers title to build a "bag of words" vectorized model and then allocates each paper to a category based on the contents in its summary abstract. Each category grows out of a "parent category", sorted by a simple difference in words of the categorical labels.

Part of the project is to publish the visualization to gain feedback from users, so if you have any suggestions or feedback, please let me know! It's certainly not perfect by any means, but being my first data visualization and first exposure to programming with a javascript framework(d3.js), I'm quite proud of it.

I plan on writing up a blog post about the development of this, mostly for myself to reflect on the personal growth (and sometimes frustrating!) experience, that hopefully others find interesting or informative. For now, I need to focus on finish up the nanodegree (A/B testing next!)


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact