Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Do any of you do data science as a hobby?
38 points by RealAnalysis on Aug 22, 2017 | hide | past | favorite | 17 comments
I work in mergers & acquisitions (math major undergrad, Python classes) but enjoy looking at interesting financial and non-financial datasets in what little free time I have.

What project(s) have you worked on? Can a self-taught hobby data scientist produce solid/meaningful statistical analysis/results products?




While working as a Software QA Engineer at Apple, I wrote a large amount of statistical blog posts to get a breadth of skills outside of my role (story: http://minimaxir.com/2017/05/leaving-apple/)

I did have a statistical background before starting at Apple however. Most of my projects were self-taught and done as self-research (I am not a fan of the take-a-million-MOOCs strategy everyone likes).


Thanks for the link!

I'm mostly interested in self-research.

How'd you know you were doing sound stats?


Oh that's easy ... put money on it, it'll make you a better statistician in no time.


With a team of data-scientists-as-a-hobby, we are building www.17-56.cl, which displays insights from data about Chile. It is in Spanish, but feel free to look around. For example, I created a map that displays the votation in the Chilean primary elections: http://fernandoi.cl/mapascomunales/primarias/primarias.html. Simple, but informative and fun.

Probably as a hobby you will not be able to write an analysis that will get published in a paper, but there is a lot of descriptive analysis out there that can be very interesting.


Awesome, I saw that we have so much data published by the government but it was always buried in endless spreadsheets. Thanks for working on this!


Yes, it is mostly government data! Also, this year there are presidential election in Chile so there is a bias towards that topic.


I really like it :-)

Never underestimate descriptive analysis. After all, that's the first step you take before digging further in the data.


How do you know your statistical analysis is good?

I'm rereading my intro to stats book and would like to use those skills to make good analysis.


It depends on the type of analysis you are doing. It is very difficult to generalize. If you are looking for causation, I would say to not even try it unless you are writing a paper.


Yup, from time to time. I also use data science in one side of my business (online marketing), but as a hobby I periodically do data analysis of the computer game DOTA2.

There's a large (huge, actually) available dataset, and lots of interesting information you can mine comparatively easily.


I like doing data science courses for fun. These days I am reading CS 229 mathematics and a book Python Machine Learning. Once in a while I get tired of coding in JS, or adding things to side projects; and then I like doing simple learning, just for the sake of learning and trying to understand (no strings attached). I like the maths part of data-science and maybe would pursue it as a theoretical endeavor. Or just repeat what people have posted on GitHub and see the results - like kids soldiering circuits. It would be fun to take part in some Kaggle competitions next year, around healthcare. Back in 2014, I spent some time working on an SVM classifier for news.


My project is https://finintelligence.com

It provides access to public/private companies financials, documents search and some custom reports. I have tons of ideas in my mind: configurable stream of companies events, alerts, intelligent search for companies data, etc.

I launched it just about a month ago, so it is still in a semi-stealth prototype mode, but I am happy to receive any feedback, feature requests and first real users )


Built model + app (Splunk app) to predict medicare and opioid prescription fraud based on published medicare claims datasets at data.cms.gov


How did you know which claims were fraudulent?


I'm really just wanting to do my own self research on topics and datasets that seem interesting, but also want to ensure the analysis im doing is sound.


Learn to cross validate. Typically by predicting something out-of-sample. Or by sense checking what your model says about things you're sure about.


Absolutely, been doing it for years with stock market data and Python - developing trading algos and very successfully I might add, actually I can retire tomorrow, but every year I am postponing, because of the team I lead...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: