
Show HN: An Enterprise-Grade Data Science Case Study (With Docs and Code) - cdl
https://github.com/calvdee/end-to-end-lead-scoring
======
cdl
As aspiring data scientists or data scientists new to the field (and working
in a business-centric rather than a research-centric role), it is easy to be
blinded by the technical aspects of data science such as exploratory data
analysis and machine learning, failing to see the forest for the trees.

My goal with this post and accompanying GitHub project is to provide a more
accurate example of an enterprise-grade data science project that includes
what is often missed in these sorts of examples: the rigorous positioning of a
business problem and scoping of a data science solution.

The majority of the time spent to complete this project was allocated to the
understanding and definition of the business problem rather than optimizing a
machine learning model. As with any other complex problem, spending some time
upfront to work through the various aspects of a business problem in a
systematic way will reduce the risk of coming up with the wrong solution,
which clearly wastes time (read: opportunity loss) and money.

