
Foundations of Data Science [pdf] - Anon84
https://www.cs.cornell.edu/jeh/book%20no%20so;utions%20March%202019.pdf
======
codesushi42
Looks like a mess. Use this instead:

[https://d2l.ai/](https://d2l.ai/)

------
therobot24
this is a weird collection of ML/computer vision/data processing topics

wavelets as it's own chapter, deep learning only has GANs as a single
subsection, graphical models several chapters after the ML chapter...just
weird arrangement choices

~~~
mlevental
it's a theory heavy intro to a wide variety of analytical techniques. it's a
common refrain that deep learning has very little in the way of theory. and in
fact gans with their optimization scheme framed as a game that plays until
equilibrium is one of the few pieces, that's also interesting.

------
crimsonalucard
If I read and master most of the stuff in this book, am I qualified to do data
science?

~~~
soVeryTired
It depends what you mean by "do data science". If you've read and mastered
what's in the book, you'll have a good chance at writing a nice PhD in machine
learning. But there's a lot of academic fluff in there that isn't too useful
outside of university.

Most data scientists are _consumers_ of algorithms, not _producers_ of
algorithms. The rules are a bit different if you're at a bigco, but most data
scientists don't do active research. It's nice to have a solid theoretical
understanding machine learning, but most data data scientists' day-to-day
consists of chaining together libraries and building nice dashboards.

~~~
crimsonalucard
By do data science, I mean the day to day stuff. So actually it's not that
hard then if all they are doing is chaining together libraries.

Is there anything that this book would be missing the day to day stuff?

~~~
joker3
> By do data science, I mean the day to day stuff. So actually it's not that
> hard then if all they are doing is chaining together libraries.

Anyone who believes this is dangerous and shouldn't be allowed anywhere near a
data science project. The programming is easy, but working with data and
making good inferences is very hard.

~~~
tekkk
Sounds a bit of gate keeping to me. I mean it isn't magic, it's still 0s and
1s which yield results that are of some use to someone. Sure, it would be nice
if everything were masters of DS but in reality, people make mistakes and
sadly, or luckily depending how you look at it, you still can get useful
results even when some of your assumptions about the data are wrong.

But I understand what you mean, and completely agree that poseur amateurs who
treat DS as just easy "number fluff" and expect fancy ML frameworks to solve
all their problems are cancerous and ruin the reputation of DS as a field.

Yet the more I have worked with regular people at work, the more I have moved
from the camp of always doing the "right thing" to hoping that people would
just "do something". I can't fix every problem so maybe I'll just deal with
the reality as it is and try to make best out of it. I won't start quizzing
people at work about DS know-how but maybe then silently guide them towards
understanding what they are doing instead of driving them out of the room and
keeping them located somewhere far away from the data science team.

------
devicetray0
I was hoping there would be a section/mention on data privacy, but a CTRL+F
revealed no mention of the word.

