Hacker News new | past | comments | ask | show | jobs | submit login
Advanced Data Analysis from an Elementary Point of View (2017) [pdf] (cmu.edu)
168 points by mindcrime on Feb 19, 2018 | hide | past | web | favorite | 26 comments

I did take this class at CMU in 2012; I liked it a lot and it's good to see the curriculum hasn't changed much since.

That said, working with highly-bespoke R packages made life crazy, and I'm thankful that post-graduation ggplot2 was an option for R data visualization and the development of the rest of the tidyverse packages after I graduated let me use R now in a much less messy manner.

`This book does not presume that you once learned but have forgotten the material from the pre-requisites; it presumes that you know that material and can go beyond it. The book also presumes a firm grasp on linear algebra and multivariable calculus, and that you can read and write simple functions in R. If you are lacking in any of these areas, now would be an excellent time to leave.`


This probably reflects the fact that the text is an only partially rewritten set of lecture notes. It would an absolutely vanilla thing to say at the beginning of the first lecture of a crowded course.

Pretty much par for the course for graduate level maths.

Linear Algebra Multivar Calculus

They underpin most of which is practical in maths, or so it seems to me.

I took a quick look, and they start at very very basic stuff.

>now would be an excellent time to leave

What does the author gain from this statement? Now, if I were to be interested in this book, I would now be not interested and would encourage others to avoid it.

Quite rude.

On the contrary, I like that the author spells out clear guidance on what is necessary preparation for the material. It lets me know right away if I should keep reading or look for more introductory material.

I didn't read it as rude. Rather a "hey you better know what you're signing up for here".

I don't think the author gains anything. On the contrary, as a reader I gain something from this statement. It's not "now would be an excellent time to leave and you should feel bad you stupid person", it's "now would be an excellent time to leave so that you don't waste your time trying to build a house on a junk foundation". If anything, it's courteous.

The author doesn’t gain anything, it’s entirely to help the reader understand their own readiness for the material. It’s not rude, it’s straightforward and matter of fact.

If you don’t meet the prerequisites, you should avoid the material. Not because the author has stated the obvious, but because you’re not prepared for it. If you’d like you can interpret it as rudeness, but really it’s intended to help people save time instead of waste it.

The first thing I do when I open a new textbook is flip to the preface material to see what the prerequisites are and how firm they are. Sometimes “passing familiarity” is enough, sometimes “mathematical maturity” is enough. But there are many treatments where that is not the case, and it’s better for everyone who’s serious about the material to be upfront about it.

Then the correct approach is "you are likely not ready for this material, I recommend you do XYZ and come back".

There's no problem with telling a reader they are not ready for a text - in fact, it can save them time. The way in which you do it though can encourage or discourage a student.

I don't think it's rude at all.

From the title, I'm very interested in the topic since a portion of my job is some form of data analysis and aggregation.

However, it's been 15 years since my Linear Algebra course and I haven't done Multivariate Calculus since high school, which even then was just a very basic introduction to the topic. I'm too far removed from that level of math at this point in my life. Further, the data I'm looking at doesn't often lend itself toward linear regressions or mathematical analysis. At least, not in a meaningful sense. It's rare that even a simple standard deviation is even particularly useful.

I appreciate that the author tells me up front the type of paper being presented isn't the type of paper I'm interested in reading even though.

> What does the author gain from this statement? Now, if I were to be interested in this book, I would now be not interested and would encourage others to avoid it.

That's not very rational. That your feelings are hurt has no impact on how good or bad the book is.

That is a great example of a previous conversation about feelings here on HN. https://news.ycombinator.com/item?id=16384453

> That's not very rational.

Don't need to be rational. Human beings aren't rational.

> That your feelings are hurt has no impact on how good or bad the book is.

It could be a great book, but if somebody is rude then (some are suggesting that the comment is tongue-in-cheek but meh) I don't really need to engage with that person. There are plenty of great books that cover this material so I would just suggest others to look elsewhere instead. Ultimately - I don't reward poor behavior. There are very few people who could make a significant enough contribution that I would give a pass on rude behavior towards others, particularly if they are condescending or mean for no reason.

I also found it peculiar that you would believe that my feelings would be hurt. Why would they be? I just found out about this particular book and have no emotional attachment to it, nor the author whatsoever. Maybe you're defensive because you believe the book is good and worthy of being read despite the author's rudeness? That would certainly better explain your need to attempt to attack my "rationality" and suggest that my "feelings were hurt" alongside posting a link that wasn't worth reading.

> Don't need to be rational. Human beings aren't rational.

Some are more than others.

> Why would they be?

I don't know why they would be, but the fact that you say "that's so rude I'm not reading your book" is clear indication that they are.

> Maybe you're defensive because (blablabla)

What the hell are you talking about? I was just pointing out that you were letting your emotions interfering with making good decisions. I was doing this for your benefit. If you don't care about that fact, I don't either. Now, have a nice life!


The commenter isn’t trolling, you’re just being incredibly literal in your interpretation and responses, so much so that you’re being obtuse. He said “your feelings were hurt” because you said it’s rude - you don’t need to start a diatribe litigating the commenter for assuming your feelings were hurt. If you find something rude, you took offense to it in some sense, and if you take offense to something, reasonable people could say your feelings were hurt in some sense.

On a similar note, when the commenter said what you’re saying isn’t rational, you took the opportunity to point out that humans as a species aren’t rational. That feels like you’re deliberately missing the point, but in case you’re not, here’s that point restated: It doesn’t make sense for you to call an author rude when the author is matter of factly informing the reader of the necessary prerequisites.

For what it’s worth, it’s apparent that the other commenters who responded to you, myself included, agree with the substantive meaning you’re attacking so literally.

His writing is somewhat tongue in cheek.

For whatever it’s worth, he seems to be a dedicated teacher who posts self criticisms of his courses publicly online. The course this book is based on has grown quite successful as well.

I’m not sure if you’re just trolling, but encouraging others to avoid this book might be a mistake. It’s quite good.

The fact that the book isn't meant for you, and the annoyance of people constantly emailing you "I read your thing and I don't get it help".

The author has a pretty interesting homepage. Especially the notebooks http://bactra.org

Second that. Some of these notebooks are excellent literature reviews, highlighting really neat connections between results.

I call myself a data professional or whatever, but looking at this table of contents I realise there is so much i can still learn!

This author has an amazing writing style, I just learned about the history of the term "regression" as used in statistics. Definitely worth looking up, it has nothing to do with curve fitting in general but was inspired by a canonical example that first inspired drawing a fit.

This is my favorite statistics factoid! Regression/reversion to the mean is the idea that if you observe an extreme value and remeasure it, it will tend toward the average value on the second observation. For example, students who have a very high or very low score on a test will have a more average score if you retest them.

Sir Francis Galton demonstrated the idea of regression to the mean by inventing linear "regression." He plotted the heights of children vs the heights of their parents and showed that very tall parents have children who are not as tall.

The fact that we call linear regression "regression" has nothing to do with reverting to the mean other than, coincidentally, linear regression was first used to illustrate the concept.

From the book's website:

"This is a draft textbook on data analysis methods, intended for a one-semester course for advance undergraduate students who have already taken classes in probability, mathematical statistics, and linear regression. It began as the lecture notes for 36-402 at Carnegie Mellon University."

There is also an overview of tools for "complex systems", by the same author: https://arxiv.org/abs/nlin/0307015

Love this book, it's a great intro into statistics based learning.


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact