Hacker News new | past | comments | ask | show | jobs | submit login
MIT OCW: Statistics for Applications (2016) (ocw.mit.edu)
278 points by tosh on July 19, 2020 | hide | past | favorite | 37 comments

I got the chance to casually work through this course while stuck at home during COVID-19. I highly recommend it to anybody who has a surface-level understanding of statistics and wants to dive deep. This is NOT a course for absolute beginners. It is rigorous and thorough. Overall a great resource, especially the notes, but the lecturer is entertaining too.

> This is NOT a course for absolute beginners

I disagree with that, as long as you have some mathematics background (calculus, a bit of linear algebra), and an understanding of probability theory (which can be taken from the prerequisite course https://www.edx.org/course/probability-the-science-of-uncert...), this course is self-sufficient and does not need prior knowledge of the subject.

I was a complete beginner in the subject and I am able to follow the course without too much difficulties.

I took a look at the material (the slides on Method of Moments, in particular) and my feeling is that it is a particularly mathematically-heavy treatment of statistics. As it states in its goals, it aims to introduce the mathematical theory of statistics. On the course's main page, it is listed as a senior undergraduate/graduate course. The style tends toward being expository rather pedagogical -- it's a very French/European approach to teaching mathematics.

It does seem to require mathematical maturity beyond the basics, and in my opinion this is likely not accessible to most beginners without some advanced mathematical training.

If you find it accessible as a beginner then I congratulate you on your mathematical prowess.

"Basic" is ill-defined, but I think sloonz is right about it only requiring calc + linalg, which most CS / Eng majors will have taken.

So I've taught undergrad courses, and my sense is, the average US college engineering sophomore with linalg and say Calculus II (but no Real Analysis) might struggle with this material somewhat. They may know the material for linalg and calculus (and may have gotten As), but my feeling is that many would not have reached the mathematical maturity to truly internalize concepts.

I would place this course maybe at the senior level (with graduate level cross-registration)...400-500 level elective.

What is your sense?


Side note: it's interesting in that in other countries, e.g. say France, the math curriculum is so darned advanced. In undergrad Year 1 at École Polytechnique, real analysis and variational methods are already covered in common courses.


Functional analysis in Year 2.


Then again the top French schools filter out non-math folks via classes prépas and exams.

I'm in the demographic you describe, yet I've had a hard time finding resources to develop that "mathematical maturity" short of going back to graduate school. Which I'd love to do, but am at a point in my career where that would be devastatingly expensive to my future since these are the prime earning and wealth building years.

I wish there was more of a self directed way to achieve this.

I just finished an MS in math and statistics a long time after doing a non-mathematical undergrad degree. I feel a lot of what is called mathematical maturity is actually getting comfortable doing proofs, which I think is hard to learn while also learning more advanced math. I would recommend working through "Mathematical proofs" by Polimeni/Chartrand/Zhang. Unlike math at an earlier level, you can't just check your answers against the official ones to see if you made a mistake - writing proofs is more like writing essays, the grammar is the easy bit, it's the process of putting the arguments together in the right detail and the right order that's important and hard to do without feedback. So you also need to get feedback from mathematicians on your proofs if possible. The best way to do that if you don't have a buddy who happens to be a mathematician is to learn to use LaTeX and ask questions on math.stackexchange.com.

An alternative is to do the proof and abstract algebra courses via (asymmetric) distance learning at https://westcottcourses.com/courses.html

I know someone who took these courses and felt like they got good feedback on their homeworks from the profs running the course.

Feel free to get in touch if you want to chat, I spent a long time trying to self-learn this stuff before starting my math MS, so happy to help in any way I can!

Since it's seems like you were already motivated and interested in learning Math on your own, how would you describe what your learnings were before you enrolled in formal studies? In other words if you could travel back in time to talk to yourself before you made the decision to enroll in a Master's program, what would the younger you have asked the older you and what would be the response? For example I'm thinking a reply might be like "well you're going to miss out on opportunities xyz by commiting to a Master's program, but because I know you and know you wouldn't be happy without satisfying your desire to learn Math in a more formal study the trade of is worth it. And you don't know this yet but when you start learning about P,Q,R you'll really get a kick out of it" :-)

Thank you for the resources, it is greatly appreciated!

Just curious, the topic above aside: what would you say is the main reason behind why you'd like to strive for "mathematical maturity"?

(from what you wrote, I gather it's not for reasons of vocation?)

I'm currently doing MITx's Fundamentals of Statistics MOOC, which seems fairly similar to this one. The course material doesn't require too much understanding of real analysis, though the instructor does make cursory acknowledgements of the "technical details" he's glossing over, presumably for the more advanced students. The only analysis concepts we've used are continuity, differentiability, and convergence. It isn't a proof-based course: the instructor does prove some theorems in class, but the psets are all just computation. I do agree that it'll be harder to internalize some of the concepts w/o a background in analysis, but I think you can get a reasonable amount out of the course regardless.

That said, I did study analysis (but not measure theory) in college, and I don't entirely remember what I learned from calc vs analysis classes, so I may be a bit off here.

The "undergrad Year 1 at École Polytechnique" is really the junior year, since the freshman/sophomore years of university education would have be done in prépas. It is undergrad, and it would be their first year at the school, but it is quite misleading to call it "undergrad Year 1". Given that undergrad is three years in France, "undergrad Year 1 at École Polytechnique" means "last year of undergrad".

Ah what you say is true... upon examination, this curriculum is for the 4-year ingénieur polytechnicien program, which culminates in a Master's degree (diplôme d'ingénieur).

Note that this is unusual in that it lasts four years, not three.

Standard curriculum is three years of bachelor's, then two years of master's.

In the prépa - engineering school track, it is two years of prépa, then three years of engineering school that gives out a master's in engineering.

Thus the first year of engineering school maps to a (third) last year of undergrad, the second to a first year of a master's and the last to the last year of a master's.

For those who might speed through 18.650, the natural next step is [0] 18.655 (Mathematical Statistics) followed by the new course [1] 9.521 (Non-Asymptotic Perspectives in Statistics).

[0] https://ocw.mit.edu/courses/mathematics/18-655-mathematical-...

[1] http://www-math.mit.edu/~rigollet/IDS160/notes.html

I actually agree with your point on prerequisites. I think by absolute beginner I meant to say no experience with probability theory either. With the prerequisites you outlined, I think a determined person could do well with the material. Mainly posting this follow-up to correct myself so people curious on whether they'd be able to follow the course are better-informed.

"I disagree with that, as long as you have some mathematics background (calculus, a bit of linear algebra), and an understanding of probability theory"

right, so it's not for beginners.

What would you recommend for someone who doesn't have the calculus and linear algebra?

Gilbert Strang's linalg lectures on MIT OCW are amazing.

I agree with this. I'm currently taking the Statistics and Data Science MicroMasters from MIT on the EDX platform. When you see a course titled "Introduction to .... ", it's never really an introductory material. They go really deep. I've taken the "Probability and the uncertainty of data". Granted, it starts with introducing very basic concepts, however quickly delves deep into very advanced topics. I must admit, Prof. John Tsitsiklis has done an awesome job in the course and even a beginner can take it. It's not an easy course to complete under deadlines.

how was your experience ? I am thinking to take micromaster. please do mention your background and what was your goal before taking the couse and what you achieved? thanks

Hey mate.. the probability course was really rigorous and I found the difficulty to be hard, especially with a full time job. I failed the first time, but mostly because I didn't have enough time to study. It's awesome deep dive into the world of probability and lays a solid foundation to get into statistics. I'm currently taking the "Data Analysis for social scientists" course right now. It's ambitious and has a lot of content. However, the material only covers the subject at a very shallow level and expects a lot from the students when it comes to home work exercises. Also, make sure you learn R before you start the course, that'd help a lot.

I completed my bachelors in Electrical Engineering 14 years ago. I'd been looking to get some formal education in Statistics as I work as a data analyst. The course has been really good and I'm learning a lot of the theoretical aspects of Data analysis that I otherwise would never have learnt thought DataCamp or Udemy or others. I'm glad to have started it, however, it's not practical and it won't help you get a job immediately. It's very much academic in nature, however you can learn the practical stuff from other sources if you need to or learn it on the job. Depends on what you're after. I'm not doing it to get a job, rather doing it to get into academia. Hope that helps.

Anybody have a recommendation for courses geared toward absolute beginners? Maybe one that would be a good segue to this course?

I would be interested in a beginner course too.

Maybe Stats 110 from Harvard? YouTube videos:


Professor Blitzstein summarizes top 10 ideas on Quora:


Cheatsheet for the class: http://www.wzchen.com/probability-cheatsheet

Book is also online: https://drive.google.com/file/d/1VmkAAGOYCTORq1wxSQqy255qLJj...

There is an EdX version of it (with problem sets) : https://www.edx.org/course/fundamentals-of-statistics

I am taking the course right now. It is very good but it is also very hard. You need to spend between 10-20 hours each week for the lectures and homeworks.

The EdX course is very good and is organized by the staff from the OCW course.

Looks like a very nice list of topics. Glad to see GLMs in there -- I think if more folks had exposure to GLMs, then we could dispel the myth that statistics is a bag of tricks. GLMs are a very awesome expansion of regression that connect many themes in statistics.

Not only are GLMs there, this is far and beyond the clearest explanation of GLMs I've ever seen. If you've ever wanted to learn the theory behind logistic regression, the last 3 lectures are a must-watch.

I sorta disagree with the order in which things are taught. I don't particularly like that likelihood is taught before regression, when there's a really easy (and intuitive!) reason to learn it.

I like the way that I was taught when I took biostats in graduate school. We fist covered t-test and ANOVA as a way to intuitively compare differences among groups. From that, we then generalized our approach, and learned that linear regression is a generalized form, which happens to have a closed form solution. From linear regression, we then learned about logistic regression and odds ratios, and about the log-link formulation. But drats! We don't have a closed form solution to calculating our Betas. But we can estimate a good set of them using likelihood!

Statistics is often fraught with nuance, and I think the more we can convince to do more thinking about what we're trying to accomplish with a statistical analysis and how to prove it should have more emphasis over just "lets just make an alphabet soup out of our slide decks!". That's not to say you shouldn't have mathematical rigor, but mathematical rigor without an intuitive understanding is potentially giving the impression that more complex analysis == better analysis.

How does this (or its EdX version) compare with ESL? (https://amzn.com/0387848576)

I took this class in-person before reading ESL. I'd say there's more overlap between this class (18.650) and the class textbook All of Statistics (Wasserman) than ESL.

That said, ESL is a better companion than Wasserman if you want to apply the statistics to ML and don't plan on studying the graduate-level statistics courses. ESL + 18.650 + 9.520 (Statistical Learning Theory, Poggio and Sasha Raklin) covers 95% of the math and statistics I've seen in ML research.

I had a negative experience with this course. I have taken lot of statistics courses in Coursera and edX. I tried this course in edx and found it not well organized and reasoned out; I don’t remember instances to give examples for, just my opinion. FYI: I have very high respect for mit, it’s professors and ocw courses

Which courses do you recommend?

What's a good online course that will help provide me a refresher on my maths?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact