I think the Bayesian linear regression is cool, but I'm not a big fan of the Bayes Factors tests, since they are very sensitive to the choice of priors.
If we just switch from p-values (with 0.05 cutoff) to Bayes Factors (with hard cutoffs), then we'll have many of the same problems stats currently has...
A couple of years ago I discovered the MacOS accessibility feature "Speak selection" under "Accessibility > Spoken Content." I've set up a keyboard shortcut for it, which allows me to take any ebook in PDF format, select a bunch of text (usually one chapter) press the key and let the computer read it to me. Basically, it turn any text into an audio book, then listen while exercising, doing the dishes, cleaning the house, or simply when you like down and need a break.
I've never was a literary person (despite everyone else in my family reading a lot), but ever since this discovery I've been catching up on a lot of fiction, philosophy, psychology book, and pretty much anything that doesn't have code or equations. Highly recommended.
You can also use the `say` command, which can produce an audio file that you can load into a music player app, which might be more useful since it can remember your place. https://ss64.com/mac/say.html
Abstract: In the coming decades, developments in automated reasoning will likely transform the way that research mathematics is conceptualized and carried out. I will discuss some ways we might think about this. The talk will not be about current or potential abilities of computers to do mathematics—rather I will look at topics such as the history of automation and mathematics, and related philosophical questions.
That was wonderful, thank you for linking it. For the benefit of anyone who doesn't have time to watch the whole thing, here are a few really nice quotes that convey some main points.
"We might put the axioms into a reasoning apparatus like the logical machinery of Stanley Jevons, and see all geometry come out of it. That process of reasoning are replaced by symbols and formulas... may seem artificial and puerile; and it is needless to point out how disastrous it would be in teaching and how hurtful to the mental development; how deadening it would be for investigators, whose originality it would nip in the bud. But as used by Professor Hilbert, it explains and justifies itself if one remembers the end pursued." Poincare on the value of reasoning machines, but the analogy to mathematics once we have theorem-proving AI is clear (that the tools and the lie direct outputs are not the ends. Human understanding is).
"Even if such a machine produced largely incomprehensible proofs, I would imagine that we would place much less value on proofs as a goal of math. I don't think humans will stop doing mathematics... I'm not saying there will be jobs for them, but I don't think we'll stop doing math."
"Mathematics is the study of reproducible mental objects." This definition is human ("mental") and social (it implies reproducing among individuals). "Maybe in this world, mathematics would involve a broader range of inquiry... We need to renegotiate the basic goals and values of the discipline." And he gives some examples of deep questions we may tackle beyond just proving theorems.
I've been working on a introductory STATS book for the past couple of years and I totally understand where the OP is coming from. There are so many books out there that focus on technique (the HOW), but don't explain the reasoning (the WHY).
I guess it wouldn't be a problem if the techniques being taught in STATS101 were actually usable in the real world. A bit like driving a car: you don't need to know how internal combustion engines work, you just need to press the pedals (and not endanger others on the road). The problem is z-tests, t-tests, ANOVA, have very limited use cases. Most real-world data analysis will require more advanced models, so the STATS education is doubly-problematic: does not teach you useful skills OR teach you general principles.
I spent a lot of time researching and thinking about STATS curriculum and choosing which topics are actually worth covering. I wrote a blog post about this[1]. In the end I settled on a computation-heavy approach, which allows me to do lots of hands simulations and demonstrations of concepts, something that will be helpful for tech-literate readers, but I think also for the non-tech people, since it will be easier to learn Python+STATS than to try to learn STATS alone. Here is a detailed argument about how Python is useful for learning statistics[2].
If you're interested in seeing the book outline, you can check this google doc[3]. Comments welcome. I'm currently writing the last chapter, so hopefully will be done with it by January. I have a mailing list[4] for ppl who want to be notified when the book is ready.
This paper probably seems obvious to a lot of people but i found when i gave talks about things and read and reviewed papers people typically didn't know basic things like why you might leave some data out as a test set, why some models work better than others, when you use logistic regression versus linear regression, etc.
Yeah I tried to make the point that even in the case of multi level models you still should consider their ability to predict because otherwise how can you trust the model understands the underlying correlation structure of your data? That’s because many people had been advocating for these models dogmatically while presenting very poor fit statistics (r^2<0.2) while making big claims. Since I finished my phd I calmed down a bit haha. Now I just run workshops and conferences instead. And I try to present statistics and machine learning as building a lab apparatus. Once the model is built then you can ask it research questions. But simply building the model is not research.
I recently learned[1] that he didn't write this book. He was telling stories to Ralph Leighton, who then wrote the books, and there is reason to believe many of the stories were embellished/enhanced. I'm kind of disappointed now that he didn't write the book himself, but I remember enjoying it when I read it as an undergrad.
I like the idea of math as "self help." People don't realize it, but the more math you learn, the easier/simpler life becomes... In other words, people learn math not because they like complexity, but because they are lazy and don't want complexity in their life.
For a specific example, consider some complicated arithmetic expression involving a dozen numbers and repeated operations +/-/*/÷. A person who knows high school algebra, could introduce some structure in the expression (e.g. by defining variables), then use the rules of algebra to simplify the expression, and end up doing much arithmetic overall to compute the answer.
The more (as in abstraction and modelling) math you know, the less math (as in arithmetic) you'll have to do!
That's not what I call laziness. Laziness is doing a half-assed, inefficient job that you don't have to concentrate on. Putting underwear away by throwing it. Spilling your drink on yourself because you didn't want to sit up in bed. Trying repeatedly to flick a switch with your toes rather than bend down. Communicating in grunts because thinking of words is demanding. What you're describing is cleverness, and it's a strain.
Sure, that's one way to define it. But to people who think work (and effort) is in itself a virtue, optimization is lazy, because it reduces the total work you'll need to do.
Even some of your examples I'm not convinced are lazy according to your own definition. Am I lazy for flicking a switch at ankle level with my toes if I can do it well enough, when bending down would take so much more effort? I think by your standard I should be commended for my foot dexterity! When does the desire to reduce effort cease to be vicious and become virtuous?
> What is probabilistic programming actually useful for?
You can think of a probabilistic programming language as a set of building blocks for building statistical models. In the olden days, people used very simple frequentist models based on standard reference distributions like the normal, Student's t, chi2, etc. The models were simple because the computational capabilities were limited.
In modern days, thanks to widespread compute and the inference algorithms, you can "fit" a much wider class of models, so researchers now tend to build bespoke models adapted for each particular application they are interested in. Probabilistic programming language are used to build those "custom" models.
The story I came up with for the first term, is that in the sequence of lenght L, you need to choose L1 locations that will get the symbol x1, so there are choose(L,L1) ways to do that. Next you have L-L1 remaining spots to fill, and L2 of those need to have the symbol x2, hence the choose(L-L1,L2) term, etc.
I think the Bayesian linear regression is cool, but I'm not a big fan of the Bayes Factors tests, since they are very sensitive to the choice of priors.
If we just switch from p-values (with 0.05 cutoff) to Bayes Factors (with hard cutoffs), then we'll have many of the same problems stats currently has...
reply