Hacker News new | past | comments | ask | show | jobs | submit login
Suggest good sites/books on probability
46 points by pm90 on July 3, 2011 | hide | past | web | favorite | 25 comments
I have always been quite confused by both the concept of probability and more so, by the plethora of resources that seek to "introduce" probability. Can you please suggest online resource to learn great beginning and advanced probability concepts?

There is only one book I would recommend:

Title: Philosophical Essay on Probabilities

Author: Pierre Simon Marquis de Laplace

Year: 1794 (and still applicable)

Amazon Search: http://www.amazon.co.uk/s/ref=nb_sb_noss?url=search-alias%3D...

Google Books: http://books.google.com/books/about/A_philosophical_essay_on...

Sure you can find more current texts, but this introduces it as if this was a cold introduction, and makes the foundation understanding almost complete. It's a pretty stunning piece of work, the essay being a popular introduction and not the underlying mathematical lecture it was based on.

Downloadable pdf and other links are available here - http://www.archive.org/details/philosophicaless00lapliala

Thanks buro9 and vinutheraj!

really nice! Thanks!

I found this intuitive tutorial in basic Bayes to be great. Has good interactive examples too.


For applications to machine learning, I strongly recommend Elements of Statistical Learning. The text is eye-opening. A PDF is freely available on Rob Tibshirani's website: http://www-stat.stanford.edu/~tibs/ElemStatLearn/

Used for engineering at leading university's: S. Ross, A first course in probability, 8th Edition, Pearson Prentice Hall 2010.

Suggested by prof as being good for learning: Introduction to probability, D. P. Bertsekas and J. N. Tsitsiklis, Athena Scientific , 2002. ISBN 1-886529-40-X. This is an excellent book for further reading and understanding some of the material.

I took 6.041, was pretty good.


We followed a book by the same authors.

Introduction to Probability with R: "Based on a popular course taught by the late Gian-Carlo Rota of MIT, with many new topics covered as well."


Thanks! I wouldn't qualify that as a proper review, altough it's hardly surprising that Radford Neal wouldn't spend more time on a book he clearly considers inadequate.

You can download A Treatise on Probability by John Maynard Keynes from Guttenberg http://www.gutenberg.org/ebooks/32625

Although I agree on the Laplace book mentioned already as well.

I thoroughly enjoyed Kai Lai Chung's "Elementary Probability Theory With Stochastic Processes" when I was in college:


I should have linked to the 4th and latest edition of the book:


There are a lot of lectures on Khan academy. Once you're don't with those, you can also get the book 'A First Course in Probability' by Sheldon Ross.

Why is NY_USA_Hacker's helpful comment dead?


Apparently some brain-dead, angry HN 'administrator' has essentially banned posts of


at least to this thread.

Posts made invisible to others include



Sounds like some HN 'administrator' doesn't want to hear about good answers to the question of this thread, that is, the serious side of probability!

Apparently they don't like Breiman at Berkeley, Cinlar at Princeton, Karr at UNC, Wierman at Hopkins, Dynkin at Cornell, McKean at Courant, Karatzas at Columbia, Shreve at CMU, etc. That's a lot of the cream of US applied math not to like! HN is sinking to a new low! We're talking brain-dead here, folks!

What is here is an attack that is emotional and personal and not rational or objective.

Paul: Chip in here and explain this 'hidden censorship' or face a big hole in the credibility and objectivity of HN.

All of your posts since http://news.ycombinator.com/item?id=2698286 are dead actually. A mistake, I hope.

Apparently now all posts by user NY_USA_Hacker are within a few hours automatically marked as "dead". Someone at HN really does NOT like NY_USA_Hacker!

The shame is on HN and Paul. As the HN community figures this out, the 'community' of HN will fall. Paul is playing fast and loose with objectivity and open discussion.

"The Black Swan" by Taleb is not strictly about probability, but it touches on it, among other topics.

Fooled by randomness would be much better if you want to learn something about probability via a popular author like Taleb, IMO

That book is also good; and the Black Swan grew out of a section within it.

I would suggest A First Course in Probability, 8th Edition by Sheldon Ross

"Can you please suggest on-line resource to learn great beginning and advanced probability concepts?"

Yup! The subject at the level you ask is a major topic in applied math but is not very popular in US universities. So, you will not get many good answers. In particular, the 'computer science' community, with 'machine learning' and 'artificial' this, Bayesian that, likely won't have good answers.

Your "on-line" part is asking a bit much; I can give you references to books but not all on-line. There may be some PDF files on-line, from TeX, that have such material; try some Google searching with the keywords used here.

The intuitive foundations of probability went back to gambling.

'Probability' is a field of 'applied' math and as such is well defined:

About 100 years ago, E. Borel student H. Lebesgue invented 'measure theory' which essentially 'rewrote' classic calculus, especially the part about integration. For the simple cases, what Lebesgue did gets the same numerical values as the classic Riemann integral. The difference is that in theoretical work Lebesgue's integral is much more general and much better 'behaved'.

But 'measure theory' has to do with, intuitively, 'length', 'area', 'volume' and various generalizations of these. Well, in probability, for probability P and event A, the 'probability' of A is P(A) and is a number in [0,1] and acts much like the 'area' of A. The connection is so close that, in the end, we have to accept that the foundations of probability are measure theory.


A. N. Kolmogorov, 'Foundations of the Theory of Probability, Second English Edition', Chelsea Publishing Company, New York, 1956. English translation of the original German '"Grundbegriffe der Wahrscheinlichkeitrechnung,"' 'Ergebnisse Der Mathematik', 1933.

applied Lebesgue's work to make probability a solid field of math. Since then Kolmogorov's foundations have been nearly the only one taken seriously in any 'modern' or 'advanced' work in probability, stochastic processes, or mathematical statistics.

A good start on a good text in stochastic processes was:

J. L. Doob, 'Stochastic Processes', John Wiley and Sons, New York, 1953.

Doob was long at University of Illinois. One of his students was P. Halmos who was later an assistant to von Neumann at the Institute for Advanced Study and in about 1942 wrote the first version of the still standard:

Paul R. Halmos, 'Finite-Dimensional Vector Spaces, Second Edition', D. Van Nostrand Company, Inc., Princeton, New Jersey, 1958.

Later he wrote:

Paul R. Halmos, 'Measure Theory', D. Van Nostrand Company, Inc., Princeton, NJ, 1950.

with at the end a NICE introduction to probability and stochastic processes based on measure theory.

Likely the first rock solid, quite comprehensive, highly polished presentation of 'modern' probability was the first edition of:

M. Loeve, 'Probability Theory, I and II, 4th Edition', Springer-Verlag, New York, 1977.

Loeve was long at Berkeley. One of his students did:

Jacques Neveu, 'Mathematical Foundations of the Calculus of Probability', Holden-Day, San Francisco, 1965.

and another did:

Leo Breiman, 'Probability', ISBN 0-89871-296-3, SIAM, Philadelphia, 1992.

Either of these two can be regarded as a more succinct presentation of the more important material in Loeve. Breiman is the more 'practical' and 'accessible'; Neveu is a crown jewel of elegance and succinctness but not always easy to read.

Other good presentations of much the same material include:

Kai Lai Chung, 'A Course in Probability Theory, Second Edition', ISBN 0-12-174650-X, Academic Press, New York, 1974.


Yuan Shih Chow and Henry Teicher, 'Probability Theory: Independence, Interchangeability, Martingales', ISBN 0-387-90331-3, Springer-Verlag, New York, 1978.

In total, those texts nail down 'probability' at all four corners and make it a rock solid topic in applied math. Good knowledge of, say, Breiman is a necessary and sufficient condition for knowing 'probability' at a serious level, that is, without being watered down for 'general audiences'.

For more, proceed with stochastic processes, stochastic optimal control, mathematical statistics, etc.

For learning probability, minimal prerequisites (more would be helpful) would be abstract algebra, linear algebra, 'analysis', measure theory, and functional analysis.

For abstract algebra, there are many texts. Sufficient is

I. N. Herstein, 'Topics in Algebra', Blaisdell, New York, 1964.

but can consider S. Lang, etc.

For linear algebra, there are many texts, but the old:

Paul R. Halmos, 'Finite-Dimensional Vector Spaces, Second Edition', D. Van Nostrand Company, Inc., Princeton, New Jersey, 1958.

remains a good 'second' text. It is also an introduction to functional analysis.

The main text for 'analysis' is just:

Walter Rudin, 'Principles of Mathematical Analysis', McGraw-Hill, New York.

in whatever is the latest edition.

Can skip the material on exterior algebra or just get it from the original source, now in English:

Henri Cartan, 'Differential Forms', ISBN 0-486-45010-4, Dover, Mineola, NY, 2006.

For measure theory and functional analysis, the standards are:

H. L. Royden, 'Real Analysis: Second Edition', Macmillan, New York.

and the first (real) half of:

Walter Rudin, 'Real and Complex Analysis', ISBN 07-054232-5, McGraw-Hill, New York.

Actually, Loeve also covers much of this material. Neveu slips in quite a lot.

Then you will be ready for Breiman, Neveu, Chung, Chow and Teicher, or Loeve.

You will discover:

The start of probability theory is a 'probability space'. For that, there are three parts, (1) the 'sample space', (2) the 'events', and (3) the 'probability measure'.

The 'sample space' is just a set of points; to support any reasonably serious work in probability, the sample space has to be uncountably infinite. In all our work, we do some one 'trial', and that corresponds to some one point in the sample space. The other points in the sample space are what 'might' have been our trial.

An 'event' is a set of trials. We ask that the collection of all events have some 'closure' properties to make that collection a 'sigma algebra'. So, briefly, the collection of events has as one event the whole sample space and is closed under relative complements and countable unions.

In terms of measure theory, the sample space and the sigma algebra of events form a 'measurable space'. A 'probability measure' is just a positive measure with total 'mass' 1. Thank you Lebesgue and measure theory.

On the reals R, we can take the open sets and, then, the Borel sets, that is, the smallest sigma algebra that contains the open sets. A 'random variable' is a function from the probability space to the reals that is 'measurable', that is, the inverse image of a Borel set is an event.

As an alternative, we can ask that the inverse image of the Lebesgue measurable sets of R be events.

If X is such a random variable, then its 'expectation' E[X] is just the Lebesgue ('abstract') integral of X over the probability space. For E[X] to exist, we need only that X be measurable and that the integral of both the positive and the negative parts of X not be infinite (we don't want to subtract one infinity from another). E.g., Lebesgue integration is very 'general': All we need is measurability and not subtract one infinity from another.

By a change of variable, we can also write the expectation as a Lebesgue integral over the real line with respect to the 'distribution' of X on R. Or, X 'induces' a probability measure on R.

We can discuss convergence of random variables -- in distribution, probability, mean square, and almost surely.

Then we can define independence for events, sigma algebras, and random variables. E.g., with this approach, if X and Y are independent random variables and f and g are functions, then f(X) and g(Y) are independent random variables. Don't try to prove this the elementary way!

With independent random variables, we can cover the classic limit theorems -- central limit theorem, weak and strong laws of large numbers, and the law of the iterated logarithm.

Using the Radon-Nikodym theorem of measure theory, for random variables X and Y we can define their 'conditional expectation' E[Y|X].

With such 'conditioning' we can discuss Markov processes, martingales, and the martingale convergence theorem.

We can also move on to ergodic theory and, say, Poincare recurrence (keep stirring the coffee and the cream will separate back, close to where it was when it was poured in).

Those are the previews of coming attractions.

Note: This material is not very popular in US universities.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact