
Suggest good sites/books on probability - pm90
I have always been quite confused by both the concept of probability and more so, by the plethora of resources that seek to "introduce" probability. Can you please suggest online resource to learn great beginning and advanced probability concepts?
======
buro9
There is only one book I would recommend:

Title: Philosophical Essay on Probabilities

Author: Pierre Simon Marquis de Laplace

Year: 1794 (and still applicable)

Amazon Search: [http://www.amazon.co.uk/s/ref=nb_sb_noss?url=search-
alias%3D...](http://www.amazon.co.uk/s/ref=nb_sb_noss?url=search-
alias%3Ddigital-text&field-
keywords=philosophical+essay+on+probability&x=0&y=0)

Google Books:
[http://books.google.com/books/about/A_philosophical_essay_on...](http://books.google.com/books/about/A_philosophical_essay_on_probabilities.html?id=WxoPAAAAIAAJ)

Sure you can find more current texts, but this introduces it as if this was a
cold introduction, and makes the foundation understanding almost complete.
It's a pretty stunning piece of work, the essay being a popular introduction
and not the underlying mathematical lecture it was based on.

~~~
vinutheraj
Downloadable pdf and other links are available here -
<http://www.archive.org/details/philosophicaless00lapliala>

~~~
pm90
Thanks buro9 and vinutheraj!

------
revorad
Probability Theory: The Logic of Science by E. T. Jaynes

[http://www.amazon.com/Probability-Theory-Science-T-
Jaynes/dp...](http://www.amazon.com/Probability-Theory-Science-T-
Jaynes/dp/0521592712)

<http://bayes.wustl.edu/etj/prob/book.pdf>

<http://omega.albany.edu:8008/JaynesBook.html>

~~~
pm90
really nice! Thanks!

------
PeteBrighton
I found this intuitive tutorial in basic Bayes to be great. Has good
interactive examples too.

<http://yudkowsky.net/rational/bayes>

------
bhickey
For applications to machine learning, I strongly recommend Elements of
Statistical Learning. The text is eye-opening. A PDF is freely available on
Rob Tibshirani's website: <http://www-stat.stanford.edu/~tibs/ElemStatLearn/>

------
namank
Used for engineering at leading university's: _S. Ross, A first course in
probability, 8th Edition, Pearson Prentice Hall 2010._

Suggested by prof as being good for learning: _Introduction to probability, D.
P. Bertsekas and J. N. Tsitsiklis, Athena Scientific , 2002. ISBN
1-886529-40-X. This is an excellent book for further reading and understanding
some of the material._

------
xcode
I took 6.041, was pretty good.

[http://ocw.mit.edu/courses/electrical-engineering-and-
comput...](http://ocw.mit.edu/courses/electrical-engineering-and-computer-
science/6-041-probabilistic-systems-analysis-and-applied-probability-
fall-2010/)

We followed a book by the same authors.

------
chl
Introduction to Probability with R: "Based on a popular course taught by the
late Gian-Carlo Rota of MIT, with many new topics covered as well."

<http://www.ccs.neu.edu/home/kenb/stochas/index.html>

~~~
imurray
A recent critical review of that book:
[http://radfordneal.wordpress.com/2011/06/18/two-textbooks-
on...](http://radfordneal.wordpress.com/2011/06/18/two-textbooks-on-
probability-using-r/)

~~~
chl
Thanks! I wouldn't qualify that as a proper review, altough it's hardly
surprising that Radford Neal wouldn't spend more time on a book he clearly
considers inadequate.

------
Yipster
You can download A Treatise on Probability by John Maynard Keynes from
Guttenberg <http://www.gutenberg.org/ebooks/32625>

Although I agree on the Laplace book mentioned already as well.

------
olifante
I thoroughly enjoyed Kai Lai Chung's "Elementary Probability Theory With
Stochastic Processes" when I was in college:

<http://www.amazon.com/dp/0387903623/>

~~~
olifante
I should have linked to the 4th and latest edition of the book:

<http://www.amazon.com/dp/1441930620/>

------
jeez
There are a lot of lectures on Khan academy. Once you're don't with those, you
can also get the book 'A First Course in Probability' by Sheldon Ross.

------
szany
Why is NY_USA_Hacker's helpful comment dead?

<http://news.ycombinator.com/item?id=2723941>

~~~
NY_Entrepreneur
Apparently some brain-dead, angry HN 'administrator' has essentially banned
posts of

NY_USA_Hacker

at least to this thread.

Posts made invisible to others include

<http://news.ycombinator.com/item?id=2733913>

<http://news.ycombinator.com/item?id=2733801>

Sounds like some HN 'administrator' doesn't want to hear about good answers to
the question of this thread, that is, the serious side of probability!

Apparently they don't like Breiman at Berkeley, Cinlar at Princeton, Karr at
UNC, Wierman at Hopkins, Dynkin at Cornell, McKean at Courant, Karatzas at
Columbia, Shreve at CMU, etc. That's a lot of the cream of US applied math not
to like! HN is sinking to a new low! We're talking brain-dead here, folks!

What is here is an attack that is emotional and personal and not rational or
objective.

Paul: Chip in here and explain this 'hidden censorship' or face a big hole in
the credibility and objectivity of HN.

~~~
szany
All of your posts since <http://news.ycombinator.com/item?id=2698286> are dead
actually. A mistake, I hope.

~~~
NY_Entrepreneur
Apparently now all posts by user NY_USA_Hacker are within a few hours
automatically marked as "dead". Someone at HN really does NOT like
NY_USA_Hacker!

The shame is on HN and Paul. As the HN community figures this out, the
'community' of HN will fall. Paul is playing fast and loose with objectivity
and open discussion.

------
gallerytungsten
"The Black Swan" by Taleb is not strictly about probability, but it touches on
it, among other topics.

~~~
localhost3000
Fooled by randomness would be much better if you want to learn something about
probability via a popular author like Taleb, IMO

~~~
gallerytungsten
That book is also good; and the Black Swan grew out of a section within it.

------
kaa2102
I would suggest A First Course in Probability, 8th Edition by Sheldon Ross

------
NY_Entrepreneur
"Can you please suggest on-line resource to learn great beginning and advanced
probability concepts?"

Yup! The subject at the level you ask is a major topic in applied math but is
not very popular in US universities. So, you will not get many good answers.
In particular, the 'computer science' community, with 'machine learning' and
'artificial' this, Bayesian that, likely won't have good answers.

Your "on-line" part is asking a bit much; I can give you references to books
but not all on-line. There may be some PDF files on-line, from TeX, that have
such material; try some Google searching with the keywords used here.

The intuitive foundations of probability went back to gambling.

'Probability' is a field of 'applied' math and as such is well defined:

About 100 years ago, E. Borel student H. Lebesgue invented 'measure theory'
which essentially 'rewrote' classic calculus, especially the part about
integration. For the simple cases, what Lebesgue did gets the same numerical
values as the classic Riemann integral. The difference is that in theoretical
work Lebesgue's integral is much more general and much better 'behaved'.

But 'measure theory' has to do with, intuitively, 'length', 'area', 'volume'
and various generalizations of these. Well, in probability, for probability P
and event A, the 'probability' of A is P(A) and is a number in [0,1] and acts
much like the 'area' of A. The connection is so close that, in the end, we
have to accept that the foundations of probability are measure theory.

Then

A. N. Kolmogorov, 'Foundations of the Theory of Probability, Second English
Edition', Chelsea Publishing Company, New York, 1956. English translation of
the original German '"Grundbegriffe der Wahrscheinlichkeitrechnung,"'
'Ergebnisse Der Mathematik', 1933.

applied Lebesgue's work to make probability a solid field of math. Since then
Kolmogorov's foundations have been nearly the only one taken seriously in any
'modern' or 'advanced' work in probability, stochastic processes, or
mathematical statistics.

A good start on a good text in stochastic processes was:

J. L. Doob, 'Stochastic Processes', John Wiley and Sons, New York, 1953.

Doob was long at University of Illinois. One of his students was P. Halmos who
was later an assistant to von Neumann at the Institute for Advanced Study and
in about 1942 wrote the first version of the still standard:

Paul R. Halmos, 'Finite-Dimensional Vector Spaces, Second Edition', D. Van
Nostrand Company, Inc., Princeton, New Jersey, 1958.

Later he wrote:

Paul R. Halmos, 'Measure Theory', D. Van Nostrand Company, Inc., Princeton,
NJ, 1950.

with at the end a NICE introduction to probability and stochastic processes
based on measure theory.

Likely the first rock solid, quite comprehensive, highly polished presentation
of 'modern' probability was the first edition of:

M. Loeve, 'Probability Theory, I and II, 4th Edition', Springer-Verlag, New
York, 1977.

Loeve was long at Berkeley. One of his students did:

Jacques Neveu, 'Mathematical Foundations of the Calculus of Probability',
Holden-Day, San Francisco, 1965.

and another did:

Leo Breiman, 'Probability', ISBN 0-89871-296-3, SIAM, Philadelphia, 1992.

Either of these two can be regarded as a more succinct presentation of the
more important material in Loeve. Breiman is the more 'practical' and
'accessible'; Neveu is a crown jewel of elegance and succinctness but not
always easy to read.

Other good presentations of much the same material include:

Kai Lai Chung, 'A Course in Probability Theory, Second Edition', ISBN
0-12-174650-X, Academic Press, New York, 1974.

and

Yuan Shih Chow and Henry Teicher, 'Probability Theory: Independence,
Interchangeability, Martingales', ISBN 0-387-90331-3, Springer-Verlag, New
York, 1978.

In total, those texts nail down 'probability' at all four corners and make it
a rock solid topic in applied math. Good knowledge of, say, Breiman is a
necessary and sufficient condition for knowing 'probability' at a serious
level, that is, without being watered down for 'general audiences'.

For more, proceed with stochastic processes, stochastic optimal control,
mathematical statistics, etc.

For learning probability, minimal prerequisites (more would be helpful) would
be abstract algebra, linear algebra, 'analysis', measure theory, and
functional analysis.

For abstract algebra, there are many texts. Sufficient is

I. N. Herstein, 'Topics in Algebra', Blaisdell, New York, 1964.

but can consider S. Lang, etc.

For linear algebra, there are many texts, but the old:

Paul R. Halmos, 'Finite-Dimensional Vector Spaces, Second Edition', D. Van
Nostrand Company, Inc., Princeton, New Jersey, 1958.

remains a good 'second' text. It is also an introduction to functional
analysis.

The main text for 'analysis' is just:

Walter Rudin, 'Principles of Mathematical Analysis', McGraw-Hill, New York.

in whatever is the latest edition.

Can skip the material on exterior algebra or just get it from the original
source, now in English:

Henri Cartan, 'Differential Forms', ISBN 0-486-45010-4, Dover, Mineola, NY,
2006.

For measure theory and functional analysis, the standards are:

H. L. Royden, 'Real Analysis: Second Edition', Macmillan, New York.

and the first (real) half of:

Walter Rudin, 'Real and Complex Analysis', ISBN 07-054232-5, McGraw-Hill, New
York.

Actually, Loeve also covers much of this material. Neveu slips in quite a lot.

Then you will be ready for Breiman, Neveu, Chung, Chow and Teicher, or Loeve.

You will discover:

The start of probability theory is a 'probability space'. For that, there are
three parts, (1) the 'sample space', (2) the 'events', and (3) the
'probability measure'.

The 'sample space' is just a set of points; to support any reasonably serious
work in probability, the sample space has to be uncountably infinite. In all
our work, we do some one 'trial', and that corresponds to some one point in
the sample space. The other points in the sample space are what 'might' have
been our trial.

An 'event' is a set of trials. We ask that the collection of all events have
some 'closure' properties to make that collection a 'sigma algebra'. So,
briefly, the collection of events has as one event the whole sample space and
is closed under relative complements and countable unions.

In terms of measure theory, the sample space and the sigma algebra of events
form a 'measurable space'. A 'probability measure' is just a positive measure
with total 'mass' 1. Thank you Lebesgue and measure theory.

On the reals R, we can take the open sets and, then, the Borel sets, that is,
the smallest sigma algebra that contains the open sets. A 'random variable' is
a function from the probability space to the reals that is 'measurable', that
is, the inverse image of a Borel set is an event.

As an alternative, we can ask that the inverse image of the Lebesgue
measurable sets of R be events.

If X is such a random variable, then its 'expectation' E[X] is just the
Lebesgue ('abstract') integral of X over the probability space. For E[X] to
exist, we need only that X be measurable and that the integral of both the
positive and the negative parts of X not be infinite (we don't want to
subtract one infinity from another). E.g., Lebesgue integration is very
'general': All we need is measurability and not subtract one infinity from
another.

By a change of variable, we can also write the expectation as a Lebesgue
integral over the real line with respect to the 'distribution' of X on R. Or,
X 'induces' a probability measure on R.

We can discuss convergence of random variables -- in distribution,
probability, mean square, and almost surely.

Then we can define independence for events, sigma algebras, and random
variables. E.g., with this approach, if X and Y are independent random
variables and f and g are functions, then f(X) and g(Y) are independent random
variables. Don't try to prove this the elementary way!

With independent random variables, we can cover the classic limit theorems --
central limit theorem, weak and strong laws of large numbers, and the law of
the iterated logarithm.

Using the Radon-Nikodym theorem of measure theory, for random variables X and
Y we can define their 'conditional expectation' E[Y|X].

With such 'conditioning' we can discuss Markov processes, martingales, and the
martingale convergence theorem.

We can also move on to ergodic theory and, say, Poincare recurrence (keep
stirring the coffee and the cream will separate back, close to where it was
when it was poured in).

Those are the previews of coming attractions.

Note: This material is not very popular in US universities.

