Title: Philosophical Essay on Probabilities
Author: Pierre Simon Marquis de Laplace
Year: 1794 (and still applicable)
Amazon Search: http://www.amazon.co.uk/s/ref=nb_sb_noss?url=search-alias%3D...
Google Books: http://books.google.com/books/about/A_philosophical_essay_on...
Sure you can find more current texts, but this introduces it as if this was a cold introduction, and makes the foundation understanding almost complete. It's a pretty stunning piece of work, the essay being a popular introduction and not the underlying mathematical lecture it was based on.
Suggested by prof as being good for learning:
Introduction to probability, D. P. Bertsekas and J. N. Tsitsiklis, Athena Scientific , 2002. ISBN 1-886529-40-X. This is an excellent book for further reading and understanding some of the material.
We followed a book by the same authors.
Although I agree on the Laplace book mentioned already as well.
at least to this thread.
Posts made invisible to others include
Sounds like some HN 'administrator' doesn't want to hear about good answers to the question of this thread, that is, the serious side of probability!
Apparently they don't like Breiman at Berkeley, Cinlar at Princeton, Karr at UNC, Wierman at Hopkins, Dynkin at Cornell, McKean at Courant, Karatzas at Columbia, Shreve at CMU, etc. That's a lot of the cream of US applied math not to like! HN is sinking to a new low! We're talking brain-dead here, folks!
What is here is an attack that is emotional and personal and not rational or objective.
Paul: Chip in here and explain this 'hidden censorship' or face a big hole in the credibility and objectivity of HN.
The shame is on HN and Paul. As the HN community figures this out, the 'community' of HN will fall. Paul is playing fast and loose with objectivity and open discussion.
Yup! The subject at the level you ask is a major topic in applied math but is not very popular in US universities. So, you will not get many good answers. In particular, the 'computer science' community, with 'machine learning' and 'artificial' this, Bayesian that, likely won't have good answers.
Your "on-line" part is asking a bit much; I can give you references to books but not all on-line. There may be some PDF files on-line, from TeX, that have such material; try some Google searching with the keywords used here.
The intuitive foundations of probability went back to gambling.
'Probability' is a field of 'applied' math and as such is well defined:
About 100 years ago, E. Borel student H. Lebesgue invented 'measure theory' which essentially 'rewrote' classic calculus, especially the part about integration. For the simple cases, what Lebesgue did gets the same numerical values as the classic Riemann integral. The difference is that in theoretical work Lebesgue's integral is much more general and much better 'behaved'.
But 'measure theory' has to do with, intuitively, 'length', 'area', 'volume' and various generalizations of these. Well, in probability, for probability P and event A, the 'probability' of A is P(A) and is a number in [0,1] and acts much like the 'area' of A. The connection is so close that, in the end, we have to accept that the foundations of probability are measure theory.
A. N. Kolmogorov, 'Foundations of the Theory of Probability, Second English Edition', Chelsea Publishing Company, New York, 1956. English translation of the original German '"Grundbegriffe der Wahrscheinlichkeitrechnung,"' 'Ergebnisse Der Mathematik', 1933.
applied Lebesgue's work to make probability a solid field of math. Since then Kolmogorov's foundations have been nearly the only one taken seriously in any 'modern' or 'advanced' work in probability, stochastic processes, or mathematical statistics.
A good start on a good text in stochastic processes was:
J. L. Doob, 'Stochastic Processes', John Wiley and Sons, New York, 1953.
Doob was long at University of Illinois. One of his students was P. Halmos who was later an assistant to von Neumann at the Institute for Advanced Study and in about 1942 wrote the first version of the still standard:
Paul R. Halmos, 'Finite-Dimensional Vector Spaces, Second Edition', D. Van Nostrand Company, Inc., Princeton, New Jersey, 1958.
Later he wrote:
Paul R. Halmos, 'Measure Theory', D. Van Nostrand Company, Inc., Princeton, NJ, 1950.
with at the end a NICE introduction to probability and stochastic processes based on measure theory.
Likely the first rock solid, quite comprehensive, highly polished presentation of 'modern' probability was the first edition of:
M. Loeve, 'Probability Theory, I and II, 4th Edition', Springer-Verlag, New York, 1977.
Loeve was long at Berkeley. One of his students did:
Jacques Neveu, 'Mathematical Foundations of the Calculus of Probability', Holden-Day, San Francisco, 1965.
and another did:
Leo Breiman, 'Probability', ISBN 0-89871-296-3, SIAM, Philadelphia, 1992.
Either of these two can be regarded as a more succinct presentation of the more important material in Loeve. Breiman is the more 'practical' and 'accessible'; Neveu is a crown jewel of elegance and succinctness but not always easy to read.
Other good presentations of much the same material include:
Kai Lai Chung, 'A Course in Probability Theory, Second Edition', ISBN 0-12-174650-X, Academic Press, New York, 1974.
Yuan Shih Chow and Henry Teicher, 'Probability Theory: Independence, Interchangeability, Martingales', ISBN 0-387-90331-3, Springer-Verlag, New York, 1978.
In total, those texts nail down 'probability' at all four corners and make it a rock solid topic in applied math. Good knowledge of, say, Breiman is a necessary and sufficient condition for knowing 'probability' at a serious level, that is, without being watered down for 'general audiences'.
For more, proceed with stochastic processes, stochastic optimal control, mathematical statistics, etc.
For learning probability, minimal prerequisites (more would be helpful) would be abstract algebra, linear algebra, 'analysis', measure theory, and functional analysis.
For abstract algebra, there are many texts. Sufficient is
I. N. Herstein, 'Topics in Algebra', Blaisdell, New York, 1964.
but can consider S. Lang, etc.
For linear algebra, there are many texts, but the old:
remains a good 'second' text. It is also an introduction to functional analysis.
The main text for 'analysis' is just:
Walter Rudin, 'Principles of Mathematical Analysis', McGraw-Hill, New York.
in whatever is the latest edition.
Can skip the material on exterior algebra or just get it from the original source, now in English:
Henri Cartan, 'Differential Forms', ISBN 0-486-45010-4, Dover, Mineola, NY, 2006.
For measure theory and functional analysis, the standards are:
H. L. Royden, 'Real Analysis: Second Edition', Macmillan, New York.
and the first (real) half of:
Walter Rudin, 'Real and Complex Analysis', ISBN 07-054232-5, McGraw-Hill, New York.
Actually, Loeve also covers much of this material. Neveu slips in quite a lot.
Then you will be ready for Breiman, Neveu, Chung, Chow and Teicher, or Loeve.
You will discover:
The start of probability theory is a 'probability space'. For that, there are three parts, (1) the 'sample space', (2) the 'events', and (3) the 'probability measure'.
The 'sample space' is just a set of points; to support any reasonably serious work in probability, the sample space has to be uncountably infinite. In all our work, we do some one 'trial', and that corresponds to some one point in the sample space. The other points in the sample space are what 'might' have been our trial.
An 'event' is a set of trials. We ask that the collection of all events have some 'closure' properties to make that collection a 'sigma algebra'. So, briefly, the collection of events has as one event the whole sample space and is closed under relative complements and countable unions.
In terms of measure theory, the sample space and the sigma algebra of events form a 'measurable space'. A 'probability measure' is just a positive measure with total 'mass' 1. Thank you Lebesgue and measure theory.
On the reals R, we can take the open sets and, then, the Borel sets, that is, the smallest sigma algebra that contains the open sets. A 'random variable' is a function from the probability space to the reals that is 'measurable', that is, the inverse image of a Borel set is an event.
As an alternative, we can ask that the inverse image of the Lebesgue measurable sets of R be events.
If X is such a random variable, then its 'expectation' E[X] is just the Lebesgue ('abstract') integral of X over the probability space. For E[X] to exist, we need only that X be measurable and that the integral of both the positive and the negative parts of X not be infinite (we don't want to subtract one infinity from another). E.g., Lebesgue integration is very 'general': All we need is measurability and not subtract one infinity from another.
By a change of variable, we can also write the expectation as a Lebesgue integral over the real line with respect to the 'distribution' of X on R. Or, X 'induces' a probability measure on R.
We can discuss convergence of random variables -- in distribution, probability, mean square, and almost surely.
Then we can define independence for events, sigma algebras, and random variables. E.g., with this approach, if X and Y are independent random variables and f and g are functions, then f(X) and g(Y) are independent random variables. Don't try to prove this the elementary way!
With independent random variables, we can cover the classic limit theorems -- central limit theorem, weak and strong laws of large numbers, and the law of the iterated logarithm.
Using the Radon-Nikodym theorem of measure theory, for random variables X and Y we can define their 'conditional expectation' E[Y|X].
With such 'conditioning' we can discuss Markov processes, martingales, and the martingale convergence theorem.
We can also move on to ergodic theory and, say, Poincare recurrence (keep stirring the coffee and the cream will separate back, close to where it was when it was poured in).
Those are the previews of coming attractions.
Note: This material is not very popular in US universities.