As you dive deeper into Probability you may come across the phrases "Rigorous Probability with Measure Theory" or "Measure Theoretic Probability". What exactly is all this talk about Measure Theory? Normally the discussion of Measure Theory and Probability is left to graduate level coursework if it is touched on at all. Because of this it is nearly impossible to find any discussion of Measure Theoretic Probability that does not require a very sophisticated background in abstract mathematics. Why would anyone in the world be interested in Measure Theory and Probability who didn't have a background in pure math?
Personally I have found Measure Theoretic Probability to be very useful in helping to understand deeper issues in Probability Theory. For example, the posts on Expectation and Variance are both written from a Measure Theoretic perspective. Measure Theoretic Probability offers a very generalized view of probability. Using sets rather than distributions represented by either discrete or continuous functions, it allows for complex problems to be understood more simply... if you can get past the rigorous math!
So today we start looking at Measure Theoretic Probability from a view that is much less rigorous than your average graduate textbook, but hopefully will allow you to take away some of the treasures this approach has.
Measure for Measure
Okay, so what is Measure Theory all about!? Luckily it is one of those well-named areas of mathematics. Measure Theory is the formal theory of things that are measurable! This is extremely important to Probability because if we can't measure the probability of something then what good does all this work do us?
One of the major aims of pure Mathematics is to continually generalize ideas. When talking about "measure" our first introduction is the idea of length: "How tall are you?", "What shoe size do you wear?", "How far to the gas station?" All of these are questions about measuring something in one dimension. The next idea is usually area: "How many square feet is that house?", "How many acres is the farm?" This is measurement in two dimensions. Then of course volume: "How many gallons of milk do you need?", "How many cubic yards of rocks to fill that hole?" This of course just extends the idea of measurement into three dimensions. Here we can see some general idea of spatial measure start to form.
But that's not the most general idea of measurement! We also have weight, time, velocity, income, age, etc. These are all used to "measure" things. It turns out that mathematically, if we are not careful, sometimes we can come up with things that are NOT measurable!
What would it mean to be non-measurable? Imagine that you were to build a wall with Lego and then took this wall apart and were somehow able to build two identical walls from only the bricks in the first! If we visualize this situation we can clearly see that this is physically absurd:
However, this is a very similar problem to what happens in the Banach-Tarski Paradox which is done without the aid of any
Measure Theory and Probability
The entire point of Probability is to measure something. Unlike length and weight we have very specific values we care about, namely the interval \([0,1]\). The most basic point of probability is that you are measuring the likelihood of events on a scale from 0 to 1. This measurement of events from 0 to 1 is the Probability Measure (we'll dive much more deeply into this in the next post!).
The tricky and mathematically challenging part is how we actually show that you can measure this! This is the point where the mathematics required really ramps up. I'm going to skip all that math in this post. If mathematical rigor does excite you I heartily recommend A First Look At Rigorous Probability Theory. Future posts will definitely dive deeper into these topics, but likely not enough to please a serious mathematician.
So Why Again?
If we're forgoing the actual rigorous proofs regarding the measurability of Probability, what's the point? There are several really useful ideas that come out of Measure Theoretic Probability, that are sadly obscured from those without a deep mathematical background.
First, Measure Theoretic Probability dispenses of the idea of using solely discrete or continuous functions in favor of using sets. In most people's first encounter the idea of probability we typically have this notion that we are thinking in terms of "events that could happen". When we imagine all the things that could happen we're really imagining a 'set' of events. This is much more intuitive and more general as both discrete and continuous probability distributions can be described as sets as well!
This trend towards generalization means that many of the pitfalls of specific approaches to probability can also be avoided. For example: in the post on Expectation we discussed that Expectation should be defined as "the sum of the values of a Random Variable weighted by their probability". Traditionally Expectation is thought of as being some value 'expected' from the distribution, such as the return on the dollar for gambling. But this type of reasoning only works for specific conditions. When we think about probability rigorously and generally we avoid common errors that occur by assuming the whole universe behaves like one common case.
Close up rigor can be very confusing, but with perspective rigor adds clarity. Again, from the post on Expectation: Random Variables are very confusing if you think about them too hard (what does it mean for a variable to be random?). In Rigorous Probability
Finally, Rigorous Probability with Measure Theory opens up the doors to many more sophisticated and extremely interesting topics such as Stochastic Processes and Stochastic Calculus.
This post is intended to serve as a basic introduction to the idea of Measure Theory in relation to Probability Theory. Despite being a mathematically intense topic, you'll notice that this post has no equations! In future