Probability Spaces: An Illustrated Introduction

Introducing Probability Spaces

To define something in Probability as measurable we need to be able to mathematically define a Probability Space. A Probability Space is also referred to as a Probability Triple and consists, unsurprisingly, of 3 parts:

The Sample Space \(\Omega\) - This is just the set of outcomes that we are sampling from. 

The Set of Events \(\mathcal{F}\) - A \(\sigma\)-algebra (pronounced "sigma algebra", also know as a "sigma field" based on whichever scares your audience more). This sounds complicated but \(\mathcal{F}\) answers the question we've asked in several posts: "What are the sane questions I can ask about this probability distribution?"

The Probability Measure \(\mathbf{P}\) - This is simply a function that maps elements from \(\mathcal{F}\) to the interval \([0,1]\).


An important note for people that are computationally minded: these are abstract mathematical ideas we need to reason about probabilities that are measurable. The \(\mathcal{F}\) can quite easily be infinite, and therefore you can't always write code that implements \(\mathcal{F}\). 

The main goal of this post is to make the intimidating triple of symbols below into something much easier to understand.

A simple triplet describing a probability space.

A simple triplet describing a probability space.

The Sample Space

Probability is often introduced using Discrete and Continuous functions to represent the behavior of various Probability Distributions. Rigorous, Measure Theoretic Probability uses Sets to represent the possible events rather than functions. In addition being more mathematically rigorous and robust I find Sets to be much more intuitive.

The Sample Space \(\Omega\) represents the set of possible outcomes we can sample from. In the earlier posts on Random Variables and Variance the "Sampler" robot was a fill-in for \(\Omega\). For a single coin toss \(\Omega\) would be \(\{\text{head},\text{tail}\}\).

The Sample Space can also be infinite. We may be picking out a number or range of numbers for 0-1 in which case \(\Omega = [0,1]\). All of the really interesting and mathematically tricky questions about Measure Theory and Probability only come up when we have sets that have infinite possibilities. 

Omega in our Probability Space represents all possibilities you can imagine.

Omega in our Probability Space represents all possibilities you can imagine.

To visualize \(\Omega\), imagine you have some unexplained aches and pains and you spent the night looking up all the terrible things it could be online. The mental list of all those outcomes from a cold to Ebola is your \(\Omega\). Anyone who has done this knows that this set of possible ailments contains only the information about what is possible and has nothing to do with what is probable!

Events and sigma-algebras 

In a few posts now we've posed the question "Is this a reasonable question to ask about?" regarding probabilities and events. For example, when rolling a six-sided die the answer to the question "What is the probability of rolling a 7?" could be 0. However, it seems a better answer is "That question doesn't make sense, there are 6 sides numbered 1-6". Likewise rather than saying that the probability of getting 1.5 heads in 10 coin tosses is 0, we want to say that flipping 1.5 heads is as nonsensical as flipping a coin and getting a goat.

\(\mathcal{F}\) answers this question and is probably the most important part of our Probability Space as far as Measure Theory goes. \(\mathcal{F}\) is a \(\sigma\)-algebra over \(\Omega\). Being a \(\sigma\)-algebra means that \(\mathcal{F}\) is a collection of subsets of \(\Omega\) such that \(\mathcal{F}\):

  • contains \(\Omega\) and the empty-set \(\emptyset\) 
  • is closed under complement
  • is closed under countable union
  • and is closed under countable intersection

If we can construct a \(\sigma\)-algebra then we know that the questions we're asking about our Sample Space are measurable. This formal definition mathematically allows us to answer our question from before: how do we formalize the idea of being measurable? Don't get too concerned about the details, just know that the purpose of the \(\sigma\)-algebra is to answer this annoying question that keeps popping up about what kind of questions we can ask. The flip side is that if we cannot construct a \(\sigma\)-algebra or our question cannot be contained in a \(\sigma\)-algebra over \(\Omega\), then we are talking nonsense not Probability.

Keeping with the metaphor from the previous section, the list of questions you want to ask your doctor about what could be wrong with while you sit in the waiting room is your \(\mathcal{F}\).

The Sigma-algebra can be understood as all the valid questions you can ask about Omega.

The Sigma-algebra can be understood as all the valid questions you can ask about Omega.

Now, this may seem like a trivial set of things to construct, but it can get complicated when our \(\Omega\) is not made up of simply a finite set of discrete events. Additionally there is a difference between asking annoying, overly anxious questions such as "could it be Ebola?" and questions that downright don't make any sense "am I a broken toaster?"

Not every question is sensible, our Sigma-algebra ensures that we only ask sensilbe questions.

Not every question is sensible, our Sigma-algebra ensures that we only ask sensilbe questions.

Asking about Ebola when you have a cough may get you an eye roll from your doctor, but asking about being a toaster will have the doctor seriously questioning your sanity. In the same way \(\mathcal{F}\) is about determining if our questions are sane or not (or, mathematically speaking, measurable or not).

The Probability Measure

The Probability Measure, \(\mathbf{P}\), is our measuring stick that goes from 0 to 1 and simply tells us how probable an event from \(\mathcal{F}\) is. Following our example \(\mathbf{P}\) is the doctor who gives us reasonable probabilities for all the questions we have.

P is the set of answers to all the questions in our sigma-algebra.

P is the set of answers to all the questions in our sigma-algebra.

By this point we've already dealt with all the tricky Measure Theory parts and are back asking how certain we are about events.


Putting it all together

This is the basic thing that Probability Spaces and Measure Theory say:

A much better way to visualize our Probability triplet!

A much better way to visualize our Probability triplet!

To meaningfully talk about Probability you have to have to imagine what can happen (\(\Omega\)), be able to formulate sane questions about those ideas \(\mathcal{F}\), and finally you need to have answers to those questions \(\mathbf{P}\). If you can do these 3 things, then you're all set, but if for some reason you can't then what you are a talking about doesn't quite make sense.

When approaching Rigorous Probability with Measure Theory the Probability Space is our foundation. Though these ideas can initially be very confusing when laid out mathematically, they correspond to reasonable ideas that help clarify how we think about Probability.

If you enjoyed this post please subscribe to keep up to date and follow @willkurt!