The Lebesgue Integral: A Curious Tea Party (Part 1)

Now that we've covered a few ways we can understand the Integral and the basics of Measure Theory and Probability Spaces we are ready to put all of that together to introduce a very exciting topic: the Lebesgue Integral. Like everything else we've discussed recently, this topic is typically reserved for advanced study, but the basic idea is relatively simple to understand and very practical for dealing with more sophisticated models of probability. This topic is going to be broken up into two parts. In this first post we'll set up a situation in which the Integrals we've already talked about aren't going to work.

A Curious Tea Party

Suppose you're observing a rather interesting tea party. What makes this party worthy of note is it has very strict rules on the usage of sugar. This party is attended only by game theorists and economists, so the rules are always obeyed. The setup is this:

On each table, there are two sugar bowls one filled with 5 sugar cubes (each cube being one teaspoon of sugar) and the other filled with a total of 5 teaspoons of granulated sugar. The sugar in the second bowl is so finely ground that each grain can be considered infinitesimally small. The party's rules for sugar consumption are pretty straight forward:  before you can use any of the granulated sugar, you must first use all 5 of the sugar cubes

Sometimes Random Variables can be both discrete and continuous.

Sometimes Random Variables can be both discrete and continuous.

Let's assume that you've recorded sugar consumption among all the tea party goers for quite a long time and finally arrived a distribution describing the sugar cube consumption. It looks roughly like this:

Like our data, our distribution has both discrete and continuous components

Like our data, our distribution has both discrete and continuous components

Now let's move on to thinking probabilistically about this!

Sugar Probability Space

If you aren't already familiar with formal Probability Spaces now would be a good time to read this illustrated post covering the topic. Given the problem above, if we want to talk about our tea party in a Measure Theoretic and Rigorous manner, we need to define the triple \((\Omega,\mathcal{F},\mathbf{P})\)

The Sample Space

When building our Probability Space, the first thing we need to define is our Sample Space \(\Omega\). In this case, we have a rather interesting distribution. A tea drinker can have either \(\{0,1,2,3,4,5\}\) of the sugar cubes, or use all 5 cubes and then have anywhere from infinitesimally greater than 5 through 10 teaspoons of sugar (ie \((5,10]\)). 
Our \(\Omega\) is:
$$\Omega = \{0,1,2,3,4,5\} \cup (5,10]$$
The Sample Space combines both discrete values and an uncountably infinite interval. This example is the first case in this blog where we're looking at a probability distribution that is not simply discrete or continuous; it's either depending on what question we're asking! 

The Sigma Algebra

Speaking of asking questions the next thing we need to talk about is our \(\sigma\)-algebra, \(\mathcal{F}\). Remember \(\mathcal{F}\) is the set of sane questions we can ask about our Sample Space \(\Omega\). We're  not going to go over a detailed description of constructing \(\mathcal{F}\) as it would take us far away from the world of 'simple' explanations of Measure Theoretic Probability. What we do need to do is realize the kinds of questions that \(\mathcal{F}\) is going to permit if we assume we can construct it. 

For starters it needs to allow us to ask questions about both discrete values (eg "What is probability of a table consuming 5 teaspoons of sugar") and about intervals (eg "What is the probability of a table consuming between 6.5 and 9 teaspoons of sugar"). If you think about how we answer questions in Probability using Calculus this should start to seem a bit concerning. Another important thing to realize is that we cannot ask questions about all fractional amounts of sugar. A question about using 5.5 teaspoons of sugar is perfectly fine but asking about 3.2 is not. Why is this? Remember \(\mathcal{F}\) is constructed using subsets of our original sample space \(\Omega\). \(\Omega\) does contain 5.5 because it contains all of \((5,10]\) but \(\Omega\) does not contain the value 3.2, only 3 and 4 explicitly.

The Probability Measure

As is often the case the Probability Measure \(\mathbf{P}\) is the easiest to understand (at least at first). We can use our distribution from earlier to imagine this function. Want to know the probability of 3 teaspoons? Just look it up on that graph! Want to figure out the probability of a table using between 7 and 8 teaspoons, we'll just integrate!

This brings us to an interesting problem. Given the graphical representation of our distribution, it appears easy to see how we can talk about coming up with probabilities. Thinking in pure math terms, how we would explicitly express the calculation is not clear. Normally in a Probability Density Function any single point, such as 3, has a probability of 0. In PDFs we can only meaningfully talk about ranges of values. Similarly in a Probability Mass Function we talk exclusively about discrete values, such as 3, but have no way to say anything meaningful about truly continuous ranges of values. Sometimes with a PMF we can talk about things like the probability of getting 3.3, but that's only when we consider all values in between 3 and 4 the same.

So we've intuitively solved this problem quite easily, but mathematically we're a bit of a tricky place!

Integration?

In past posts, we've discussed both the Fundamental Theorem of Calculus and the Riemann Integral as different ways of viewing the process of integration, with the Riemann Integral being the most robust of the two. Now we've encountered a new problem with integration, and this problem has come about because we're no longer integrating over nicely behaved distributions but rather interesting sets of events.

To solve this problem we'll be introducing a new (for us), and much more robust definition of the Integral: the Lebesgue Integral!

In the next post, we'll continue the discussion of our tea party. We'll explore how both the FToC and the Riemann Integral fail to solve our problem and take a look at the very powerful Lebesgue integral.

If you enjoyed this post please subscribe to keep up to date and follow @willkurt!