forked from dlsun/probability
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathlotus.Rmd
165 lines (130 loc) · 7.76 KB
/
lotus.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
# LOTUS {#lotus}
## Motivating Example {-}
In Lesson \@ref(ev-infinity), we analyzed the St. Petersburg Paradox. There, we calculated the expected
amount we win, $E[W]$, by first deriving the p.m.f. of $W$.
However, we know that the amount we win, $W$, is related to the number of tosses, $N$, by
\[ W = 2^{N-1}. \]
Furthermore, we know that $N$ follows a $\text{Geometric}(p=1/2)$ distribution. Can we calculate
the expected amount we win, $E[W] = E[2^{N-1}]$, from the p.m.f. of $N$, without deriving
the p.m.f. of $W$?
## Theory {-}
In this lesson, we will learn how to calculate expected values of _functions_ of random variables.
That is, we will calculate expected values of the form $E[g(X)]$. There are two ways to do this:
1. Calculate the p.m.f. of $Y = g(X)$, then calculate $E[Y]$ from the usual formula \@ref(eq:ev).
2. Use the **Law of the Unconscious Statistician** (LOTUS), described below.
```{theorem lotus, name="LOTUS"}
Let $X$ be a random variable with p.m.f. $f_X(x)$. Define $Y = g(X)$ for some function $g$. Then,
$E[Y] = E[g(X)]$ is
\begin{equation}
E[g(X)] = \sum_x g(x) \cdot f_X(x).
(\#eq:lotus)
\end{equation}
```
Theorem \@ref(thm:lotus) allows us to calculate the expected value of $Y = g(X)$, without first
finding its distribution. Instead, we can just use the known distribution of $X$.
This result is called the "Law of the Unconscious Statistician" because many people intuitively
assume it is true. Remember that $E[g(X)]$ represents the "average" value
of $g(X)$. To calculate the average value of $g(X)$, it makes sense to take a weighted average of the
possible values $g(x)$, where the weights are the probabilities $f_X(x)$.
Let's start with a simple example where $E[g(X)]$ is easy to calculate, to understand why LOTUS works.
```{example, name="Random Circle"}
We toss a fair coin twice. Let $X$ be the number of heads. Then, the p.m.f. of $X$ is
\[ \begin{array}{r|ccc}
x & 0 & 1 & 2 \\
\hline
f_X(x) & .25 & .50 & .25
\end{array} \]
Now, suppose we sketch a circle whose radius is $X$ (in feet), the random number we just generated by tossing
the coin. Then, the area of this circle is a random variable $A = \pi X^2$ (in square feet).
What is the expected area $E[A] = E[\pi X^2]$? Clearly, the only possible values of $\pi X^2$ are
\begin{align*}
\pi \cdot 0^2 &= 0, & \pi \cdot 1^2 &= \pi, & \text{ and } \pi \cdot 2^2 &= 4\pi,
\end{align*}
and their probabilities are just the probabilities of $0$, $1$, and $2$, respectively. That is,
the p.m.f. of $A$ is
\[ \begin{array}{r|ccc}
a & \pi \cdot 0^2 & \pi \cdot 1^2 & \pi\cdot 2^2 \\
\hline
f_A(a) & .25 & .50 & .25
\end{array} \]
Therefore, the expected area must be
\begin{align}
E[\pi X^2] &= (\pi \cdot 0^2) \cdot .25 + (\pi \cdot 1^2) \cdot .50 + (\pi \cdot 2^2) \cdot .25 \\
&= 1.5 \pi. (\#eq:area-circle)
\end{align}
Notice that we weighted the values of $g(x) = \pi x^2$ by the p.m.f. of $X$ to calculate
$E[g(X)]$. This is exactly what LOTUS (Theorem \@ref(thm:lotus)) said we should do!
Notice that we get a different answer if we first evaluate the expected radius and the calculate
the area of a circle with that radius:
\[ \pi E[X]^2 = \pi \cdot 1^2 = \pi. \]
An average circle is not the same as a circle with an average radius!
In general, $E[g(X)] \neq g(E[X])$. In the first case, you have to use LOTUS.
In the second case, you first calculate the expected value, then apply the function
$g$ to the result.
```
Now let's look at the question posed at the beginning of the lesson.
```{example, name="St. Petersburg Paradox Revisited"}
We know that the p.m.f. of $N$, a $\text{Geometric}(p=0.5)$ random variable, is
\[ f_N(n) = (1-0.5)^{n-1} 0.5 = 0.5^n. \]
By LOTUS,
\begin{align*}
E[2^{N-1}] &= \sum_{n=1}^\infty 2^{n-1} \cdot (0.5)^n \\
&= \sum_{n=1}^\infty (0.5) \\
&= 0.5 + 0.5 + 0.5 + \ldots \\
&= \infty,
\end{align*}
which matches answer we got in Lesson \@ref(ev-infinity).
```
Here is a more complex application of LOTUS. This particular expected value
may seem unmotivated, but it will come in handy later when we talk about variance.
<iframe width="560" height="315" src="https://www.youtube.com/embed/jxAPRL8iO3k" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
```{example binomial-lotus}
Let $X$ be a $\text{Binomial}(n, N_1, N_0)$ random variable. In Example \@ref(exm:binomial-ev),
we showed that $E[X] = n \frac{N_1}{N}$. Now, we calculate $E[X(X-1)]$ by applying LOTUS
\@ref(eq:lotus) to the binomial p.m.f.
\[ f(x) = \binom{n}{x} \frac{N_1^x N_0^{n-x}}{N^n}, x=0, 1, \ldots, n. \]
\begin{align*}
E[X(X-1)] &= \sum_{x=0}^n x (x-1) \cdot \binom{n}{x} \frac{N_1^x N_0^{n-x}}{N^n} \\
&= \sum_{x=2}^n x(x-1) \cdot \binom{n}{x} \frac{N_1^x N_0^{n-x}}{N^n},
\end{align*}
where the only change from line 1 to line 2 was to start the sum at $x=2$
instead of $x=0$. (We can do this because the summand is 0 when $x=0$ and $x=1$.
Try plugging in $x=0$ and $x=1$ if you do not see this.)
Next, we replace $x(x-1) \cdot \binom{n}{x}$ using the combinatorial identity
\[ x(x-1) \binom{n}{x} = n (n-1) \binom{n-2}{x-2}. \]
Here is a story proof ot this identity: imagining selecting a committee of $x$ people from $n$, where
one person is the chair and another person is the vice-chair. We can either:
1. select the committee first ($\binom{n}{x}$) and then select the chair and vice-chair ($x(x-1)$), or
2. select the chair and vice-chair first ($n(n-1)$) and then select the rest of the committee ($\binom{n-2}{x-2}$).
Since these are two equivalent methods of selecting a committee with a chair and a vice-chair, the two
expressions must be equal.
\begin{align*}
&= \sum_{x=2}^n n (n-1) \binom{n-2}{x-2} \frac{N_1^x N_0^{n-x}}{N^n} \\
&= n(n-1)\sum_{x=2}^n \binom{n-2}{x-2} \frac{N_1^x N_0^{n-x}}{N^n} & (\text{pull $n(n-1)$ outside the sum}) \\
&= n(n-1) \sum_{x'=0}^{n-2} \binom{n-2}{x'} \frac{N_1^{x' + 2} N_0^{n - 2 - x'}}{N^n} & (\text{apply substitution $x' = x - 2$}) \\
&= n(n-1) \frac{N_1^2}{N^2} \sum_{x'=0}^{n-2} \underbrace{\binom{n-2}{x'} \frac{N_1^{x'} N_0^{n - 2 - x'}}{N^{n-2}}}_{\text{p.m.f. of $\text{Binomial}(n-2, N_1, N_0)$}} & (\text{pull factors of $N_1$ and $N$ outside the sum}) \\
&= n(n-1) \frac{N_1^2}{N^2} & (\text{sum of p.m.f. over all possible values is 1})
\end{align*}
```
## Essential Practice {-}
1. Suppose we generate a random length $L$ (in inches) from the p.m.f.
| $\ell$ | 1 | 2 | 3 |
|--:|:-:|:-:|:-:|
|$f(\ell)$|.2| .5| .3
and draw a square with that sidelength. Calculate $E[L]^2$ and $E[L^2]$.
Are they the same? Which one represents the expected area of the square we drew?
2. Let $X$ be a $\text{Poisson}(\mu)$ random variable. Calculate $E[X(X-1)]$.
3. Let $X$ be a $\text{Geometric}(p)$ random variable. Let $t$ be a constant. Calculate
$M(t) = E[e^{tX}]$ as a function of $t$. Statisticians call this the
_moment generating function_ of $X$, while engineers may recognize this function as the
_Laplace transform_ of the p.m.f. of $X$.
## Additional Exercises {-}
1. Another resolution to the St. Petersburg Paradox is to consider expected
utility $U$ rather than expected wealth $W$. ("Utility" is the term that economists use for "happiness".)
Because of diminishing marginal utility, the first million dollars is worth more than the next million dollars.
One way to model diminishing marginal utility is to assume that $U = \log(W)$.
Show that the expected utility of the St. Petersburg game is finite, even though the expected winnings
is infinite.
2. Let $X$ be a $\text{Hypergeometric}(n, N_1, N_0)$ random variable. Calculate $E[X(X-1)]$.
3. Let $X$ be a $\text{Poisson}(\mu)$ random variable for $0 < \mu < 1$. Calculate $E[X!]$.
4. Let $X$ be a $\text{NegativeBinomial}(r, p)$ random variable. Calculate $E[(X+1)X]$.