forked from dlsun/probability
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathbayes.Rmd
143 lines (106 loc) · 7.39 KB
/
bayes.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
# Bayes' Theorem {#bayes}
## Motivating Example
The ELISA test is used to screen blood for HIV.
- When the blood contains HIV, it gives a positive result 98\% of the time.
- When the blood does not contain HIV, it gives a negative result 94\% of the time.
The prevalence of HIV is about 1\% in the adult male population.
A patient has just tested positive and wants to know the probability that he has HIV.
What would you tell him?
The solution to this problem involves an important theorem in probability and statistics called
**Bayes' Theorem**. This video covers some of the intuition and the history behind Bayes' Theorem.
Don't worry about the details for now. This video is meant to be more inspiring than informative.
<iframe width="560" height="315" src="https://www.youtube.com/embed/R13BD8qKeTg" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
## Theory
The conditional probabilities $P(A | B)$ and $P(B | A)$ are not the same. For example,
let $A$ be "currently a Cal Poly student" and $B$ be "went to high school in California".
- $P(B | A)$ is very high, about $0.85$, since most Cal Poly students are in-state.
- $P(A | B)$ is very low, as only a very small fraction of people who went to high school in CA are currently in college, much less
at Cal Poly.
Bayes' Theorem is a way to convert probabilities of the form $P(B | A)$ into probabilities of the form $P(A | B)$. This switcheroo
is surprisingly common in probability and statistics. For example,
- Doctors know the probability that a patient tests positive ($B$) given that they have the disease ($A$), but a patient
is more interested in the probability that he has the disease ($A$) given that he tested positive ($B$).
- E-mail providers can collect data on the probability that an e-mail contains a certain word ($B$) given that it is
spam ($A$), but a spam filter needs the probability that the e-mail is spam ($A$) given that it contains the word ($B$).
Don't be deceived by its simplicity; Bayes' Theorem
is one of the most important and powerful results in all of probability and statistics.
```{theorem, name="Bayes' Theorem"}
\[ P(A | B) = \frac{P(A) P(B | A)}{P(B)}. \]
```
```{proof}
By the definition of conditional probability (\@ref(def:conditional)) and the multiplication rule (\@ref(thm:multiplication-rule)):
\begin{equation}
P(A | B) = \frac{P(A \textbf{ and } B)}{P(B)} = \frac{P(A) P(B | A)}{P(B)}.
\end{equation}
```
The next video provides intuition about this proof, connecting it with concepts from the past few lessons, including
conditional probability and independence.
<iframe width="560" height="315" src="https://www.youtube.com/embed/U_85TaXbeIo?start=13" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
To solve most problems, you will need to combine Bayes' Theorem with the Law of Total Probability (\@ref(thm:ltp)).
The solution to the HIV testing example from above demonstrates some of the common tricks.
```{solution}
Let's represent HIV positive by $H$ and a positive ELISA test by $T$.
The problem statement tells us that
\begin{align*}
P(H) &= 0.01 & P(T | H) &= 0.98 & P(\textbf{not } T | \textbf{not } H) &= 0.94.
\end{align*}
By the Complement Rule, we can infer that
\begin{align*}
P(\textbf{not } H) &= 0.99 & P(\textbf{not } T | H) &= 0.02 & P(T | \textbf{not } H) &= 0.06.
\end{align*}
Since we already have $P(T | H)$ and we want to know $P(H | T)$, this is a job for Bayes' Rule:
\[ P(H | T) = \frac{P(H) P(T | H)}{P(T)} = \frac{0.01 \cdot 0.98}{P(T)}. \]
Unfortunately, we do not know $P(T)$, the probability of testing positive overall. However, we do know
its probability, conditional on HIV status. So we apply the Law of Total Probability, partitioning
by HIV status:
\[ P(T) = P(H) P(T | H) + P(\textbf{not } H) P(T | \textbf{not } H) = 0.01 \cdot 0.98 + 0.99 \cdot 0.06. \]
Finally, we plug this result into Bayes' Rule:
\[ P(H | T) = \frac{0.01 \cdot 0.98}{0.01 \cdot 0.98 + 0.99 \cdot 0.06} = .142. \]
So even though the patient tested positive for HIV, he only has a 14.2% chance of actually having HIV!
```
The low probability is surprising at first. But it makes sense if we think about the problem geometrically. The figure below partitions
adult males based on their HIV status and test status. The shaded area represents all people who tested positive. The people
who actually have HIV make up only a tiny fraction of all the people who tested positive because there are so many more people who do not
have the disease, that the false positives overwhelm the true positives.
```{r, echo=FALSE, engine='tikz', out.width='60%', fig.ext='png', fig.align='center', fig.cap="Intuition for Bayes' Rule"}
\begin{tikzpicture}[scale=.6]
\draw (0, 0) rectangle (10, 10);
\draw (0.5, 0) -- (0.5, 10);
\node[above] at (0.25, 10) {{\scriptsize $H$}};
\node[above] at (5.5, 10) {{\scriptsize $\textbf{not } H$}};
\node[below] at (0.25, 0) {{\small 1\%}};
\node[below] at (5.5, 0) {{\small 99\%}};
\draw[fill=gray!50] (0, 0) rectangle (0.5, 9.5);
\node at (0.25, 5) {$+$};
\node at (0.25, 9.65) {$-$};
\node[above,rotate=90] at (0, 5) {{\small 99\%}};
\node[above,rotate=90] at (0, 9.75) {{\small 1\%}};
\draw[fill=gray!50] (0.5, 0) rectangle (10, 1.4);
\node at (5.5, 0.7) {$+$};
\node at (5.5, 6.5) {$-$};
\node[below,rotate=90] at (10, 0.7) {{\small 6\%}};
\node[below,rotate=90] at (10, 6.5) {{\small 94\%}};
\end{tikzpicture}
```
The next video will give you deep insights about Bayes' rule. You should be able to follow most of it, but you may want to come
back to this video after trying some of the examples below.
<iframe width="560" height="315" src="https://www.youtube.com/embed/HZGCoVF3YvM" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
## Examples
1. The rare mineral unobtanium is present in only 1% of rocks in a mine. You have an
unobtanium detector, which never fails to detect unobtanium when it is present.
Otherwise, it is still reliable, returning accurate readings 90% of the time when
unobtanium is not present.
a. What is $P(\text{unobtanium detected} | \text{unobtanium present})$?
b. What is $P(\text{unobtanium present} | \text{unobtanium detected})$?
(**Hint:** One of these probabilities can be read off directly from the question prompt.
The other needs to be calculated using Bayes' Rule.)
<iframe width="560" height="315" src="https://www.youtube.com/embed/1csFTDXXULY" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
2. One application where Bayes' Theorem has been extremely successful is spam filtering. From historical data,
80% of all e-mail is spam, and the phrase "free money" is used in 10% of spam e-mails. That is,
$P(\text{"free money"} | \text{spam}) = 0.1$.
The phrase is also used in 1% of non-spam e-mails. A new e-mail has just arrived which contains the phrase
"free money". Given this information, what is the probability that it is spam,
$P(\text{spam} | \text{"free money"})$?
3. A certain disease afflicts 10% of the population. A test for the disease is 90% accurate for patients with the disease
and 80% accurate for patients without the disease. Suppose you test positive for the disease.
What is the probability that you actually have the disease?