References:
- Probabilistic Machine Learning by Kevin P. Murphy
- Think Bayes by Alan B. Downey
import pandas as pd
Definitions
Joint and Conditional Probability
Chain Rule
2 events are independent iff
-
Similarly
-
Conditional independence of events
-
Events are often dependent on each other, but may be rendered independent if we condition on the relevant intermediate variables.
Union Probability
If 2 events are mutually exclusive
Joint Distribution over sets of related random variables
Suppose, to start, that we have two random variables, X and Y . We can define the joint distribution of two random variables using p(x; y) = p(X = x; Y = y) for all possible values of X and Y . If both variables have finite cardinality, we can represent the joint distribution as a 2d table, all of whose entries sum to one. For example, consider the following example with two binary variables:
P(A and B) | P(A = 0) | P(A = 1) | Marginal |
---|---|---|---|
P(B = 0) | 0.2 | 0.3 | 0.5 |
P(B = 1) | 0.3 | 0.2 | 0.5 |
Marginal | 0.5 | 0.5 | 1.0 |
Given a joint distribution, marginal distribution of a r.v:
sums over all possible states of B.
This is called rule of total probability.
Bayes' rule and inference
Inference: Act of passing from sample data to generalizations, usually with degree of certainty.
Bayes Rule
- P(H) : What we know about possible values of H before we see any data; the prior distribution. If H has K possible values, then P(H) is a vector of K probabilities, that sum to 1.
- P(Y | H = h) = If H = h, the distribution over the possible outcomes Y we expect to see; the observation distribution.
- P(Y = y | H = h) = Given our prior hypothesis, the likelihood of the data. This is called the likelihood.
Multiplying the prior distribution p(H=h) by the likelihood function P(Y=y| H=h) for each h gives the unnormalized joint distribution P( H=h and Y=y). - Normalize likelihood to get marginal likelihood.
Examples
Cookies
Suppose there are two bowls of cookies.
- Bowl 1 contains 30 vanilla cookies and 10 chocolate cookies.
- Bowl 2 contains 20 vanilla cookies and 20 chocolate cookies.
Now suppose you choose one of the bowls at random and, without looking, choose a cookie at random. If the cookie is vanilla, what is the probability that it came from Bowl 1?
df = pd.DataFrame(index=["bowl 1", "bowl 2"])
df["prior"] = 0.5
df["observed"] = [3/4, 1/2]
df["likelihood"] = df["prior"] * df["observed"]
df["marginal likelihood"] = df["likelihood"] / df["likelihood"].sum()
df
prior | observed | likelihood | marginal likelihood | |
---|---|---|---|---|
bowl 1 | 0.5 | 0.75 | 0.375 | 0.6 |
bowl 2 | 0.5 | 0.50 | 0.250 | 0.4 |
Testing for Covid-19 in Seattle
Suppose you think you may have contracted COVID-19, which is an infectious disease caused by the SARS-CoV-2 virus. You decide to take a diagnostic test, and you want to use its result to determine if you are infected or not. Let H = 1 be the event that you are infected, and H = 0 be the event you are not infected. Let Y = 1 if the test is positive, and Y = 0 if the test is negative. We want to compute p(H = h | Y = y), for , where y is the observed test outcome. (We will write the distribution of values, [p(H = 0|Y = y); p(H = 1|Y = y)] as p(H|y), for brevity.) We can think of this as a form of binary classification, where H is the unknown class label, and y is the feature vector.
If you tested positive, what is the likelihood of being infected?
- Specificity = TNR = 0.975
- Sensitivity = TPR = 0.875
df = pd.DataFrame(index=["H=0", "H=1"])
df["prior"] = [0.9, 0.1]
df["observed"] = [1 - 0.975, 0.875]
df["likelihood"] = df["prior"] * df["observed"]
df["marginal likelihood"] = df["likelihood"] / df["likelihood"].sum()
df
prior | observed | likelihood | marginal likelihood | |
---|---|---|---|---|
H=0 | 0.9 | 0.025 | 0.0225 | 0.204545 |
H=1 | 0.1 | 0.875 | 0.0875 | 0.795455 |
There is a 79.5 % chance you are infected.
Monty Hall Problem
The Monty Hall Problem is based on a game show called Let’s Make a Deal. If you are a contestant on the show, here’s how the game works:
The host, Monty Hall, shows you three closed doors—numbered 1, 2, and 3—and tells you that there is a prize behind each door. One prize is valuable (traditionally a car), the other two are less valuable (traditionally goats). The object of the game is to guess which door has the car. If you guess right, you get to keep the car. Suppose you pick Door 1. Before opening the door you chose, Monty opens Door 3 and reveals a goat. Then Monty offers you the option to stick with your original choice or switch to the remaining unopened door.
To maximize your chance of winning the car, should you stick with Door 1 or switch to Door 2?
So we need to compute:
P(Prize=Door 1| Opened = Door 3) and P(Prize=Door 2| Opened = Door 3)
df = pd.DataFrame(index=["Prize = Door 1", "Prize= Door 2", "Prize = Door 3"])
df["prior"] = 1/3
# This is the tricky part. Keep in mind that host doesn't open the door where the prize is.
# P(Opened door 3 | Prize = Door 1) and P(Opened door 3 | Prize = Door 2)
df["observed"] = [1/2, 1, 0]
df["likelihood"] = df["prior"] * df["observed"]
df["marginal likelihood"] = df["likelihood"] / df["likelihood"].sum()
df
prior | observed | likelihood | marginal likelihood | |
---|---|---|---|---|
Prize = Door 1 | 0.333333 | 0.5 | 0.166667 | 0.333333 |
Prize= Door 2 | 0.333333 | 1.0 | 0.333333 | 0.666667 |
Prize = Door 3 | 0.333333 | 0.0 | 0.000000 | 0.000000 |
We see that switching to Door 2 is the rational but unintuitive thing to do.