References:
- Probabilistic Machine Learning by Kevin P. Murphy
- Think Bayes by Alan B. Downey

import pandas as pd

Definitions

Joint and Conditional Probability

Chain Rule

2 events are independent iff
-

Similarly
-

Conditional independence of events
-

Events are often dependent on each other, but may be rendered independent if we condition on the relevant intermediate variables.

Union Probability

If 2 events are mutually exclusive

Suppose, to start, that we have two random variables, X and Y . We can define the joint distribution of two random variables using p(x; y) = p(X = x; Y = y) for all possible values of X and Y . If both variables have finite cardinality, we can represent the joint distribution as a 2d table, all of whose entries sum to one. For example, consider the following example with two binary variables:

P(A and B) P(A = 0) P(A = 1) Marginal
P(B = 0) 0.2 0.3 0.5
P(B = 1) 0.3 0.2 0.5
Marginal 0.5 0.5 1.0

Given a joint distribution, marginal distribution of a r.v:
sums over all possible states of B. This is called rule of total probability.

Bayes' rule and inference

Inference: Act of passing from sample data to generalizations, usually with degree of certainty.

Bayes Rule

  • P(H) : What we know about possible values of H before we see any data; the prior distribution. If H has K possible values, then P(H) is a vector of K probabilities, that sum to 1.
  • P(Y | H = h) = If H = h, the distribution over the possible outcomes Y we expect to see; the observation distribution.
  • P(Y = y | H = h) = Given our prior hypothesis, the likelihood of the data. This is called the likelihood.
    Multiplying the prior distribution p(H=h) by the likelihood function P(Y=y| H=h) for each h gives the unnormalized joint distribution P( H=h and Y=y).
  • Normalize likelihood to get marginal likelihood.

Examples

Cookies

Suppose there are two bowls of cookies.

  • Bowl 1 contains 30 vanilla cookies and 10 chocolate cookies.
  • Bowl 2 contains 20 vanilla cookies and 20 chocolate cookies.

Now suppose you choose one of the bowls at random and, without looking, choose a cookie at random. If the cookie is vanilla, what is the probability that it came from Bowl 1?

df = pd.DataFrame(index=["bowl 1", "bowl 2"])
df["prior"] = 0.5
df["observed"] = [3/4, 1/2]
df["likelihood"] = df["prior"] * df["observed"]
df["marginal likelihood"] = df["likelihood"] / df["likelihood"].sum()
df
prior observed likelihood marginal likelihood
bowl 1 0.5 0.75 0.375 0.6
bowl 2 0.5 0.50 0.250 0.4


Testing for Covid-19 in Seattle

Suppose you think you may have contracted COVID-19, which is an infectious disease caused by the SARS-CoV-2 virus. You decide to take a diagnostic test, and you want to use its result to determine if you are infected or not. Let H = 1 be the event that you are infected, and H = 0 be the event you are not infected. Let Y = 1 if the test is positive, and Y = 0 if the test is negative. We want to compute p(H = h | Y = y), for , where y is the observed test outcome. (We will write the distribution of values, [p(H = 0|Y = y); p(H = 1|Y = y)] as p(H|y), for brevity.) We can think of this as a form of binary classification, where H is the unknown class label, and y is the feature vector.

If you tested positive, what is the likelihood of being infected?

  • Specificity = TNR = 0.975
  • Sensitivity = TPR = 0.875
df = pd.DataFrame(index=["H=0", "H=1"])
df["prior"] = [0.9, 0.1]
df["observed"] = [1 - 0.975, 0.875]
df["likelihood"] = df["prior"] * df["observed"]
df["marginal likelihood"] = df["likelihood"] / df["likelihood"].sum()
df
prior observed likelihood marginal likelihood
H=0 0.9 0.025 0.0225 0.204545
H=1 0.1 0.875 0.0875 0.795455

There is a 79.5 % chance you are infected.

Monty Hall Problem

The Monty Hall Problem is based on a game show called Let’s Make a Deal. If you are a contestant on the show, here’s how the game works:

The host, Monty Hall, shows you three closed doors—numbered 1, 2, and 3—and tells you that there is a prize behind each door. One prize is valuable (traditionally a car), the other two are less valuable (traditionally goats). The object of the game is to guess which door has the car. If you guess right, you get to keep the car. Suppose you pick Door 1. Before opening the door you chose, Monty opens Door 3 and reveals a goat. Then Monty offers you the option to stick with your original choice or switch to the remaining unopened door.

To maximize your chance of winning the car, should you stick with Door 1 or switch to Door 2?

So we need to compute:
P(Prize=Door 1| Opened = Door 3) and P(Prize=Door 2| Opened = Door 3)

df = pd.DataFrame(index=["Prize = Door 1", "Prize= Door 2", "Prize = Door 3"])
df["prior"] = 1/3
# This is the tricky part. Keep in mind that host doesn't open the door where the prize is.
# P(Opened door 3 | Prize = Door 1) and P(Opened door 3 | Prize = Door 2)
df["observed"] = [1/2, 1, 0]
df["likelihood"] = df["prior"] * df["observed"]
df["marginal likelihood"] = df["likelihood"] / df["likelihood"].sum()
df
prior observed likelihood marginal likelihood
Prize = Door 1 0.333333 0.5 0.166667 0.333333
Prize= Door 2 0.333333 1.0 0.333333 0.666667
Prize = Door 3 0.333333 0.0 0.000000 0.000000

We see that switching to Door 2 is the rational but unintuitive thing to do.