Joint probability#
Two or more random variables#
A single random variable is a useful way to describe a sample space and the probabilities assigned to values on the number line corresponding to outcomes.
But more complicated scientific questions often require several random variables and rules for how they interact. We will explore how to generate multiple random variables on the same sample space. Then we will develop an approach to define probabilities for combinations of random variables, a joint probability mass function and joint cumulative mass function.
A new type of sample space#
Suppose we design a sample space \(\mathcal{S}\) and build two random variables on this space \(X\) and \(Y\). Because \(X\) and \(Y\) are random variables we can think of the support of \(X\) and support of \(Y\) as two new sample spaces. Lets assume that the outcomes in \(supp(X) = \{x_{1},x_{2},x_{3},\cdots,x_{N}\}\) and that the outcomes in \(supp(Y) = \{y_{1},y_{2},\cdots,y_{M}\}\). Then we can define a new set of outcomes in the space \(supp(X) \times supp(Y) = \{x_{i}, y_{j}\}\) for all combinations of \(i\) from \(1\) to \(N\) and \(j\) from \(1\) to \(M\).
This new space above maps outcomes in \(\mathcal{S}\) to a tuple \((x_{i},y_{j})\) where the outcomes were mapped by \(X\) to the value \(x_{i}\) and by \(Y\) to the value \(y_{j}\). If we assign the probability of \((x_{i},y_{j})\) to be the the sum of the probabilities of all outcomes in \(\mathcal{S}\) that \(X\) maps to \(x_{i}\) and \(Y\) maps to \(y_{i}\) then we call this a joint probability distribution.
Example A pharmaceutical company launches a clinical trial to study adverse events from a novel medicine. The trial enrolls patients and randomizes them to receive the novel drug or optimal medical therapy. The trial expects three potential adverse events which we call \(\text{ae}_{1}\), \(\text{ae}_{2}\), and \(\text{ae}_{3}\). We may choose to define a sample space \(\mathcal{S} = \{(\text{nov},\text{ae}_{1}),(\text{nov},\text{ae}_{2}),(\text{nov},\text{ae}_{3}),(\text{omt},\text{ae}_{1}),(\text{omt},\text{ae}_{2}),(\text{omt},\text{ae}_{3})\}\) and further build a random variable \(X\) that maps outcomes to values 0 when an adverse event was experienced by a control patient and 1 when an adverse event was experienced by a novel drug patient, and a random variable \(Y\) that maps adverse event 1 to the value 1, adverse event 2 to the value 2, and adverse event 3 to the value 3.
Event |
X |
---|---|
(nov, ae₁) |
1 |
(nov, ae₂) |
1 |
(nov, ae₃) |
1 |
(omt, ae₁) |
0 |
(omt, ae₂) |
0 |
(omt, ae₃) |
0 |
Event |
Y |
---|---|
(nov, ae₁) |
1 |
(nov, ae₂) |
2 |
(nov, ae₃) |
3 |
(omt, ae₁) |
1 |
(omt, ae₂) |
2 |
(omt, ae₃) |
3 |
A joint probability distribution would assign probabilities to all possible pairs of values for \(X\) and \(Y\).
X |
Y |
prob |
---|---|---|
1 |
1 |
0.10 |
1 |
2 |
0.05 |
1 |
3 |
0.23 |
0 |
1 |
0.05 |
0 |
2 |
0.30 |
0 |
3 |
0.27 |
We write \(P(X=x, Y=y)\) to denote the probability assigned to the joint probability that the random variable \(X\) equals the value \(x\) and the random variable \(Y\) equals the value \(y\). With the above example in mind, \(P(X=0,Y=2) = 0.30\) and this probability could be described as the probability an OMT patient experiences adverse event 2.
Visualization#
Joint probability distributions are often visualized as a table with one random variable’s values representing the rows and the second random variable’s values representing the columns
X \ Y |
Y = 1 |
Y = 2 |
Y = 3 |
---|---|---|---|
X = 0 |
0.05 |
0.30 |
0.27 |
X = 1 |
0.10 |
0.05 |
0.23 |
XXX#
A joint distribution can be used to compute probabilities that the random variable \(X\) equals the value \(x\) for any value of \(Y\)—\(P(X=x)\) and also the probability that the random variable \(Y\) equals the value \(y\) for any value of \(x\)—\(P(Y=y)\). These probabilities are called marginal probabilities.
The marginal probabilities that \(X\) equals \(0\), or \(P(X=0)\), is computed by summing the joint probabilities where \(X=0\). We can use the same procedure to compute the marignal probability that \(X=1\).
The same procedure can be done for the random variable \(Y\) to find marginal probabilities:
Joint distributions need not be restricted to only two random variables. We can build joint distributions of any number of random variables and can compute marginal probabilities in the same way that we do for a joint distribution of two random variables.
The rules, laws, and theorems of probability that we learned in chapter one carry over to random variables. This is because we can consider a new sample space of values that our random variable can take. Suppose we build two random variables \(X\) that maps outcomes to the the values -1,0,1 and \(Y\) that maps outcomes to the values 1,2,3. We can define a new sample space \(\mathcal{S} = \{(x,y) | x \in supp(X) \text{ and } y \in supp(Y) \}\) With our new sample space, we can now discuss statements like \(P(X = x | Y = y)\).
Example If we continue with our pharmaceutical example, we can build a new sample space \(\mathcal{S} = \{(0,1),(0,2),(0,3),(1,1),(1,2),(1,3)\}\) and compute, for example, \(P(X=1 | Y=2) = \frac{P(X=1,Y=2)}{P(Y=2)} = \frac{0.05}{0.35} = 0.14\).
We can define the conditional probability of the value of one random variable given another as
and so define statistical independence between two random variables, \(A\) and \(B\), as
We can also translate the law of total probability from events to random variables. For a partition \([Y=y_{1}] \cup [Y=y_{2}] \cup [Y=y_{3}] \cup \cdots \cup [Y=y_{n}]\) of the event that the random variable \(X=x\),
By thinking of the values of a random variable as outcomes in a new sample space, we can apply our past intuition and the past mechanics of events, sets to a collection of random variables.
import numpy as np
import matplotlib.pyplot as plt
# Define the PDF
def f(x, y):
return 4 * x * y * np.exp(-(x**2 + y**2))
# Create a grid over [0, 3] x [0, 3]
x = np.linspace(0, 3, 300)
y = np.linspace(0, 3, 300)
X, Y = np.meshgrid(x, y)
Z = f(X, Y)
# Plot
fig = plt.figure(figsize=(10, 6))
ax = fig.add_subplot(111, projection='3d')
ax.plot_surface(X, Y, Z, cmap="viridis", edgecolor='none')
ax.set_title(r"$f(x, y) = 4xy e^{-(x^2 + y^2)}$")
ax.set_xlabel("x")
ax.set_ylabel("y")
ax.set_zlabel("f(x, y)")
plt.tight_layout()
plt.show()

import numpy as np
import plotly.graph_objects as go
import plotly.io as pio
# Use a compatible renderer
pio.renderers.default = "notebook" # Try "iframe" or "notebook_connected" if needed
# Define the PDF
def f(x, y):
return 4 * x * y * np.exp(-(x**2 + y**2))
# Grid
x = np.linspace(0, 3, 150)
y = np.linspace(0, 3, 150)
X, Y = np.meshgrid(x, y)
Z = f(X, Y)
# Plot
fig = go.Figure(data=[go.Surface(z=Z, x=X, y=Y, colorscale='Viridis')])
fig.update_layout(
title=r"$f(x, y) = 4xy e^{-(x^2 + y^2)}$",
scene=dict(xaxis_title='x', yaxis_title='y', zaxis_title='f(x, y)')
)
fig.show()
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[2], line 2
1 import numpy as np
----> 2 import plotly.graph_objects as go
3 import plotly.io as pio
5 # Use a compatible renderer
ModuleNotFoundError: No module named 'plotly'
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import multivariate_normal
# Grid for all plots
x = np.linspace(-3, 3, 100)
y = np.linspace(-3, 3, 100)
X, Y = np.meshgrid(x, y)
pos = np.dstack((X, Y))
# Covariance configurations to compare
covariances = [
([[1, 0], [0, 1]], "ρ = 0"),
([[1, 0.8], [0.8, 1]], "ρ = 0.8"),
([[1, -0.8], [-0.8, 1]], "ρ = -0.8"),
([[2, 0.6], [0.6, 1]], "σ₁ ≠ σ₂, ρ = 0.6")
]
# Create 2x2 subplot
fig, axes = plt.subplots(2, 2, figsize=(10, 10))
for ax, (cov, title) in zip(axes.flat, covariances):
rv = multivariate_normal([0, 0], cov)
Z = rv.pdf(pos)
ax.contour(X, Y, Z, levels=10, cmap='viridis')
ax.set_title(title)
ax.set_xlabel('X')
ax.set_ylabel('Y')
ax.set_aspect('equal')
plt.tight_layout()
plt.suptitle("Isothermal Contours of Bivariate Normal Distributions", fontsize=16, y=1.02)
plt.show()

Exercises#
Define the following joint distribution of random variable \(Q\) and \(R\) that are mapped from the sample space \(\mathcal{G}\)
Q
R
prob
2
1
0.01
2
2
0.075
2
3
0.13
1
1
0.1
1
2
0.05
1
3
0.17
0
1
0.05
0
2
0.30
0
3
0.115
What is the implied support of \(Q\)?
What is the implied support of \(R\)?
Compute P(Q=1,R=2)
Compute the marginal probabilities for \(Q\)
Compute the marginal probabilities for \(R\)
Compute \(P(R=1 | Q=0)\)
The random variable \(Q\) is called statistically independent from \(R\) if for every value \(q \in supp(Q)\) and for every value \(r \in R\) the following is true \(P(Q=q |R=r) = P(Q)\). Is \(Q\) statistically independent from \(R\)? Why or why not?
For two events \(A\) and \(B\) that are statistically independent, we found that \(P(A \cap B) = P(A)P(B)\). Please derive an equivalent expression for two random variables by applying the definition of conditional probability and statistical independence to random variables.
Define the following joint distribution of random variable \(Q\) and \(R\) that are mapped from the sample space \(\mathcal{G}\)
Q
R
prob
2
1
0.01
2
2
0.075
2
3
0.13
1
1
0.1
1
2
0.05
1
3
0.17
0
1
0.05
0
2
0.30
0
3
0.115
Please compute \(\mathbb{E}(Q)\)
Please compute \(\mathbb{E}(R)\)
Please compute \(V(Q)\)
Please compute \(V(R)\)
Kimberly Palaguachi-Lopez—-Class of 2024: Consider the following joint distribution of random variable Z and L that are mapped from the sample space G
Z
L
prob
1
0
0.05
2
0
0.15
3
0
0.19
2
1
0.02
1
1
0.12
3
1
0.04
2
0.5
0.10
3
0.5
0.13
1
0.5
0.20
Please compute the \(supp(Z)\)?
Please compute the \(supp(L)\)?
Solve for \(P(Z=1,L=1)\)
Solve for the respective marginal probabilities of \(Z\)
Solve for the respective marginal probabilities of \(L\)
Is the probability distribution for the random variable Z a valid probability distribution? Why or why not? Is the probability distribution for L a valid probability distribution? Why or why not?