Homework06#
Please write python code that simulated 8000 super experiments. For each mini-experiment, you will ask the computer to draw 20 samples from a poisson distribution with rate \((\lambda = 3)\). For each mini-experiment please compute and store the mean. You will finish your simulation with a list of 8000 means called M. For each value in M subtract 3 and divide by the square root of 3. Call this list Z. Please plot a histogram of Z. Use the Central Limit theorem and properties of the Normal distribtuion to justify the shape, location, and scale of this empirical density.
Assume that \((X_{1}, X_{2}, \cdots, X_{10}) \sim \text{Binomial}(20,p)\) we sample \(10\) random variables that are all Binomially distributed with the number of trials equal to 20 and probability of success on a single trial equal to \(p\).
Write down the distribution of \(\bar{X}\) using the central limit theorem.
Using the properties of the Normal, write down the distribution of \(\bar{X} - \mathbb{E}(X)\)
Using the properties of the Normal, write down the distribution of \(\frac{\bar{X} - \mathbb{E}(X)}{\frac{\sigma}{\sqrt{10}}}\)
Suppose \(Y \sim N(0,1)\) is a random variable with a standard normal distribution. Further, suppose that we wish to find a value \(v\) such that \(P( -v < Y < v ) = P( Y < v ) - P(Y < -v) = 0.95\)
Write down \(P( -v < Y < v )\) as an integral statement
Use the quad module
scipy.integrateto create a function calledfind_value. The functionfin_valuetakes as input a valuevand outputs \(P( -v < Y < v )\).Finally, iterate through values of v from 0.1 to 3 by 0.01 until you find a value \(v\) such that \(P( -v < Y < v ) = 0.95\). Please report that value.
We will use the below simulation to explore the Law of Large Numbers (LLN) and how the LLN can be used to estimate parameters for a supposed sequence of random variables.
Please build a function called
experimentthat takes as input the value \(n\) and returns the mean of a sample \((X_{1}, X_{2}, \cdots, X_{n}) \sim \text{Geom}(1/4)\).Please plot the mean for \(n\) values from 5 to 5000 by 5. What to you observe happening?
Please build a function called
experiment2that takes as input the value \(n\) (the number of mini experiments, in other words \(\mathcal{D} = [x_{1}, x_{2}, \cdots x_{n}]\)) and the value \(m\) the number of super experiments. The function will output a list of \(m\) values that are equal to \( | \bar{\mathcal{D}} - 4 |\)Please create a 2X2 figure that plots a histogram of the outputs from experiment 2 for (\(n=10\) mini experiments and \(m=100\) super experiments);(\(n=100\) mini experiments and \(m=100\) super experiments);(\(n=500\) mini experiments and \(m=100\) super experiments);(\(n=5000\) mini experiments and \(m=100\) super experiments). What do you observe happening?
The law of large numbers (LLN) is the following mathemtical statement: Given a sequence of random variables \(X_{1}, X_{2}, \cdots X_{n}, \cdots\), for every \(\epsilon > 0\)
Intuitively, the LLN says that if you collect a large enough sample that the empirical mean can be arbitrarily close the true expected value of a sequence of random variables. The LLN acts a bridge between theoretical probability/statistics and empricial modeling. Lets see how.
Let (\(Y_{1}, Y_{2}, \cdots, Y_{n}) \sim \text{Geom}(p)\). The LLN states that \(\bar{Y}\) will approach the true expected value for a random variable with \(\text{Geom}(p)\). This means then that an estimate of the true expected value is the empicial mean or
Suppose we are measuring the number of days an individual is contagious with RSV once infected. We measure the date from infection to the date that the indivudal clears the infection from 100 individuals. Our statistical setup will assume that the number of days contagious follows a geometric distribution, or (\(Y_{1}, Y_{2}, \cdots, Y_{n}) \sim \text{Geom}(p)\).
(1) Please simulate 100 geometric random variables with \(p=1./10\).
(2) Compute the mean from this sample and describe how this answer relate to the LLN