Chapter 14: Sampling variation and quality

STAT 1010 - Fall 2022

Learning outcomes

Find kurtosis(\(K_4\))
If \(n > 10 |K_4|\), where \(n\) is the sample size, then a normal model adequately approximates the distribution of the sample mean \(\bar{X}\).
If we know the data come from a normal distribution this is also true.

\[ SD(\bar{X}) = SE(\bar{X}) = \frac{\sigma}{\sqrt{n}} \]

If \(X \sim N(\mu_X, \sigma_X^2)\)

\[ \bar{X} \sim N(\mu = \mu_X, \sigma^2 = \frac{\sigma_X^2}{n}) \]

Let \(Y \sim N(\mu = 5, \sigma^2 = 16)\), find the distribution of the mean of repeated samples of size 4.
- \(\bar{Y} \sim N(\mu = 5, \sigma^2 = 4)\)

if mean production is outside certain values, we may need to stop and recallibrate machinery
these values are called control limits
\(\mu - L \leq \bar{X} \leq \mu + L\)
\(\mu - L\) and \(\mu + L\) are control limits

False positives - type 1 error
- act when you should not
- probabiliy of occurence denoted \(\alpha\)
False negatives - type 2 error
- don’t act when you should
- probabiliy of occurence denoted \(\beta\)

If \(\bar{X} \sim N(\mu = 12, \sigma^2 = 2.3)\) how can we find the control limits?

If \(\bar{X} \sim N(\mu = 12, \sigma^2 = 2.3)\) how can we find the control limits?

We want the \(\alpha\) to be \(0.025\)
- \(Pr(\bar{X} < z_{0.0125} \textrm{ or } \bar{X} > z_{0.0125})\)
- qnorm(0.0125) \(\approx -2.241403\)
- \(\begin{aligned}-2.241403 =& \frac{X - \mu_\bar{X}}{\sigma_\bar{X}}\\ & = \frac{X - 12}{\sqrt{2.3}} \\ &= -2.241403\sqrt{2.3} +12 = 8.600744 \end{aligned}\)
- qnorm(0.0125, mean = 12, sd = sqrt(2.3))

We want the \(\alpha\) to be \(0.025\)
- \(Pr(\bar{X} < z_{0.0125} \textrm{ or } \bar{X} > z_{0.0125})\)
- qnorm(1- 0.0125) \(\approx 2.241403\)
- \(\begin{aligned}2.241403 =& \frac{X - \mu_\bar{X}}{\sigma_\bar{X}}\\ & = \frac{X - 12}{\sqrt{2.3}} \\ &= 2.241403\sqrt{2.3} +12 = 15.39926 \end{aligned}\)
- qnorm(1-0.0125, mean = 12, sd = sqrt(2.3))

We are not testing once, but multiple times. Assuming independence:

\(\begin{aligned}P(\textrm{within limits for 10 days}) =& P(\textrm{within limits for day 1}) \cdot P(\textrm{within limits for day 2}) \cdot \dots \cdot P(\textrm{within limits for day 10})\\ &= 0.975^{10} \approx 0.7763296 \end{aligned}\)
There is a \(1-0.7763296 = 0.2236704\) percent false positive rate
Management must decide if there is a false positive by checking for mechanical errors and inspecting equipment
Adjust \(\alpha\) value to address this

X-bar charts are slow to detect under or over filling