# Chapter 15: Confidence intervals

STAT 1010 - Fall 2022

# Learning outcomes

• Building confidence intervals for the population proportion
• Building confidence intervals for the population mean
• Understanding the $t$ distributions
• Interpreting confidence intervals
• Understand the margin of error and how to use it to compute sample sizes

# Confidence interval for proportion

• What percent of customers will refinance their loans?
• What percentage of employees are in favor of a uniform change?
• What percentage of voters will vote for Fetterman?

## Variability

• dependent on $p$ - fatter in middle, skinnier near 0 and 1
• dependent on $n$, number of samples, bigger sample, less variability

## Variability - math

$\hat{p} \sim N(\mu = p, \sigma^2 = \frac{p(1-p)}{n})$

• numerator of the variance is similar to that of a Binomial distribution
• as sample size increases, variance decreases

## CI - math

$\hat{p} \pm z_{\alpha/2} \cdot \sqrt{\hat{p}(1-\hat{p})/n}$ For $95\%$ CI, $z_{\alpha/2} = 1.96$

As confidence level increases, interval widens

In R: qnorm(0.025)

## Assumptions

• SRS condition The observed sample is SRS from the appropriate population. If population is finite, less than 10% of total population
• Sample size condition (for proportion) both $n\hat{p}$ and $n(1-\hat{p})$ are larger than 10.

## Example 1

• Let $\hat{p}= 0.14$ and $n = 350$, find a $95\%$ CI for the proportion

• \begin{aligned} se(\hat{p}) &= \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} &= \sqrt{\frac{0.14(1-0.14)}{350}}\end{aligned} \approx 0.0185

• lower value: $\hat{p} - 1.96 \cdot 0.0185 \approx 0.10374$

• upper value: $\hat{p} + 1.96 \cdot 0.0185 \approx 0.17626$

## Example 2

An auditor checks a sample of 225 randomly chosen transactions from among the thousands processed in an office. Thirty-five contains errors in crediting or debiting the appropriate account

a. Does this situation meet the conditions required for a $z$-interval for the proportion?

b. Find the $95\%$ confidence interval for $p$, the proportion of all transactions processed in the this office that have these errors.

c. Managers claim that the proportion of errors is about $10\%$. Does that seem reasonable?

## Example 2 - solns

1. yes, $\hat{p} = 35/225 \approx 0.156$ so $n\hat{p}, n(1-\hat{p})> 10$ and we assume sampled $<10\%$

2. \begin{aligned} \hat{p} \pm z_{\alpha/2} \cdot \sqrt{\hat{p}(1-\hat{p})/n} &= 0.156 \pm 1.96 \sqrt{0.156(1-0.156)/225}\\ &= 0.156 \pm 1.96 \cdot 0.0242 \\ &= 0.156 \pm 0.047\\ & [0.109, 0.203]\end{aligned}

3. No $10\%$ is too low, with $95\%$ confidence it is higher

# Confidence interval for mean

• Average weight for filled ice cream containers
• Average purchases from a website in US## Variability • standard deviation of distribution • higher sd -> wider CI • sample size • smaller sample size -> wider CI ## CI - mean $\bar{x} \pm t_{\alpha/2, n-1} \cdot \frac{s}{\sqrt{n}}$ ## T - distribution • $Z = \frac{X-\mu_X}{\sigma_X}$ • $\bar{X} \sim N(\mu = \mu\_X, \sigma^2 = \frac{\sigma_X^2}{n})$ • $T_{n-1} = \frac{\bar{X}-\mu_X}{S/\sqrt{n}}$ • $t$ distribution has more density in the tails to account for smaller sample sizes ## Variability • dependent on $confidence \text{ }level$ - as confidence level increases so does the width of the interval • dependent on $n$, number of samples, bigger sample, less variability ## Assumptions • SRS condition The observed sample is SRS from the appropriate population. If population is finite, less than 10% of total population • Sample size condition the sample size is larger than 10 times the absolute value of the kurtosis, $n>10|K_4|$. ## Example 1 • Let $\bar{x}= \3285$, $n = 150$, and $s = \238$ find a $95\%$ CI for the mean • \begin{aligned} se(\bar{x}) &= s/\sqrt{n} &= 238/\sqrt{150}\end{aligned} \approx 19.43262 • qt(.025, df = 149) $\approx -1.976$ • lower value: $3285 - 1.976 \cdot 19.43262 \approx \3,320$ • upper value: $3285 + 1.976 \cdot 19.43262 \approx \3,247$ • 3285 + qt(.025, df = 149)* 238/sqrt(150) ## Example 2 Office administrators claim that the average amount on a purchase order is6,000. A SRS of 49 purchase orders averages $\bar{x} = \4,200$ with $s = \3,500$.

1. What is the relevant sampling distribution?
2. Find the 95% confidence interval for $\mu$, the mean of purchase orders handled by this office during the sampling period.
3. Do you think the administrators claim is reasonable?

## Example 2 - solns

1. $N(\mu, \sigma^2/49)$

2. \begin{aligned} 4200 \pm 2.01 \cdot 3500/\sqrt{49} &= 4200 \pm 2.01 \cdot 500 \\ &= 4200 \pm -1005.317 \\ &\approx [3195, 5205]\end{aligned}

3. No $\6,00$ is way above the $95\%$ confidence interval

# Interpreting CI

“We can be 95% confident that the average purchase order handled by this office lies between $3195 and$5205.”

# Manipulating CI

In example from means. How could we update the CI if we know we had 10 offices with similar purchase patterns.

• If $[L \text{ to } U]$ is a $100(1 – \alpha)%$ confidence interval for $\mu$

• then $[c \cdot L \text{ to } c \cdot U]$ is a $100 (1 – \alpha)%$ confidence interval for $c \cdot \mu$

• and $[c + L \text{ to } c + U]$ is a $100(1 – \alpha)%$ confidence interval for $c + \mu$.

## Example 2 - solns

$10 \cdot [3195, 5205] = [31950, 52050]$

# Margin of error

$\bar{x} \pm \frac{2s}{\sqrt{n}}$

• Margin of error = $\frac{2s}{\sqrt{n}}$

## Sample size - mean

$n = \frac{4s^2}{\text{(Margin of error)}^2}$

• For $p$ use $p = 0.5$ for biggest variance