The R functions we have discussed in these weeks:

 functions qnorm() pnorm() cov() cor() mutate() View() tibble() slice() exp() lfactorial() dbinom() dpois() slice_sample() c()

# Lecture 7

$$\mu = E(X)$$ $$= x_1p(x_1) + x_2p(x_2) + ... + x_np(x_n)$$

How to define a pdf!

-$$E(X \pm c) = E(X) \pm c$$

-$$SD(X \pm c) = SD(X)$$

-$$Var(X \pm c) = Var(X)$$

## Multiplication rules

-$$E(cX) = cE(X)$$

-$$SD(cX) = |c|SD(X)$$

-$$Var(cX) = c^2Var(X)$$

-$$E(cX \pm a)=cE(X) \pm a$$

- $$Var(cX \pm a)=c^2Var(X)$$

# Lecture 8

## Describing scatter plots

• Homo and heteroskedastic

## Covariance look at a plot and know if positive or negative

$$cov(x, y) = \frac{(x_1 - \bar{x})(y_1 - \bar{y}) + (x_2 - \bar{x})(y_2 - \bar{y}) + \ldots + (x_n - \bar{x})(y_n - \bar{y})}{n-1}$$

## Correlation - look at a plot and approximate

$corr(x, y) = \frac{cov(x, y)}{s_x \cdot s_y}$

1. Referred to as $$r$$
2. Strength of linear association
3. $$r$$ is always between $$-1$$ and $$+1$$, $$-1 \leq r \leq 1$$.
4. $$r$$ does not have units

## Fitting a line

$$m = \frac{r \cdot s_y}{s_x}$$

$$b = \bar{y} - m \bar{x}$$

## Predicting

library(tidymodels)

## Assign the least
## squares line
least_squares_fit <-
linear_reg() %>%
set_engine("lm") %>%
fit(price ~ carat, data = diamonds)
## NOTE: outcome first, predictor second

## Find the prediction
predict(least_squares_fit, tibble(carat = c(1, 2, 2.5, 4)))
# A tibble: 4 × 1
.pred
<dbl>
1  5500.
2 13256.
3 17135.
4 28769.

## Spurious correlations

birth order associated with increased risk of downs syndrome, etc…

# Lecture 9

## The Sharpe ratio

$$S(X) = \frac{\mu_X - r_f}{\sigma_X}$$ - higher better - how to compute with a calculator from a pdf and also with just $$\mu$$ and $$\sigma$$ - computing in R

## Joint pdfs

Find probability given values Are they independent or not? What do increasing and decreasing joint pdfs look like?

$$E(X+Y) = E(X) + E(Y)$$ regardless of independence

## Covariance of RV

$$Cov(X, Y) = E((X - \mu_X)(Y - \mu_Y))$$ $$Var(X + Y) = Var(X) + Var(Y) + 2Cov(X, Y)$$ If $$X$$,$$Y$$ are independent, then $$Cov(X, Y) = 0$$, opposite not true If $$X$$, $$Y$$ are independent, $$Cov(X, Y) = 0$$, so $$Var(X+ Y) = Var(X) + Var(Y)$$

## Correlation

$$corr(x, y) = \frac{cov(x, y)}{s_x \cdot s_y}$$ $\rho = Corr(X, Y) = \frac{Cov(X, Y)}{\sigma_X \cdot \sigma_Y}$ ## IID variables $$E(aX + bY + c) = aE(X) + bE(Y) + c$$ $$Var(aX + bY + c) = a^2Var(X) + b^2Var(Y) + 2abCov(X, Y)$$

# Lecture 10

## Bernoulli trial

expectation and variance

## Binomial

$$Y = B_1 + B_2 + ... + B_n$$, where $$B_1, B_2, ..., B_n$$ Bernoulli trials FIST expectation and variance binomial pdf limiting of $$p$$ approaches Poisson distribution use R to find these probabilities

## Poisson

RIPS expectation and variance use R to find these probabilities

# Lecture 11

As $$n$$ increases binomial approaches normal distribution

## Shifts and scales of normal distribution

Z score and standardization Use symmetry to find probabilities, finding them in R Percentiles to Z values, finding them in R

# Lecture 12

## Sampling methods

SRS, stratified, cluster, census, voluntary response, convenience benefits and drawbacks of all and where to apply each

# Lecture 13

distribution of the sample mean, standard error, control limits and types of errors, error rates with repeated testing, control charts for variation