Study list for exam 2

The R functions we have discussed in these weeks:

functions
qnorm()
pnorm()
cov()
cor()
mutate()
View()
tibble()
slice()
exp()
lfactorial()
dbinom()
dpois()
slice_sample()
c()

Lecture 7

\(\mu = E(X)\) \(= x_1p(x_1) + x_2p(x_2) + ... + x_np(x_n)\)

How to define a pdf!

Addition rules

-\(E(X \pm c) = E(X) \pm c\)

-\(SD(X \pm c) = SD(X)\)

-\(Var(X \pm c) = Var(X)\)

Multiplication rules

-\(E(cX) = cE(X)\)

-\(SD(cX) = |c|SD(X)\)

-\(Var(cX) = c^2Var(X)\)

-\(E(cX \pm a)=cE(X) \pm a\)

- \(Var(cX \pm a)=c^2Var(X)\)

Lecture 8

Which variable on which axis and what are the names for each variable?

Describing scatter plots

  • Homo and heteroskedastic

Covariance look at a plot and know if positive or negative

\(cov(x, y) = \frac{(x_1 - \bar{x})(y_1 - \bar{y}) + (x_2 - \bar{x})(y_2 - \bar{y}) + \ldots + (x_n - \bar{x})(y_n - \bar{y})}{n-1}\)

Correlation - look at a plot and approximate

\[corr(x, y) = \frac{cov(x, y)}{s_x \cdot s_y}\]

  1. Referred to as \(r\)
  2. Strength of linear association
  3. \(r\) is always between \(-1\) and \(+1\), \(-1 \leq r \leq 1\).
  4. \(r\) does not have units

Fitting a line

\(m = \frac{r \cdot s_y}{s_x}\)

\(b = \bar{y} - m \bar{x}\)

Predicting

library(tidymodels)

## Assign the least
## squares line
least_squares_fit <- 
  linear_reg() %>% 
  set_engine("lm") %>% 
  fit(price ~ carat, data = diamonds) 
## NOTE: outcome first, predictor second

## Find the prediction
predict(least_squares_fit, tibble(carat = c(1, 2, 2.5, 4)))
# A tibble: 4 × 1
   .pred
   <dbl>
1  5500.
2 13256.
3 17135.
4 28769.

Spurious correlations

birth order associated with increased risk of downs syndrome, etc…

Lecture 9

The Sharpe ratio

\(S(X) = \frac{\mu_X - r_f}{\sigma_X}\) - higher better - how to compute with a calculator from a pdf and also with just \(\mu\) and \(\sigma\) - computing in R

Joint pdfs

Find probability given values Are they independent or not? What do increasing and decreasing joint pdfs look like?

\(E(X+Y) = E(X) + E(Y)\) regardless of independence

3 Rules of independence

Covariance of RV

\(Cov(X, Y) = E((X - \mu_X)(Y - \mu_Y))\) \(Var(X + Y) = Var(X) + Var(Y) + 2Cov(X, Y)\) If \(X\),\(Y\) are independent, then \(Cov(X, Y) = 0\), opposite not true If \(X\), \(Y\) are independent, \(Cov(X, Y) = 0\), so \(Var(X+ Y) = Var(X) + Var(Y)\)

Correlation

\(corr(x, y) = \frac{cov(x, y)}{s_x \cdot s_y}\) \[\rho = Corr(X, Y) = \frac{Cov(X, Y)}{\sigma_X \cdot \sigma_Y}\] ## IID variables \(E(aX + bY + c) = aE(X) + bE(Y) + c\) \(Var(aX + bY + c) = a^2Var(X) + b^2Var(Y) + 2abCov(X, Y)\)

Lecture 10

Bernoulli trial

expectation and variance

Binomial

\(Y = B_1 + B_2 + ... + B_n\), where \(B_1, B_2, ..., B_n\) Bernoulli trials FIST expectation and variance binomial pdf limiting of \(p\) approaches Poisson distribution use R to find these probabilities

Poisson

RIPS expectation and variance use R to find these probabilities

Lecture 11

As \(n\) increases binomial approaches normal distribution

Shifts and scales of normal distribution

Z score and standardization Use symmetry to find probabilities, finding them in R Percentiles to Z values, finding them in R

68 - 95 - 99.7 rule

Normality tests and qqplots

Skewness and Kurtosis

Lecture 12

Sampling methods

SRS, stratified, cluster, census, voluntary response, convenience benefits and drawbacks of all and where to apply each

Lecture 13

distribution of the sample mean, standard error, control limits and types of errors, error rates with repeated testing, control charts for variation