Chapter 10: Association between random variables

STAT 1010 - Fall 2022

Learning outcomes

By the end of this lesson you should:

  • Understand the Sharpe ratio and know how to compute it to compare stocks that are not independent
  • Understand and use a joint probability distribution function
  • Know the 3 rules of independence
  • Understand how covariance, correlation, and independence intertwine
  • Explain how independent and identically distributed random variables impact upon expectation, variance, and standard deviation
  • Compute expectation and variance of weighted sums

Revision 1

The Sharpe Ratio

\[S(X) = \frac{\mu_X - r_f}{\sigma_X}\]

  • \(r_f\) is the return on a risk-free investment
  • \(\mu\) is the mean of a random variable \(X\) that measures the performance of an investment
  • \(\sigma\) is the standard deviation of a random variable \(X\) that measures the performance of an investment
  • all inputs must be measured over the same time period (ie. yearly, monthly, …)

Compute the Sharpe ratio

Company Random Variable Mean per month SD
Apple A 2.45% 13.3%
McDonalds M 1.14% 6.2%

assume \(r_f\) is the risk-free rate of interest is \(0.1\%\)

Compute the Sharpe ratio

  • \[\begin{aligned} S(A) &= \frac{\mu_A - r_f}{\sigma_A}\\ &= \frac{2.45 - 0.1}{13.3}\\ &= 0.177 \end{aligned}\]

  • \[\begin{aligned} S(M) &= \frac{\mu_M - r_f}{\sigma_M}\\ &= \frac{1.14 - 0.1}{6.2}\\ &= 0.168 \end{aligned}\]

We prefer Apple because it has a higher Sharpe ratio \(0.177 > 0.168\)

Revision 2

Instead of knowing \(\mu\) and \(\sigma\) we have a pdf

IBM stock IBM stock Microsoft Microsoft
\(x\) \(P(X = x)\) \(y\) \(P(Y = y)\)
Increases $5 0.11 $4 0.18
No change 0 0.80 0 0.67
Decreases -$5 0.09 -$4 0.15

Sharpe ratio by hand

IBM IBM IBM IBM IBM IBM IBM
\(x\) \(P(X = x)\) \(\mu_X\) \((x-\mu_X)^2 \cdot p_X(x)\) \(Var(X)\) \(sd(X)\) Sharpe ratio
Increases $5 0.11
No change 0 0.80
Decreases -$5 0.09

Sharpe ratio by hand

IBM IBM IBM IBM IBM IBM IBM
\(x\) \(P(X = x)\) \(\mu_X\) \((x-\mu_X)^2 \cdot p_X(x)\) \(Var(X)\) \(sd(X)\) Sharpe ratio
Increases $5 0.11 0.1 2.6411 4.99 2.23 0.03805123
No change 0 0.80 0.1 0.0080 4.99 2.23 0.03805123
Decreases -$5 0.09 0.1 2.3409 4.99 2.23 0.03805123

Sharpe ratio by hand

MCSFT MCSFT MCSFT MCSFT MCSFT MCSFT MCSFT
\(y\) \(P(Y =y)\) \(\mu_Y\) \((y-\mu_Y)^2 \cdot p_Y(y)\) \(Var(Y)\) \(sd(Y)\) Sharpe ratio
Increases $4 0.18
No change 0 0.67
Decreases -$4 0.15

Sharpe ratio by hand

MCSFT MCSFT MCSFT MCSFT MCSFT MCSFT MCSFT
\(y\) \(P(Y =y)\) \(\mu_Y\) \((y-\mu_Y)^2 \cdot p_Y(y)\) \(Var(Y)\) \(sd(Y)\) Sharpe ratio
Increases $4 0.18 0.12 2.709792 5.2656 2.29469 0.04575782
No change 0 0.67 0.12 0.009648 5.2656 2.29469 0.04575782
Decreases -$4 0.15 0.12 2.546160 5.2656 2.29469 0.04575782

The Sharpe ratio - in R

# Input the data and the pdf
stock <- tibble(x = c(5, 0, -5), 
                p_x = c(0.11, 0.8, 0.09), 
                y = c(4, 0, -4), 
                p_y = c(.18, .67, .15))


Sharpe_ratio <- # name this the Sharpe ratio
  stock %>% # use the data from above
  mutate(part_mean_x = x*p_x, # find mean for each part of x 
         part_mean_y = y*p_y, # now for y
         mean_x = sum(part_mean_x), # find mean x 
         mean_y = sum(part_mean_y),# now for y
         part_var_x = p_x * (mean_x - x)^2, # find var for each part of x 
         part_var_y = p_y * (mean_y - y)^2,# now for y
         var_x = sum(part_var_x), # find var of x 
         var_y = sum(part_var_y),# now for y
         sd_x = sqrt(var_x), # find sd of x 
         sd_y = sqrt(var_y)) %>% # now for y
  summarise(S_x = (mean_x - 0.015)/sd_x, # find Sharpe ratio of x with rf = 0.015
            S_y = (mean_y - 0.015)/sd_y) %>% # now for y
  slice(1)

Joint probability distributions

\(X\) \(X\) \(X\)
\(x = -5\) \(x = 0\) \(x = 5\) \(p(y)\)
\(Y\) \(y=4\) \(0.00\) \(0.11\) \(0.07\) \(0.18\)
\(Y\) \(y=0\) \(0.03\) \(0.62\) \(0.02\) \(0.67\)
\(Y\) \(y=-4\) \(0.06\) \(0.07\) \(0.02\) \(0.15\)
\(p(x)\) \(0.09\) \(0.80\) \(0.11\) \(1\)
  • What is the probability that \(x = 0\) and \(y =4\)? - \(11\%\)

  • What is the probability that \(x = 5\) and \(y =4\)? - \(7\%\)

  • What is the probability that \(x = 0\) and \(y =0\)? - \(62\%\)

  • What outcome will never occur? - \(x = -5\) and \(y =4\)

Expected value of \(X + Y\)

\(X\) \(X\) \(X\)
\(x = -5\) \(x = 0\) \(x = 5\) \(p(y)\)
\(Y\) \(y=4\) \(0.00\) \(0.11\) \(0.07\) \(0.18\)
\(Y\) \(y=0\) \(0.03\) \(0.62\) \(0.02\) \(0.67\)
\(Y\) \(y=-4\) \(0.06\) \(0.07\) \(0.02\) \(0.15\)
\(p(x)\) \(0.09\) \(0.80\) \(0.11\) \(1\)
  • \(E(X+Y) = E(X) + E(Y)\)
  • \(E(X+Y) = E(X) + E(Y) = 0.1 + 0.12 = 0.22\)

3 Rules of independence

If the probability of one event occuring has no impact on another event occuring, they are independent.

  • If \(X\) and \(Y\) are independent then \(p(x,y) = p(x) \cdot p(y)\) for all pairs \((x,y)\)

  • If \(p(x,y) = p(x) \cdot p(y)\) for all \((x, y)\) pairs, then \(X\) and \(Y\) are independent.

  • It follows that if \(X\) and \(Y\) are independent, then \(E(XY) = E(X) \cdot E(Y)\)

Are \(X\) and \(Y\) independent?

\(X\) \(X\) \(X\)
\(x = -5\) \(x = 0\) \(x = 5\) \(p(y)\)
\(Y\) \(y=4\) \(0.00\) \(0.11\) \(0.07\) \(0.18\)
\(Y\) \(y=0\) \(0.03\) \(0.62\) \(0.02\) \(0.67\)
\(Y\) \(y=-4\) \(0.06\) \(0.07\) \(0.02\) \(0.15\)
\(p(x)\) \(0.09\) \(0.80\) \(0.11\) \(1\)
  • \(p_x(-5) \cdot p_y(4) = 0.18 \cdot 0.09 \neq 0.00\)

  • \(p_x(-5) \cdot p_y(-4) = 0.09 \cdot 0.15 = 0.0135 \neq 0.06\)

  • NO! NO! NO!

Covariance of Random Variables

  • \(cov(x, y) = \frac{(x_1 - \bar{x})(y_1 - \bar{y}) + (x_2 - \bar{x})(y_2 - \bar{y}) + \ldots + (x_n - \bar{x})(y_n - \bar{y})}{n-1}\)

  • assumes each \((x, y)\) pair appears once

  • \[Cov(X, Y) = E((X - \mu_X)(Y - \mu_Y))\]

Covariance of sum

  • \((a+b)^2 = a^2 + 2ab +b^2\)
  • it follows that
  • \(Var(X + Y) = Var(X) + Var(Y) + 2Cov(X, Y)\)

Correlation

  • \(corr(x, y) = \frac{cov(x, y)}{s_x \cdot s_y}\)

  • \[\rho = Corr(X, Y) = \frac{Cov(X, Y)}{\sigma_X \cdot \sigma_Y}\]

As \(X\) increases, what happens to \(Y\)?

\(X\) \(X\) \(X\)
\(x = -5\) \(x = 0\) \(x = 5\) \(p(y)\)
\(Y\) \(y=4\) \(0.00\) \(0.11\) \(0.07\) \(0.18\)
\(Y\) \(y=0\) \(0.03\) \(0.62\) \(0.02\) \(0.67\)
\(Y\) \(y=-4\) \(0.06\) \(0.07\) \(0.02\) \(0.15\)
\(p(x)\) \(0.09\) \(0.80\) \(0.11\) \(1\)

Independence & Cov

How does independence impact upon correlation and covariance?

\[ E((X-\mu_X)(Y-\mu_Y))\]

  • \[\begin{aligned} Cov(X,Y) &= E((X-\mu_X)(Y-\mu_Y))\\ &= E(X - \mu_X)E(Y-\mu_Y)\\ &= 0 \end{aligned}\]

  • If \(X\),\(Y\) are independent, then \(Cov(X, Y) = 0\)

  • The opposite is not true (if \(Cov(X, Y) = 0\), then \(X\),\(Y\) are independent)

Independence & Var

\(Var(X) + Var(Y) = Var(X) + Var(Y) + 2Cov(X, Y)\)

  • If \(X\), \(Y\) are independent, \(Cov(X, Y) = 0\), so

  • \[Var(X+ Y) = Var(X) + Var(Y)\]

Sharpe ratio of a sum

\[ S(X + Y) = \frac{(\mu_X + \mu_Y) - 2r_f}{\sqrt{(Var(X+Y))}} \]

Sharpe ratio of sum - R

# input x & y
x <- c(-5, 0, 5)
y <- c(4, 0, -4)

# input data
data <- bind_cols(expand_grid(x, y), 
                  probs = c(0, .03, .06, 
                            .11, .62, .07, 
                            .07, .02, .02), 
                  mu_x = rep(0.1, 9), 
                  mu_y = rep(0.12, 9), 
                  var_x = rep(4.99, 9), 
                  var_y = rep(5.2656, 9))

data %>% 
  mutate(cov_xy_part = (x - mu_x)*
           (y - mu_y)*probs, # multiply together
         cov_xy = sum(cov_xy_part), # sum them
         var_xy = var_x + var_y + 2*cov_xy) %>% # using the rule
  summarise(S_r = (mu_x + mu_y - 2*0.015)/
                      (sqrt(var_xy))) %>% # find the Sharpe value
  slice(1) # only the first
# A tibble: 1 × 1
     S_r
   <dbl>
1 0.0497
  • The Sharpe ratio for \(X\) is \(0.038\), for \(Y\) it’s \(0.046\), investing in both gives a better return \(0.050\)

Double the investment

  • Invest twice as much for one day
  • Invest in one stock for two subsequent days
  • Which is better? Why?

Double for one day

\[S(2X) = \frac{2\mu_X - 2r_f}{\sqrt{Var(2X)}}\]

  • \(\mu_X = 0.1\)

  • \(Var(X) = 4.99\)

  • \(r_f = .015\)

  • \[\begin{aligned} &= \frac{2 \cdot 0.1 - 2 \cdot 0.015}{\sqrt{4 \cdot 4.99}}\\ &= \frac{.2 - 0.03}{\sqrt{19.96}}\\ &= 0.038 \end{aligned}\]

  • This is the same as before, how do we compute if we invest for two subsequent days?

IID

If we invested in a stock on 2 subsequent days instead of investing in 2 stocks on one day the return on those two days are:

independent and identically distributed (IID)

  • identically distributed - the outcomes on each day are likely to be different, but the probability of the outcomes is the same

  • independent - very common assumption for stocks (part of the reason for the 2008 financial crises)

Addition rules for IID variables

Important

If \(n\) random variables \((X_1, X_2, ..., X_n)\) are iid with mean \(\mu_X\) and standard deviation \(\sigma_X\), then

\[ E(X_1 + X_2 + ... + X_n) = n \cdot \mu_X \] \[ Var(X_1 + X_2 + ... + X_n) = n \cdot \sigma^2_x \] \[ SD(X_1 + X_2 + ... + X_n) = \sqrt{n} \sigma_X \]

Sharpe ratio - two days

\[ S(X_1 + X_2) = \frac{2\mu_X - 2r_f}{\sqrt{2 \sigma^2_x}}\]

  • \[\begin{aligned} &= \frac{0.20 - 0.03}{\sqrt{9.98}}\\ &= 0.054 \end{aligned}\]
  • you expect more variance if you have two stock for one day b/c anything that happens that day is magnified
  • if you have one stock for two days you reduce the variance

Weighted sums

If we decide to leave money in the bank, there will be interest that accrues. How can we include this in our model?

Important

\[E(aX + bY + c) = aE(X) + bE(Y) + c\]

\[Var(aX + bY + c) = a^2Var(X) + b^2Var(Y) + 2abCov(X, Y)\]

Expectation of weighted sum

\[ E(2X + 4Y + 0.06)\]

    • \[\begin{aligned} &= 2E(X) + 4E(Y) + 0.06\\ &= 2(0.10) + 4(0.12) + 0.06\\ &= \$0.74 \end{aligned}\]

Variance of weighted sum

\[ Var(2X + 4Y + 0.06)\]

  • \[\begin{aligned} &= 2^2Var(X) + 4^2Var(Y) + 2 \cdot(2 \cdot 4) \cdot Cov(X,Y)\\ &= 4(4.99) + 16(5.27) + 16(2.19)\\ &= 139.32 \end{aligned}\]

Your turn

Click here or the qr code below