library(tidyverse) # for data manipulation and data visualization
library(tidymodels) # for model fitting and to get residuals
hw-4-instructions optional
Introduction
For this homework assignment we will be exploring the Diamonds dataset. You will find this document which introduces R
commands necessary for this homework indispensable. Please review it thoroughly.
Learning goals
In this assignment, you will…
- Find confidence intervals and perform hypothesis tests for proportions in
R
. - Interpret confidence intervals and hypothesis tests for the proportions.
- Fit a linear model to data in
R
. - Find and plot residuals in
R
.
Getting started
Log in to RStudio
Click on your Stat1010.Rproj that we made the first day of class.
Go to File ➛ New File ➛ Quarto Document and name the document hw-4 click create.
Packages
The following packages will be used in this assignment:
Data: Diamonds
The diamonds dataset, from the tidyverse()
set of packages, contains information on 53,940 round diamonds from the Loose Diamonds Search Engine.
In
R
, compute a 97% confidence interval for the population proportion of diamonds that have anIdeal
cut
and interpret it in context.In
R
, test the hypothesis that the population proportion ofFair
diamonds is not equal to 3.5% and interpret it. Include all hypotheses, and the test statistic.In
R
, test the hypothesis that the population proportion of diamonds withVery Good
cut and thebest colour
is equal to the population proportion of diamonds withPremium
cut and thebest colour
. Include all hypotheses, and the test statistic, and interpret the test in context.Use
R
to fit a least squares line of best fit to predict theprice
of a diamond usingx
. Interpret \(b_0\), \(b_1\), and \(R^2\) in context, then plot the residuals and comment on the plot. Find the predicted value of price for \(x = 1.5\) and comment on the validity of this prediction. If a diamonds length increased by 5mm, our model predicts what increase in price?
Component | Points |
---|---|
1 | 2 |
2 | 4 |
3 | 3 |
4 | 10 |