HW 7 - Modern statistical inference (one group)

Due Monday, June 13, 9:00pm on Gradescope

This assignment needs to be completed with RMarkdown and submitted as a PDF on Gradescope. Feel free to re-use the template provided for HW1.

When submitting your work on Gradescope, please assign a page for each question.

Problem set (10 points)

  • 9.2 – parts b and d (1 point)
  • 9.4 (4 points) – Hint: use the symbol $ to write mathematical equations; for instance, to write the fraction \(\frac{e^{\mu}}{\beta_0}\) simply use $\frac{e^{\mu}}{\beta_0}$.
  • 9.8 (3 points)
  • 24.8 – parts b-c with a percentile bootstrap interval, the usual bootstrap interval that we have constructed in class (2 points)

For the remaining exercises, use the birth data set

library(readr)
d <- read_csv("https://rmorsomme.github.io/website/projects/training_set.csv")

Confidence intervals via bootstrap (18 points)

Construct (5 points) and interpret in the context of the problem (1 point) a 95% confidence interval using 10000 bootstrap samples for

  • the proportion of female newborn (6 points)

  • the mean weight of newborns (6 points)

  • the slope of gestation week in a simple linear regression model for newborn_birth_weight (6 points)

Hypothesis test via simulation (21 points)

Conduct a hypothesis test using simulation for the following cases. State the hypotheses in words and in mathematical symbols (2 points) and use 10000 simulated samples and the significance level \(\alpha=0.05\) (5 points).

  • You want to determine whether exactly half of the newborns are female (7 points)

  • You want to determine whether the mean weight of newborns is 3500 grams (7 points)

  • You want to determine whether newborn_birth_weight is independent of gestation week (7 points)

Reproducibility (1 point)

A seed is set with the command set.seed before any code with a stochastic component.