HW 8 - Modern statistical inference (two groups)

Due Thursday, June 16, 9:00pm on Gradescope

This assignment needs to be completed with RMarkdown and submitted as a PDF on Gradescope. Feel free to re-use the template provided for HW1.

When submitting your work on Gradescope, please assign a page for each question.

Problem set (23 points)

  • 13.4 (4 points)
  • 14.4 (2 points) – parts a and c
  • 16.16 (1 point) – part e
  • 16.20 (5 points)
  • 16.22 (4 points)
  • 16.24 (2 points)
  • 16.28 (2.5 point)
  • 16.30 (2.5 point)

For the remaining exercises, use the birth data set

library(readr)
d <- read_csv("https://rmorsomme.github.io/website/projects/training_set.csv")

Confidence intervals via bootstrap (12 points)

Construct (5 points) and interpret in the context of the problem (1 point) a 95% confidence interval using 10000 bootstrap samples for

  • the difference in proportion of female newborns among women with and without risk factor (6 points)

  • the difference in the mean weight of newborns among women with and without risk factor (6 points)

Hypothesis test via simulation (14 points)

Conduct a hypothesis test using simulation for the following cases. State the hypotheses in words and in mathematical symbols (2 points) and use 10000 samples simulated samples and the significance level \(\alpha=0.05\) (5 points).

  • You want to determine whether women with and without risk factor have the same proportion of female newborns (7 points)

  • You want to determine whether women with and without risk factor have newborns with the same mean weight (7 points)

Reproducibility (1 point)

A seed is set with the command set.seed before any code with a stochastic component.