AE 11: Hypothesis Testing

Hypothesis testing review

\[H_0: p = 0.1 \text{ vs }H_a: p < 0.1\]

Generate the null distribution

null_dist <- organ_donor %>%
  specify(response = outcome, success = "complication") %>% #<<
  hypothesize(null = "point", 
              p = c("complication" = 0.10, "no complication" = 0.90) 
              ) %>% 
  generate(reps = 100, type = "simulate") %>% 
  calculate(stat = "prop")

Calculate p-value

null_dist %>%
  filter(stat <= (3/62)) %>%
  summarise(p_value = n()/nrow(null_dist))

## # A tibble: 1 x 1
##   p_value
##     <dbl>
## 1    0.17

Clone a repo + start a new project

Clone the ae-11 repo on GitHub and start a new project in RStudio. Be sure to configure git in the RStudio console, so you can so you can push your results back up to GitHub.

library(usethis)
use_git_config(user.name= "github username", user.email="your email")

We will be using the asheville dataset. You may load in the dataset with the following code (be sure to set eval to be TRUE in the following R chunk!):

library(tidyverse)
library(infer)

Exercise 1 (Zoom poll)

Suppose you are interested in whether the mean price per guest per night is actually less than $80. Choose the correct null and alternative hypotheses.

$H_0: \mu = 80 \text{ vs. }H_a: \mu \neq 80$
$H_0: \mu = 80 \text{ vs. }H_a: \mu < 80$
$H_0: \mu = 80 \text{ vs. }H_a: \mu > 80$
$H_0: \bar{x} = 80 \text{ vs. }H_a: \bar{x} \neq 80$
$H_0: \bar{x} = 80 \text{ vs. }H_a: \bar{x} < 80$
$H_0: \bar{x} = 80 \text{ vs. }H_a: \bar{x} > 80$

Exercise 2

Let’s use simulation-based methods to conduct the hypothesis test specified in Exercise 1. We’ll start by generating the null distribution.

Fill in the code and uncomment the lines below to generate the null distribution.

set.seed(060921)

null_dist <- asheville # %>%
  #specify(response = ______) %>%
  #hypothesize(null = ______, mu = ______) %>%
  #generate(reps = 100, type = "bootstrap") %>%
  #calculate(stat = _____)

Exercise 3

Fill in the code and uncomment the lines below to calculate the p-value using the null distribution from Exercise 2.

mean_ppg <- asheville # %>% 
  # summarise(mean_ppg = _____ ) %>%
  # pull()

#null_dist %>%
  #filter(______) %>%
  #summarise(p_value = ______)

Exercise 4

Use the p-value to make your conclusion using a significance level of 0.05. Remember, the conclusion has 3 components

How the p-value compares to the significance level
The decision you make with respect to the hypotheses (reject $H_0$ /fail to reject $H_0$)
The conclusion in the context of the research problem

Exercise 5

Suppose you are interested in whether the median price per guest per night is equal to or less than $80. Carry out a similar analysis to that undertaken in Exercises 1 - 4 to test these hypotheses.