- Understand fundamentals of hypothesis testing
- Formulating hypotheses
- Calculating test statistics
- Interpreting P-values
Perform randomization tests for comparisons
- Readings
- ISRS: ch. 2.1-2.4
Perform randomization tests for comparisons
Associate hypothesis with a test statistic, i.e. quantity calculated from data
sim = tibble( iter = 1:1000,
value = replicate( 1000,
sample( 0:1, 100, replace = TRUE ) %>% mean() ) )
sim
## # A tibble: 1,000 x 2
## iter value
## <int> <dbl>
## 1 1 0.47
## 2 2 0.5
## 3 3 0.48
## 4 4 0.44
## 5 5 0.46
## 6 6 0.55
## 7 7 0.51
## 8 8 0.48
## 9 9 0.52
## 10 10 0.52
## # ... with 990 more rows
| Range | Compatibility with \(H_0\) |
|---|---|
| P-value > 0.10 | no evidence against \(H_0\) |
| 0.05 < P-value < 0.10 | weak evidence against \(H_0\) |
| 0.01 < P-value < 0.05 | moderate evidence against \(H_0\) |
| 0.001 < P-value < 0.01 | strong evidence against \(H_0\) |
| P-value < 0.001 | very strong evidence against \(H_0\) |
sim %>% summarise( mean( value >= .59 ) ) %>% pull() ## [1] 0.04
Reject \(H_0\) at \(\alpha = 5\%\) significance level \(\Rightarrow\) conclude proposal will pass (go with \(H_A\))
set.seed(123) sim = replicate( 1000, sample( 0:1, 500, replace = TRUE ) %>% mean() ) (P_value = mean( abs(sim - .5) >= (276/500 - .5) )) ## [1] 0.02
Idea: if populations are similar, then group information does not matter
Randomization/Permutation test: approximate sampling distribution under \(H_0\) by repeatedly shuffling groups randomly and calculating their difference
coin package
independence_test() for hypothesis testhrlyearn for different sexlibrary(coin)
lfs %>% filter( educ == 5) %>%
mutate( sex = factor(sex, levels = 1:2, labels = c("M","F")) ) %>%
independence_test( hrlyearn ~ sex, data = .,
alternative = "two.sided", distribution = "approximate" )
##
## Approximative General Independence Test
##
## data: hrlyearn by sex (M, F)
## Z = 15.834, p-value < 2.2e-16
## alternative hypothesis: two.sided