- Understand fundamentals of hypothesis testing
- Formulating hypotheses
- Calculating test statistics
- Interpreting P-values
Perform randomization tests for comparisons
- Readings
- ISRS: ch. 2.1-2.4
Perform randomization tests for comparisons
Associate hypothesis with a test statistic, i.e. quantity calculated from data
sim = tibble( iter = 1:1000, value = replicate( 1000, sample( 0:1, 100, replace = TRUE ) %>% mean() ) ) sim ## # A tibble: 1,000 x 2 ## iter value ## <int> <dbl> ## 1 1 0.47 ## 2 2 0.5 ## 3 3 0.48 ## 4 4 0.44 ## 5 5 0.46 ## 6 6 0.55 ## 7 7 0.51 ## 8 8 0.48 ## 9 9 0.52 ## 10 10 0.52 ## # ... with 990 more rows
Range | Compatibility with \(H_0\) |
---|---|
P-value > 0.10 | no evidence against \(H_0\) |
0.05 < P-value < 0.10 | weak evidence against \(H_0\) |
0.01 < P-value < 0.05 | moderate evidence against \(H_0\) |
0.001 < P-value < 0.01 | strong evidence against \(H_0\) |
P-value < 0.001 | very strong evidence against \(H_0\) |
sim %>% summarise( mean( value >= .59 ) ) %>% pull() ## [1] 0.04
Reject \(H_0\) at \(\alpha = 5\%\) significance level \(\Rightarrow\) conclude proposal will pass (go with \(H_A\))
set.seed(123) sim = replicate( 1000, sample( 0:1, 500, replace = TRUE ) %>% mean() ) (P_value = mean( abs(sim - .5) >= (276/500 - .5) )) ## [1] 0.02
Idea: if populations are similar, then group information does not matter
Randomization/Permutation test: approximate sampling distribution under \(H_0\) by repeatedly shuffling groups randomly and calculating their difference
coin
package
independence_test()
for hypothesis testhrlyearn
for different sex
library(coin) lfs %>% filter( educ == 5) %>% mutate( sex = factor(sex, levels = 1:2, labels = c("M","F")) ) %>% independence_test( hrlyearn ~ sex, data = ., alternative = "two.sided", distribution = "approximate" ) ## ## Approximative General Independence Test ## ## data: hrlyearn by sex (M, F) ## Z = 15.834, p-value < 2.2e-16 ## alternative hypothesis: two.sided