- Examine hypothesis testing more closely
- Understand Error Types
- Use Power & Effect Size
- Identify hypothesis testing pitfalls
- Prevent Data-Snooping
- Employ best practices
- Readings
- ISRS: ch. 2.1-2.4
P-value: probability of observing equally or more extreme statistic value under \(H_0\) (P-value)
Reject \(H_0\) if P-value is smaller than preset cutoff called significance level (\(\alpha\))
There is a trade-off between the two types of errors
E.g. can have same P-value for: large \(n\) & small \(\hat{\mu}\), or small \(n\) & large \(\hat{\mu}\)
library(effsize) lfs %>% filter( educ == 5 ) %>% mutate( sex = factor(sex) ) %>% cohen.d( hrlyearn ~ sex, data = .) ## ## Cohen's d ## ## d estimate: 0.3361159 (small) ## 95 percent confidence interval: ## lower upper ## 0.2947876 0.3774442
Size | Effect |
---|---|
\(0.0 - 0.2\) | Negligible |
\(0.2 - 0.5\) | Small |
\(0.5 - 0.8\) | Medium |
\(0.8+\) | Large |
At 5% significance, expect 1 in 20 (independent) tests to Reject \(H_0\) even when it is true!
Snoop around data long engough, and you are almost guaranteed to find significant results at 5% level
How to prevent hypothesis testing misuse
Study pre-registration: specify research questions & methodology prior to data-collection \(\rightarrow\) prevent data-snooping
Computational Reproducibility: publish all data & code for analysis \(\rightarrow\) prevent errors/manipulation
Report both positive and negative results \(\rightarrow\) prevent publication bias