9.5 Chapter 5
Bootstrapping: The process of resampling our sample mimics the process of sampling from the full target population using our best stand-in for the population: our sample. We bootstrap our sample in order to 1) estimate the variability of the statistic and 2) get a range of plausible values for the true population parameter.
Null Hypothesis: A hypothesis that is conservative/not interesting/does not elicit action, typically of the form “there is no real relationship” or “there is no real difference.”
Parameter: A numerical summary of population that is typically unknown but of interest
Randomization Variability: If we could repeat the randomization process (to a treatment group) in an experiment, each treatment group would be slightly different and thus the numerical summaries of those groups would vary as well.
Randomization Test: Assume no real difference between treatment groups (null hypothesis). Then randomly shuffle the treatment groups to repeat the randomization process under the assumption of no real relationship. Repeat and see how big of a difference we’d expect under the null hypothesis and compare with what difference was actually observed.
Sampling Distribution: The distribution of values of a sample statistic across all possible random samples from the population (has a center, spread, and shape).
Sampling Variability: If we could repeat the random sampling process, each sample we would get would be slightly different and thus the numerical summaries or statistics of that sample would vary as well.
Simulating Randomization into Groups: Since we cannot repeat the entire experiment, we can simulate the randomization into groups and utilize the observed data by assuming the treatment has no real impact on the outcome. We can estimate the variability in the group differences under this assumption.
Simulating Sampling from a Population: If we had the population, we would simulate many possible random samples and estimate the variability in the sample statistics across random samples.
Statistic: Any numerical summary of sample data