hypothesis testing is the mechanism by which a hypothesis is tested statistically.
The core logic of hypothesis testing: have a metric, do tests, calculate probability that the outcome could have happened given the metric is true.
Examples include
- t-test (for sample means)
- z-test (for sample proportions)
- chi-square test (for sample categories)
Common to all hypothesis tests are the following terms.
null hypothesis
A null hypothesis is a “no difference” hypothesis created as a part of hypothesis testing. It is usually stated as an equality.
alternative hypothesis
The alternative hypothesis is the “new news” hypothesis created as a part of hypothesis testing, whereby the confirmation would introduce new information.
p-value
the p-value of a hypothesis test is the probability of the results acquired taking place given if the null hypothesis. That is:
\begin{equation} p(\hat{p} | H_0\ true) \end{equation}
To figure out the above probability, you could either simulate the occurrence and look at a histogram (more common for AP Statistics anyways) or measure a few other statistics. We will talk about them later.
To use p-value as a hypothesis test, the sample has to meet the conditions for inference.
See also p-value from bootstrap
Type I Error
A Type I Error takes place when you reject the null hypothesis during hypothesis testing even while its true: i.e., a false positive.
The probability of having a Type I Error is the significance level of the test.
Type II Error
A Type II Error takes place when you accept the null hypothesis during hypothesis testing even while its false.
The probability of having a Type II Error is the conjugate of the power of a test.
significance level
significance level is the level by which one would accept a p-value is being indicative of the success of a test. We usually use the letter \(\alpha\) to denote this.
power (statistics)
power is a statistic calculable during hypothesis testing. Its the probability of rejecting the null hypothesis given the null hypothesis is false. Also known as the conjugate of the Type II Error.
power increases as significance level increases, but then the probability of a Type I Error increases as well.