From Confidence Intervals to Hypothesis Test
Basic Concept of Statistical Inference
1 Notations
- \(\theta\): parameter of random variable \(X\)
- \(\hat{\theta} = \hat{\theta}\left(X_1, \ldots, X_n\right)\): estimate of paremeter \(\theta\)
- \(F_{\theta}\left(x\right)=\mathbb{P}\left(X\leq x\right)\): CDF of r.v. \(X\), i.e, \(X \sim F_{\theta}\)
- \(q_{p} = F_{\theta}^{-1}\left(p\right)\): p-quantile of \(F_{\theta}\)
- \(\iff p = \mathbb{P}\left(X \leq q_{p}\right)\)
- \(\mathrm{SE} := \mathrm{SD}\left(\hat{\theta}\right)\): standard error of estimate \(\hat{\theta}\)
If the argument of SD is an estimate, then we call SD as SE.
2 Confidence Interval (CI)
2.1 Pivot (Pivotal quantity)
Pivot \(T\) is a statistic which follows below conditions:
- \(T\) includes the unknown parameter \(\theta\) of randomvariable \(X\)
- Its distribution \(F_T\) does not depend on \(\theta\)
- \(F_T\) is known distribution.
2.2 Derive CI
Set the pivot \(T = \frac{\hat{\theta}-\theta}{\mathrm{SE}\left(\hat{\theta}\right)
} \sim F_T\).
It does not need to be unbiased. In general, \(\mathbb{E}\left(\hat{\theta}\right) \neq \theta\). (Thus the pivot does not be standardized)
Then, followings are hold.
\[ \begin{align*} P\left( q_{\alpha/2} \le T \le q_{1-\alpha/2} \right) &= F_T\left(q_{1-\alpha/2}\right) - F_T\left(q_{\alpha/2}\right) \\ &= 1 - \alpha \end{align*} \]
Apply \(T = \frac{\hat{\theta}-\theta}{\mathrm{SE}}\),
\[ \mathbb{P}\left( q_{\alpha/2} \leq \frac{\hat{\theta} - \theta}{\mathrm{SE}\left(\hat{\theta}\right) } \leq q_{1-\alpha/2}\right) = 1 - \alpha \]
\[ \mathbb{P}\left( \hat{\theta} - q_{1-\alpha/2}\cdot \mathrm{SE}\left(\hat{\theta}\right) \leq \theta \leq \hat{\theta} - q_{\alpha/2}\cdot \mathrm{SE}\left(\hat{\theta}\right) \right) = 1 - \alpha \]
2.3 Conclusion (CI)
The confidence interval of parameter \(\theta\) is:
\[ \theta \in \left(\hat{\theta} - q_{1-\alpha/2}\cdot \mathrm{SE}\left(\hat{\theta}\right) , \quad \hat{\theta} - q_{\alpha/2}\cdot \mathrm{SE}\left(\hat{\theta}\right) \right) \]
2.4 Examples
2.4.1 Symmetry pivot:
If the pivot distribution is symetry, i.e, \(q_{\alpha/2} = -q_{1-\alpha/2}\) for any \(\alpha\).
Then the CI can be simplified as below:
\[ \theta \in \left(\hat{\theta} - q_{1-\alpha/2}\cdot \mathrm{SE}\left(\hat{\theta}\right) , \quad \hat{\theta} + q_{1-\alpha/2}\cdot \mathrm{SE}\left(\hat{\theta}\right) \right) \]
Simply, the CI is:
\[ \hat{\theta} \pm q_{1-\alpha/2}\cdot \mathrm{SE}\left(\hat{\theta}\right) \]
2.4.2 Normal distribution:
If the estimate \(\hat{\theta}\) follows normal distribution, the pivot can be calculated as below:
\[ \begin{align*} T &= \frac{\hat{\theta}-\theta}{\mathrm{SE}\left(\hat{\theta}\right)} \\ &= \frac{\hat{\theta}-\mathbb{E}\left(\hat{\theta}\right)}{\mathrm{SD}\left(\hat{\theta}\right)} \sim Z\left(0,1\right) \end{align*} \]
Thus we can define standard normal distribution \(Z\)’s quantile: \[z_{p} = \Phi^{-1}\left(p\right)\] where \(\Phi\): CDF of standard normal distribution
Simply, we take CI as below: \[ \hat{\theta} \pm z_{1-\alpha/2}\cdot \mathrm{SE}\left(\hat{\theta}\right) \]
3 Hypothesis Testing
3.1 Definition: p-value (\(p\))
The p-value is the probability of obtaining test results at least as extreme as the result actually observed, under the assumption that the null hypothesis (\(H_0\)) is correct.
In other words, it measures how surprising the observed result is under \(H_0\).
3.1.1 Small p-value
- This result would be very rare by chance.
- The observed statistic lies in the tail (a rare/surprising result).
- It provides strong evidence against “no difference/effect”, i.e., we Reject \(H_0\).
3.1.2 Large p-value
- This result is easy to get by chance.
- It is not evidence that there is no effect (we just lack sufficient evidence to reject it).
- Therefore, we Maintain \(H_0\) (Fail to reject).
3.2 Significance Level (\(\alpha\)) and Decision Rule
Based on a pre-defined significance level \(\alpha\), we determine the result as follows:
If $p < \alpha$:
p-value is small, reject $H_0$
Else:
p-value is large, maintain $H_0$
3.3 Hypothesis
- \(H_0\): Null hypothesis.
- e.g. \(H_0: \theta = \theta_0\) (no difference)
- \(H_1\): Alternative hypothesis
- e.g. \(H_1: \theta \neq \theta_0\) (significantly difference)
3.5 Examples
3.5.1 Normal Distribution:
Let’s examine the relationship between the range of the CI and the area of the p-value using an estimator that follows a normal distribution.
Assume that under the two-sided test null hypothesis \(H_0: \theta = \theta_0\), the test statistic \(T\) (under \(H_0\)) follows a standard normal distribution:
\[T = \frac{\hat{\theta} - \theta_0}{\mathrm{SE}\left(\hat{\theta}\right)} \sim Z\left(0, 1\right)\]
3.5.1.1 CI Range (\(1 - \alpha\) Area):
The confidence interval covers the central \(1-\alpha\) area of the sampling distribution, centered at the observed \(\hat{\theta}\).
\[ \theta \in \left(\hat{\theta} - z_{1-\alpha/2}\cdot \mathrm{SE}\left(\hat{\theta}\right), \quad \hat{\theta} + z_{1-\alpha/2}\cdot \mathrm{SE}\left(\hat{\theta}\right)\right) \]
3.5.1.2 p-value Range (Tail Area):
The p-value is the sum of the extreme probabilities in both tails, based on the observed test statistic \(t_{obs}\).
\[ p\text{-value} = \mathbb{P}\left(|T| \geq |t_{obs}|\right) = 2 \times \left(1 - \Phi\left(|t_{obs}|\right)\right) \]
3.5.1.3 Complementary Relationship:
If the calculated p-value is less than the significance level \(\alpha\) (\(p < \alpha\)), it means the observed \(\hat{\theta}\) is so far from the assumed \(\theta_0\) that the p-value tail area falls inside the \(\alpha\) rejection region.
Spatially, this means \(\theta_0\) is located outside the Confidence Interval (CI) (which represents the \(1-\alpha\) area).
Conversely, if the p-value is greater than or equal to the significance level (\(p \ge \alpha\)), \(\theta_0\) is safely included within the Confidence Interval.
Thus, the size of the tail area represented by the p-value and whether the parameter is included in the CI range yield mathematically identical conclusions, just like two sides of the same coin.
3.6 Summary: Two Equivalent Decision Rules
In hypothesis testing, there are two primary decision rules to determine whether to reject the null hypothesis (\(H_0: \theta = \theta_0\)).
These two approaches are mathematically equivalent and will always lead to the exact same conclusion.
| Decision Rule | Reject \(H_0\) | Maintain \(H_0\) | Core Intuition |
|---|---|---|---|
| Confidence Interval | \(\theta_0 \notin \text{CI}\) | \(\theta_0 \in \text{CI}\) | Does the assumed parameter \(\theta_0\) lie outside our estimated plausible range? |
| p-value | \(p < \alpha\) | \(p \ge \alpha\) | Is the observed data too surprising/extreme to have occurred by chance under \(H_0\)? |
Note: The Confidence Interval approach checks the location of the hypothesized parameter, while the p-value approach evaluates the probability of the observed data.