본문 바로가기

Mathematics/Statistics

One- and Two-Sample Tests of Hypotheses

Testing a Statistical Hypothesis

  • Type I error: rejection of the null hypothesis when it is true
  • Type II error: non-rejection of the null hypothesis when it is false
  • Significance level: the probability of committing a type I error, written by the Greek letter $\alpha$.
  • It is impossible to compute the probability of committing a type II error, denoted by $\beta$.
  • The probability of committing both types of error can be reduced by increasing the sample size.
  • $P$-value is the lowest level of significance at which the observed value of the test statistic is significant.

Single Sample: Tests Concerning a Single Mean

Let two hypotheses be:

  • $H_0: \mu = \mu_0$
  • $H_1: \mu \ne \mu_0$

Tests on a Single Mean (Variance Known)

For $\displaystyle z = {\bar x - \mu_0 \over \sigma / \sqrt n}$,

  • If $-z_{\alpha/2} < z < z_{\alpha/2}$, do not reject $H_0$.
  • Otherwise, reject $H_0$.

Note that $-z_{\alpha/2} \le z \le z_{\alpha/2}$ is equivalent to $\displaystyle \bar x - z_{\alpha/2} {\sigma \over \sqrt n} \le \mu_0 \le \bar x + z_{\alpha/2} {\sigma \over \sqrt n}$.

Tests on a Single Mean (Variance Unknown)

For $\displaystyle t = {\bar x - \mu_0 \over s/ \sqrt n}$,

  • If $-t_{\alpha/2, n-1} \le t \le t_{\alpha/2, n-1}$, do not reject $H_0$.
  • Otherwise, reject $H_0$.

Two Samples: Tests on Two Means

Let two hypotheses be:

  • $H_0 : \mu_1 - \mu_2 = d_0$
  • $H_1 : \mu_1 - \mu_2 \neq d_0$

Variance Known

For $\displaystyle z = {(\bar x_1 - \bar x_2) - d_0 \over \sqrt{\sigma_1^2/n_1 + \sigma_2^2 /n_2}}$,

  • If $-z_{\alpha/2} < z < z_{\alpha/2}$, do not reject $H_0$.
  • Otherwise, reject $H_0$.

Unknown But Equal Variances

Let $\displaystyle s_p = {s_1^2 (n_1 - 1) + s_2^2 (n_2 - 1) \over n_1 + n_2 - 2}$.

For $\displaystyle t = {(\bar x_1 - \bar x_2)-d_0 \over s_p \sqrt{1/n_1 + 1/n_2}}$,

  • If $-t_{\alpha/2, n_1 + n_2 - 2} < t < t_{\alpha/2, n_1 + n_2 - 2}$, do not reject $H_0$.
  • Otherwise, reject $H_0$.

Unknown and Unequal Variances

For $\displaystyle t = {(\bar x_1 - \bar x_2)-d_0 \over \sqrt{s_1^2/n_1 + s_2^2 /n_2}}$ and d.f. $\displaystyle v = {(s_1^2 /n_1 + s_2^2 / n_2)^2 \over (s_1^2/n_1)^2/(n_1 - 1) + (s_2^2 /n_2)^2 / (n_2 - 1)}$,

  • If $-t_{\alpha/2, v} < t < t_{\alpha/2, v}$, do not reject $H_0$.
  • Otherwise, reject $H_0$.

Paired Observations

Let $\bar D$ be the sample mean and $S_d$ be the standard deviation of the differences of the observations.

For $\displaystyle t = {\bar D - \mu_d \over S_d / \sqrt n}$,

  • If $-t_{\alpha/2, n-1} < t < t_{\alpha/2, n-1}$, do not reject $H_0$.
  • Otherwise, reject $H_0$.

Choice of Sample Size for Testing Means

Using $z$-value

  • $H_0 : \mu = \mu_0$
  • Alternative: $\mu = \mu_0 + \delta$

Note that $1 - \beta$ is the power of test

  • One-side tailed test: $\displaystyle n = {(z_{\alpha} + z_{\beta})^2 \sigma^2 \over \delta^2}$
  • Two-side tailed test: $\displaystyle n = {(z_{\alpha/2}+z_{\beta})^2 \sigma^2 \over \delta^2}$

Two-Sample Case

Suppose $\sigma_1$ and $\sigma_2$ are known.

  • One-side tailed test: $\displaystyle n = {(z_\alpha + z_\beta)^2 (\sigma_1^2 + \sigma_2^2) \over \delta^2}$
  • Two-side tailed test: $\displaystyle n = {(z_{\alpha/2} + z_{\beta})^2 (\sigma_1^2 + \sigma_2^2) \over \delta^2}$

Suppose unknown.

  • Should use non-central $t$-distribution
  • Note that $\displaystyle \Delta = {|\delta| \over \sigma}$

One Sample: Test on a Single Proportion

Small Samples

  • $H_0 : p = p_0$, $H_1 : p < p_0, p > p_0$, or $p \neq p_0$
  • Choose $\alpha$, a level of significance
  • Test statistic: binomial variable $X$ with $p = p_0$
  • Computations: find $x$, the number of successes, and compute the appropriate $P$-value
  • Make conclusions based on the $P$-value

If we use normal approximation, the $z$-value for $p = p_0$ is $\displaystyle z = {\hat p - p_0 \over \sqrt{p_0q_0/n}}$

Two Samples: Tests on Two Proportions

  • $H_0: p_1 = p_2$
  • pooled estimate of the proportion $\displaystyle \hat p = {x_1 + x_2 \over n_1 + n_2}$
  • $\displaystyle z = {\hat p_1 - \hat p_2 \over \sqrt{\hat p \hat q (1 / n_1 + 1/n_2)}}$
  • Two-side tailed: $|z| < z_{\alpha/2}$
  • One-side: $z < z_{\alpha}$ or $z > -z_{\alpha}$

Concerning Variances

Single Sample

Suppose we test $\sigma^2 = \sigma_0^2$.

  • $\displaystyle \chi^2 = {(n-1)s^2 \over \sigma_0^2}$
  • We don't reject $H_0$ if $\chi_{1-\alpha/2}^2 < \chi^2 < \chi_{\alpha/2}^2$
  • For testing $\sigma^2 < \sigma_0^2$, we don't reject if $\chi^2 > \chi_{1-\alpha}^2$
  • For testing $\sigma^2 > \sigma_0^2$, we don't reject if $\chi^2 < \chi_\alpha^2$
  • with d.f. $n-1$

Two Samples

Suppose we test $\sigma_1^2 = \sigma_2^2$

  • $\displaystyle f = {s_1^2 \over s_2^2}$
  • Then, $f$ is $F$-distribution of $v_1 = n_1 - 1$ and $v_2 = n_2 - 1$
  • We don't reject if $f_{1-\alpha/2} < f < f_{\alpha/2}$
  • For $\sigma_1^2 < \sigma_2^2$, we don't reject if $f > f_{1-\alpha/2}$
  • For $\sigma_1^2 > \sigma_2^2$, we don't reject if $f < f_{\alpha/2}$

Goodness of Fit Test

We want to test if some sample follows specific distribution.

Example: is dice balanced?

  • $\displaystyle \chi^2 = \sum_{i=1}^k {(o_i - e_i)^2 \over e_i}$ where
    • $o_i$: observed frequencies
    • $e_i$: expected frequencies

We can also use normality assumption.

Note: Geary's test

  • Suppose $X_1, \cdots, X_n$ is taken from $N(\mu, \sigma^2)$
  • Consider: $\displaystyle U = {{ \sqrt{\pi/2} \sum |x_i - \bar X| / n } \over \sqrt{\sum(X_i - \bar X)^2/n}}$
  • $\displaystyle z = {U - 1 \over 0.2661 / \sqrt n}$

Test for Independence (Categorical Data)

To test for independence, a expected probability of a cell is multiple of column and row.

  • First, we calculate $\displaystyle \chi^2 = \sum_{i} {(o_i - e_i)^2 \over e_i}$
  • Now, test $\chi^2 > \chi_{\alpha}^2$ with $v = (r-1)(c-1)$ d.f.
  • Note that if $v = 1$, we should use $\displaystyle \chi^2 = \sum_i {(|o_i - e_i| - 0.5)^2 \over e_i}$ instead.

Test for Homogeneity

We want to test each row is homogeneous w.r.t each column

  • Compute $\chi$ and test $\chi^2 > \chi_\alpha^2$

Testing for Several Proportions

Just use that with binomial and test $p_1 = p_2 = \cdots$

'Mathematics > Statistics' 카테고리의 다른 글

Simple Linear Regression and Correlation  (0) 2021.10.24
One- and Two-Sample Tests of Hypotheses  (0) 2021.10.24
One- and Two-Sample Estimation Problems  (0) 2021.10.24