AP Statistics Audio Lectures

Hypothesis Tests

by Arnold Kling

Hypothesis Tests

by Arnold Kling

To listen to the lecture, click here.

Focus is still on terminology

Continue to assume that standard deviation is known, and we take a sample and find the mean. We have a hypothesis about the mean that we want to test.

Calculation goes from natural units to a percentile

Example: Longevity for people born in September is 77.8 years. Take a sample of people born in May and see if they have the same longevity or if it is shorter. Suppose that sample of 1600 people born in May has longevity of 77.5 years and the population standard deviation is known to be 4 years. Calculation:

Z = (m)/[s/sqrt(n)] = (77.5 - 77.8)/[4/sqrt(1600)] = -3.0

-What percentile? Normcdf (-100, -3.0) = .0013

Because .0013 is very small, there is a very low probability that you would get a sample mean of 77.5 with 1600 people if the true mean were 77.8, so we reject the hypothesis that the true mean is 77.8 in favor of the alternative that the true mean is lower than that.

How low is low? Below .05

General Procedure

- State the null hypothesis (champion); mean longevity for those born in May is 77.8 years
- State alternative hypothesis (challenger): mean longevity for those born in May is less than 77.8 years
- Take sample, compute mean
- Calculate percentile for mean if the null hypothesis were true, using Z = (m)/[s/sqrt(n)] -
- Draw conclusion by comparing percentile to an arbitrary limit, such as .05. If the percentile is below the limit, reject the null hypothesis (dethrone the challenger)

Truly Guilty | Truly Innocent | |
---|---|---|

Convict | correct decision | Type I error |

Acquit | Type II error | correct decision |

Null Hypothesis False | Null Hypothesis True | |
---|---|---|

Reject Null Hypothesis | correct decision | Type I error |

Fail to Reject Null Hypothesis | Type II error | correct decision |

Null hypothesis, H_{0}. Always an equality, e.g. H_{0}: m = 77.8

Alternative hypothesis, H_{a}. Always an inequality, e.g. H_{a}: m < 77.8

p-value. The percentile that we get when we do the calculation, e.g. .0013 (multiplying by 100 would give 0.13 percentile)

significance level, a. The arbitrary hurdle set by the investigator, e.g. .05 or 5 percent. Technically, a is the probability that we give ourselves of making a Type I error.

If the p-value is *below* a, we reject H_{0} and accept H_{a}. Otherwise, we fail to reject H_{0}. We never say that we accept H_{0}, because we have not proven that it's true. We've only failed to find strong evidence that it's false.

Once again, a statement such as "we reject the null hypothesis at a 5 percent significance level" is a statement about our methods. We are saying that calculations based on our sample mean and sample size would incorrectly reject the null hypothesis no more than 5 percent of the time.

In classical statistics, the null hypothesis is either true or it isn't. You cannot say that there is a 0.13 percent probability that the null hypothesis is true. You can only make probability statements about your statistical methods.

State the null hypothesis in words and in symbols.

*The null hypothesis is that the mean longevity of people born in May is 77.8 years. The alternative hypothesis is the the mean longevity is lower than that. Let m = mean longevity of people born in May.
H _{0}: m = 77.8
H_{a}: m < 77.8*

State significance level that you chose: *a = .05*

State sample results: *n = 1600, * = 77.5

State p-value: * p-value = .0013*

Compare p-value to a and state conclusion: *Because .0013 is less than .05, we reject the null hypothesis that the mean longevity of people born in May is 77.8 years and accept the alternative that the mean longevity is lower*

Important to remember: when p-value is less than a, we reject; when p-value is greater than a, we fail to reject