Stats Audio Lectures--Hypothesis Tests

AP Statistics Audio Lectures
Hypothesis Tests
by Arnold Kling

To listen to the lecture, click here.

Focus is still on terminology

Continue to assume that standard deviation is known, and we take a sample and find the mean. We have a hypothesis about the mean that we want to test.

Calculation goes from natural units to a percentile

Example: Longevity for people born in September is 77.8 years. Take a sample of people born in May and see if they have the same longevity or if it is shorter. Suppose that sample of 1600 people born in May has longevity of 77.5 years and the population standard deviation is known to be 4 years. Calculation:

Z = (X - m)/[s/sqrt(n)] = (77.5 - 77.8)/[4/sqrt(1600)] = -3.0

What percentile? Normcdf (-100, -3.0) = .0013

Because .0013 is very small, there is a very low probability that you would get a sample mean of 77.5 with 1600 people if the true mean were 77.8, so we reject the hypothesis that the true mean is 77.8 in favor of the alternative that the true mean is lower than that.

How low is low? Below .05

General Procedure

State the null hypothesis (champion); mean longevity for those born in May is 77.8 years
State alternative hypothesis (challenger): mean longevity for those born in May is less than 77.8 years
Take sample, compute mean
Calculate percentile for mean if the null hypothesis were true, using Z = (X - m)/[s/sqrt(n)]
Draw conclusion by comparing percentile to an arbitrary limit, such as .05. If the percentile is below the limit, reject the null hypothesis (dethrone the challenger)

Decision Theory

	Truly Guilty	Truly Innocent
Convict	correct decision	Type I error
Acquit	Type II error	correct decision

Null Hypothesis False Null Hypothesis True

Reject Null Hypothesis correct decision Type I error

Fail to Reject Null Hypothesis Type II error correct decision

	Null Hypothesis False	Null Hypothesis True
Reject Null Hypothesis	correct decision	Type I error
Fail to Reject Null Hypothesis	Type II error	correct decision

Terminology

Null hypothesis, H₀. Always an equality, e.g. H₀: m = 77.8

Alternative hypothesis, H_a. Always an inequality, e.g. H_a: m < 77.8

p-value. The percentile that we get when we do the calculation, e.g. .0013 (multiplying by 100 would give 0.13 percentile)

significance level, a. The arbitrary hurdle set by the investigator, e.g. .05 or 5 percent. Technically, a is the probability that we give ourselves of making a Type I error.

If the p-value is below a, we reject H₀ and accept H_a. Otherwise, we fail to reject H₀. We never say that we accept H₀, because we have not proven that it's true. We've only failed to find strong evidence that it's false.

What does it mean?

Once again, a statement such as "we reject the null hypothesis at a 5 percent significance level" is a statement about our methods. We are saying that calculations based on our sample mean and sample size would incorrectly reject the null hypothesis no more than 5 percent of the time.

In classical statistics, the null hypothesis is either true or it isn't. You cannot say that there is a 0.13 percent probability that the null hypothesis is true. You can only make probability statements about your statistical methods.

Good Habits

State the null hypothesis in words and in symbols.

The null hypothesis is that the mean longevity of people born in May is 77.8 years. The alternative hypothesis is the the mean longevity is lower than that. Let m = mean longevity of people born in May.
H₀: m = 77.8
H_a: m < 77.8

State significance level that you chose: a = .05

State sample results: n = 1600, X = 77.5

State p-value: p-value = .0013

Compare p-value to a and state conclusion: Because .0013 is less than .05, we reject the null hypothesis that the mean longevity of people born in May is 77.8 years and accept the alternative that the mean longevity is lower

Important to remember: when p-value is less than a, we reject; when p-value is greater than a, we fail to reject