AP Statistics Audio Lectures

Confidence Intervals

by Arnold Kling

Confidence Intervals

by Arnold Kling

To listen to the lecture, click here.

Focus is on terminology

Now we treat the true population mean as unknown, but with the population standard deviation known (unrealistic).

Basic calculation goes from a percentile to natural units

For example, suppose that we take a sample of 25 students and ask how long it takes them to write an essay for the SAT's. We want to say something about the population mean, based on the sample mean. Pretend that we know that the population standard deviation is 1.5 minutes. Suppose that our sample mean is 20 minutes.

To calculate a 90 percent confidence interval for the population mean. That is, we want to find the natural units that fall within the middle 90 percentiles.:

- We need to go from percentile to natural units, so we use invnorm. However, we do
*not*take invnorm (.9). The middle 90 percentiles are between .05 and .95. So, take invnorm (.05) = -1.64 - Using Z = (X - s/sqrt(n)], we have

-1.64 = (X - 20)/[1.5/5]
)/[ - X = 20 - .492 = 19.508
- The confidence interval is symmetric around the mean, so it is (19.508, 20.492)

Terminology

- Confidence
*level*, C: 90 percent. Arbitrary, chosen by investigator. - Confidence
*interval*: (19.508, 20.492) Natural units that are calculated to fall within the percentiles given by the confidence level. - Margin of error, m: 0.492 In natural units, the distance from the sample mean to the boundaries of the confidence interval.

m = z*s/sqrt(n), where z* is the distance measured in standard deviations that corresponds to the confidence interval percentile boundaries (invnorm). z* = 1.64 in our example

Calculate a 95 percent confidence interval.

- z* = invnorm (.975) = 1.96
- m = z*s/sqrt(n) = 1.96(1.5)/sqrt(25) = .588
- c.i. = + or - m = 20 + or - .588 = (19.412, 20.588)

Note that confidence interval is wider for 95 percent than 90 percent

You want to say that there is a 90 percent chance that the true population mean is between 19.508 minutes and 20.492 minutes, but you cannot.

In classical statistics, the true mean is a population parameter, and it is what it is. You cannot change the probability distribution for it based on what the investigator does.

You can only make probability statements about your statistical methods, not about the true parameter.

You can say that a 90 percent confidence interval is an interval that is calculated by a method that would produce an interval containing the true parameter 90 percent of the time.

90 percent of the time, when we take a sample of size 25 and calculate a confidence interval based on the sample mean and--in this case--known population standard deviation, our interval will contain the true population parameter.

That way, two people taking two different samples, getting different sample means, and correctly calculating two different confidence intervals can both be correct.

Narrower is better

Other things equal, what is the effect on the width of the confidence interval (i.e., the margin of error) of:

- raising the confidence level (e.g., from 90 percent to 95 percent?)
- a higher population standard deviation?
- a higher sample mean?
- a larger sample size?