AP Statistics Lectures
Table of Contents
by Arnold Kling

Mean and Variance

The mean, or first moment, of a distribution is a measure of the average. Suppose that a random variable has three outcomes.

X P(X)
3 .2
5 .6
12 .2

To calculate the mean of X, we compute E(X). That is,

E(X) = X = .2(3) + .6(5) + .2(12) = 6.0

The variance of X is calculated as E(X - X)2. We can augment our table as follows:

X P(X) X-X (X-X)2
3 .2 -3 9
5 .6 -1 1
12 .2 6 36

Now, we take E(X - X)2.

E(X - X)2 = .2(9) + .6(1) + .2(36) = 9.6

Suppose that the values of X were raised to 4, 6, and 13. What do you think would happen to the mean of X? What do you think would happen to the variance of X? Verify your guesses by setting up the table and doing the calculation.

See if you can come up with values of X that would raise the mean and raise the variance. See if you can come up with values of X that would raise the mean but lower the variance. Finally, suppose we leave the values of X the same. Can you come up with different values of P(X) that keep the same mean but lower the variance? Can you come up with values of P(X) that keep the same mean but raise the variance?

A Useful Identity

One way to understand the relationship between E(X2) and the variance of X is to write out the following identity.

E(X - X)2 = E(X2 - 2XX + X2)
= E(X2 - 2[E(X)]2 + [E(X)]2
= E(X2) - [E(X)]2

Standard Deviation and Greek notation

The standard deviation of a random variable is the square root of the variance. In the example above, the standard deviation would be the square root of (9.6).

The mean of X is written as mX The Greek letter is pronounced "mew," although it often is transliterated as "mu." The standard deviation of X is written as sX. The Greek letter is called "sigma." Using Greek notation, the variance is written as s2X

Two Random Variables

Often, we will take two random variables, X and Y, and add them to create a new random variable. We could give the new random variable its own name, Z, but often we just call it X+Y.

The properties of the expectation operator imply that:

mX+Y = E(X+Y) = E(X) + E(Y) = mX + mY
s2X+Y = E(X+Y - X+Y)2
= E(X - X + Y - Y)2
= E(X - X)2 + E(Y - Y)2 + 2E([X - X][Y - Y]
= s2X + s2Y + 2sXY

The term sXY is called the covariance of X and Y. We will return to it later in the course. For now, we note that in the case where X and Y are independent, the covariance is 0, and the equation reduces to:

s2X+Y = s2X + s2Y (when X and Y are independent)

It follows that if we have n independent random variables X that have the same mean mX and variance s2X, and we call the sum of these random variables V, then

iid equations
mV = nmX
s2V = n s2X

These are called iid equations, because they refer to the sum of indepent, identically distributed random variables. Verify that the iid equations are correct.

  1. Start with the random variable that can take on values of 3, 5, or 12 with probabilities .2, .6, and .2, respectively. We calculated its mean as 6 and its variance as 9.6.
  2. Next, consider the random variable V that is the sum of two X's. (Think of each X as being like one die, and V is the sum of dice.) According to the iid equations, what should be the mean and variance of V?
  3. In a table, show all possible values of V. Show their probabilities. Then calculate the mean and variance of V using the values and their probabilities. Verify that you get the same answer as when you use the iid equations.