To hear the lecture, click here
For a random variable, we focus on the mean, the variance, and the standard deviation.
The mean is the same as the average. It has many other aliases: expected value, m, E(X), X
Calculating the mean
point-count example from picking a card (ace = 4, king = 3, etc.)
Adding up everything in the last column, we get 10/13. This is the average value of a card under this point-count system. Note that if you add up the values of all the cards in the deck you get 16 + 12 + 8 + 4 = 40. Divide by the number of cards in the deck, and you get 40/52 = 10/13.
Suppose that you have a spinner with a 20 percent chance of landing on a 3, a 60 percent chance of landing on a 5, and a 20 percent chance of landing on a 12. We can represent this as a random variable. Set up the table to calculate the mean.
Adding up the numbers in the right-hand column, we see that the mean of the random variable is 6.
The variance is the probability-weighted average of the squared deviations from the mean, mX. In the last example, we have
To find the variance, we add up the numbers in the right-hand side of the table, which gives us 9.0.
We want to measure the "spread" of the distribution of the random variable. The mean gives us the "center." We want to know on average, how far are we likely to be from the center. For example, the mean temperature in DC might be about the same as in Oakland, but our winters are colder and summers are hotter. In that sense, we have higher variance.
The mean is particularly insufficient to describe the process of making predictions. If I predict a foot of snow one week and we get nothing, and then I predict nothing next week and we get a foot of snow, my average prediction of 6 inches equals the average snowfall. So what?
Traditional statistics joke: First statistician shoots at a deer, and misses by 10 feet to the left. Second statistician shoots at the deer, and misses by 10 feet to the right. Third statistician jumps up and down, saying "We hit the deer! We hit the deer!"
How to measure variance? Average difference from mean does not work, because pluses and minuses cancel out (hot days cancel out cold). Taking the absolute value of differences would work. In practice, it is easier to work with squared differences than with absolute values.
The standard deviation is the square root of the variance. When we're done grinding through the table, we can take the square root of the answer to get the standard deviation. In the example with the spinner and three numbers, the variance was 9, so the standard deviation is 3.
notation: standard deviation is called s and variance is called s2
Mean measures center, variance or standard deviation measures spread
Have to grind through the table to calculate mean, variance, and standard deviation
Short cuts are available when one random variable is a transformation of another random variable (next lecture)