Remarks on Random Variables
You can think of a random variable as being analogous to a histogram. In a histogram, you might show the percentage of your data that falls into each of several categories.
For example, suppose you had data on family income. You might find that 20 percent of families have an income below $30K, 27 percent have an income between $30 and $40k, 21 percent have an income between $40 and $50k, and 32 percent have an income over $50k. A histogram would be a chart of that data with the income ranges on the X-axis and the percentages on the Y-axis.
Similarly, a graph of a random variable shows the range of values of the random variable on the X-axis and the probabilities on the Y-axis. Just as the percentages in a histogram have to add to 100 percent, the probabilities in a graph of a random variable have to add to 1. (We say that the area under the curve of a probability density function has to equal one). Just as the percentages in a histogram have to be non-negative, the probabilities of the values of a random variable have to be non-negative.
A probability distribution function (pdf) for a random variable X is an equation or set of equations that allows you to calculate probability based on the value of x. Think of a pdf as a formula for producing a histogram. For example, if X can take on the values 1, 2, 3, 4, 5 and the probabilities are equally likely, then we can write the pdf of X as:
f(X) = .2 for X = 1, 2, 3, 4, or 5The point to remember about a pdf is that the probabilities have to be nonnegative and sum to one. For a discrete distribution, it is straightforward to add all of the probabilities. For a continuous distribution, you have to take the "area under the curve." In practice, unless you know calculus, the only areas you can find are when the pdf is linear. See the Uniform Distribution.
The mean of a data set is a measure of the center of a histogram. Similarly, the mean of a random variable is a measure of the center of its probability distribution. When you take a random variable and add a constant or multiply by a constant, you move the center. In general, if Y = a + bX, then the mean of Y is equal to a + b(mean of X).
The variance of a data set is a measure of the dispserion of a histogram around its center. Similarly, the variance of a random variable is a measure of the dispersion of that variable around its mean.
So, if Y = a+bX, then the variance of Y is equal to b^{2} times the variance of X. The additive constant a has no effect. Since the standard deviation is the square root of the variance, the standard deviation of Y is b times the standard deviation of X.
Next, consider what happens when you add two random variables. Let W = X+Y. Let V = X-Y.
The mean of W is equal to the mean of X plus the mean of Y. The mean of V is equal to the mean of X minus the mean of Y. The variance of W is equal to the variance of X plus the variance of Y plus two times the covariance of X and Y. The variance of V is equal to the variance of X plus the variance of Y minus two times the covariance of X and Y.Suppose that X and Y are independent. In that case, the covariance term is zero. Therefore, the standard deviation of W equals the standard deviation of X plus the standard deviation of Y. The same for the standard deviation of V.
Finally, consider a weighted average of random variables. In terms of a histogram, suppose that you had two zip codes with different average incomes. If you wanted to take the overall mean income of the population in both zip codes, you would have to weight the means by the different populations in the zip codes.
For example, suppose that there are 6000 families in one zip code, with a mean income of $50k. Suppose that there are 4000 families in another zip code, with a mean income of $40k. The overall mean income is equal to (1/10,000)(6000 * $50 + 4000 *$40) = $46k.
Similarly, suppose that you take a weighted average of two random variables. Let W = aX + bY. Then the mean of W is equal to a times the mean of X plus b times the mean of Y.
The variance of W will equal a^{2} times the variance of X plus b^{2} times the variance of Y plus 2ab times the covariance of X and Y. Again, if X and Y are independent the covariance term vanishes.