AP Statistics Lectures

by Arnold Kling

Geometric Distribution

Suppose that I am at a party and I start asking girls to dance. Let X be the number of girls that I need to ask in order to find a partner. If the first girl accepts, then X=1. If the first girl declines but the next girl accepts, then X=2. And so on.

When X=n, it means that I failed on the first n-1 tries and succeeded on the nth try. My probability of failing on the first try is (1-p). My probabilty of failing on the first two tries is (1-p)(1-p).

My probability of failing on the first n-1 tries is (1-p)^{n-1}. Then, my probability of succeeding on the nth try is p. Thus, we have

P(X = n) = (1-p)^{n-1}p

This is known as the geometric distribution. When you have a sequence of numbers in which the (n+1)th number is a multiple of the nth number, it is called a geometric sequence. In this case, P(X = n+1) is a multiple of P(X = n). (What is that multiple?)

What is the probability that it will take more than n tries to succeed? We know that if I ask an infinite number of girls to dance, eventually one of them will accept. So, the probability that it will take more than n tries is the same as the probability that I fail n times. That is,

P(X > n) = (1-p)^{n}

If X is geometric with parameter p, what is E(X)?

We are faced with an infinite sum. Multiplying X times P(X) for X = 1, 2, 3, ... gives

[1]
S = p + 2p(1-p) + 3p(1-p)^{2} +...+np(1-p)^{n-1}

Multiply both sides by (1-p) and you have

[2]
(1-p)S = p(1-p) + 2p(1-p)^{2} + 3p(1-p)^{3} +...+np(1-p)^{n}

Subtracting [2] from [1] gives

S - (1-p)S = pS = p[1 + (1-p) + (1-p)^{2} + ...(1-p)^{n}] = p(1/p) = 1

S = 1/p

S = 1/p

Therefore, the mean of the geometric distribution is equal to 1/p. If we are trying to estimate how many girls I will have to ask to dance until I find a partner, and p, the probability of one girl accepting, is .2, then on average I will have to ask five girls.

You will not have to know it, but for the record, the variance of the geometric distribution is (1-p)/p