To hear the lecture, click here.
Statistical reasoning is based on the theory of probability. Particularly important are conditional probability (included in chapter 6) and random variables (chapter 7).
Random Process--a process that can be repeated many times with different outcomes
Sample Space--the set of all possible outcomes of a random process
{H,T} {0,1,2} {R,G,B}
Discrete vs. Continuous
Example: High Temperature of the Day: {T<0, 0<=T<20, 20<=T<50, 50<=T<100, 100<=T}
Rules of samples space: no overlaps, no omissions. Therefore, probabilities sum to 1
What is wrong with {T<0, 0<=T<50, 20<=T<80, 100<=T}
We will use two definitions: logical indepence; and statistical independence. For now, logical independence--no reason to believe that two events affect one another
Flip Three Coins
{HHH,HHT,HTH,HTT,THH,THT,TTH,TTT}
In a deck of cards, what is the probability that the first card you turn over is an ace? If the first card is an ace, is the probability of getting an ace on the second card still 1/13? The events are not independent, because once you turn over one ace, there are only 51 cards remaining, and only three of them are aces. To make the events independent, you would have to put the first ace back in the deck and reshuffle before you draw the second card.
In general, if you have a relatively small, fixed number of possibilities to choose from, and you choose more than one, the results are not independent. Only if you put the first choice back and "reshuffle" will you have independence. Putting back and reshuffling is called "sampling with replacement." Not putting back and reshuffling is called "sampling without replacement."
Component Failure vs. System Failure
Boat in the middle of a lake with 2 engines--only if both engines fail are you stuck
one failure ruins entire system--serial Christmas lights; single-elimination tournament
When one team is ahead 3-0 in a best-of-seven play-off, what has to happen for that team to lose the play-off?
Suppose you are behind 3-0 in a best-of-seven playoff, meaning that you need to win 4 games in a row in order to avoid failure. If the probability of winning a game is .5, what is the probability of winning four in a row? (Assume that each game is independent)
Multiplication rule: when events are independent, we multiply probabilities. (.5)(.5)(.5)(.5) = .0625
If a batter has a .3 chance of getting a hit, what is probability of getting a hit twice in a row? three times in a row? (assuming independence)
If a lawyer has a probability of .6 of winning one case and a probability of .7 of winning a second case, and the two cases are independent, what is the probability of winning both cases? What is the probability of winning no cases?
Using the multiplication rule, to win both cases you multiply (.6)(.7) = .42. To lose both cases, you multiply the probabilities of losing each case, that is (.4)(.3) = .12. Since we win both cases 42 percent of the time and lose both cases 12 percent of the time, we must win exactly one case 46 percent of the time.
Is the probability of rolling a sum of 7 the same as the probability of rolling a sum of 3?
Out of 36 possibilities, two produce a sum of 3 and six produce a 7. Another way to see this is to think of rolling the dice one at a time. If the first die produces a 3,4,5, or 6, then there is no way to get a total of 3 on two dice. If the first die produces a 1 or 2, then you have a 1/6 chance of getting a 3 on two dice. To summarize:
Value on First Die | Number Required on Second Die | Overall Probability |
---|---|---|
1 | 2 | 1/6 x 1/6 = 1/36 |
2 | 1 | 1/6 x 1/6 = 1/36 |
3,4,5,6 | impossible | 0 |
Overall, the probability of rolling a sum of 3 is 1/36 +1/36 = 2/36. What is the probability of rolling a sum of 4? of 5?
Key terminology: random process; sample space; independent events; sampling with replacement; multiplication rule; failure models--component vs. system