Probability

5.3. Probability#

Consider a six-sided die. (See image of two six-sided dice in the previous section.) The die can be used to select a number between 1 and 6 inclusive. By inclusive, we mean inclusive of the “end points” 1 and 6 (as well as of 2, 3, 4, and 5). The die can be used to select one of the numbers in an approximately random manner, modulo the slight imperfections (physical asymmetries) in the die.

Suppose you roll a perfect die \(n\) times and keep a record of each outcome.

We list the following examples: What is the chance or probability that…

the die lands on a given number (say, 3) on the first roll?
the die lands on a given number (say, 4) on the second roll?
the die lands on a 1 or 6 on the first roll?
the die lands on the same number on the first two rolls?
the die lands on 1, 2, 3, 4, 5, and then 6 when rolled six times (\(n=6\))?

The probability of an event, given a set of all of the equally likely outcomes, is the ratio

\[\frac{\text{the number of outcomes that produce the event}}{\text{the total number of possible outcomes}}.\]

Does this definition help you answer Example 1? Give it a try. Then read on!

We can do all of Examples 1–5 approximately by using computer simulation. The computer simulates or imitates the rolling of a physical die. We can generate (pseudo-)random numbers many times and see what fraction of the cases result in the target outcome. Also, we can answer Problems 1–5 exactly by using mathematical ideas.

5.3.1. Definitions and Facts#

Definitions and rules. The sample space is the set of possible outcomes. The sample space for the roll of the die is {1, 2, 3, 4, 5, 6}.

Definition. An event \(E\) corresponds to a subset of the sample space. For example, one event \(E\) is that the die lands on a 1 or a 6. The event \(E\) corresponds to the subset {1, 6}.

Example 1. The probability that the die lands on a given number (say, 3) on the first roll is 1/6.

Fact. The probability of an event \(A\) is a real number \(P(A)\) between 0 and 1 inclusive, for example 1/6.

Fact. The probability of the sample space is 1. For example, the probability that the die lands on a 1 or 2 or 3 or 4 or 5 or 6 is 6/6 or 1.

Fact. The probability that event \(A\) does not occur is \(P(\text{not} A) = 1 - P(A)\).

Example 3. The probability that the die lands on a 1 or a 6 is 2/6 or 1/3.

Example 4. The sample space for two rolls of the die is {11, 12,…, 16, 21, 22,…, 26, 31, 32,…, 36,…61, 62,…, 66}, where each two-digit number represents two rolls of the die. An event \(A\), in which the die lands on the same number on the first two rolls, corresponds to the set {11, 22, 33, 44, 55, 66}. The probability that the die lands on the same number on the first two rolls is 6/36 or 1/6.

Definition. Mutually exclusive events are events that cannot happen at the same time. For example, one roll of the die can yield one of six mutually exclusive events.

5.3.2. Addition rule#

The addition rule for mutually exclusive events \(A\) and \(B\) is that \(P(A \ \text{or} \ B) = P(A) + P(B)\).

Example 3. For example, the probability that the die lands on 1 (event \(A\)) or the die lands on 6 (event \(B\)) is \(P(A \ \text{or} \ B) = P(A) + P(B) = 1/6 + 1/6 = 1/3\). This agrees with the prior solution.

5.3.3. Multiplication rule#

The multiplication rule for mutually exclusive events \(A\) and \(B\) is that \(P(A \ \text{and} \ B) = P(A) P(B).\)

Example 4. For example, the probability that the die lands on the same number on the first two rolls is the probability that a number is rolled (event \(A\)) and the same number is rolled (event \(B\)) is \(P(A \ \text{and} \ B) = P(A) P(B) = 1 * 1/6 = 1/6\). This agrees with the prior solution.

5.3.4. Simulation#

This section opened with Problems 1–5 above. We said we could use a computer program to simulate the rolling of a physical die to answer the questions approximately.

Let’s write a roll function to generate (pseudo-)random numbers many times. We’ll approximate the probability of an event by calculating the fraction of the total cases that result in the event. We see a possible output below.

# Return a list of num_rolls die rolls.
def roll(num_rolls):
    # Define a list for the results of the die rolls.
    die_rolls = []
    # Roll the die num_rolls times.
    for i in range(num_rolls):
        # Store a number 1-6 inclusive in die_roll.
        die_roll = random.randint(1, 6)
        # Append the die roll into the list of die rolls.
        die_rolls.append(die_roll)

    return(die_rolls)

# Store ten rolls of the die in a list.
rolls = roll(10)
# Print the list.
print(rolls)
# Print the number of threes that were rolled.
print(f"Number of threes: {rolls.count(3)}")

[3, 3, 4, 4, 1, 3, 3, 2, 4, 5]
Number of threes: 4

Simulating Example 1. Answer Example 1 from the beginning of this section approximately by simulating a large number of rolls and counting the number of threes. Compare the result to the mathematical answer in a complete sentence. We see a possible output below.

For large number of trials, the simulation should yield a fraction of threes approximately equal to 1/6 or 0.16666666666666666.
Result obtained: 0.16629

For 100,000 simulated rolls of the die, the fraction of values of 3 is typically rather close to the mathematical probability of rolling a 3.

Simulating Example 3. Answer Example 3 from the beginning of this section approximately by simulating a large number of rolls and counting the number of ones and sixes. Compare the result to the mathematical answer in a complete sentence. We see a possible output below.

For large number of trials, the simulation should yield a fraction approximately equal to 2/6 or 0.3333333333333333.
Result obtained: 0.336

Simulating Example 4. Answer Example 4 from the beginning of this section approximately by doing a large number of trials of two rolls of a die and counting the fraction of times the two rolls are the same. Compare the result to the mathematical probability in a complete sentence.

For large number of trials, the simulation should yield a fraction of 'double numbers' approximately equal to 1/6 or 0.16666666666666666.
Result obtained: 0.16767

For 100,000 simulated rolls of the die, the fraction of times the two rolls came up is typically rather close to the mathematical probability of the two rolls coming up the same.

5.3.5. Law of large numbers#

In probability theory, the law of large numbers (LLN) says that the average of the results obtained from a large number of trials should be close to the expected average and tends to become closer to the expected average as more trials are performed (based on the Wikipedia article on the LLN, citing A Modern Introduction to Probability and Statistics: Understanding Why and How, by F.M. Dekking, C. Kraaikamp, H.P. Lopuhaa, and L.E. Meester. Springer (Delft, The Netherlands, 2005)).

Example. A single roll of a fair, six-sided die produces 1, 2, 3, 4, 5, or 6, each with equal probability. Therefore, the expected value of the average of the rolls is (1 + 2 + 3 + 4 + 5 + 6) / 6 = 3.5. According to the LLN, if a six-sided die is rolled a large number of times, the average value will approach 3.5 as the number of die rolls approaches infinity. We will illustrate this idea using simulations with a larger and larger amount of rolls. We see a possible output below.

for i in range(6):
    n = 10**(i+1)
    print(f"{n}, {sum(roll(n))/n}")

10, 3.3
100, 3.42
1000, 3.547
10000, 3.5135
100000, 3.49626
1000000, 3.498728

For a number of rolls \(n=10, 100, 1000, 10^4, 10^5, 10^6\), the average value is, for example, 3.3, 3.42, 3.547, 3.5135, 3.49626, 3.498728, respectively.

5.3.6. Gambler’s fallacy#

“…the law of large numbers does not imply, as too many seem to think, that if deviations from expected behavior occur, these deviations are likely to be ‘evened out’ by opposite deviations in the future. This misapplication of the law of large numbers is known as the gambler’s fallacy” (Introduction to Computation and Programming Using Python by Guttag).

Example. At the Casino de Monte-Carlo on August 18, 1913, the roulette ball landed on black 26 times in a row, and gamblers lost millions betting on red, incorrectly believing that landing on red became increasingly likely as the length of the run of blacks increased (https://www.bbc.com/future/article/20150127-why-we-gamble-like-monkeys , 1/27/2015). If you assume that there is a probability of 0.5 of landing on black and a probability of 0.5 of landing on red, what is the probability that 26 spins will result in black all 26 times? What is the probability that the 27th spin will land on black?

# If you assume that there is a probability of 0.5 of landing on black and a probability of 0.5 of landing on red,
# the probability that 26 spins will result in black all 26 times is
print((0.5)**26)
print(1.5/100_000_000)
# If you do 100 million trials of spinning 26 times each, getting black 26 times will typically happen only once or twice.

1.4901161193847656e-08
1.5e-08

If you spin a 27th time, the probability of landing on black is 0.5.

5.3.7. Exercises#

Exercises

Exercise 1: Do Example 5 from the beginning of the Probability section mathematically: Suppose you roll a perfect die \(n\) times and keep a record of each outcome. What is the chance or probability that the die lands on 1, 2, 3, 4, 5, and then 6 when rolled six times (\(n=6\))? Use Python to evaluate your answer. Explain your conclusion in a complete sentence.

Exercise 2: Do Example 5 from the beginning of the Probability section computationally: Suppose you roll a perfect die \(n\) times and keep a record of each outcome. What is the chance or probability that the die lands on 1, 2, 3, 4, 5, and then 6 when rolled six times (\(n=6\))? Do a large number of trials of six rolls of a die and count the number of times the rolls come out as 1 followed by 2, followed by 3, followed by 4, followed by 5, followed by 6. A piece of syntax you may consider using is if (rolls[0] == 1 and rolls[1] == 2 and rolls[2] == 3 and rolls[3] == 4 and rolls[4] == 5 and rolls[5] == 6):. Compare the fraction times you get this pattern to your mathematical answer from the previous exercise in a complete sentence.

Exercise 3: If event \(A\) is that the die comes up 2, and event \(B\) is that the die comes up with an even number, are the two events \(A\) and \(B\) mutually exclusive? Explain. Is the addition rule \(P(A \ \text{or} \ B) = P(A) + P(B)\) applicable? If so, show that the rule holds. If not, show that the rule does not hold. Use Python to illustrate your answer.

Exercise 4: Calculate the probability (a) mathematically and (b) computationally of obtaining the given result in \(n\) rolls of a fair six-sided die. Make a statement in a complete sentence about how many times you expect the outcome, in terms of a number of trials (consisting of \(n\) rolls per trial). Compare the fraction of times your simulation produces the target pattern with your mathematical answer.

\(n=6\), all numbers are the same
\(n=6\), at least one number is different
\(n=1\), the number rolled is greater than 2
\(n=2\), the sum of the two numbers is greater than 2
\(n=2\), the two numbers are different
\(n=3\), two numbers are the same and one is different
\(n=6\), all the numbers are different
\(n=100\), all the numbers are the same
\(n=6\), the number 1 appears at least once
\(n = 6\), the number 1 appears at most once

Exercise 5: Consider the function flip below. Summarize its purpose in a complete sentence. Add comments (with #) to clarify the code. Demonstrate its performance on a sample run. Summarize the results in a complete sentence.

Exercise 6. Consider the function flip_trials_preliminary below. Add comments to explain the code. Summarize the purpose of the function in a complete sentence. Demonstrate its performance on a sample run. Summarize the results in a complete sentence.

Exercise 7. When a fair coin is flipped once, the theoretical probability that the outcome will be heads is equal to 1/2. Therefore, according to the LLN, the proportion of heads in a “large” number n of coin flips “should be” roughly 1⁄2, and the proportion of heads approaches 1/2 as n approaches infinity. Illustrate this idea using simulation with a larger and larger number of flips.

def flip(num_flips):
    head_count = 0
    for i in range(num_flips):
        if random.choice(('H', 'T')) == 'H':
            head_count += 1
    return head_count/num_flips

def flip_trials_preliminary(num_flips_per_trial, num_trials):
    fraction_heads = []
    for i in range(num_trials):
        fraction_heads.append(flip(num_flips_per_trial))
    mean = sum(fraction_heads) / len(fraction_heads)
    return mean