Statistics questions
MTH 245 Lesson 11 Notes Conditional Events and the Multiplication Rule of Probability
For two events A and B, the probability of B, assuming that we have already observed an outcome in A during a previous replication of the experiment, is denoted by P(BA). (This is the conditional probability of 𝐵𝐵 given 𝐴𝐴.)
To calculate the probability that an event 𝐴𝐴 occurs in one replication of an experiment and a second event 𝐵𝐵 occurs in a separate, subsequent replication, we multiply the probability of 𝑃𝑃(𝐴𝐴) by the probability of 𝑃𝑃(𝐵𝐵), but when we calculate 𝑃𝑃(𝐵𝐵), we need to make sure we account for any effect of the first replication on the outcome of the second. This is the Multiplication Rule of Probability, and is expressed mathematically as follows:
𝑃𝑃(𝐴𝐴 ∩ 𝐵𝐵) = 𝑃𝑃(𝐴𝐴) × 𝑃𝑃(𝐵𝐵|𝐴𝐴) Example 1: Suppose two cards are drawn in sequence and without replacement from a standard 52-card poker deck. (That is, when we draw the first card, we do not replace it in the deck for the second draw.) What is the probability that the first card will be the king of hearts and the second card will be a spade?
Define 𝐴𝐴 = {𝐾𝐾𝐾𝐾𝐾𝐾𝐾𝐾 𝑜𝑜𝑜𝑜 𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻, 𝐷𝐷𝐻𝐻𝐻𝐻𝐷𝐷 #1} and 𝐵𝐵 = {𝑆𝑆𝑆𝑆𝐻𝐻𝑆𝑆𝐻𝐻, 𝐷𝐷𝐻𝐻𝐻𝐻𝐷𝐷 #2}. Note that after the first round, there is one fewer card in the deck, but since the missing card---the King of Hearts—is not a spade, there are still 13 spades in the deck. Therefore,
𝑃𝑃(𝐴𝐴 ∩ 𝐵𝐵) = 𝑃𝑃(𝐴𝐴) × 𝑃𝑃(𝐵𝐵 | 𝐴𝐴) = 𝑃𝑃(𝐾𝐾𝑜𝑜𝐻𝐻, 𝐷𝐷𝐻𝐻𝐻𝐻𝐷𝐷 1) ×
𝑃𝑃(𝑆𝑆𝑆𝑆𝐻𝐻𝑆𝑆𝐻𝐻, 𝐷𝐷𝐻𝐻𝐻𝐻𝐷𝐷 2 | 𝐾𝐾𝑜𝑜𝐻𝐻 𝐷𝐷𝐻𝐻𝐻𝐻 𝐻𝐻𝐻𝐻𝑟𝑟𝑜𝑜𝑟𝑟𝐻𝐻𝑆𝑆 𝐻𝐻𝑜𝑜𝐻𝐻𝐻𝐻𝐻𝐻 𝐷𝐷𝐻𝐻𝐻𝐻𝐷𝐷 1)
= 1 52
× 13 51
= 0.005.
Independent Events
Two events A and B are said to be independent if the outcome of one doesn’t affect the probability of the outcome of the other. Formally, this is expressed as follows:
𝑃𝑃(𝐵𝐵|𝐴𝐴) = 𝑃𝑃(𝐵𝐵) and 𝑃𝑃(𝐴𝐴|𝐵𝐵) = 𝑃𝑃(𝐴𝐴)
If two events are not independent, they are said to be dependent. Note that dependence does not imply there is a cause-and-effect relationship between the events, but merely that probability calculations for one replication of the experiment depend on the results of the other replication in some way. If A and B are independent, the Multiplication Rule simplifies as follows:
𝑃𝑃(𝐴𝐴 ∩ 𝐵𝐵) = 𝑃𝑃(𝐴𝐴) × 𝑃𝑃(𝐵𝐵)
From the standpoint of sampling, selection with replacement produces independent events, while selection without replacement results in dependent events. Example 2: Suppose 50 individuals in a drug monitoring program are tested, and of these, 6 tested positive and 44 tested negative. If two users are selected at random from the 50 people in the program, what is the probability that the first selection tested positive and the second selection tested negative? Calculate the probability both with and without replacement.
Define 𝐴𝐴 = {𝑃𝑃𝑜𝑜𝐻𝐻𝐾𝐾𝐻𝐻𝐾𝐾𝑟𝑟𝐻𝐻, 𝑆𝑆𝐻𝐻𝑆𝑆𝐻𝐻𝑆𝑆𝐻𝐻𝐾𝐾𝑜𝑜𝐾𝐾 #1} and 𝐵𝐵 = {𝑁𝑁𝐻𝐻𝐾𝐾𝐻𝐻𝐻𝐻𝐾𝐾𝑟𝑟𝐻𝐻, 𝑆𝑆𝐻𝐻𝑆𝑆𝐻𝐻𝑆𝑆𝐻𝐻𝐾𝐾𝑜𝑜𝐾𝐾 #2}.
Without replacement: After the first round, there is one less individual to choose from---the positive tester from Selection #1—but there are still 44 negative testers available for Selection #2. Therefore,
𝑃𝑃(𝐴𝐴 ∩ 𝐵𝐵) = 𝑃𝑃(𝐴𝐴) × 𝑃𝑃(𝐵𝐵 | 𝐴𝐴) = 6 50
× 44 49
= 0.108.
With replacement: After the first round, there are still 50 individuals to choose from, 44 of which tested negative. Therefore,
𝑃𝑃(𝐴𝐴 ∩ 𝐵𝐵) = 𝑃𝑃(𝐴𝐴) × 𝑃𝑃(𝐵𝐵) = 6 50
× 44 50
= 0.106.
Example 3: Suppose an urn contains 5 black marbles and 3 white marbles. Three marbles are randomly drawn from the urn in succession. What is the probability of selecting black marbles on all three draws? Calculate the probability both with and without replacement.
Define 𝐴𝐴 = {𝐵𝐵𝑆𝑆𝐻𝐻𝑆𝑆𝐵𝐵 𝑟𝑟𝐻𝐻𝐻𝐻𝑚𝑚𝑆𝑆𝐻𝐻, 𝑆𝑆𝐻𝐻𝑆𝑆𝐻𝐻𝑆𝑆𝐻𝐻𝐾𝐾𝑜𝑜𝐾𝐾 #1}, 𝐵𝐵 = {𝐵𝐵𝑆𝑆𝐻𝐻𝑆𝑆𝐵𝐵 𝑟𝑟𝐻𝐻𝐻𝐻𝑚𝑚𝑆𝑆𝐻𝐻, 𝑆𝑆𝐻𝐻𝑆𝑆𝐻𝐻𝑆𝑆𝐻𝐻𝐾𝐾𝑜𝑜𝐾𝐾 #2}, and 𝐶𝐶 = {𝐵𝐵𝑆𝑆𝐻𝐻𝑆𝑆𝐵𝐵 𝑟𝑟𝐻𝐻𝐻𝐻𝑚𝑚𝑆𝑆𝐻𝐻, 𝑆𝑆𝐻𝐻𝑆𝑆𝐻𝐻𝑆𝑆𝐻𝐻𝐾𝐾𝑜𝑜𝐾𝐾 #3}.
Without replacement: After each round, there is one less black marble to choose from and the total count of marbles also decreases by one. Therefore,
𝑃𝑃(𝐴𝐴 ∩ 𝐵𝐵 ∩ 𝐶𝐶) = 𝑃𝑃(𝐴𝐴) × 𝑃𝑃(𝐵𝐵 | 𝐴𝐴) × 𝑃𝑃(𝐶𝐶 | 𝐴𝐴 ∩ 𝐵𝐵) = 5 8
× 4 7
× 3 6
= 0.179.
With replacement: After each round, there are still 8 marbles to choose from, 5 of which are black. Therefore,
𝑃𝑃(𝐴𝐴 ∩ 𝐵𝐵 ∩ 𝐶𝐶) = 𝑃𝑃(𝐴𝐴) × 𝑃𝑃(𝐵𝐵) × 𝑃𝑃(𝐶𝐶) = 5 8
× 5 8
× 5 8
= �5 8 � 3
= 0.244.
Treating Dependent Events as Independent (the Five Percent Condition)
As we've seen, probability calculations can quickly become complicated when replications of an experiment are dependent on each other, as happens when sampling without replacement. Fortunately, under a certain condition—the Five Percent Condition—we can treat the replications as if they were independent. This makes calculating probabilities for large numbers of replications easier.
The Five Percent Condition gets its name from the comparison of the sample size 𝐾𝐾 (which equals the number of replications) to the population size 𝑁𝑁. If 𝐾𝐾 is no more than five percent of 𝑁𝑁—in other words, if 𝐾𝐾 𝑁𝑁⁄ ≤ 0.05—the Condition applies, and we can calculate probabilities as if we were sampling with replacement, even if we really aren’t. Example 5: Suppose that out of 810 scales at two airports, 102 are known to be defective. If two of the 810 scales are selected at random for inspection, what is the probability that they will both be defective? Calculate both with and without replacement.
We want to find 𝑃𝑃(𝐴𝐴 ∩ 𝐵𝐵), where 𝐴𝐴 = {𝐷𝐷𝐻𝐻𝑜𝑜𝐻𝐻𝑆𝑆𝐻𝐻𝐾𝐾𝑟𝑟𝐻𝐻, 𝑅𝑅𝑜𝑜𝑅𝑅𝐾𝐾𝑆𝑆 1} and 𝐵𝐵 = {𝐷𝐷𝐻𝐻𝑜𝑜𝐻𝐻𝑆𝑆𝐻𝐻𝐾𝐾𝑟𝑟𝐻𝐻, 𝑅𝑅𝑜𝑜𝑅𝑅𝐾𝐾𝑆𝑆 2}. Note that we are sampling 𝐾𝐾 = 2 scales from a population of 𝑁𝑁 = 810. Since 𝐾𝐾/𝑁𝑁 = 0.002 < 0.05, the Five Percent Condition applies and we can calculate 𝑃𝑃(𝐴𝐴 ∩ 𝐵𝐵) as if we were sampling the scales with replacement (realistically, we wouldn't do that because we'd want to inspect different scales):
𝑃𝑃(𝐴𝐴 ∩ 𝐵𝐵) = 𝑃𝑃(𝐴𝐴) × 𝑃𝑃(𝐵𝐵) = 102 810
× 102 810
= �102 810 � 2
= 0.0159.
For sake of comparison, the calculations without replacement are
𝑃𝑃(𝐴𝐴 ∩ 𝐵𝐵) = 𝑃𝑃(𝐴𝐴) × 𝑃𝑃(𝐵𝐵 | 𝐴𝐴) = 102 810
× 101 809
= 0.0157,
which is close to the previous probability (they round to the same number to three decimal places: 0.016.
In the previous example, we didn't have to make use of the Five Percent Condition because the exact calculations were relatively straightforward. However, as this next example shows, as the problem's dimensions increase, exact calculations quickly become overwhelming. Example 6: Suppose that out of 10,000 scales at multiple airports, 1,259 are known to be defective. What is the probability that any 5 randomly selected scales will all be defective? Calculate both with and without replacement.
We are now sampling 𝐾𝐾 = 5 scales from a population of 𝑁𝑁 = 10,000. Since 𝐾𝐾/𝑁𝑁 = 0.0005 < 0.05, the Five Percent Condition applies and we can treat the scales as if they've been sampled with replacement:
𝑃𝑃(𝐻𝐻𝑆𝑆𝑆𝑆 5 𝑆𝑆𝐻𝐻𝑜𝑜𝐻𝐻𝑆𝑆𝐻𝐻𝐾𝐾𝑟𝑟𝐻𝐻) = 1259 10000
× 1259 10000
× 1259 10000
× 1259 10000
× 1259 10000
= � 1259 10000
� 5
= 0.0000316.
The calculations without replacement are
𝑃𝑃(𝐻𝐻𝑆𝑆𝑆𝑆 5 𝑆𝑆𝐻𝐻𝑜𝑜𝐻𝐻𝑆𝑆𝐻𝐻𝐾𝐾𝑟𝑟𝐻𝐻) = 1259 10000
× 1258 9999
× 1257 9998
× 1256 9997
× 1255 9996
= 0.0000314.
Again, the two probabilities are close to equal.
To go beyond Example 6, if I wanted to expand the number of scales selected to 1,000, I could simply change the exponent in the first calculation to 1,000 instead of 5. However, if I wanted to expand the second formula, I would have to multiply 995 more terms, each with numerator and denominator decreased by 1. This is what makes the Five Percent Condition so powerful. Note: if you have taken a statistics course before—especially AP Statistics—you probably introduced to the similar Ten Percent Condition. Ten percent is less conservative, but it leads to inaccurate probabilities in certain situations. Statisticians disagree which percent should be used and when. We will not address this issue in MTH 245, but you should be aware it exists.