Statistics questions
MTH 245 Lesson 12 Notes Random Variables
In Lesson 1, we defined the term variable as a measurement or observation associated with an individual, and we saw that a set of variable values for the individuals in a sample make up the data set. A variable may be categorical or quantitative.
A random variable (typically represented by 𝑥𝑥 or 𝑋𝑋) is a variable associated with a possible outcome of an experiment. For the rest of the course, we will mostly restrict ourselves to the case where 𝑥𝑥 takes on a single quantitative value as a result of a single replication of the experiment.
Quantitative random variables can be discrete or continuous depending on the type of data involved. We will consider the discrete case in this lesson and in Lesson 13.
Probability Distributions
A probability distribution is a function, table, or graph that lists every possible value a random variable can take on, along with its associated probability. The best way to understand a probability distribution is to think of it a specialized form of relative frequency distribution.
To be considered a probability distribution, a function/table/graph must satisfy three criteria:
1. Every possible value of 𝑥𝑥 must be accounted for. 2. Each value of 𝑥𝑥 must be associated with a properly defined
probability between 0 and 1. 3. These probabilities must all sum to 1.
Example 1: Consider an experiment where a woman gives birth to a single child and the assigned gender of the child (male or female) is recorded. Suppose that for any live birth, the probability that the baby will be a girl is 0.500. Construct and verify a probability distribution for the random variable 𝑥𝑥, the total number of girls in two live births.
Define the random variable 𝑥𝑥 as the number of girls in two live births. Since there can only be zero, one, or two girls in two live births, 𝑥𝑥 = {0, 1, 2}.
To determine the probabilities, we use the methods of Lesson 10. Define the sample space of gender sequences of two live births as 𝑆𝑆 = {𝐵𝐵𝐵𝐵𝐵𝐵/𝐵𝐵𝐵𝐵𝐵𝐵, 𝐵𝐵𝐵𝐵𝐵𝐵/𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺, 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺/𝐵𝐵𝐵𝐵𝐵𝐵, 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺/𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺} and event spaces as 𝐴𝐴 = {𝐵𝐵𝐵𝐵𝐵𝐵/𝐵𝐵𝐵𝐵𝐵𝐵} = {𝑥𝑥 = 0}, 𝐵𝐵 = {𝐵𝐵𝐵𝐵𝐵𝐵/𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺, 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺/𝐵𝐵𝐵𝐵𝐵𝐵} = {𝑥𝑥 = 1}, and 𝐶𝐶 = {𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺/𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺} = {𝑥𝑥 = 2}. Using the definition of theoretical probability, 𝑃𝑃(𝐴𝐴) = 𝑃𝑃(𝑥𝑥 = 0) = 1 4⁄ , 𝑃𝑃(𝐵𝐵) = 𝑃𝑃(𝑥𝑥 = 1) = 1 4⁄ , and 𝑃𝑃(𝐶𝐶) = 𝑃𝑃(𝑥𝑥 = 2) = 1 4⁄ .
This gives us the following probability distribution:
𝑥𝑥 𝑃𝑃(𝑥𝑥) 0 1 4⁄ 1 2 4⁄ 2 1 4⁄
Total 4 4⁄ (= 1)
The above table meets all three criteria of a probability distribution.
Often, the individual values of a random variable 𝑥𝑥 are related to their probabilities through some sort of functional relationship. Many commonly used probability distributions fall into this category. The next example illustrates this concept.
Example 2: Suppose that for a random variable 𝑥𝑥, 𝑃𝑃(𝑥𝑥) = 𝑥𝑥 ⁄ 3 and 𝑥𝑥 = {0, 1, 2}. Construct and verify the associated probability distribution.
By definition, 𝑥𝑥 = {0, 1, 2}. To find the probability for each value of 𝑥𝑥, plug that value into the function 𝑃𝑃(𝑥𝑥). This gives us the following distribution:
𝑥𝑥 𝑃𝑃(𝑥𝑥) 0 0 3⁄ 1 1 3⁄ 2 2 3⁄
Total 3 3⁄ (= 1)
As with Example 1, the above table meets all three criteria of a probability distribution.
Probability Calculations Using a Discrete Probability Distribution
For a random variable 𝑥𝑥, events are defined in terms of 𝑥𝑥 taking on a specific value or range of values. For example, our event space might consist of a set of outcomes for which the associated values of 𝑥𝑥 lie between 0 and 4 inclusive; that is, 𝐴𝐴 = {0 ≤ 𝑥𝑥 ≤ 4}. To find 𝑃𝑃(𝐴𝐴) = 𝑃𝑃(0 ≤ 𝑥𝑥 ≤ 4), we simply refer to the probability distribution of 𝑥𝑥 and add up all the probabilities for that range of values: 𝑃𝑃(𝐴𝐴) = 𝑃𝑃(𝑥𝑥 = 0) + 𝑃𝑃(𝑥𝑥 = 1) + 𝑃𝑃(𝑥𝑥 = 2) + 𝑃𝑃(𝑥𝑥 = 3) + 𝑃𝑃(𝑥𝑥 = 4).
Example 3: Using the distribution from Example 1, find the probability of observing one or fewer girls in a sequence of two live births.
𝑃𝑃(𝐵𝐵𝑜𝑜𝑜𝑜 𝐵𝐵𝐺𝐺 𝑓𝑓𝑜𝑜𝑓𝑓𝑜𝑜𝐺𝐺) = 𝑃𝑃(𝑥𝑥 ≤ 1) = 𝑃𝑃(𝑥𝑥 = 0) + 𝑃𝑃(𝑥𝑥 = 1) = 1 4⁄ + 2 4⁄ = 3 4⁄ = 0.750.
Example 4: Using the distribution from Example 2, find the probability that 𝑥𝑥 takes on the value 1 or higher.
𝑃𝑃(𝐵𝐵𝑜𝑜𝑜𝑜 𝐵𝐵𝐺𝐺 ℎ𝐺𝐺𝑖𝑖ℎ𝑜𝑜𝐺𝐺) = 𝑃𝑃(𝑥𝑥 ≥ 1) = 𝑃𝑃(𝑥𝑥 = 1) + 𝑃𝑃(𝑥𝑥 = 2) = 1 3⁄ + 2 3⁄ = 3 3⁄ = 1.000.
Parameters of a Discrete Random Variable
The most common use of a probability distribution is to model the behavior of a population, particularly its parameters. The parameters of the distribution can be used as surrogates of the parameters of the population.
The mean of a discrete random variable is defined as
𝜇𝜇 = ∑[𝑥𝑥 ⋅ 𝑃𝑃(𝑥𝑥)] for all values of 𝑥𝑥
To evaluate this formula, use the following procedure:
1. Multiply each value of 𝑥𝑥 by its associated probability to get 𝑥𝑥 ⋅ 𝑃𝑃(𝑥𝑥). 2. Add all the 𝑥𝑥 ⋅ 𝑃𝑃(𝑥𝑥) terms to get 𝜇𝜇.
The parameter 𝜇𝜇 is also referred to as the expected value (or expectation or mathematical expectation) of the random variable, because it’s the value of the random variable 𝑥𝑥 that we “expect” to see most often over an infinite number of replications of an experiment.
The variance of a discrete random variable is defined as
𝜎𝜎2 = {∑[𝑥𝑥2 ⋅ 𝑃𝑃(𝑥𝑥)]} − 𝜇𝜇2 for all values of 𝑥𝑥
To evaluate this formula, use the following procedure:
1. Multiply the square of each value of 𝑥𝑥 by its associated probability to get 𝑥𝑥2 ⋅ 𝑃𝑃(𝑥𝑥).
2. Add all the 𝑥𝑥2 ⋅ 𝑃𝑃(𝑥𝑥) terms. 3. Take the sum from Step 2 and subtract 𝜇𝜇2—the square of the mean—
to get 𝜎𝜎2.
The standard deviation of a discrete random variable—𝜎𝜎—is simply the square root of its variance (√𝜎𝜎2).
Example 5: Find the expected value, variance, and standard deviation using the probability distribution from Example 1.
𝑥𝑥 𝑃𝑃(𝑥𝑥) 𝑥𝑥 ⋅ 𝑃𝑃(𝑥𝑥) 𝑥𝑥2 ⋅ 𝑃𝑃(𝑥𝑥) 0 1 4⁄ 0 ⋅ 1 4⁄ = 0 02 ⋅ 1 4⁄ = 0 1 2 4⁄ 1 ⋅ 2 4⁄ = 2 4⁄ 12 ⋅ 2 4⁄ = 2 4⁄ 2 1 4⁄ 2 ⋅ 1 4⁄ = 2 4⁄ 22 ⋅ 1 4⁄ = 4 4⁄
Total 4 4⁄ (= 1) 4 4⁄ (= 1) 6 4⁄ (= 3 2⁄ )
𝜇𝜇 = 1, the total of the third column. The variance is the total of the fourth column minus the square of the total of the third column:
𝜎𝜎2 = 3 2 − (1)2 = 3
2 − 1 = 3
2 − 2
2 = 1
2 = 0.5.
The standard deviation is the square root of the variance: 𝜎𝜎 = �1 2⁄ ≈ 0.7.
Example 6: Find the expected value, variance, and standard deviation using the probability distribution from Example 2.
𝑥𝑥 𝑃𝑃(𝑥𝑥) 𝑥𝑥 ⋅ 𝑃𝑃(𝑥𝑥) 𝑥𝑥2 ⋅ 𝑃𝑃(𝑥𝑥) 0 0 0 ⋅ 0 = 0 02 ⋅ 0 = 0 1 1 3⁄ 1 ⋅ 1 3⁄ = 1 3⁄ 12 ⋅ 1 3⁄ = 1 3⁄ 2 2 3⁄ 2 ⋅ 2 3⁄ = 4 3⁄ 22 ⋅ 2 3⁄ = 8 3⁄
Total 3 3⁄ (= 1) 5 3⁄ 9 3⁄ (= 3) 𝜇𝜇 = 5 3⁄ , the total of the third column. The variance is the total of the fourth column minus the square of the total of the third column:
𝜎𝜎2 = 9 3 − �5
3 � 2
= 9 3 − 25
9 = 27
9 − 25
9 = 2
9 ≈ 0.2.
The standard deviation is the square root of the variance: 𝜎𝜎 = �2 9⁄ ≈ 0.5.
Probability and Statistical Significance
In Lesson 7, we learned how to exploit the Empirical Rule to determine which values in a data set are statistically significant. We could apply the same reasoning to determine significant values of a random variable 𝑥𝑥. The problem with this, however, is that the Empirical Rule assumes the random variable's distribution is symmetric; it will produce inaccurate results if the distribution is skewed. Instead, we will use probability to determine significance. The fundamental idea is that the farther away from the mean 𝜇𝜇 a value of 𝑥𝑥 is, the smaller the probability of observing that value – or any value of 𝑥𝑥 farther from 𝜇𝜇 – simply by random chance.
To use the probability method to determine if some value of 𝑥𝑥 – call it 𝑥𝑥0 – is significant, we need to apply one of the follow tests:
− If 𝑥𝑥0 is less than the mean 𝜇𝜇, then calculate 𝑃𝑃(𝑥𝑥 ≤ 𝑥𝑥0) (the sum of all probabilities of 𝑥𝑥 values less than or equal to 𝑥𝑥0). If that probability is less than 0.025, 𝑥𝑥0 is significantly low with respect to 𝜇𝜇.
− If 𝑥𝑥0 is greater than the mean 𝜇𝜇, then calculate 𝑃𝑃(𝑥𝑥 ≥ 𝑥𝑥0) (the sum of all probabilities of 𝑥𝑥 values greater than or equal to 𝑥𝑥0). If that probability is less than 0.025, 𝑥𝑥0 is significantly high with respect to 𝜇𝜇.
Example 7: The table below is the probability distribution for the random variable 𝑥𝑥, which represents the number of girls among a random sample of 10 randomly selected live births from 10 different mothers. Use the probability method to determine which values of 𝑥𝑥 are significant. Note: the distribution mean 𝜇𝜇 = 5.0. 𝑥𝑥 0 1 2 3 4 5 6 7 8 9 10
𝑃𝑃(𝑥𝑥) 0.001 0.010 0.044 0.117 0.205 0.246 0.205 0.117 0.044 0.010 0.001
Significantly low values of 𝑥𝑥:
𝑥𝑥 = 0: 𝑃𝑃(𝑥𝑥 ≤ 0) = 𝑃𝑃(𝑥𝑥 = 0) = 0.001 ≤ 0.025, so 𝑥𝑥 = 0 is significantly low.
𝑥𝑥 = 1: 𝑃𝑃(𝑥𝑥 ≤ 1) = 𝑃𝑃(𝑥𝑥 = 0) + 𝑃𝑃(𝑥𝑥 = 1) = 0.001 + 0.010 = 0.011 ≤ 0.025, so 𝑥𝑥 = 1 is significantly low.
𝑥𝑥 = 2: 𝑃𝑃(𝑥𝑥 ≤ 2) = 𝑃𝑃(𝑥𝑥 = 0) + 𝑃𝑃(𝑥𝑥 = 1) + 𝑃𝑃(𝑥𝑥 = 2) = 0.001 + 0.010 + 0.044 = 0.055 > 0.025, so 𝑥𝑥 = 2 is not significant.
𝑥𝑥 = {3, 4}: Since 𝑥𝑥 = 2 is not significant, 𝑥𝑥 = 3 and 𝑥𝑥 = 4—which are closer to 𝜇𝜇, are also not significant.
Significantly high values of 𝑥𝑥:
𝑥𝑥 = 10: 𝑃𝑃(𝑥𝑥 ≥ 10) = 𝑃𝑃(𝑥𝑥 = 10) = 0.001 ≤ 0.025, so 𝑥𝑥 = 0 is significantly high.
𝑥𝑥 = 9: 𝑃𝑃(𝑥𝑥 ≥ 9) = 𝑃𝑃(𝑥𝑥 = 10) + 𝑃𝑃(𝑥𝑥 = 9) = 0.001 + 0.010 = 0.011 ≤ 0.025, so 𝑥𝑥 = 9 is significantly high.
𝑥𝑥 = 8: 𝑃𝑃(𝑥𝑥 ≥ 8) = 𝑃𝑃(𝑥𝑥 = 10) + 𝑃𝑃(𝑥𝑥 = 9) + 𝑃𝑃(𝑥𝑥 = 8) = 0.001 + 0.010 + 0.044 = 0.055 > 0.025, so 𝑥𝑥 = 8 is not significant.
𝑥𝑥 = {6, 7}: Since 𝑥𝑥 = 8 is not significant, 𝑥𝑥 = 6 and 𝑥𝑥 = 7—which are closer to 𝜇𝜇—are also not significant.
Example 8: The table below is the probability distribution for the random variable 𝑥𝑥. Which values of 𝑥𝑥 are significant? Use the probability method to determine which values of 𝑥𝑥 are significant. Note: the distribution mean 𝜇𝜇 = 1.3.
𝑥𝑥 0 1 2 3 4 5 𝑃𝑃(𝑥𝑥) 0.237 0.395 0.264 0.088 0.015 0.001
Significantly low values of 𝑥𝑥:
𝑥𝑥 = 0: 𝑃𝑃(𝑥𝑥 ≤ 0) = 𝑃𝑃(𝑥𝑥 = 0) = 0.237 > 0.025, so 𝑥𝑥 = 0 is not significant.
𝑥𝑥 = 1: Since 𝑥𝑥 = 0 is not significant, 𝑥𝑥 = 1—which is closer to 𝜇𝜇—is also not significant.
Significantly high values of 𝑥𝑥:
𝑥𝑥 = 5: 𝑃𝑃(𝑥𝑥 ≥ 5) = 𝑃𝑃(𝑥𝑥 = 5) = 0.001 ≤ 0.025, so 𝑥𝑥 = 5 is significantly high.
𝑥𝑥 = 4: 𝑃𝑃(𝑥𝑥 ≥ 4) = 𝑃𝑃(𝑥𝑥 = 5) + 𝑃𝑃(𝑥𝑥 = 4) = 0.001 + 0.015 = 0.016 ≤ 0.025, so 𝑥𝑥 = 4 is also significantly high.
𝑥𝑥 = 3: 𝑃𝑃(𝑥𝑥 ≥ 3) = 𝑃𝑃(𝑥𝑥 = 5) + 𝑃𝑃(𝑥𝑥 = 4) + 𝑃𝑃(𝑥𝑥 = 3) = 0.001 + 0.015 + 0.088 = 0.104 > 0.025, so 𝑥𝑥 = 3 is not significant.
𝑥𝑥 = 2: Since 𝑥𝑥 = 3 is not significant, 𝑥𝑥 = 2—which is closer to 𝜇𝜇—is also not significant.