Statistics questions Notes 21
MTH 245 Lesson 21 Notes Estimating a Population Parameter
A confidence interval is an estimate of the value of a population parameter. It is an interval on the real number line that is centered on a point estimate of the parameter. The interval end points—the upper and lower confidence limits—are determined by the margin of error. The illustration below shows the various parts of a confidence interval.
A point estimate is a single value that approximates the actual value of a population parameter. It is calculated using the sample statistic that is the best estimator of the parameter. The best estimator of 𝜇𝜇 is �̅�𝑥; for 𝑝𝑝, it is �̂�𝑝. Taken by itself, a point estimate isn't an adequate estimate of a parameter value. Point estimators are random variables, and they can therefore take on different values for different samples. A point estimate for one particular random sample might be close to the actual parameter value, while it may be substantially different for a different random sample from the same population. We have to take this into account by defining a range of values that could presumably contain the true parameter value. A confidence interval can be formally expressed in one of three ways:
1. Algebraic notation: lower limit < population parameter < upper limit 2. Interval notation: (lower limit, upper limit) 3. Plus-or-minus notation: point estimate ± margin of error
Confidence Level The confidence level (written in general terms as 1 − 𝛼𝛼) is the probability that a particular confidence interval actually contains the population parameter. The concept of confidence and significance are directly related. Where the significance level 𝛼𝛼 is the probability of observing a value that is significantly far from the mean, the confidence level 1 − 𝛼𝛼 is the probability of observing a non-significant value. The following graph, which we first saw in Section 6.3, illustrates the relationship between the two:
The confidence level is usually reported as a percentage of the form 100 ⋅ (1 − 𝛼𝛼)%. The most commonly used confidence levels are 90%, 95%, and 99%, although others are possible.
Margin of Error
When sample statistic calculated from a random sample is used to construct a best estimate of a population parameter, the margin of error, denoted 𝐸𝐸, is the maximum likely difference (with probability 1 − 𝛼𝛼) between the point estimate and the true value of the parameter. To find the lower confidence limit, subtract 𝐸𝐸 from the point estimate; similarly, to find the upper confidence limit, add 𝐸𝐸 to the point estimate.
Note that the confidence level 1 − 𝛼𝛼 increases, 𝐸𝐸 also increases, which means that the higher the confidence level, the wider the interval.
Effect of Confidence Level on Width
Similarly, the larger the sample size 𝑛𝑛, the smaller 𝐸𝐸 will be, leading to a narrower interval.
Effect of Sample Size on Width
Interpreting a Confidence Interval
When reporting a confidence interval estimate, we say we are 100 ⋅ (1 − 𝛼𝛼)% confident that the true value of the population parameter lies within the interval. Alternatively, we can say we are 100 ⋅ (1 − 𝛼𝛼)% confident that values outside the confidence interval cannot equal to the population parameter.
Note: The phrase "100 ⋅ (1 − 𝛼𝛼)% confidence interval" does not mean that 100 ⋅ (1 − 𝛼𝛼)% of the possible values of the population parameter lie within the interval. The population parameter is a constant that never varies, where the confidence interval is nearly always different for each unique sample.
Calculating the Point Estimate and the Margin of Error from a Confidence Interval
If we're given a confidence interval, we can back into the values of the point estimate and the margin of error using the fact that the interval is symmetric around the point estimate. For any confidence interval, the values are found as follows:
Point estimate = lower confidence limit + upper confidence limit
2
Margin of error = upper confidence limit − lower confidence limit
2
Example 1: Suppose you are told that a confidence interval estimate for �̂�𝑝 is (0.165, 0.335). Find the point estimate �̂�𝑝 and the margin of error 𝐸𝐸 using the above formulas.
�̂�𝑝 = 0.165 + 0.335
2 = 0.250
𝐸𝐸 = 0.335 − 0.165
2 = 0.085
Confidence Intervals and Hypothesis Testing A confidence interval around a population parameter at confidence level 1 − 𝛼𝛼 is closely related to a two-tailed hypothesis test (the operator in 𝐻𝐻𝐴𝐴 is ≠) of that parameter at significance level 𝛼𝛼. (Note: this will not work if the operator in 𝐻𝐻𝐴𝐴 is < or > —these cases require an adjustment to the calculations that is outside the scope of this course.)
The procedure for conducting a hypothesis test using a confidence interval is as follows:
1. Define 𝐻𝐻0 and 𝐻𝐻𝐴𝐴. 2. Calculate the 100(1 − 𝛼𝛼)% confidence interval from the data. 3. Compare the null hypothesis value (the number to the right of the =
sign in 𝐻𝐻0) to the interval: a. If the value lies outside the confidence interval, then reject 𝐻𝐻0. b. If the value lies in the confidence interval, then fail to reject 𝐻𝐻0.
4. Interpret the results in light of the original claim. Example 2: In Example 2 of Lesson 20, a research team made the claim that the mean red blood cell count for adult males (in millions of cells per microliter) is 4.950. Suppose the 95% confidence interval estimate of mean red blood cell count that was calculated from the data is 4.946 < 𝜇𝜇 < 5.199. Is there sufficient evidence to reject the researchers' claim? Is this result consistent with the one in Lesson 20?
The hypotheses are the same as for Example 2 in Lesson 20:
𝐻𝐻0: 𝜇𝜇 = 4.950 106 cells/dL (original claim) 𝐻𝐻𝐴𝐴: 𝜇𝜇 ≠ 4.950 106 cells/dL
Since 4.946 < 4.950 < 5.199, we fail to reject 𝐻𝐻0. There is insufficient evidence to reject the research team's claim that the mean red blood cell count for adult males = 4.950 106 cells/dL. This result is identical to the one we obtained in Lesson 20.