Statistics questions

angelface
MTH245Lesson22Notes-1.pdf

MTH 245 Lesson 22 Notes One-Sample Interval Estimate of 𝑝𝑝

The best estimator of a population proportion 𝑝𝑝 is the sample proportion �̂�𝑝. The true sampling distribution of �̂�𝑝 is binomial, but if 𝑛𝑛 ⋅ �̂�𝑝 ≥ 5 and 𝑛𝑛 ⋅ (1 − �̂�𝑝) ≥ 5 (in other words, there are at least 5 "successes" and 5 "failures"), the sampling distribution of �̂�𝑝 can be closely approximated with a normal distribution, and the margin of error is

E = 𝑧𝑧1−(𝛼𝛼 2⁄ ) ∙ � 𝑝𝑝�∙(1−𝑝𝑝�)

𝑛𝑛

where 𝑧𝑧1−(𝛼𝛼 2⁄ ) is the 100 ⋅ �1 − 𝛼𝛼 2 � percentile of a standard normal

distribution.

To use StatCrunch to calculate a confidence interval estimate of 𝑝𝑝, follow the same procedure as for a hypothesis test of 𝑝𝑝 – Stat  Proportion Stats  One Sample  With Summary – but set the "Perform:" radio button to "Confidence interval for p" and enter the desired confidence level (do not change the method). Example 1: During a recent poll, randomly selected internet users were asked if they felt vulnerable to identity theft. Of the 1,002 responses, 531 indicated that yes, they did feel vulnerable. Calculate a 95% confidence interval estimate of 𝑝𝑝, the proportion of all internet users who feel vulnerable to identity theft. Are the pollsters justified in saying that more than half of internet users feel vulnerable to identity theft (i.e., that 𝑝𝑝 > 0.5)?

We will use the confidence interval method to evaluate the pollsters' claim. The hypotheses are:

𝐻𝐻0: 𝑝𝑝 = 0.5 𝐻𝐻𝐴𝐴: 𝑝𝑝 > 0.5 (original claim)

StatCrunch gives a 95% confidence interval of 0.499 < 𝑝𝑝 < 0.561. Since 0.5 lies in that interval, we fail to reject 𝐻𝐻0 and conclude there is insufficient evidence to support the pollsters' claim.

Example 2: A travel magazine surveyed randomly selected frequent air travelers to determine their preferred seat: aisle, middle, or window. Of 806 total responses, 492 said they preferred aisle seats. Calculate a 99% confidence interval estimate of the proportion of frequent fliers who prefer aisle seats. Can the magazine report that more than half of frequent fliers prefer aisle seats?

As with Example 1, the hypotheses are:

𝐻𝐻0: 𝑝𝑝 = 0.5 𝐻𝐻𝐴𝐴: 𝑝𝑝 > 0.5 (original claim)

StatCrunch gives a 95% confidence interval of 0.566 < 𝑝𝑝 < 0.655. Since that interval does not contain 0.5, we reject 𝐻𝐻0 and conclude there is sufficient evidence to support the magazine's claim.

Example 3: A study of 420,018 cell phone users found that 135 of them developed cancer of the brain or nervous system. Prior to this study of cell phone use, the rate of such cancer was found to be 0.0335% for those not using cell phones.

a. Construct a 90% confidence interval estimate of the percentage of cell phone users who develop cancer of the brain or nervous system.

StatCrunch yields a confidence interval of 0.0276% < 𝑝𝑝 < 0.0367%. (Be careful when reading the StatCrunch output; it will display the limits as decimals rather than percentages. You will need to multiply by 100 to get the proper limits for the next part.)

b. Do cell phone users appear to have a rate of cancer of the brain or nervous system that is different from the rate of such cancer among those not using cell phones?

Define p as the cancer rate for cellphone users. The hypotheses are:

𝐻𝐻0: 𝑝𝑝 = 0.0335% 𝐻𝐻𝐴𝐴: 𝑝𝑝 ≠ 0.0335% (original claim)

Since the interval in Part a contains 0.0335%, we fail to reject 𝐻𝐻0 and conclude there is insufficient evidence to support the claim that the cancer rates are different for cell phone users and non-users.

Choosing Sample Size to Match a Desired Margin of Error

In order to estimate 𝑝𝑝 with a desired margin of error 𝐸𝐸, the minimum sample size needed is

𝑛𝑛 = �𝑧𝑧1−(𝛼𝛼 2⁄ )� 2 ∙ �̂�𝑝 ∙ (1 − �̂�𝑝)

𝐸𝐸2

Most of the time, we won't know �̂�𝑝; if that's the case, we use the following approximation:

𝑛𝑛 = �𝑧𝑧1−(𝛼𝛼 2⁄ )� 2 ∙

0.25 𝐸𝐸2

Values of 𝑧𝑧1−(𝛼𝛼 2⁄ ) for frequently used confidence levels are as follows:

1 − 𝛼𝛼 𝑧𝑧1−(𝛼𝛼 2⁄ ) 0.90 1.645 0.95 1.960 0.99 2.575

NOTE: When computing required sample size, always round 𝑛𝑛 up to the next higher integer, regardless of the decimal remainder. This is because rounding down results in a sample size that will not guarantee the desired margin of error. This is the only lesson in this course where we will use anything other than common (aka round-half-up) rounding. Do not automatically round up in any context other than this one!

Example 4: Find the sample size needed to estimate the proportion of Republicans among registered voters in California. Use 𝐸𝐸 = 0.03, 1 − 𝛼𝛼 = 0.90, and assume �̂�𝑝 is unknown.

Since we are making no assumptions about the value of �̂�𝑝, we use the second equation:

𝑛𝑛 = �𝑧𝑧1−(𝛼𝛼 2⁄ )� 2 ∙ 0.25 𝐸𝐸2

= 1.6452 ∙ 0.25 0.032

= 1.645 2∙0.25

0.032 = 0.67650625

0.0009

= 751.673611���� ↗ 752,

so we would need to survey 752 registered voters to achieve the required margin of error of three percent.

Example 5: Find the sample size needed to estimate the proportion of community college faculty members who have earned doctoral degrees. Use 𝐸𝐸 = 0.05, 1 − 𝛼𝛼 = 0.99, and assume �̂�𝑝 = 0.15 based on previous research that suggests that 15% of faculty have doctorates.

Since we are making an assumption about the value of �̂�𝑝, we use the first equation:

𝑛𝑛 = �𝑧𝑧1−(𝛼𝛼 2⁄ )� 2 ∙ 𝑝𝑝�∙

(1−𝑝𝑝�) 𝐸𝐸2

= 2.5752 ∙ 0.15∙ (1−0.15) 0.052

= 2.575 2∙0.15∙0.85 0.052

= 0.8454046875 0.0025

= 338.161875 ↗ 339

so we would need to survey 339 faculty members voters to achieve the required margin of error of five percent. (Note that we rounded the result up to the next higher integer, as required for this type of problem.)