Stats- Lesson 25

angelface
MTH245Lesson25Notes-1.pdf

MTH 245 Lesson 25 Notes Independent Samples for Two Means – P-Value Method

Two random samples are independent if the values from one population are not related or naturally paired with values from the other population. Unlike the matched pairs case, when samples are independent, the individuals in one sample are completely different from the individuals in the other sample. This is usually the case with simple randomized experimental designs where the sample is only broken into treatment and control groups (as opposed to a randomized block design, which breaks the sample down into more than two groups).

Consider two independent random samples with respective sample sizes 𝑛𝑛1 and 𝑛𝑛2 (not necessarily equal). Suppose that both samples come from populations where the variances 𝜎𝜎12 and 𝜎𝜎22 are unknown and presumed to be unequal. Further suppose that at least one of the following holds:

1. Both populations are normally distributed. 2. 𝑛𝑛1 ≥ 30 AND 𝑛𝑛2 ≥ 30.

If these conditions are met, we can perform a two-sample t-test. As with the matched pairs case, the null hypothesis is that there is no difference between the two population means:

𝐻𝐻0: 𝜇𝜇1 − 𝜇𝜇2 = 0

and the alternative hypothesis will be one of the following:

𝐻𝐻𝐴𝐴: 𝜇𝜇1 − 𝜇𝜇2 < 0 (Population 1 has a smaller mean) OR

𝐻𝐻𝐴𝐴: 𝜇𝜇1 − 𝜇𝜇2 > 0 (Population 1's mean is larger) OR

𝐻𝐻𝐴𝐴: 𝜇𝜇1 − 𝜇𝜇2 ≠ 0 (the two means are different)

As with all hypothesis tests, if p-value ≤ 𝛼𝛼, reject 𝐻𝐻0.

To use StatCrunch to conduct a two-sample 𝑡𝑡-test when sample statistics for both samples are already available:

1. Open a blank data table. 2. Click Stat  T Stats  Two Sample  With Summary. Warning: do

not use Stat  Z Stats  Two Sample – it will not produce correct results!

3. Fill in the sample statistics. 4. Make sure the radio button is set to "Hypothesis test for 𝜇𝜇𝐷𝐷 = 𝜇𝜇1 − 𝜇𝜇2"

and the operator in 𝐻𝐻𝐴𝐴 is correct (do not change 𝐻𝐻0).

5. Click "Compute!".

To use StatCrunch to conduct a two-sample 𝑡𝑡-test using raw data:

1. Import/enter the data. 2. Click Stat  T Stats  Two Sample  With Data. Warning: do not use

Stat  Z Stats  Two Sample or Stat  T Stats  Paired – neither will produce correct results!

3. Select the appropriate data column. 4. Leave the radio button at "Hypothesis test for 𝜇𝜇" (the default). 5. Fill in the null hypothesis value and the alternative hypothesis

operator.

6. Click "Compute!".

Example 1: In a study investigating the effects of color on creativity, each subject was given a computer-based creativity test; for the first group, their computer screens had a red background, while the second group's tests were displayed with a blue background. Responses were scored by a panel of judges, with summary statistics for each group given in the table below. Based on these results, the researchers claim that different colors produce different mean scores. Test this claim at 𝛼𝛼 = 0.01.

Background �̅�𝑥 𝑠𝑠 𝑛𝑛 Red 3.39 0.97 35 Blue 3.97 0.63 36

Define Population 1 as Red scores and Population 2 as Blue scores. Further, suppose 𝜇𝜇1 and 𝜇𝜇2 are the mean jump heights for each population. If the researchers' claim is true, then the two means will not be equal and the hypotheses for the test will be as follows:

𝐻𝐻0: 𝜇𝜇1 − 𝜇𝜇2 = 0 𝐻𝐻𝐴𝐴: 𝜇𝜇1 − 𝜇𝜇2 ≠ 0 (original claim)

Since the p-value = 0.004 < 𝛼𝛼 = 0.01, we reject 𝐻𝐻0. There is sufficient evidence to support the researchers' claim.

Example 2: Company A, an auto manufacturer, claims that the mean operating cost per mile of one of its models is less than that of a similar model made by its leading competitor, Company B. A consumer group conducted a study comparing the operating costs of 30 randomly selected vehicles from the manufacturer and 32 from its competitor. The results are in the table below. Test Company A's claim at 𝛼𝛼 = 0.01.

Source �̅�𝑥 𝑠𝑠 𝑛𝑛 Company A $0.52/mi $0.05/mi 30 Company B $0.55/mi $0.07/mi 32

Define Population 1 as costs per mile for all production units of Company A's model and Population 2 as the costs per mile for Company B's model.

Further, suppose 𝜇𝜇1 and 𝜇𝜇2 are the mean costs per mile for each population. If Company A's claim is true, then 𝜇𝜇1 will be less than 𝜇𝜇2 and the hypotheses for the test will be as follows:

𝐻𝐻0: 𝜇𝜇1 − 𝜇𝜇2 = 0 𝐻𝐻𝐴𝐴: 𝜇𝜇1 − 𝜇𝜇2 < 0 (original claim)

Since the p-value = 0.028 > 𝛼𝛼 = 0.01, we fail to reject 𝐻𝐻0. There is insufficient evidence to support the Company A's claim. Note that as we saw in Lesson 24, if we instead labeled the Company A costs per mile as Population 2, as long as we change 𝐻𝐻𝐴𝐴 to 𝜇𝜇1 − 𝜇𝜇2 > 0, we will get the same p-value and the test result will be the same.

Independent Samples for Two Means – Confidence Interval Method To construct a two-sided confidence interval estimate of the difference between the population means, use the same procedure as above with one change: set the radio button to "Confidence interval for 𝜇𝜇1 − 𝜇𝜇2" and enter the appropriate confidence level. Example 3: Conduct the hypothesis test in Example 1 using the confidence interval method. Are the results of the two tests consistent?

The populations and hypotheses were already defined in Example 1. Since 𝛼𝛼 = 0.01, to perform the same test we need to use 1 − 𝛼𝛼 = 0.99.

StatCrunch reports a 99% confidence interval −1.10 < 𝜇𝜇1 − 𝜇𝜇2 < 0.06. Since this interval does not contain 0, we reject 𝐻𝐻0 and conclude there is sufficient evidence to support the researchers' claim. This is the same result as the one we observed in Example 1.

Example 4: In a study at a U. S. medical school, researchers measured the LDL cholesterol level of 28 randomly selected heart attack patients two days after they had their heart attacks. As a control, the researchers also measured the LDL cholesterol level of 30 random healthy adults with no

signs of heart disease. The LDL Cardiac (Full) data set for this example contains the cholesterol measurements. Is there sufficient evidence to suggest that people who have had heart attacks have LDL cholesterol levels different from patients who have not had heart attacks? Test at 𝛼𝛼 = 0.01 using the confidence interval method.

Define Population 1 as the LDL levels of cardiac patients and Population 2 as the LDL levels of healthy patients. If the original claim is true, then then 𝜇𝜇1 will not equal 𝜇𝜇2 and the hypotheses for the test will be as follows:

𝐻𝐻0: 𝜇𝜇1 − 𝜇𝜇2 = 0 𝐻𝐻𝐴𝐴: 𝜇𝜇1 − 𝜇𝜇2 ≠ 0 (original claim)

Since 𝛼𝛼 = 0.01, to perform the same test we need to use 1 − 𝛼𝛼 = 0.99.

StatCrunch reports a 99% confidence interval 34.0 < 𝜇𝜇1 − 𝜇𝜇2 < 87.6. Since this interval does not contain 0, we reject 𝐻𝐻0. There is sufficient evidence to support the claim that people who have had heart attacks have LDL cholesterol levels different from patients who have not had heart attacks.