Statistics Technology Lab

profilekarlakay456
Lab7DiagnosticTestingforComparisonofTwoMeansIndependetSampleswithNon-ParametricOption.pdf

Math 2150L – Statistics Technology Lab

Diagnostic Testing – Comparing Two Means – Independent Samples

Steps for Comparing Two Means – Independent Samples

1. Use QQ Plot to determine if either or both of the data sets are approximately normally

distributed.

2. If it is safe to assume both data sets are approximately normally distributed then proceed

with the Independent Samples t-Test.

3. If it is not safe to assume either or both data sets are approximately normally distributed

then the Wilcoxon Non-Parametric Test for Two Independent Samples must be used.

t-Test for Two Independent Samples where both samples are approximately normally distributed

Step 1: Hypothesis to be tested

Ho: 1 = 2

HA: 1 ≠ 2 , 1 < 2 , 1 > 2

Step 2: Use the “t.test” R command to generate a p-value.

> t.test(variable1, variable2, alternative=”alternative”, paired=FALSE, conf.level=0.95)

Step 3: If p-value <  = 0.05 the data has provided evidence HA is true

Wilcoxon Non-Parametric Test for Two Independent Samples (sample data not normal)

Step 1: Hypothesis to be tested

Ho: Median1 = Median2

HA: Median1 ≠ Median2 , Median1 < Median2 , Median1 > Median2

Step 2: Use the “wilcox.test” R command to generate a p-value.

> wilcox.test(variable1, variable2, alternative=”alternative”, paired=FALSE)

Step 3: If p-value <  = 0.05 the data has provided evidence HA is true

Example – Using the two data sets below, run the QQ Plot diagnostics to determine if it is safe to

assume either (or both) data sets are normally distributed. Based on the diagnostics results,

perform the appropriate test to either compare means or medians between the two groups.

Sample A – 1, 3, 4, 6, 8, 12, 15, 16, 17, 22, 26

Sample B – 2, 5, 7, 9, 11, 13, 14, 18, 19, 20, 25, 27

(next page for solution)

Math 2150L – Statistics Technology Lab

Diagnostic Testing – Comparing Two Means – Independent Samples

Since both samples are small (nA = 11 < 30 and nB = 12 < 30) we cannot assume the data are

approximately symmetric and must investigate the QQ plots of the samples shown below.

The graph above on the left is for sampleA and the graph above on the right is for sample. If the

data are approximately symmetric the points should generally follow the reference line with no

severe breaks at the ends. It is my opinion that for both sampleA and sample the points generally

follow the reference line with no severe breaks at the ends. That being the case we can proceed

to test the equality of means hypothesis using the t-test.

Step 1: Hypotheses to be tested

Ho: A = B

HA: A ≠ B

Step 2: The “t.test” R command below generates a p-value = 0.4897

> t.test(sampleA,sampleB, alternative=”two.sided”, paired=FALSE, conf.level=0.95)

Step 3: Since the p-value 0.4897   = 0.05 the data has provided no significant evidence HA is

true and we can assume A  B.

As an example, suppose we thought the QQ Plot of sample (above on the left) made a severe

break on the lower left end and that was evidence sample was not approximately symmetric. We

would then proceed with Wilcox test of Medians rather than the t-test of means as follows:

Math 2150L – Statistics Technology Lab

Diagnostic Testing – Comparing Two Means – Independent Samples

Step 1: Hypotheses to be tested

Ho: MedianA = MedianB

HA: MedianA ≠ MedianB

Step 2: The “wilcox.test” R command below generates a p-value = 0.4865

> wilcox.test(sampleA, sampleB, alternative=”two.sided”, paired=FALSE)

Step 3: Since the p-value 0.4865   = 0.05 the data has provided no significant evidence HA is

true and we can assume MedianA  MedianB.