data set

profileGman 21$
Week62-sampleHypothesisTestingandCIpart3.pdf

In this document we will discuss 2 – sample Z- hypothesis testing and confidence

intervals that uses a mean’s and known population S’s.

This PDF discusses Z-Critical Value and you are discussing a sample mean and a

population S.

There are still 3 different hypothesis scenarios with a 2 – Sample Z Hypothesis

Test.

Lower Tail Test (1 tail):

Ho: �̅�1 − �̅�2 = 0

Ha: �̅�1 − �̅�2 < 0

Upper Tailed Test (1 tail):

Ho: �̅�1 − �̅�2 = 0

Ha: �̅�1 − �̅�2 > 0

Two Tailed Test:

Ho:�̅�1 − �̅�2 = 0

Ha: �̅�1 − �̅�2 ≠ 0

The hypothesized value is 0 and the same key words apply from a 1 – sample

hypothesis test to determine which scenario to use. 𝜇1 − 𝜇2 is the difference

between the average in the first sample and the average in the second sample.

The Z – Test Statistic = �̅�1− �̅�2−0

√ 𝑆1

2

𝑛1 +

𝑆2 2

𝑛2

Where S is the population standard deviation, 𝜇1 𝑎𝑛𝑑 𝜇2 are averages and n1

and n2 are the sample sizes.

We can use =NORM.S.DIST to find the p-values. These should look familiar from

the discussion forum.

Example:

A dietitian has developed a diet that is low in fats, carbs, and cholesterol. The

dietitian wishes to examine the effects this diet has on the weights of obese

people. Two random samples of 30 obese each are selected, and one group of 30

people is places on the low-fat diet. The other 30 people are places on a diet that

contains approximately the same quantity of food, but has is not low in fats,

carbs, and cholesterol. For each person the amount of weight lost (or gained) in a

three-week period is recorded. There is a difference in the population mean

weight losses for the two diets? The population S1 = 4.67 and the population S2 =

4.04. Use alpha = .05. Here we see we are given the Raw Data set.

WL Low Diet WL Regular Diet

8 6

21 14 13 4

8 6 11 13

4 11

3 11 6 8

16 14 5 8

10 6 8 4

8 12

12 2 7 1

3 2 12 6

14 1

16 0 11 9

10 5 9 10

10 6 8 6

14 9

3 8 7 3

14 1 11 7

14 8

First step is to state the hypothesis scenario. Because the key word says

difference this means it is a two tailed test.

Ho: �̅�1 − �̅�2 = 0

Ha: �̅�1 − �̅�2 ≠ 0

Before we start calculating anything by hand and because we are given the raw

data set, we can actually run this hypothesis test in Excel. And since you installed

the Data Analysis Toolpak it is easy to do. First you need to input this Raw Data

into Excel.

Then go to Data -> Data Analysis -> and scroll to where it says z-Test Two Sample

for Means and click OK

Under Input:

Variable 1 Range: you will highlight the WL Low Diet column and make sure you

include the top row where the Label is located.

Variable 2 Range: you will highlight the WL Regular Diet column and make sure

you include the top row where the Label is located.

Hypothesize Mean Difference: can be left as 0

Variance 1 Variance (known): Here is where you will put the Known Variance for

the First Sample. In the problem we are given the Known Standard Deviation. To

find the Variance all we did was Square it. Input that value in the box.

Variance 2 Variance (known): Here is where you will put the Known Variance for

the Second Sample. In the problem we are given the Known Standard Deviation.

To find the Variance all we did was Square it. Input that value in the box.

Check the “Labels” box because we did include the first row of labels. For Alpha

out 0.05 but this can be change depending on what significance level you use.

Then make sure the bubble for New Workbook Ply: highlight and click OK. It

should look similar to the screenshot below.

Once you click OK in a new Worksheet this should populate.

z-Test: Two Sample for Means

WL Low Diet WL Regular Diet

Mean 9.866666667 6.7 Known Variance 21.8089 16.3216

Observations 30 30

Hypothesized Mean Difference

0

z 2.808838232

P(Z<=z) one-tail 0.002486031

z Critical one-tail 1.644853627

P(Z<=z) two-tail 0.004972062

z Critical two-tail 1.959963985

Here we have all the values we need to state a conclusion.

We see the Z - Test Statistic = 2.8088 and because we ran a two tailed test the

p-value = .00497.

p -value = .00497 < .05. This p-value is less than .05 which means we Reject Ho.

Yes, there is statistical evidence that there is a difference in the population mean

weight losses for the two diets.

If we were running a 1-tailed test, we are given the p-value which is .002486. Z-

Test Statistic is the same and so is the conclusion for a 1-tailed test.

Using Excel to run a hypothesis test when we are given the Raw Data is very

convenient. But if we aren’t given the Raw Data and we are given the averages

and known S’s we will need to compute the Z-Test Stat by hand and then use the

Excel function to find the p-value.

To find the Z-Test Stat we will use this equation and plug in what we know. You

should know by now how to calculate the average and SD using Excel. Which is

what I did here.

Z – Test Statistic = �̅�1− �̅�2−0

√ 𝑆1

2

𝑛1 +

𝑆2 2

𝑛2

Z – Test Statistic = 9.86667− 6.7−0

√4.672

30 +

4.042

30

Z – Test Statistic = 3.16667

√.72696333+.54405333

Z – Test Statistic = 3.16667

√1.27101666

Z – Test Statistic = 3.16667

1.1273937

Z – Test Statistic = 2.80884

When we calculate the Test Stat by hand using algebra we get the same value.

Next, we need to find the p-value. We will use the =NORM.S.DIST function to find

the p-value.

In Excel input =NORM.S.DIST(2.80884,TRUE) and hit Enter. We will type True

because this is a cumulative test.

We see this p-value = .997514 BUT remember when we use this function in Excel,

this function is in the less than form. This means if we were running a Lower

Tailed test, this would be our p-value. If we were running an Upper Tailed Test

we need to take 1 - .997514 to get the p-value for our test.

P-value = 1 - .997514 = .002486. This is the p-value for an upper tailed test.

But since we are running a Two Tailed, we take whichever p-value is smaller and

multiply it by 2.

p-value = .002486*2 = .004972. This is the p-value for a two tailed test. And if we

compare these to the Excel output that should be the same and draw the same

conclusion.

This is how you would run a 2 – sample Z hypothesis test using averages and

population S’s when we don’t have the raw data and can’t use Excel.

Now that we ran a hypothesis test, let calculate a confidence interval and draw

the same conclusion.

The equation for a 2 – sample Z confidence interval:

�̅�1 − �̅�2 ± 𝑍𝛼 2

∗ ∗ √ 𝑆1

2

𝑛1 +

𝑆2 2

𝑛2

Where Standard Error (SE) = √ 𝑆1

2

𝑛1 +

𝑆2 2

𝑛2

Margin of Error = 𝑍𝛼 2

∗ ∗ √ 𝑆1

2

𝑛1 +

𝑆2 2

𝑛2

We have all the values we need let’s plug them into our equation.

9.86667 − 6.7 ± 𝑍𝛼 2

∗ ∗ √4.672

30 +

4.042

30

The last thing we need to find is a Z-Critical Value. We will use the =NORM.S.INV

in Excel to find the Z-Critical Value.

If we want to find a 95% confidence interval, then alpha = 1 - .95 = .05. But

because this is a confidence interval and we need to take into account the plus

AND minus on both sides if the bell-shaped curve we will divide alpha be 2. .05/2

= .025. Then we take 1 - .025 = .975. We will use this value in our Excel function.

=NORM.S.INV(.975)

We see the Z – Critical Value is 1.96. We will plug this into the equation and

solve. But if you compare this Critical Value to the Excel output we got when we

ran the hypothesis test it is the same because we used Alpha = .05 in the output.

But this value will change depending on what you input for Alpha.

z-Test: Two Sample for Means

WL Low Diet WL Regular Diet

Mean 9.866666667 6.7

Known Variance 21.8089 16.3216 Observations 30 30

Hypothesized Mean Difference

0

z 2.808838232

P(Z<=z) one-tail 0.002486031

z Critical one-tail 1.644853627

P(Z<=z) two-tail 0.004972062

z Critical two-tail 1.959963985

9.86667 − 6.7 ± 𝑍𝛼 2

∗ ∗ √4.672

30 +

4.042

30

9.86667 − 6.7 ± 1.96 ∗ √4.672

30 +

4.042

30

3.16667 ± 1.96 ∗ 1.1273937

3.16667 ± 2.209697

The confidence interval goes from .95697 to 5.376367. This interval goes from a

positive value to a positive value. This means that 0 is NOT in this interval.

Because 0 is NOT in the interval, Yes, it is Significant, and we Reject Ho. This is the

same conclusion that we got with the hypothesis test.