stat correction to be done its spss ANOVA

profileassignmentexpert96
stata.docx

1

T1: Conduct one-way ANOVA analysis on the height using the "Weight and Height.xlsx" sample data under the Data Sets (See Attachment) folder (Note that you will need to change the format of the data set to conduct the analysis). Based on your results, what conclusion would you draw regarding the average height of students in the three states?

Solution:

Null Hypothesis: All the states have the same mean height of the students.

Alternate Hypothesis: Not all the states have the same mean height of the students.

The One-Way Analysis of Variance was performed in SPSS. The SPSS output is given as follows:

Descriptives

Height

N

Mean

Std. Deviation

Std. Error

95% Confidence Interval for Mean

Minimum

Maximum

Lower Bound

Upper Bound

KS

28

67.5714

3.30184

.62399

66.2911

68.8517

62.00

75.00

MO

41

69.9756

3.24220

.50635

68.9522

70.9990

61.00

75.00

NE

23

67.8696

4.18593

.87283

66.0594

69.6797

61.75

74.00

Total

92

68.7174

3.65929

.38151

67.9596

69.4752

61.00

75.00

Test of Homogeneity of Variances

Height

Levene Statistic

df1

df2

Sig.

1.697

2

89

.189

Levene Statistic (2, 89) = 1.697, p-value = 0.189 Comment by Bo Yan: I don’t have SPSS installed on my computer. But the your results are slightly different from the numbers I got from using Excel and another statistical package.

The p-value is significant at α=0.05. From this test we conclude that we do not have the homogeneity of variance. Comment by Bo Yan: Does the evidence support your conclusion here when p > 0.05?

ANOVA

Height

Sum of Squares

df

Mean Square

F

Sig.

Between Groups

118.211

2

59.105

4.781

.011

Within Groups

1100.316

89

12.363

Total

1218.527

91

F (2, 89) = 4.781, p-value = 0.011

The p-value is significant at α=0.05. We reject the null hypothesis. From our analysis there is sufficient evidence to conclude that not all the states have the same mean height of the students, at a significance level of 0.05. This can be further emphasized from the results of the post-hoc tests. The following table shows the results of the Turkey’s HSD post hoc tests: Comment by Bo Yan: Good.

Multiple Comparisons

Dependent Variable: Height

Tukey HSD

(I) State

(J) State

Mean Difference (I-J)

Std. Error

Sig.

95% Confidence Interval

Lower Bound

Upper Bound

KS

MO

-2.40418*

.86202

.018

-4.4588

-.3495

NE

-.29814

.98948

.951

-2.6566

2.0603

MO

KS

2.40418*

.86202

.018

.3495

4.4588

NE

2.10604

.91601

.061

-.0773

4.2894

NE

KS

.29814

.98948

.951

-2.0603

2.6566

MO

-2.10604

.91601

.061

-4.2894

.0773

*. The mean difference is significant at the 0.05 level.

KS & MO:

The 95% confidence interval for the difference of means is given as (-4.4588, -.3495).

As the confidence interval does not contain 0, we can conclude that there is statistically significant difference between the mean height of students belonging to the states KS and MO. Comment by Bo Yan: Now you should understand the difference between statistically significant and practically significant difference. So I expect you to be more careful when using the word “significant”, which could be easily misunderstood by people without statistical training if not used carefully.

KS & NE:

The 95% confidence interval for the difference of means is given as (-2.6566, 2.0603).

As the confidence interval contains 0, we can conclude that there is no significant difference between the mean height of students belonging to the states KS and NE. Comment by Bo Yan: As I explained in last week’s summary, we don’t accept the null hypothesis.

MO & NE:

The 95% confidence interval for the difference of means is given as (-.0773, 4.2894).

As the confidence interval contains 0, we can conclude that there is no significant difference between the mean height of students belonging to the states MO and NE. Comment by Bo Yan: Same as above

The Mean Plot is given as follows:

The means plot graph shows that there is clearly significant difference in the groups. Comment by Bo Yan: Your description of the chart is problematic in two ways. First, the means were obtained from a sample, which can only suggest difference. Second, if you change the starting point of the y axis to 0, the chart might suggest otherwise.

Results:

A one-way ANOVA revealed significant differences in the mean heights of students for the three states, F (2, 89) = 4.781, p-value = 0.011. Further post-hoc analysis revealed that the states KS (M = 67.5714, SD = 3.30184) and MO (M = 69.9756, SD = 3.24220) have significantly different mean heights of students.

T2 : Conduct some research on the Internet on T-test and ANVOA. Based on your research, explain why not conducting three T-tests to answer the question in T1. 

The reasons why we conduct ANOVA and not a T-test is because

· Comparing three sets of data using the t-test would require three t-tests to be conducted which increases the chances of making type I error.

· T-test shall not make use of all the availed information from which the samples were collected. Comment by Bo Yan: What do you think is the impact of not using all available information?

· It is easier to perform a single ANOVA than performing three multiple t-tests.

T3: A grocery store conducted a study to find out which brand is preferred among its customers by asking 300 customers whether they like or dislike Brand A and another 300 customers whether they like or dislike Brand B. The results are presented in the table below. According to this sample (without hypothesis testing), which brand is preferred and explain your finding. 

 

Brand A

Brand B

Total

Like

215

241

456

Dislike

85

59

144

Total

300

300

600

Brand B is more liked. If the tests are anything to go by and without undertaking any tests, we can observe that brand B is more liked than brand A by all genders. This is because if placed in a frequency distribution, the brand B shall have a lot of frequency.

· The results are further disaggregated by gender and presented in the following two tables for males and females respectively. According to the male and female samples (without hypothesis testing), which brand is preferred by male customers and female customers respectively and explain your findings. 

Male

Brand A

Brand B

Total

Like

120

20

140

Dislike

80

40

100

Total

200

60

240

·

Female

Brand A

Brand B

Total

Like

95

221

316

Dislike

5

39

44

Total

100

260

360

· Based on the two sets of findings, what is your overall conclusion? What is the implication of this study? Comment by Bo Yan: You did not answer this question.

The results further indicate that the males like brand A more than brand B, while more females like brand B as compared to brand A. the study also indicates that more men prefer brand A than women and more women prefer brand B than men.