stat correction to be done its spss ANOVA

stata.docx

Home >Mathematics homework help >stat correction to be done its spss ANOVA

T1: Conduct one-way ANOVA analysis on the height using the "Weight and Height.xlsx" sample data under the Data Sets (See Attachment) folder (Note that you will need to change the format of the data set to conduct the analysis). Based on your results, what conclusion would you draw regarding the average height of students in the three states?

Solution:

Null Hypothesis: All the states have the same mean height of the students.

Alternate Hypothesis: Not all the states have the same mean height of the students.

The One-Way Analysis of Variance was performed in SPSS. The SPSS output is given as follows:

Descriptives
Height
	N	Mean	Std. Deviation	Std. Error	95% Confidence Interval for Mean	Minimum	Maximum
					Lower Bound	Upper Bound
KS	28	67.5714	3.30184	.62399	66.2911	68.8517	62.00	75.00
MO	41	69.9756	3.24220	.50635	68.9522	70.9990	61.00	75.00
NE	23	67.8696	4.18593	.87283	66.0594	69.6797	61.75	74.00
Total	92	68.7174	3.65929	.38151	67.9596	69.4752	61.00	75.00

Test of Homogeneity of Variances
Height
Levene Statistic	df1	df2	Sig.
1.697	2	89	.189

Levene Statistic (2, 89) = 1.697, p-value = 0.189 Comment by Bo Yan: I don’t have SPSS installed on my computer. But the your results are slightly different from the numbers I got from using Excel and another statistical package.

The p-value is significant at α=0.05. From this test we conclude that we do not have the homogeneity of variance. Comment by Bo Yan: Does the evidence support your conclusion here when p > 0.05?

ANOVA
Height
	Sum of Squares	df	Mean Square	F	Sig.
Between Groups	118.211	2	59.105	4.781	.011
Within Groups	1100.316	89	12.363
Total	1218.527	91

F (2, 89) = 4.781, p-value = 0.011

The p-value is significant at α=0.05. We reject the null hypothesis. From our analysis there is sufficient evidence to conclude that not all the states have the same mean height of the students, at a significance level of 0.05. This can be further emphasized from the results of the post-hoc tests. The following table shows the results of the Turkey’s HSD post hoc tests: Comment by Bo Yan: Good.

Multiple Comparisons
Dependent Variable: Height Tukey HSD
(I) State	(J) State	Mean Difference (I-J)	Std. Error	Sig.	95% Confidence Interval
					Lower Bound	Upper Bound
KS	MO	-2.40418*	.86202	.018	-4.4588	-.3495
	NE	-.29814	.98948	.951	-2.6566	2.0603
MO	KS	2.40418*	.86202	.018	.3495	4.4588
	NE	2.10604	.91601	.061	-.0773	4.2894
NE	KS	.29814	.98948	.951	-2.0603	2.6566
	MO	-2.10604	.91601	.061	-4.2894	.0773
*. The mean difference is significant at the 0.05 level.

KS & MO:

The 95% confidence interval for the difference of means is given as (-4.4588, -.3495).

As the confidence interval does not contain 0, we can conclude that there is statistically significant difference between the mean height of students belonging to the states KS and MO. Comment by Bo Yan: Now you should understand the difference between statistically significant and practically significant difference. So I expect you to be more careful when using the word “significant”, which could be easily misunderstood by people without statistical training if not used carefully.

KS & NE:

The 95% confidence interval for the difference of means is given as (-2.6566, 2.0603).

As the confidence interval contains 0, we can conclude that there is no significant difference between the mean height of students belonging to the states KS and NE. Comment by Bo Yan: As I explained in last week’s summary, we don’t accept the null hypothesis.

MO & NE:

The 95% confidence interval for the difference of means is given as (-.0773, 4.2894).

As the confidence interval contains 0, we can conclude that there is no significant difference between the mean height of students belonging to the states MO and NE. Comment by Bo Yan: Same as above

The Mean Plot is given as follows:

The means plot graph shows that there is clearly significant difference in the groups. Comment by Bo Yan: Your description of the chart is problematic in two ways. First, the means were obtained from a sample, which can only suggest difference. Second, if you change the starting point of the y axis to 0, the chart might suggest otherwise.

Results:

A one-way ANOVA revealed significant differences in the mean heights of students for the three states, F (2, 89) = 4.781, p-value = 0.011. Further post-hoc analysis revealed that the states KS (M = 67.5714, SD = 3.30184) and MO (M = 69.9756, SD = 3.24220) have significantly different mean heights of students.

T2 : Conduct some research on the Internet on T-test and ANVOA. Based on your research, explain why not conducting three T-tests to answer the question in T1.

The reasons why we conduct ANOVA and not a T-test is because

· Comparing three sets of data using the t-test would require three t-tests to be conducted which increases the chances of making type I error.

· T-test shall not make use of all the availed information from which the samples were collected. Comment by Bo Yan: What do you think is the impact of not using all available information?

· It is easier to perform a single ANOVA than performing three multiple t-tests.

T3: A grocery store conducted a study to find out which brand is preferred among its customers by asking 300 customers whether they like or dislike Brand A and another 300 customers whether they like or dislike Brand B. The results are presented in the table below. According to this sample (without hypothesis testing), which brand is preferred and explain your finding.

	Brand A	Brand B	Total
Like	215	241	456
Dislike	85	59	144
Total	300	300	600

Brand B is more liked. If the tests are anything to go by and without undertaking any tests, we can observe that brand B is more liked than brand A by all genders. This is because if placed in a frequency distribution, the brand B shall have a lot of frequency.

· The results are further disaggregated by gender and presented in the following two tables for males and females respectively. According to the male and female samples (without hypothesis testing), which brand is preferred by male customers and female customers respectively and explain your findings.

Male	Brand A	Brand B	Total
Like	120	20	140
Dislike	80	40	100
Total	200	60	240

Female	Brand A	Brand B	Total
Like	95	221	316
Dislike	5	39	44
Total	100	260	360

· Based on the two sets of findings, what is your overall conclusion? What is the implication of this study? Comment by Bo Yan: You did not answer this question.

The results further indicate that the males like brand A more than brand B, while more females like brand B as compared to brand A. the study also indicates that more men prefer brand A than women and more women prefer brand B than men.