HW11.xlsx

HW11

MGMT 650 Summer 2025 Week 11 Homework Questions For credit, do all work in this file with no links to external files. Credit for your work also requires charts, graphs, and calculations be linked to the data within this file. Calculations begin by typing =. For all questions, try to put your answers, calculations, or charts in the yellow area by the question.

Chi Square #1-7

An analyst wonders if the age distribution of customers coming for service In Town is the same as at the Mall.
He selects transactions at random from each branch and researches the ages of the customers. These are the data :
Age
< 30 30 - 55 56 + Totals
Distribution In Town 15 57 45 117
Mall 29 49 30 108
Totals 44 106 75 225
For all problems, try to enter your answers or calculations in the yellow cells.
1 Write the null hypothesis regarding the frequency distribution of ages by In Town and Mall.
Which type of test is required (goodness of fit, independence, homogeneity)?
2 Compute
Age Groups
< 30 30 - 55 56 +
Expected (E) In Town
Mall
Age Groups
< 30 30 - 55 56 +
(O - E)2/E In Town
Mall
3 What is the chi square test statistic? Note: =CHISQ.TEST does not compute the chi-squared test statistic. It computes p-value.
4 What is the degrees of freedom of the chi-squared test statistic?
Use an Excel function to compute the exploratory and sensitive critical value at alpha = 10%.
5 What is the percent p-value of the chi square test statistic?
6 State your decision about H0 with the significance level. Explain why.
Are the bank distributions the same (Yes/No)?
7 If instead of testing homogeneity, independence were tested with the same data, would age groups or location matter?

ANOVA #8-13

Moto owns a silk shop in Taipei City and wants to expands color selection.
Before expanding colors, Moto will decide if the customers have a brand preference.
As an experiment, the number of sales by brand were recorded on randomly selected days.
Can it be concluded that there is a difference in preference among the brands at the significance level 5%
Cyan Pearl Onyx
824 776 734
809 640 689
898 822 778
945 812 855
820 673 700
900 893 780
922 798
912
8 What is the null hypothesis?
What is the alternative hypothesis?
9 The level of significance depends on the application. What is the level of significance (α) in this problem?
Perform single factor ANOVA using Data Analysis. Include the labels in row 7 in the Input Range and check the Labels checkbox.
10 From the ANOVA output: What is the F test statistic?
What is the critical value of the F test statistic?
11 What is your decision in terms of H0, the significance level, p-value, the F test statistic, and the critical value of the F test statistic.
12 Is brand preference the same?
13 Which brand do you think is prefered?

Regression #14-20

Studies have shown that the frequency with which shoppers browse Internet retailers is positively corelated to the frequency with which they make purchases.
The following data show respondents Age and Time in minutes browsing online retailers per year.
Age (X) Time (Y)
16 307.8
17 149.1
19 211.7
22 232.6
22 141.6
22 345.6
22 305.6
28 131.4
28 283.4
28 311.4
28 472.4
30 236
33 463.9
34 468.2
35 531.5
35 429.5
35 480.5
36 611.8
39 558.7
39 376.7
40 575
42 476.6
43 382.9
44 450.2
48 481.4
50 439
50 540
51 637.3
52 309.6
54 505.2
58 583.4
59 665.7
60 649
14 Compute the correlation between Age and Time using Data Analysis. Include the labels in the Input Range and check the Labels checkbox.
15 Compute the correlation using the Excel function =CORREL. If answers for #14 and 15 do not agree, there is an error.
16 The strength of the correlation motivates further examination.
a) Make a scatter plot linked to and near the data above, and with Age on the horizontal (X) axis.
b) Add to your chart
A meaningful title
Vertical axis label Time
Horizontal axis label Age
c) Complete the chart by adding Trendline and checking boxes
17 Read directly from the chart:
a) Intercept =
b) Slope =
c) R2 =
18a Perform regression using Data Analysis. Select the Time data first, include the labels in row 4 in the Input Range , and check the Labels checkbox.
18b In the Regression output, highlight the Y-intercept blue, the slope green, and R2 red.
19 Is it valid to use this data to predict the amount of time that a 85-year-old will spend browsing online retailers ?
Why or why not (Week 11 Presentation, slide 8)?
Is it valid to use this data to predict the amount of time that a 60-year-old will spend browsing online retailers ?
20 If valid, use the Data Analysis output to predict the number of minutes spent by a 35-year old shopper. Enter = followed by the regression formula,
entering the intercept and slope into the formula by clicking on the corresponding cells in the regression output.
(Week 11 Presentation, slide 11)

Cleaning Outlier #21-25

21 Make a scatter plot linked to and near the following data:
X Y
1.01 2.5982
1.07 2.2752
1.16 2.592
1.19 2.2252
1.19 3.2843
1.22 2.19
1.32 3.6176
1.38 2.9516
1.43 3.9971
1.45 4.0855
1.46 4.0862
1.48 4.0272
1.56 4.3208
1.58 3.6052
1.6 3.59
1.72 4.8756
1.77 4.1042
1.8 4.538
1.81 5.1257
1.88 4.6192
1.94 5.4924
2 5.21
3 -4
22 Make the title Scatterplot of the Data
Add trendline and regression equation and R2 label to the plot.
23 The scatterplot reveals a point outside the point pattern. A "residual" is defined as the veritical distance from the line to a point. Q1, Q3, and IQR were computed for the residuals. Residuals that are more that 1.5 IQR below Q1 or more than 1.5 IQR above Q3 are called outliers and the corresponding data must be investigated. It was determined that the outlying data point was due to data entry error. The data with outlier removed is copied below. Make a new scatterplot linked to and near the cleaned data, and add title Scatterplot of Cleaned Data, trendline, and regression equation.
X Y
1.01 2.5982
1.07 2.2752
1.16 2.592
1.19 2.2252
1.19 3.2843
1.22 2.19
1.32 3.6176
1.38 2.9516
1.43 3.9971
1.45 4.0855
1.46 4.0862
1.48 4.0272
1.56 4.3208
1.58 3.6052
1.6 3.59
1.72 4.8756
1.77 4.1042
1.8 4.538
1.81 5.1257
1.88 4.6192
1.94 5.4924
2 5.21
24 Compare the regression results of the two plots. What does the outlier do to the regression line?
What does the regression line do when the outlier is removed?
25 How does removal of the outlier affect R2?
Why does it affect R2 this way?

image1.png