HW11
HW11
MGMT 650 Summer 2025 Week 11 Homework Questions For credit, do all work in this file with no links to external files. Credit for your work also requires charts, graphs, and calculations be linked to the data within this file. Calculations begin by typing =. For all questions, try to put your answers, calculations, or charts in the yellow area by the question.
Chi Square #1-7
| An analyst wonders if the age distribution of customers coming for service In Town is the same as at the Mall. | ||||||
| He selects transactions at random from each branch and researches the ages of the customers. These are the data : | ||||||
| Age | ||||||
| < 30 | 30 - 55 | 56 + | Totals | |||
| Distribution | In Town | 15 | 57 | 45 | 117 | |
| Mall | 29 | 49 | 30 | 108 | ||
| Totals | 44 | 106 | 75 | 225 | ||
| For all problems, try to enter your answers or calculations in the yellow cells. | ||||||
| 1 | Write the null hypothesis regarding the frequency distribution of ages by In Town and Mall. | |||||
| Which type of test is required (goodness of fit, independence, homogeneity)? | ||||||
| 2 | Compute | |||||
| Age Groups | ||||||
| < 30 | 30 - 55 | 56 + | ||||
| Expected (E) | In Town | |||||
| Mall | ||||||
| Age Groups | ||||||
| < 30 | 30 - 55 | 56 + | ||||
| (O - E)2/E | In Town | |||||
| Mall | ||||||
| 3 | What is the chi square test statistic? Note: =CHISQ.TEST does not compute the chi-squared test statistic. It computes p-value. | |||||
| 4 | What is the degrees of freedom of the chi-squared test statistic? | |||||
| Use an Excel function to compute the exploratory and sensitive critical value at alpha = 10%. | ||||||
| 5 | What is the percent p-value of the chi square test statistic? | |||||
| 6 | State your decision about H0 with the significance level. Explain why. | |||||
| Are the bank distributions the same (Yes/No)? | ||||||
| 7 | If instead of testing homogeneity, independence were tested with the same data, would age groups or location matter? | |||||
ANOVA #8-13
| Moto owns a silk shop in Taipei City and wants to expands color selection. | ||||
| Before expanding colors, Moto will decide if the customers have a brand preference. | ||||
| As an experiment, the number of sales by brand were recorded on randomly selected days. | ||||
| Can it be concluded that there is a difference in preference among the brands at the significance level 5% | ||||
| Cyan | Pearl | Onyx | ||
| 824 | 776 | 734 | ||
| 809 | 640 | 689 | ||
| 898 | 822 | 778 | ||
| 945 | 812 | 855 | ||
| 820 | 673 | 700 | ||
| 900 | 893 | 780 | ||
| 922 | 798 | |||
| 912 | ||||
| 8 | What is the null hypothesis? | |||
| What is the alternative hypothesis? | ||||
| 9 | The level of significance depends on the application. What is the level of significance (α) in this problem? | |||
| Perform single factor ANOVA using Data Analysis. Include the labels in row 7 in the Input Range and check the Labels checkbox. | ||||
| 10 | From the ANOVA output: What is the F test statistic? | |||
| What is the critical value of the F test statistic? | ||||
| 11 | What is your decision in terms of H0, the significance level, p-value, the F test statistic, and the critical value of the F test statistic. | |||
| 12 | Is brand preference the same? | |||
| 13 | Which brand do you think is prefered? | |||
Regression #14-20
| Studies have shown that the frequency with which shoppers browse Internet retailers is positively corelated to the frequency with which they make purchases. | ||
| The following data show respondents Age and Time in minutes browsing online retailers per year. | ||
| Age (X) | Time (Y) | |
| 16 | 307.8 | |
| 17 | 149.1 | |
| 19 | 211.7 | |
| 22 | 232.6 | |
| 22 | 141.6 | |
| 22 | 345.6 | |
| 22 | 305.6 | |
| 28 | 131.4 | |
| 28 | 283.4 | |
| 28 | 311.4 | |
| 28 | 472.4 | |
| 30 | 236 | |
| 33 | 463.9 | |
| 34 | 468.2 | |
| 35 | 531.5 | |
| 35 | 429.5 | |
| 35 | 480.5 | |
| 36 | 611.8 | |
| 39 | 558.7 | |
| 39 | 376.7 | |
| 40 | 575 | |
| 42 | 476.6 | |
| 43 | 382.9 | |
| 44 | 450.2 | |
| 48 | 481.4 | |
| 50 | 439 | |
| 50 | 540 | |
| 51 | 637.3 | |
| 52 | 309.6 | |
| 54 | 505.2 | |
| 58 | 583.4 | |
| 59 | 665.7 | |
| 60 | 649 | |
| 14 | Compute the correlation between Age and Time using Data Analysis. Include the labels in the Input Range and check the Labels checkbox. | |
| 15 | Compute the correlation using the Excel function =CORREL. If answers for #14 and 15 do not agree, there is an error. | |
| 16 | The strength of the correlation motivates further examination. | |
| a) Make a scatter plot linked to and near the data above, and with Age on the horizontal (X) axis. | ||
| b) Add to your chart | ||
| A meaningful title | ||
| Vertical axis label Time | ||
| Horizontal axis label Age | ||
| c) Complete the chart by adding Trendline and checking boxes | ||
| 17 | Read directly from the chart: | |
| a) Intercept = | ||
| b) Slope = | ||
| c) R2 = | ||
| 18a | Perform regression using Data Analysis. Select the Time data first, include the labels in row 4 in the Input Range , and check the Labels checkbox. | |
| 18b | In the Regression output, highlight the Y-intercept blue, the slope green, and R2 red. | |
| 19 | Is it valid to use this data to predict the amount of time that a 85-year-old will spend browsing online retailers ? | |
| Why or why not (Week 11 Presentation, slide 8)? | ||
| Is it valid to use this data to predict the amount of time that a 60-year-old will spend browsing online retailers ? | ||
| 20 | If valid, use the Data Analysis output to predict the number of minutes spent by a 35-year old shopper. Enter = followed by the regression formula, | |
| entering the intercept and slope into the formula by clicking on the corresponding cells in the regression output. | ||
| (Week 11 Presentation, slide 11) | ||
Cleaning Outlier #21-25
| 21 | Make a scatter plot linked to and near the following data: | |
| X | Y | |
| 1.01 | 2.5982 | |
| 1.07 | 2.2752 | |
| 1.16 | 2.592 | |
| 1.19 | 2.2252 | |
| 1.19 | 3.2843 | |
| 1.22 | 2.19 | |
| 1.32 | 3.6176 | |
| 1.38 | 2.9516 | |
| 1.43 | 3.9971 | |
| 1.45 | 4.0855 | |
| 1.46 | 4.0862 | |
| 1.48 | 4.0272 | |
| 1.56 | 4.3208 | |
| 1.58 | 3.6052 | |
| 1.6 | 3.59 | |
| 1.72 | 4.8756 | |
| 1.77 | 4.1042 | |
| 1.8 | 4.538 | |
| 1.81 | 5.1257 | |
| 1.88 | 4.6192 | |
| 1.94 | 5.4924 | |
| 2 | 5.21 | |
| 3 | -4 | |
| 22 | Make the title Scatterplot of the Data | |
| Add trendline and regression equation and R2 label to the plot. | ||
| 23 | The scatterplot reveals a point outside the point pattern. A "residual" is defined as the veritical distance from the line to a point. Q1, Q3, and IQR were computed for the residuals. Residuals that are more that 1.5 IQR below Q1 or more than 1.5 IQR above Q3 are called outliers and the corresponding data must be investigated. It was determined that the outlying data point was due to data entry error. The data with outlier removed is copied below. Make a new scatterplot linked to and near the cleaned data, and add title Scatterplot of Cleaned Data, trendline, and regression equation. | |
| X | Y | |
| 1.01 | 2.5982 | |
| 1.07 | 2.2752 | |
| 1.16 | 2.592 | |
| 1.19 | 2.2252 | |
| 1.19 | 3.2843 | |
| 1.22 | 2.19 | |
| 1.32 | 3.6176 | |
| 1.38 | 2.9516 | |
| 1.43 | 3.9971 | |
| 1.45 | 4.0855 | |
| 1.46 | 4.0862 | |
| 1.48 | 4.0272 | |
| 1.56 | 4.3208 | |
| 1.58 | 3.6052 | |
| 1.6 | 3.59 | |
| 1.72 | 4.8756 | |
| 1.77 | 4.1042 | |
| 1.8 | 4.538 | |
| 1.81 | 5.1257 | |
| 1.88 | 4.6192 | |
| 1.94 | 5.4924 | |
| 2 | 5.21 | |
| 24 | Compare the regression results of the two plots. What does the outlier do to the regression line? | |
| What does the regression line do when the outlier is removed? | ||
| 25 | How does removal of the outlier affect R2? | |
| Why does it affect R2 this way? | ||