Statisctics

Discussion1.docx

Home >Mathematics homework help >Statistics homework help >Statisctics

Discussion 1

You have learned about the inferences about population variances, comparing multiple proportions, test of independence and goodness of fit, and now please answer the following questions in detail by applying the knowledge that you have gained from readings and lectures of this week. It is important to include hypothetical examples whenever applicable.

1. Describe how chi squared and F random variables are generated. What are the properties of the distribution of these random variables?

2. Discuss the objective in testing hypotheses on variance of one population, and variances of two populations, and the underlying assumptions. Explain formulation of the hypothesis, the test statistic, the rationale for rejecting the null, the criterion for choosing the rejection region, possible test outcomes, and the criterion for evaluating the p value. Provide hypothetical examples of formulating hypotheses on variance of a population, and uniformity of variance across two populations.

3. Explain the chi square test on uniformity of a proportion across the several populations, goodness of fit, and independence. Explain formulation of the hypothesis, the test statistic, the rationale for rejecting the null, the criterion for choosing the rejection region, possible test outcomes, and the criterion for evaluating the p value. Provide a hypothetical example of formulating hypotheses on uniformity of proportion across several populations. Provide hypothetical examples of formulating hypotheses in each case.

Activity 1 – CLO 1, CLO 2, CLO 3, CLO 4

1. Ball bearing manufacturing is a highly precise business in which minimal part variability is critical. Large variances in the size of the ball bearings cause bearing failure and rapid wear-out. Production standards call for a maximum variance of .0001 inches. Gerry Liddy has gathered a sample of 15 bearings that shows a sample standard deviation of .014 inches. Use ɑ = .10

a. Please determine whether the sample indicates that the maximum acceptable variance is being exceeded.

b. What is the p value?

2. The grade point averages of 352 students who completed a college course in financial accounting have a standard deviation of .940. The grade point averages of 73 students who dropped out of the same course have a standard deviation of .797.

a. Do the data indicate a difference between the variances of grade point averages for students who completed a financial accounting course and students who dropped out?

b. Use ɑ = .05 level of significance.

c. What is the p value?

Note: F of alpha / 2 with degrees of freedom 351 and 72 which yields 0.025 area under its graph to the right is 1.466

Activity 2 – CLO 1, CLO 2, CLO 3, CLO 4

This Activity requires a detailed analysis and to provide the answer to the three questions below:

1. As listed by The Arts Newspaper’s Visitor Figures Survey (https://www.theartnewspaper.com/visitor-figures-2017), the five most-visited art museums in the world, provided in Table 1 are the Louvre Museum, the National Museum in China, the Metropolitan Museum of Art, the Vatican Museums, and the British Museum.

Table 1

Frequencies of Ratings of the Museums by the Visitors Included in the Sample

	Louvre	National Museum in China	Metropolitan Museum of Art	Vatican Museums	British Museum
Spectacular	113	88	94	98	96
Not spectacular	37	44	46	72	64

Use the sample data provided in the Table 1 to answer the following questions:

a. Calculate the point estimate of the population proportion of visitors who rated each of these museums as spectacular.

b. Conduct a hypothesis test to determine if the population proportion of visitors who rated the museum as spectacular is equal for these five museums using ɑ = .05 level of significance. What is the p-value?

c. If the null is rejected perform post-hoc test(s) using same ɑ and make conclusions.

0. A Financial Times/Harris Poll surveyed people in six countries to assess attitudes toward a variety of alternate forms of energy. The data in Table 2 is a portion of the poll’s findings concerning whether people favor or oppose the building of new nuclear power plants.

Table 2

The Frequencies of the Respondent’s Opinion about a New Nuclear Power Plant by Country of Origin.

	Great Britain	France	Italy	Spain	Germany	United States
Strongly favor	141	161	298	133	128	204
Favor more than oppose	348	366	309	222	272	326
Oppose more than favor	381	334	219	311	322	316
Strongly oppose	217	215	219	443	398	174

Use the sample data provided in the Table 2 to answer the following questions:

a. How large was the sample in this poll?

b. Conduct a hypothesis test to determine whether people’s attitude toward building new nuclear power plants is independent of country using ɑ = .05 level of significance. What is the p-value and what is your conclusion?

c. Using the percentage of respondents who “strongly favor” and “favor more than oppose,” which country has the most favorable attitude toward building new nuclear power plants? Which country has the least favorable attitude?

0. Based on 2017 sales, the six top-selling compact cars are the Honda Civic, Toyota Corolla, Nissan Sentra, Hyundai Elantra, Chevrolet Cruze, and Ford Focus (New York Daily News, http://www.nydailynews.com/autos/street-smarts/best-selling-small-cars-2016-list-article-1.2945432). The 2017 market shares are: Honda Civic 20%, Toyota Corolla 17%, Nissan Sentra 12%, Hyundai Elantra 10%, Chevrolet Cruze 10%, and Ford Focus 8%, with other small car models making up the remaining 23%.

A sample of 400 compact car sales in Chicago showed the number of vehicles sold in Table 3.

Table 3

The Frequencies of Sales of Car Brands

Honda Civic	98
Toyota Corolla	72
Nissan Sentra	54
Hyundai Elantra	44
Chevrolet Cruze	42
Ford Focus	25
Others	65

Use a goodness of fit test to determine if the sample data indicate that the market shares for compact cars in Chicago are different than the market shares suggested by nationwide 2017 sales using ɑ = .05 level of significance.

a. What is the p-value and what is your conclusion?

b. If the Chicago market appears to differ significantly from the nationwide sales, which categories contribute most to this difference?

discussion Question 2 – CLO 5, CLO 12

You have learned about the multiple linear regression, and now please answer the following questions in detail by applying the knowledge that you have gained from readings and lectures. It is important to include hypothetical examples whenever applicable.

1. Explain the multiple linear regression model, the independent variables and the dependent variable, assumptions of the model, objectives, and the approach taken to construct the model, and using the model for prediction.

2. Explain the analysis of variance (ANOVA) test on significance of the regression and how the result of this test is interpreted; discuss the hypotheses on coefficients of the regression and how the results of testing these hypotheses are interpreted about significance of these coefficients; include both unidirectional and bidirectional situations the coefficient of determination, adjusted coefficient of determination, and their significance.

3. Describe how multicollinearity can have adverse effects in constructing the regression model, how it is identified, and how normality of residuals is verified.

Activity 3 – CLO 5, CLO 11, CLO 12

The Tire Rack, America’s leading online distributor of tires and wheels, conducts extensive testing to provide customers with products that are right for their vehicle, driving style, and driving conditions. In addition, the Tire Rack maintains an independent consumer survey to help drivers help each other by sharing their long-term tire experiences. The following data show survey ratings (1 to 10 scale with 10 being the highest rating) for 18 maximum performance summer tires. The variable Steering rates, the tire’s steering responsiveness, Tread Wear rates, quickness of wear based on the driver’s expectations, and Buy Again rates the driver’s overall tire satisfaction and desire to purchase the same tire again. Values shown below are averages as obtained from surveys.

Table 4

Sample of Ratings of Eighteen Maximum Performance Summer Tires.

Tire	Steering	Treadwear	Buy Again
Goodyear Assurance TripleTred	8.9	8.5	8.1
Michelin HydroEdge	8.9	9.0	8.3
Michelin Harmony	8.3	8.8	8.2
Dunlop SP 60	8.2	8.5	7.9
Goodyear Assurance ComforTred	7.9	7.7	7.1
Yokohama Y372	8.4	8.2	8.9
Yokohama Aegis LS4	7.9	7.0	7.1
Kumho Power Star 758	7.9	7.9	8.3
Goodyear Assurance	7.6	5.8	4.5
Hankook H406	7.8	6.8	6.2
Michelin Energy LX4	7.4	5.7	4.8
Michelin MX4	7.0	6.5	5.3
Michelin Symmetry	6.9	5.7	4.2
Kumho 722	7.2	6.6	5.0
Dunlop SP 40 A/S	6.2	4.2	3.4
Bridgestone Insignia SE20	5.7	5.5	3.6
Goodyear Integrity	5.7	5.4	2.9
Dunlop SP20 FE	5.7	5.0	3.3

Use the data provided in Table 4 to answer the following questions:

1. Provide descriptive statistics of the data.

2. Develop two simple regression models that can be used to predict the Buy Again rating given the Steering Rating in one and the Tread Wear rating in the other. State the hypotheses on the coefficients, justify formulation of these hypotheses, and interpret the results. Use ɑ = .05. Include all phases of assessment of the model.

3. Develop a multiple regression model that can be used to predict the Buy Again rating given the Steering rating and the Tread Wear rating. State the hypotheses on the coefficients, justify formulation of these hypotheses, and interpret the results. Use ɑ = .05. Include all phases of assessment of the model and do not forget to check multicollinearity.

4. Does combining the two independent variables improve coefficient of determination? Please explain.

5. Choose a combination of steering and treadwear not given in the above table and find the expected Buy Again for this combination.

Professional Assignment 1 – CLO 3, CLO 5, CLO 6, CLO 11, CLO 12

Consumer Research, Inc., is an independent agency that conducts research on consumer attitudes and behaviors for a variety of firms. In one study, a client asked for an investigation of consumer characteristics that can be used to predict the amount charged by credit card users are given in Table 5. Data were collected on annual income, household size, and annual credit card charges for a sample of 50 consumers.

a. Provide descriptive statistics of the data, develop a multiple regression model that can be used to predict the Amount Charged given the Annual Income and Household Size, state the hypotheses on the coefficients, justify formulation of these hypotheses, and interpret the results. Use ɑ = .05. Include all phases of assessment of the model and do not forget to check multicollinearity.

b. To assess robustness of the software, repeat part (a) but this time use full representation of Annual Income. What conclusions can you make?

c. Choose a combination of Annual Income and Household Size not given in the table and find the expected Amount Charged for this combination.

Use ɑ = .05 in testing all hypotheses

Table 5

Data on Annual Income, Household size, Amount of Credit Changed

Income ($1000’s)	Household size	Amount charged ($)	Income	Size	Charged
54	3	4016	54	6	5573
30	2	3159	30	1	2583
32	4	5100	48	2	3866
50	5	4742	34	5	3586
31	2	1864	67	4	5037
55	2	4070	50	2	3605
37	1	2731	67	5	5345
40	2	3348	55	6	5370
66	4	4764	52	2	3890
51	3	4110	62	3	4705
25	3	4208	64	2	4157
48	4	4219	22	3	3579
27	1	2477	29	4	3890
33	2	2514	39	2	2972
65	3	4214	35	1	3121
63	4	4965	39	4	4183
42	6	4412	54	3	3730
21	2	2448	23	6	4127
44	1	2995	27	2	2921
37	5	4171	26	7	4603
62	6	5678	61	2	4273
21	3	3623	30	2	3067
55	7	5301	22	4	3074
42	2	3020	46	5	4820
41	7	4828	66	4	5149