Statisctics

profilevikram8888
Discussion1.docx

Discussion 1

You have learned about the inferences about population variances, comparing multiple proportions, test of independence and goodness of fit, and now please answer the following questions in detail by applying the knowledge that you have gained from readings and lectures of this week. It is important to include hypothetical examples whenever applicable. 

1. Describe how chi squared and F random variables are generated. What are the properties of the distribution of these random variables?

2. Discuss the objective in testing hypotheses on variance of one population, and variances of two populations, and the underlying assumptions. Explain formulation of the hypothesis, the test statistic, the rationale for rejecting the null, the criterion for choosing the rejection region, possible test outcomes, and the criterion for evaluating the p value. Provide hypothetical examples of formulating hypotheses on variance of a population, and uniformity of variance across two populations.

3. Explain the chi square test on uniformity of a proportion across the several populations, goodness of fit, and independence. Explain formulation of the hypothesis, the test statistic, the rationale for rejecting the null, the criterion for choosing the rejection region, possible test outcomes, and the criterion for evaluating the p value. Provide a hypothetical example of formulating hypotheses on uniformity of proportion across several populations. Provide hypothetical examples of formulating hypotheses in each case.

Activity 1 – CLO 1, CLO 2, CLO 3, CLO 4

1. Ball bearing manufacturing is a highly precise business in which minimal part variability is critical. Large variances in the size of the ball bearings cause bearing failure and rapid wear-out. Production standards call for a maximum variance of .0001 inches. Gerry Liddy has gathered a sample of 15 bearings that shows a sample standard deviation of .014 inches. Use ɑ = .10 

a. Please determine whether the sample indicates that the maximum acceptable variance is being exceeded. 

b. What is the p value?

2. The grade point averages of 352 students who completed a college course in financial accounting have a standard deviation of .940. The grade point averages of 73 students who dropped out of the same course have a standard deviation of .797. 

a. Do the data indicate a difference between the variances of grade point averages for students who completed a financial accounting course and students who dropped out? 

b. Use ɑ = .05 level of significance. 

c. What is the p value? 

 

Note: F of alpha / 2 with degrees of freedom 351 and 72 which yields 0.025 area under its graph to the right is 1.466

Activity 2 – CLO 1, CLO 2, CLO 3, CLO 4

This Activity requires a detailed analysis and to provide the answer to the three questions below:  

1. As listed by The Arts Newspaper’s Visitor Figures Survey (https://www.theartnewspaper.com/visitor-figures-2017), the five most-visited art museums in the world, provided in Table 1 are the Louvre Museum, the National Museum in China, the Metropolitan Museum of Art, the Vatican Museums, and the British Museum.  

Table 1

Frequencies of Ratings of the Museums by the Visitors Included in the Sample

Louvre

National Museum in China

Metropolitan Museum of Art

Vatican Museums

British Museum

Spectacular

113

88

94

98

96

Not spectacular

37

44

46

72

64

Use the sample data provided in the Table 1 to answer the following questions:

a. Calculate the point estimate of the population proportion of visitors who rated each of these museums as spectacular. 

b. Conduct a hypothesis test to determine if the population proportion of visitors who rated the museum as spectacular is equal for these five museums using ɑ = .05 level of significance. What is the p-value? 

c. If the null is rejected perform post-hoc test(s) using same ɑ and make conclusions.

0. A Financial Times/Harris Poll surveyed people in six countries to assess attitudes toward a variety of alternate forms of energy. The data in Table 2 is a portion of the poll’s findings concerning whether people favor or oppose the building of new nuclear power plants. 

Table 2

The Frequencies of the Respondent’s Opinion about a New Nuclear Power Plant by Country of Origin.

Great Britain

France

Italy

Spain

Germany

United States

Strongly favor

141

161

298

133

128

204

Favor more than oppose

348

366

309

222

272

326

Oppose more than favor

381

334

219

311

322

316

Strongly oppose

217

215

219

443

398

174

 

Use the sample data provided in the Table 2 to answer the following questions:

a. How large was the sample in this poll? 

b. Conduct a hypothesis test to determine whether people’s attitude toward building new nuclear power plants is independent of country using ɑ = .05 level of significance. What is the p-value and what is your conclusion? 

c. Using the percentage of respondents who “strongly favor” and “favor more than oppose,” which country has the most favorable attitude toward building new nuclear power plants? Which country has the least favorable attitude?

0. Based on 2017 sales, the six top-selling compact cars are the Honda Civic, Toyota Corolla, Nissan Sentra, Hyundai Elantra, Chevrolet Cruze, and Ford Focus (New York Daily News, http://www.nydailynews.com/autos/street-smarts/best-selling-small-cars-2016-list-article-1.2945432). The 2017 market shares are: Honda Civic 20%, Toyota Corolla 17%, Nissan Sentra 12%, Hyundai Elantra 10%, Chevrolet Cruze 10%, and Ford Focus 8%, with other small car models making up the remaining 23%. 

A sample of 400 compact car sales in Chicago showed the number of vehicles sold in Table 3

Table 3

The Frequencies of Sales of Car Brands 

Honda Civic

98

Toyota Corolla

72

Nissan Sentra

54

Hyundai Elantra

44

Chevrolet Cruze 

42

Ford Focus

25

Others

65

Use a goodness of fit test to determine if the sample data indicate that the market shares for compact cars in Chicago are different than the market shares suggested by nationwide 2017 sales using ɑ = .05 level of significance. 

a. What is the p-value and what is your conclusion? 

b. If the Chicago market appears to differ significantly from the nationwide sales, which categories contribute most to this difference? 

discussion Question 2 – CLO 5, CLO 12

You have learned about the multiple linear regression, and now please answer the following questions in detail by applying the knowledge that you have gained from readings and lectures. It is important to include hypothetical examples whenever applicable. 

1. Explain the multiple linear regression model, the independent variables and the dependent variable, assumptions of the model, objectives, and the approach taken to construct the model, and using the model for prediction.

2. Explain the analysis of variance (ANOVA) test on significance of the regression and how the result of this test is interpreted; discuss the hypotheses on coefficients of the regression and how the results of testing these hypotheses are interpreted about significance of these coefficients; include both unidirectional and bidirectional situations the coefficient of determination, adjusted coefficient of determination, and their significance.

3. Describe how multicollinearity can have adverse effects in constructing the regression model, how it is identified, and how normality of residuals is verified. 

Activity 3 – CLO  5, CLO 11, CLO 12

The Tire Rack, America’s leading online distributor of tires and wheels, conducts extensive testing to provide customers with products that are right for their vehicle, driving style, and driving conditions. In addition, the Tire Rack maintains an independent consumer survey to help drivers help each other by sharing their long-term tire experiences. The following data show survey ratings (1 to 10 scale with 10 being the highest rating) for 18 maximum performance summer tires. The variable Steering rates, the tire’s steering responsiveness, Tread Wear rates, quickness of wear based on the driver’s expectations, and Buy Again rates the driver’s overall tire satisfaction and desire to purchase the same tire again. Values shown below are averages as obtained from surveys.

Table 4 

Sample of Ratings of Eighteen Maximum Performance Summer Tires.

Tire

Steering

Treadwear

Buy Again

Goodyear Assurance TripleTred 

8.9

8.5

8.1

Michelin HydroEdge

8.9

9.0

8.3

Michelin Harmony

8.3

8.8

8.2

Dunlop SP 60

8.2

8.5

7.9

Goodyear Assurance ComforTred

7.9

7.7

7.1

Yokohama Y372 

8.4

8.2

8.9

Yokohama Aegis LS4

7.9

7.0

7.1

Kumho Power Star 758

7.9 

7.9

8.3

Goodyear Assurance

7.6

5.8 

4.5 

Hankook H406

7.8

6.8 

6.2 

Michelin Energy LX4 

7.4 

5.7 

4.8 

Michelin MX4 

7.0

6.5 

5.3 

Michelin Symmetry 

6.9 

5.7

4.2 

Kumho 722 

7.2 

6.6

5.0 

Dunlop SP 40 A/S 

6.2 

4.2 

3.4

Bridgestone Insignia SE20

5.7

5.5 

3.6

Goodyear Integrity 

5.7 

5.4 

2.9 

Dunlop SP20 FE 

5.7

5.0 

3.3

Use the data provided in Table 4 to answer the following questions: 

1. Provide descriptive statistics of the data.

2. Develop two simple regression models that can be used to predict the Buy Again rating given the Steering Rating in one and the Tread Wear rating in the other. State the hypotheses on the coefficients, justify formulation of these hypotheses, and interpret the results. Use ɑ = .05. Include all phases of assessment of the model. 

3. Develop a multiple regression model that can be used to predict the Buy Again rating given the Steering rating and the Tread Wear rating. State the hypotheses on the coefficients, justify formulation of these hypotheses, and interpret the results. Use ɑ = .05. Include all phases of assessment of the model and do not forget to check multicollinearity.

4. Does combining the two independent variables improve coefficient of determination? Please explain.

5. Choose a combination of steering and treadwear not given in the above table and find the expected Buy Again for this combination.

Professional Assignment 1 – CLO 3, CLO 5, CLO 6, CLO 11, CLO 12

Consumer Research, Inc., is an independent agency that conducts research on consumer attitudes and behaviors for a variety of firms. In one study, a client asked for an investigation of consumer characteristics that can be used to predict the amount charged by credit card users are given in Table 5. Data were collected on annual income, household size, and annual credit card charges for a sample of 50 consumers.

a. Provide descriptive statistics of the data, develop a multiple regression model that can be used to predict the Amount Charged given the Annual Income and Household Size, state the hypotheses on the coefficients, justify formulation of these hypotheses, and interpret the results. Use ɑ = .05. Include all phases of assessment of the model and do not forget to check multicollinearity.

b. To assess robustness of the software, repeat part (a) but this time use full representation of Annual Income. What conclusions can you make?

c. Choose a combination of Annual Income and Household Size not given in the table and find the expected Amount Charged for this combination.

Use ɑ = .05 in testing all hypotheses

Table 5

Data on Annual Income, Household size, Amount of Credit Changed

Income ($1000’s)

Household size

Amount charged ($)

Income

Size

Charged

54 

4016 

54

6

5573

30 

3159 

30 

2583

32 

5100 

48 

3866 

50 

4742

34 

3586 

31 

1864

67 

5037

55 

4070 

50 

3605 

37 

2731 

67 

5345 

40 

3348 

55 

5370 

66 

4764 

52 

3890 

51 

4110 

62 

4705 

25 

4208 

64 

4157 

48 

4219 

22 

3579 

27 

2477 

29 

3890 

33

2

2514

39

2

2972

65 

4214

35

3121

63 

4965 

39 

4183 

42 

4412 

54 

3730 

21 

2448 

23 

4127 

44 

2995 

27 

2921 

37 

4171 

26 

4603 

62 

5678 

61 

4273 

21 

3623 

30 

3067 

55 

5301 

22 

3074 

42 

3020 

46 

4820 

41

7

4828

66

4

5149