Statistics assignment for second year university
[Q 1~12] A clothing store is considering two methods to reduce these losses: 1) to hire a security guard vs. 2) to install cameras. After collecting data for 5-month period each respectively, the monthly losses (in 100) were recorded in the table. The manager would install the cameras only if there was enough evidence that the guard was better. ̅ s2 n Guard (x1)
27
20
32
23
38
Cameras (x2)
48
31
29
38
44
) 2 a) Compute the average, variance in the above table. (Show ( − ̅ calculation!) b) Test whether you use equal variance or not. (b-1) Setup hypotheses. (b-2) F-stat = (b-3) Fcrit region: (b-4) Conclusion: Assume the equal-variance t-statistic for the above two populations c) T-test about μ1 – μ2. (c-1) Setup hypotheses. Conclusion must include - Whether you can reject H0 or not - Explain in the problem context. (c-2) Compute SD & d.f. d.f. = 1 1 1 2 + ? ( ) �� SD= ? ?𝑛 𝑛 (c-3) t-stat = � = � (c-4) rejection region: (c-5) Conclusion: Because hiring the guard is more expensive, the manager requires that the reducing losses (in 100) when hiring the guard must be at least 2 d) 2nd t-test (d-1) Setup hypotheses. H1 ) �� (d-2) t-stat = � = � (d-3) Conclusion: Since t-stat 1 2 (BA 2606 MID-2) 2 [Q 13~19] Hoping to improve sales, one company decided to introduce more attractive packaging. To test the effect on sales, the manager distributes the new design to Supermarket 1 (MKT1), while sending the old design to Supermarket 2 (MKT2). The barcode data were received after a certain period. The code for this product was 9077 in both supermarkets. Since the cost for new package is more expensive, the manager wants to know the effectiveness of this new design. The collected data for the total transactions (n) and the number of 9077 (x) is as follows: a) Set up the alternative hypothesis. b) Fill up the table. (Use 3 decimal point calculation.) What is the pooled proportion for (p1 – p2)? 1 1 + ? c) Compute the standard error for (p1 – p2). ? ?𝑛 𝑛 d) What distribution does (p1 – p2) follow? MKT1 MKT2 Total n 904 1038 x 180 155 p ) Why not t-distribution? . 𝒑 −𝒑 ? 𝟏 𝟏 e) Compute z-statistic: ( ? 𝑺𝑫 f) What is zcrit at α = 0.05? Explain where you get the number. P(Z g) Conclusion: Since Because the new design is more expensive, the management requires the new design outsells the old one by at least 2%. In this assumption, please answer for the following questions.
? ? 1 1 𝑛 ? h) Compute the standard error for (p1 – p2). ? ?𝑝(1−𝑝) 1 ? ? ) � − 𝑝 𝟏 𝟏 2 𝑺𝑫 i) Compute z-statistic: ( �−𝒑 ( 1−𝑝) Speeds (km/h)
20
30
40
50
60
ABS (x1)
36
48
60
67
70
non-ABS (x2)
34
51
64
69
73
d = x1 – x2
[Q 22~27] To find the effectiveness of ABS, a car buyer organized an experiment. He hit the brakes at some speed and recorded the time to stop an ABS-equipped car and another identical car without ABS. The speeds and the time (in 0.1seconds) to stop on dry pavement are listed here. Can we infer that ABS is better (that is, the stopping time is shorter) with 95% confidence? a) What kind of comparison is this question about? b) Set up the alternative hypothesis, using d = x1 – x2. c) Compute d in the table and the average and standard deviation of d. [Show your calculations!] ̅ Σ n �=� ̅ d) What distribution does �f �ollow? −2−0 1 e) Compute t-statistic and tcrit at α = 0.05: �=� .049 = f) Conclusion: [Q28~34] A professor is interested in whether students in different degree programs earn different amounts in their summer jobs. A sample of 4 students in the BA, BSc, and BBA programs were (BA 2606 MID-2) 3 asked to report what they earned the previous summer. The results (in $100s) are listed here. Can we infer that students in different degree programs differ in their summer earnings? BA
BSc
BBA
37 37 39
32 49 54 42 39 59 49 47 56 Grand mean mean mean mean a) Compute the group means and the grand mean in the table. b) Compute SST, SSE and SSTotal using the blank space in the table. � ? 2 SSE = ∑∑? � − ̅ = − ? 2 SSTotal = ∑∑? 𝑖 ̿ SST = c) Fill up the following table. (Fcrit is at α = 0.05.) Show the formula for SSTotal & SST as SSE. Total d) Conclusion: Since We. [Q35~38] The employee absenteeism costs of North American companies more than $100 billion per year. The personnel manager recorded the weekdays during which individuals in a sample of 200 absentees were away over the past several months. Do these data suggest that absenteeism is higher on some days of the week than on others? Weekday Mon Tue Wed Thu Fri SUM Number of Absent (f) 49 37 35 39 40 200 Expected Absent (e) (f - e)2/e a) What is the null hypothesis? H0: b) What is Expected Absent (e) for Wednesday? Show your calculation. c) Fill up the table including (f - e)2/e. d) What is the χ2 statistic? e) Rejection region at α = 0.05: If χ2 > χ2 (Do not say with p-value. Use χ2 crit.) e) Conclusion: Since We 2 � ( ) ) (BA 2606 MID-2) 4
Year 2000
Year 2005
Year 2010
Total
Firearm (1)
161
175
131
467
Knife (2)
36
37
24
97
Other (3)
53
39
27
119
No weapon (4)
159
166
126
451
Total
409
417
308
1134
[Q39~42] Every year, there are more than 300,000 robberies in the United States. A researcher took a random sample of robberies in 2000, 2005, and 2010, and recorded the weapon used: 1 = Firearm, 2 = Knife or other cutting instrument, 3 = Other, and 4 = No weapon. Can we infer that the use of weapons in robberies differed over the three years? a) What is the statistical independence? 1134 b) What is the expected frequency for type “1” weapon in Year 2005? = 417∗467 171.73 c) What is closest to the “(f - e)2/e” for type “1” weapon in Year 2005? = (175−171.73)2 d) What is d.f.? = (r-1)(c-1) = (3-1)(4-1) = 6 e) What is the χ2 crit at α = 0.05? χ2 0.05,6 = 12.6 f) If χ2 statistic = 4.76, what is your conclusion? [Q 43-50] To help determine how many beers to stock the concession manager at Yankee Stadium wanted to know how the temperature affected beer sales. Accordingly, she took a sample of 10 games and recorded the number of beers sold (in 1,000 bottles) and the temperature (in °F) in the middle of the game. We use the linear regression model of y= b0 + b1 x. Temp (x) 78 68 72 88 72 84 Beers (y) 20 11 14 30 17 28 ( − ̅) (�−��?)� − ̅ (�−��? 2 ( − ̅)(�−��?)� �? � 1 0 1 0 0 20.967 -9 -9 81 81 81 11.297 -5 -6 25 36 30 15.165 11 10 121 100 110 30.637 -5 -3 25 9 15 15.165 7 8 49 64 56 26.769 ) 2 (�? −��? � 0.935 75.742 23.377 113.146 23.377 45.819 77 20 SUM 302 290 292 282.396 a) Fill up the table. (From ( − ̅) to ∑( − ̅)(�−��?).�) ̅ ) ) 2 b) What is ∑( − 2, ∑(�−��? �, and ∑( − ̅)(�−��?)?�302, 290 and 292 � 0 1 ̅ ? )( ) 𝑏= = c) Compute the regression line, say �=�� + � �. (Show your calculations with 3 decimal point.) ∑(�−�𝑥�−�𝑦 ̅ ) 2 1 ∑(�−�𝑥 d) What is the estimated value for x = 79? �? � 𝑆𝑆𝑅 � ∑ �−�𝑦 ? ? ( ) 2 ∑ �−� ? ( ) 2 e) The SSR was partly computed in the table. What is R2? = 𝑆=�𝑇𝑦= f) Fill out the following ANOVA table about regression. Source of Variation SS df MS F Fcrit Regression (R) Residuals (E) Total g) What is the sb1 (the standard error for parameter b1)? R = , �1� 𝑒 𝑥 2 ∑ ( �−�𝑥 ̅ ) 2 crit (Formulas for Midterm Test) TABLES: (BA 2606 MID-Formula) ̅ �−�𝜇 0 √ � 0 𝑝 0 0 𝑞/� �/�2 ? �� ? ?? 𝑛 Ch 12. t-test of μ �=��/��with d.f = n – 1; z-test of p �=�? �−�𝑝 �(But, CI = �̂±�𝑧 ��) 2 𝑛 𝑛 𝑝 𝑝 2 1 2 1 � 1 2 2 2 𝜈+𝜈 Ch 13. Equal-variance t-test of μ1 – μ2: SD = ? 𝑠? + 1 ? where 𝑠= 𝜈𝑠+𝜈�, with d.f = ν1 + ν2. 1 2 1 12 𝑖 �𝑠� 𝑠 +�𝑠� 1 2 ( ) 2 2 1 2 𝑖 1 2 2 2 � � 𝑖 1 2 Unequal-variance t-test of μ1 – μ2: SD = √� �+ � �where � �= �, with d.f = �𝑠�/𝜈+�𝑠�/�. 1 2 1 2 2 2 2 2 1−�, � �,𝛼�, � 1 2 2 1 F-test of 𝜎/𝜎: F = �/�𝑠 (note: 𝐹𝛼�,� = 1/𝐹� , use α =.05/2 for equal variance test.) ? Matched pairs: Use the difference (D) t-test of 𝐷 Ch 14. Source of variation SS d.f Treatment SST k - 1 Error SSE n - k Total SSTotal n - 1 MS F MST MST/MSE MSE F where 𝐹 SSTotal = (SS from ̿), �,𝛼−�1,�−�𝑘 SSE = (SS from ̅) and 2 SST =∑ ? ̅− ̿? ? � �𝐸 1 𝑛 𝑛 �/�2,� 𝑖 Fisher’s LSD: 𝑡 −�𝑘 ��? + 1 ? compared with |𝜇− �|�, (Bonferroni adj. used α' = α ÷ kC2) 𝑖 𝑗 ? [ 𝛼 𝑔 𝑔 1 2 � ] . Tukey’s ω: = 𝑞(�,𝑘)�𝑀�/ ��wh�ere = �/�1/ + 1/ + ...+1/ � 2 𝑓 −𝑒 ( ) 2 𝑒 𝑖 𝑘 Ch 15. Goodness-of-Fit: 𝜒= ∑�=�1 𝑖𝑖with d.f. = k – 1. 2 𝑓 −𝑒 𝑗 ? 2 𝑒 𝑖 𝑠 ∑ � ̅ ? Statistical Independence Test: 𝜒= ∑ ∑ ? 𝑖�𝑗� with d.f. = (r – 1) (c – 1). 𝑖𝑗 � 𝑠 2 ∑ ( �−�𝑥 ̅ ) 2 1 0 1 0 1 Ch 15. Regression: 𝑏= 𝑥=� (�−��)�(�−��) and 𝑏= �? −�� �̅from �=�𝑏+ � �. 2 2 2 𝑥 ) ) ) ANOVA Table: SSR = ∑(�? −��? �, SSE = ∑(�−��? �(with d.f = n-2), SST=SSTotal = ∑(�−��? � 2 𝑆𝑆𝑅 𝑆𝑆𝑇 Standard Error for b1: 𝑠= 𝑠 = ? 𝑀𝑆𝐸 ? (�−�1)𝑠 Standard Normal Table for Pr(Z < z-value) z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09 1.5 .9332 .9345 .9357 .9370 .9382 .9394 .9406 .9418 .9429 .9441 1.6 .9452 .9463 .9474 .9484 .9495 .9505 .9515 .9525 .9535 .9545 1.7 .9554 .9564 .9573 .9582 .9591 .9599 .9608 .9616 .9625 .9633 1.8 .9641 .9649 .9656 .9664 .9671 .9678 .9686 .9693 .9699 .9706 1.9 .9713 .9719 .9726 .9732 .9738 .9744 .9750 .9756 .9761 .9767 2.0 .9772 .9778 .9783 .9788 .9793 .9798 .9803 .9808 .9812 .9817 2.1 .9821 .9826 .9830 .9834 .9838 .9842 .9846 .9850 .9854 .9857 2.2 .9861 .9864 .9868 .9871 .9875 .9878 .9881 .9884 .9887 .9890 2.3 .9893 .9896 .9898 .9901 .9904 .9906 .9909 .9911 .9913 .9916 2.4 .9918 .9920 .9922 .9925 .9927 .9929 .9931 .9932 .9934 .9936 2.5 .9938 .9940 .9941 .9943 .9945 .9946 .9948 .9949 .9951 .9952 2.6 .9953 .9955 .9956 .9957 .9959 .9960 .9961 .9962 .9963 .9964 2.7 .9965 .9966 .9967 .9968 .9969 .9970 .9971 .9972 .9973 .9974 2.8 .9974 .9975 .9976 .9977 .9977 .9978 .9979 .9979 .9980 .9981 Critical Value of F-dist; α =.05 Critical Value of F-dist; α =.025 ν2 ν1 1 2 3 4 5 ν2 ν1 1 2 3 4 1 161.4 199.5 215.7 224.6 230.2 1 647.8 799.5 864.2 899.6 2 18.51 19.00 19.16 19.25 19.30 2 38.5 39.0 39.2 39.2 3 10.13 9.55 9.28 9.12 9.01 3 17.4 16.0 15.4 15.1 4 7.71 6.94 6.59 6.39 6.26 4 12.2 10.6 9.98 9.60 5 6.61 5.79 5.41 5.19 5.05 5 10.0 8.43 7.76 7.39 6 5.99 5.14 4.76 4.53 4.39 6 8.81 7.26 6.60 6.23 7 5.59 4.74 4.35 4.12 3.97 7 8.07 6.54 5.89 5.52 8 5.32 4.46 4.07 3.84 3.69 8 7.57 6.06 5.42 5.05 9 5.12 4.26 3.86 3.63 3.48 9 7.21 5.71 5.08 4.72 10 4.96 4.10 3.71 3.48 3.33 10 6.94 5.46 4.83 4.47 11 4.84 3.98 3.59 3.36 3.20 11 6.72 5.26 4.63 4.28 12 4.75 3.89 3.49 3.26 3.11 12 6.55 5.10 4.47 4.12 13 4.67 3.81 3.41 3.18 3.03 13 6.41 4.97 4.35 4.00 14 4.60 3.74 3.34 3.11 2.96 14 6.30 4.86 4.24 3.89 15 4.54 3.68 3.29 3.06 2.90 15 6.20 4.77 4.15 3.80 t-Table (Critical value) d.f
t0.05
t0.025
2
2.920
4.303
3
2.353
3.182
4
2.132
2.776
5
2.015
2.571
6
1.943
2.447
7
1.895
2.365
8
1.860
2.306
9
1.833
2.262
10
1.812
2.228
20
1.725
2.086
30
1.697
2.042
40
1.684
2.021
50
1.676
2.009
60
1.671
2.000
70
1.667
1.994
∞
1.645
1.960
5 d.f χ20.05 χ20.025 921.8 1 3.84 5.02 39.3 2 5.99 7.38 14.9 3 7.81 9.35 9.36 4 9.49 11.1 7.15 5 11.1 12.8 5.99 6 12.6 14.4 5.29 7 14.1 16.0 4.82 8 15.5 17.5 4.48 9 16.9 19.0 4.24 10 18.3 20.5 4.04 11 19.7 21.9 3.89 12 21.0 23.3 3.77 13 22.4 24.7 3.66 14 23.7 26.1 3.58 15 25.0 27.5