Case Study- Report

profileNayef_222
ExampleREPORTforthecasestudy.pdf

1

Example

Statistics Investigation Report

Module: FC301- Statistics

Student Name: xxxxx

Student ID: xxxxx

Tutor(s) name(s): xxxxx and xxxxx

Date: dd/mm/yyyy

2

Pedestrians killed in collision with car, pick-up truck or van Deaths (US) (CDC)

correlates with

Drivers killed in collision with car, pick-up truck or van Deaths (US) (CDC)

Years Pedestrians killed in collision with car, pick-

up truck or van Deaths (US) (CDC)

(x)

Drivers killed in collision with car, pick-

up truck or van Deaths (US) (CDC)

(y)

1999 2877 2988

2000 2909 2996

2001 3005 3288

2002 3078 3594

2003 2980 3412

2004 2790 3210

2005 2728 2839

2006 2764 2726

2007 2476 2429

2008 2112 1937

2009 1999 1672

2010 1838 1314

Total 31556 32405

Part 1: For your independent variable x:

(a)

Minimum Value 1838

Maximum Value 3078

Range 1240

Q1 2112

Median 2777

Q3 2980

Interquartile Range 868

Mode \

Mean 2629.67

Standard Deviation 409.7267382

3

(b)

F CF RF PRF

1500≤x<2000 2 2 0.17 17%

2000≤x<2500 2 4 0.17 17%

2500≤x<3000 6 10 0.5 50%

3000≤x<3500 2 12 0.17 17%

Total 12 12 1 100%

(c)

Pie Chart

Bar Chart

17%

17%

50%

17%

1500≤x<2000 2000≤x<2500 2500≤x<3000 3000≤x<3500

0 13 25 38 50 63

1500≤x<2000

2000≤x<2500

2500≤x<3000

3000≤x<3500

4

(d) As can be seen from the bar chart, the third line (2500-3000) is the longest, and

the other three are shorter but all the same. It can be seen from the pie chart that the

number of deaths of pedestrians who collide with cars, small trucks or vans is 2500-

3000, accounting for half of all years. However, the percentages of the other three

groups of data are the same, which is 17%. These results show that half of the years

have more than 2,500 deaths.

Part 2:For your independent variable x:

(a)

Minimum Value 1314

Maximum Value 3594

Range 2289

Q1 1937

Median 2913.5

Q3 3288

Interquartile Range 1351

Mode \

Mean 2007.42

Standard Deviation 1022.673183

(b) As can be seen from a series of calculations, the minimum number is 1,314 and

the maximum number is 3,594. We can see that the difference between the maximum

and minimum of the death toll is close to 2,000. Prove the difference between the

lowest and highest scores of 2289 people. The quartile range between the quartile

(Q1 = 1937) and the upper quartile (Q3 = 3288) is 1351. In addition, the median

score was 2913.5, meaning that half of the number of deaths in the year was higher

than 2913.5, while the other half was below 2913.5. In addition, by observing the

quartile, we can assume that 25% of students scored below 47.75 in the exam. And

25% of students scored higher than 85.25. The number of deaths per year is different,

so there is no mode. However, the average score is 2007.42. The standard deviation

of s = 1022.673183 is large, which means that the score varies greatly. In addition,

the mean, mode, and median are not equal, indicating the fact that the data is not

normally distributed. After further investigation, we may think that the death toll is a

5

negative deviation because the distance between Q3 and Q2 is less than the distance

between Q2 and Q1.

Part 3: Comments

x y xy x² y²

2877 2988 8596476 8277129 8938144

2909 2996 8715364 8462281 8976016

3005 3288 9880440 9030025 10810944

3078 3594 11062332 9474084 12916836

2980 3412 10167760 8880400 11641744

2790 3210 8955900 7784100 10304100

2728 2839 7744792 7441984 8059921

2764 2726 7534664 7639696 7431076

2476 2429 6014204 6130576 5900041

2112 1937 4090944 4460544 3751969

1999 1672 3342328 3996001 2795584

1838 1314 2415132 3378244 1726596

Total 31556 32405 88520336 84955064 93252971

(a) Sxy = Σxy - ΣxΣy/n

= 88520336 - 31556×32405/12

= 3305987.667

Sxx = Σx² - (Σx)²/n

= 84955064 - 31556²/12

= 1973302.667

Syy = Σy² - (Σy)²/n

= 93252971 - 32405²/12

= 5745968.917

r = Sxy /√SxxSyy

So r = 0.98

(b)

6

b = Sxy/Sxx

= 17943363.67/1973302.667

= 9.093

a = ȳ - bx̄

= 2007.42 - 9.093×2629.67

= - 21904.16931

y = bx + a

y = 9.093x - 21904.16931

Because b > 0, the slope of the line is positive, meaning that the regression line and the correlation

between x and y are both positive.

(c)

We can see that the horizontal axis represents the pedestrian collision with a car,

small truck or van death (US) (CDC), and the vertical axis represents the death of a

driver who collided with a car, pickup truck or van death (US) (CDC). r equal 0.98,

so this is strong positive linear correction.

(d) From the calculation of Pearson's moment correlation coefficient (r = 0.98), we

can find the number of people killed in the death of a car, pickup truck or van and the

number of deaths of drivers who collide with cars, small trucks or vans. There is a

strong positive correlation, and the more people who die in the death of a car, pickup

truck or van, the more people die when the driver collides with a car, a small truck or

a van. In addition, based on the calculation of the measured coefficients (r² = 0.96), it

can be assumed that the 96% change in the number of deaths of drivers who died in a

car, small truck or van death can be passed by pedestrians in the event of a death with

a car, pickup or van. The change in the number of people killed in the death is ex-

plained. Observing the y-intercept (a = -21904.16931), it can be suggested that if the

Drivers killed

in collision

with car, pick-

up truck or

van Deaths

(US) (CDC)

7

number of people killed in a car, truck or van death is 0, they will die when the driver

collides with a car, a small truck or a van. The number of people is - 21904.16931;

and the calculation of the gradient (b = 9.093) indicates that the number of people

killed in the death of a car, pickup truck or van increases by one unit, and the number

of deaths of drivers who collide with cars, small trucks or vans will increase by 9.093

units.