Probability and statistics

profileBebisha
ProjectExample.pdf

Mass Shootings from 1982 – 2019

Cougan Collins

Class: SU20-INTRO TO PROBABILITY AND STATS_01

Instructor: Nicholas Jacob

I will examine the data from the mass shootings done from 1982 to 2019. The data can be found

here https://www2.stetson.edu/~jrasp/data.htm. The original data set was not arranged by year,

but I wanted to make it easier to read as the mass shootings in chronologic order. There was

some additional data that was repetitive and not useful for my analysis, so I cleaned that data out.

The categorical variables are the cases, location (state/city), date, summary of the shooting,

location (where it happened), prior signs of mental health, mental health details, were the

weapons obtained legally, weapon type, race, gender, and type of shooting. The quantitative

variables are the number of fatalities, age of the shooter, the number of injured, and total victims.

Below is an image of four cases of my data set.

What I am most interested in learning from these mass shootings that took place during this

timeframe is how many of these shooters had prior signs of mental illness. If the data shows that

most of these shooters had some kind of mental illness beforehand, it would suggest that we need

to pay closer attention to those who have a mental illness, especially if they show any signs of

hostility towards classmates or fellow workers. If having mental problems is one of the main

signs that could cause a mass shooter, then we could develop better programs that would help

these individuals to have the tools they need to not become the next mass shooter.

I chose to make two tables that would show the frequency and relative frequency of those mass

shooters who had prior signs of mental illness and who bought their guns legally. Those who had

a mental illness prior to the shootings are significantly higher than those who did not. Since my

table includes forty-one shooters whose prior mental condition is unknown, it makes the relative

frequency of those who had prior mental illness appear to be lower. If we removed the unclear,

the relative frequency would about 80%, which indicates that mental illness is a key factor in

those who would commit a mass shooting.

As you look at the second table, about 71% of the weapons they used were obtained legally. If

you removed the unclear accounts, about 84% of the guns were bought legally.

Our third table, which is a two-way table, show a correlation between having a mental illness

prior to the shooting and having guns legally. This correlation would suggest that if we want to

help reduce the number of mass shootings, then we need to have stricter gun rules for those with

mental illnesses. However, we need to keep in mind that having a mental illness doesn’t mean a

person will become a mass shooter or harm anyone. So, creating restrictions would need to be

case location date summary fatalities injured total_victims location age_of_shooterprior_signs_mental_health_issuesmental_health_details weapons_obtained_legallyweapon_type race gender type

Welding shop shooting Miami, Florida 8/20/1982 Junior high school teacher Carl Robert Brown, 51, opened fire inside a welding shop and was later shot dead by a witness as he fled the scene.8 3 11 Other 51 Yes His second wife left him because he refused to seek psychological help. He had become increasingly isolated. One former student said he was "off his rocker."Yes One shotgun white Male Mass

Dallas nightclub shooting Dallas, Texas 6/29/1984 Abdelkrim Belachheb, 39, opened fire at an upscale nightclub after a woman rejected his advances. He was later arrested.6 1 7 Other 39 Yes During his last meal with his wife, he confessed he was depressed and had visited psychiatric hospitals in Belgium.No One semiautomatic handgunwhite Male Mass

San Ysidro McDonald's massacre San Ysidro, California 7/18/1984 James Oliver Huberty, 41, opened fire in a McDonald's restaurant before he was shot dead by a police officer.22 19 41 Other 41 Yes The day before the shooting, he tried to make an appointment at a mental health clinic. Yes One semiautomatic handgun, one rifle (assault), one shotgunwhite Male Mass

United States Postal Service shooting Edmond, Oklahoma 8/20/1986 Postal worker Patrick Sherrill, 44, opened fire at a post office before committing suicide.15 6 21 Workplace 44 Unclear He was worried he had inherited mental problems and rebuffed a pastor's suggestion he seek psychiatric counseling. His family members denied he had a history of mental illness.Yes Three semiautomatic handgunswhite Male Mass

done fairly. Perhaps further study might show that certain kinds of mental illnesses might be

correlated with mass shootings.

We will never be able to stop all mass shootings, but the data I have composed indicates that we

could decrease the number of shootings by having more substantial restrictions on owning guns

if people have mental illnesses. If the parents have guns legally with no mental illness but their

child is mentally ill, then they should secure their weapons, so the child has no access to them.

Prior signs of mental illness

Response Frequency Relative Frequency

Yes 59 0.504273504

No 17 0.145299145

Unclear 41 0.35042735

Total 117 1

Weapons obtained legally

Response Frequency Relative Frequency

Yes 83 0.709401709

No 16 0.136752137

Unclear 18 0.153846154

Total 117 1

Two-way table

Weapons obtained

legally

Weapons

obtained illegally

Weapons

obtained unclear

Prior signs of

mental illness

46 8 5

No prior signs of

mental illness

12 4 0

Unclear prior signs

of mental illness

25 3 14

Next, I will examine the statistical summary of the total number of victims in these mass

shootings, and I will provide a histogram and a box plot. I also included a close up of the main

data of the box plot to make it easier to see. The charts will show that the distribution is skewed

to the right as the mean is greater than the median. Also, you will notice in the box plot that there

are several outliers ranging from 36 to 604. If I removed the extreme outlier (604), the mean

would change to 15.40517241, which shows how out outliers can inflate the average. As you can

see from the histogram, the majority of victims injured or killed during a mass shooting are

between three and thirteen. We can conclude that on average that a mass shooting will have

between three to thirteen victims that are injured or killed.

Total victims - Summary

Mean 20.43589744

Median 11

Standard Dev. 56.27184141

Variance 3190.248011

Q1 7

Q2 11

Q3 18

IQR 11

Min 3

Max (604.00)

Hypothesis test for a quantitative variable

I suspect that if we examine all the mass shootings in my chart that we will discover that there

are more injuries than deaths.

Ho: = The mean of the injuries will be > than the mean of the deaths in the mass shootings in my

report.

Ha: ≠ The mean of the injuries will be < than the mean of the deaths in the mass shootings in my

report.

This will be easy to determine if my hypothesis is supported because all I have to do is compare

the mean of the deaths in the chart to the mean of the injuries. If my hypothesis is supported, the

numbers will show it, and they do. The mean of deaths is 8.10, and the mean of injuries 12.33.

Therefore, my hypothesis is supported. I will add that I based my hypothesis on the fact that

when a mass shooting happens, people tend to run as fast as they can, which puts them at a

higher risk of injuring themselves. Also, as the shooter shoots into a crowd, he or she is not

going to have kill shot every time, which means that there will be more people injured from the

stray bullets as well.

Simple formula: (i=injures d=deaths)

Ho: μi = μd

Ha: μi < μd

Hypothesis test for a categorical variable

I suspect that the majority of mass shooters have some form of mental illness and showed signs

of mental illness before they committed the mass shooting.

Ho: = The frequency of mass shooters who had prior signs of mental illness is equal to the mass

shooters who had no prior signs of mental illness.

Ha: ≠ The frequency of mass shooters who had prior signs of mental illness is < than the mass

shooters who had no prior signs of mental illness.

The main reason I suspect the majority of mass shooters would have some form of mental illness

is because what person in his or her right mind would decide to kill as many people as he or she

can. We can easily determine if the hypothesis is supported by looking at the frequency table

below. It clearly supports my hypothesis because 59 cases had prior signs of mental illness, and

17 did not.

Prior signs of mental illness

Response Frequency Relative Frequency

Yes 59 0.504273504

No 17 0.145299145

Unclear 41 0.35042735

Total 117 1

Simple formula:

Ho: p Mental Illness=1/2

Ha: p Mental Illness >1/2

Bootstrap test for my categorical variable

To test my hypothesis further, I used bootstrapping with 52 generated samples to determine if my

hypothesis should be accepted or rejected. I discovered that my bootstrap mean was .778846154,

and the standard error is .042. Computing the 95% confidence level with a z score of 1.96, I

found the range to be .6930511954 to .864641114, which means that I can be 95% confident that

the majority of mass shooters had some previous signs of mental illness. So, I failed to reject my

hypothesis.

Bootstrap test for my quantitative variable

To test my hypothesis further, I used bootstrapping with 52 generated samples for the injured and

the fatalities to determine if my hypothesis should be accepted or rejected. I discovered that my

bootstrap mean for the fatalities is 8.010804709, and my bootstrap mean for injuries is

11.43202709. The standard error for fatalities is .728691332 and 4.73563476.

If my hypothesis were being tested by the bootstrap mean, I would have failed to reject my null

hypothesis because the number of injuries is greater than the number of deaths. However, based

on the 95% confidence level, I must reject the hypothesis because, based on the compared ranges

for fatalities and injuries, it is possible for the deaths to be greater than the injuries because there

is a much greater range with the injury numbers. As you can see in the charts below, the fatalities

range between 6.553422046 and 9.468187372. The injuries range from 1.960757572 to

20.90329661.

From Module 6, I will use the formulas we learned to test my hypothesis again. Let’s begin with

our categorical data the records if the shooter had prior signs of mental illness.

Simple formula:

Ho: p Mental Illness=1/2

Ha: p Mental Illness >1/2

There are 76 shooters, with 59 of them having prior signs of mental illness. I suspect that more

than half of the shooters had prior signs of mental illness. I will use an Alpha of .05. My sample

size is 76, the proportion is ½, and my statistic is .776315787. The SE is .057353933. Z =

4.817730411 and Z* 1.6448553627. Using StartKey, my 95% CI ranges from .684 to .868. This

is almost identical to my initial bootstrap test above, which once again offers support that my

hypothesis is true. Using these numbers, I have failed to reject the null hypothesis.

Next, let’s examine the quantitative data using the new formula from Module 6. I am very

interested in how this will compare to the previous module.

Simple formula: (i=injures d=deaths)

Ho: μi = μd

Ha: μi < μd

Samplesize proportion statistic

76 0.5 0.776315789 0.05

n P p hat alpha

Simple formula: (i=injures d=deaths)

Ho: μi = μd

Ha: μi < μd

z* z SE

1.644853627 4.817730411 0.057353933

(phat-p)/SE sqrt(p*(p*(1-p)/n

We have a sample size of 2391. Out of the 2391, there were 948 fatalities and 1443 injuries. The

SE = 0.014460896. P1-P2 = -0.20702635. Z = -14.31628741. P = 8.6564E-47. I have included a

snapshot of Excel numbers. However, when I put this problem into StartKey, you can see from

the chart below that is only a .013 chance that the deaths and injuries will be different from one

another. However, from the Excel nonpooled numbers, the 95% Cl difference can range from -

0.234755148 to -0.17929755. Though this formula gave me different numbers from the previous

example above, I would still have to reject the null hypothesis. Also, my instructor told me to

include the following Cl numbers which range from -0.027728799 and 0.027728799.

In this section, I am only retesting the qualitative part of my hypothesis with a test. Using the

formula we learned this week, I came up with the following numbers.

Fatalities Injuries

2391 2391 4782

948 1443 2391

0.396486826 0.603513174 0.5

p1 p2 Pooled Proportion

Pooled SE p1-p2

0.014460896 -0.20702635

z p tail two times

-14.31628741 8.6564E-47 1 1.73E-46

0

Not pooled SE

0.014147606

z*

1.959963985

95% CI

-0.234755148 -0.17929755

Module 7

Average Standard Dev.

4.230769231 45.9914506

SE

0.940561764

Tstat

4.498130153

Postive

0.034017483

CL

4.198773687 4.262764774

Lower Upper

If you compare these numbers to my previous test, they are quite different. With such a high

confidence level, I will accept my null hypothesis. I amazed at how different each statistical test

is. My question would be, which is the most reliable method?

For the last part of our project I will provide two conditional probabilities from a modified

version of my two-way table that was done earlier in this project.

How many shooters had a prior sign of mental illness? This would be determined by dividing

39/54, which would = .72. In other words, 72% had prior signs of mental illness.

How many shooters had no prior signs of mental illness and obtained weapons illegally? This

would be determined by dividing 4/54 which would = .074. In other words, 7.4% had no prior

signs of mental illness who had obtained weapons illegally.

Please refer to the following formula that is to be included in this final project.

I have enjoyed examining the statistics of my project. I am thankful for the opportunity to have

had a glimpse it to what goes into looking at projects like mine.

You can view my edited Excel file from this link

https://www.dropbox.com/s/3xezqdqoth3v17q/Mother%20Jones%20-

%20Mass%20Shootings%20Database%5EJ%201982%20-%202019%20revised.xlsx?dl=0

Two-way table

Column1 Weapons obtained legally Weapons obtaintaid illegally total

Prior signs of mental illness 39 0 39

No prior signs of mental illness 11 4 15

Total 50 4 54