Question for help
Chapter 10
Statistical Problems in Testing Models of Crime
Introduction
Much of applied economics is using data to test hypothesis from theory. Economists are primarily concerned with causal inference rather than descriptive statistics.
In other words, economists want to understand the direct effect of one variable on another, rather than simply describing trends (correlation is not causation!)
This, unfortunately, is incredibly challenging, and something that is continually being challenged and advanced even today, but this chapter will introduce you to the most common statistical tools to estimate causation.
Testing the Determinants of Crime Rates
One of the most basic statistical inferences in the economics of crime is the relationship between the probability of sanction and the level of offending.
According to economic theory, an increase in the probability of sanction should lower the level of offending.
However, there are many other variables that could affect crime rates such as the median income, average education level, and age of the population.
Testing the Determinants of Crime Rates
Differences in crime rate over time and across various places could be explained by expected sanction, market wage, education levels, and age of the population.
If this is true, then the supply of crime could be explained in this form:
In such a model describes the relationship between the explanatory variable and the outcome variable, holding the other variables constant.
Testing the Determinants of Crime Rates
If we had data that observed all these variables, it is impossible that there are any values of that would fit the data. This is because all these variables are subject to random variation or variation by chance.
This could happen for several reasons including
Measurement error bias
Omitted variable bias
Testing the Determinants of Crime Rates
Since the data cannot be fit into a model, another variable is included: the error term
The error term represents any deviations in crime not explained by the explanatory variables.
It is assumed that these deviations are random according to some probability distribution and that on average the expected error is zero,
Testing the Determinants of Crime Rates
The standard statistical approach to obtaining estimated values for is to use ordinary least squares regression (OLS) to compute estimated values for these parameters () and their standard errors and test their agreement with economic theory.
OLS estimates by calculating the that minimizes the sum of the squared error terms.
Omitted Variable Bias
There are several assumptions that must be met in order for the estimated values of to be unbiased. Unbiased meaning that the estimated parameters are on average correct.
The first assumption is that the expected value of the error term cannot be correlated with any explanatory variables. For example, .
Omitted Variable Bias
For example, let’s assume that the researcher does not observe average age of a population and estimates the following equation:
Age still has an effect on crime, but it is no longer in the researcher's model, so it becomes part of the error term.
Age is also correlated with income (W), so is no longer true.
In other words, the estimated parameter is going to confuse the effect of wages with the effect age, producing bias.
Omitted Variable Bias
Assume that wage and age are related by this equation:
Then the estimated equation becomes:
So, the estimated will actually be
Measurement Error Bias
Often, there is some amount of error between the value that is measured and the true value. For example, we may have incidence reports for Tallahassee, but they will always include some number of unreported crimes.
Depending on the application, measurement error may not be problematic.
However, one instance it becomes problematic is when the measurement error is correlated with the explanatory variables.
Measurement Error Bias
Suppose that both and are measured with error and that measurement errors in are related to errors in . Then and
becomes
and
Reverse Causality
While measurement error and omitted variable bias present challenges to empirical research, by far, the largest problem is simultaneous equation bias or reverse causality bias
The previous models assume that any correlation between the explanatory variables and the crime is due to the explanatory causing an effect on crime, but often crime can also have an effect on the explanatory variable.
In this example, increases in expected sanction will cause a decrease in crime. However, areas that are high in crime, will cause an incentive for local authorities in increase resources allocated to reducing crime, causing an increase in sanction. This selection bias in treatment is such that crime is also having an effect on sanctions.
Reverse Causality
This reverse causality bias will cause to reflect a correlational relationship, but not a causal relationship.
A naïve regression of expected sanctions and crime will generally result in a positive relationship, potentially leading someone to believe that higher sanctions will cause crime rates.
Instrumental Variables
One common approach to overcoming reverse causality is using instrumental variables or two-stage least squares (2SLS).
This approach involves finding a variable/“instrument” that has an effect on the treatment variable (sanctions), but in such a way that it is not related to the outcome variable (crime rate).
In other words, the instrument is causing an effect on sanctions independent of crime rates.
Instrumental Variables
Suppose that there was a grant allocating resources to increasing expected sanctions and that the grants were distributed to cities randomly. Then we could estimate the effect on the grant on expected sanctions by estimating the following equation:
Then we could take this equation with the estimated parameters and plug it into the original equation we were estimating.
Instrumental Variables
Now the variation in expected sanctions is unrelated to crime rates and should be unbiased.
There are two requirements an instrument must meet
It must be correlated with the treatment
It must only affect the dependent variable through its effect on the treatment variable. It other words, it cannot have a direct effect on the outcome variable.
Natural Experiments
Natural experiments are when there are unique events that cause shifts in treatment or explanatory variables. These can include changes from new laws passed, court rulings, changes in nature/weather, etc.
Economists use these sudden events to study the effect they have on some outcome.
Example: Evans and Owens (2007) use the Violent Crime Control and Law Enforcement Act of 1994 to study the effect of increased law enforcement officers on crime
Natural Experiments
This law established a grant program called COPS which gives grants to local police agencies to hire new police officers.
Since these grants are not allocated based on crime rates, Evans and Owens were able to use these grants as an instrument for number of police officers.
They found that an increase in police officer has a statistically significant effect on automobile theft, burglary, robbery, and aggravated assault.
Difference-In-Differences
Suppose a state were considering a raising the minimum age to buy a firearm from 18 to 21 with the intent of reducing teen gun violence.
Going forward, there are two possible states of the world. One in which this law goes into effect and one in which the law doesn’t. In a theoretical experimental world, we could observe both states of the world and compare outcomes. However, in the real world the researcher is only able to observe one outcome.
Difference-In-Differences
Suppose the law passes. To answer the question of the effect that the law has on teen gun violence, the researcher needs to estimate the hypothetical outcome in which the law doesn’t pass.
Simply comparing outcomes before and after the law is passed is going to contain a significant amount of omitted variable bias. So, the research also includes variables that are likely to have an effect on crime: income, poverty, unemployment, weather, demographic changes.
While this is likely to reduce bias in the estimate, it will likely still contain a large amount of omitted variable bias.
Difference-In-Differences
So, what does the researcher do?
Difference-In-Differences! (DID)
This research technique involves including data from a state that is not treated to serve as a “counterfactual” or hypothetical state of the world in which the law didn’t pass.
Florida actually passed this law in March of 2018 while the minimum age to buy a firearm in Georgia and Alabama remained at 18.
Difference-In-Differences
So, an approach to estimate the law’s effect on crime rates could be to observe data from 2015 to 2021 from all counties in Florida, Georgia, and Alabama, and estimate the following equation:
is a matrix of explanatory variables with being the matrix of parameters
and are county and year fixed effects, respectively.
Fixed effects are dummy variables that are intended to account for unobserved variables that are constant for given county or year.
For example, a dummy variable for 2020 is a variable that is equal to one in year 2020 and equal to zero otherwise. The parameter would estimate the effect from unobservable that affect all counties in year 2020.
Difference-In-Differences
would give an estimate of the change in teen gun violence rates in Florida in the three years following the passage of the law after controlling for other explanatory variables in our model.
may give an unbiased estimate of the effect of this law on teen gun violence, but a deeper analysis is needed that is beyond the scope of this course.
Chapter Summary
This chapter covers common reasons for bias in estimating causal relationships.
Measurement Error Bias
Omitted Variable Bias
Reverse Causality Bias
And we discussed various methods used to overcome some of these issues
Instrument Variables (2SLS is most common)
Difference-In-Differences (DID)