elem econ homework
Random assignment
Causality
Generally in economic research, we are interested in determining a causal relationship between two variables.
Examples of causal questions:
Does health insurance cause improved health?
Does education cause an increase in earnings? (we certainly hope so…)
Does an increase in the minimum wage cause a decline in employment?
Does the addition of the lovely little accents at the bottom of my slides cause you to pay more attention? (I certainly hope so…)
Treatments and outcomes
We call the variable that we suspect causes a change in the other the treatment.
Health insurance
Minimum wage increase
Education
Accent on slides
The variable that we suspect changes in response to the treatment is often called the outcome.
Health
Employment
Income
Student attention
In a perfect world…
… we could observe each individual in a relevant sample both with and without treatment at the same point in time.
Consider the case of a binary (i.e. takes on two values such as “yes” or “no”) treatment variable. Denote the outcome of individual i with treatment as and the outcome for individual i without treatment as . Then the causal effect of treatment for individual i is
These are potential outcomes. They are outcomes that may occur depending on treatment status. However, we can only ever observe one potential outcome for each person.
Example from book: Khuzdar’s health index with and without health insurance.
Unfortunately, we live in an imperfect world where things cannot simultaneously be and not be (as far as I know, I’m an economist not a philosopher). As our book quotes: “Acts demolish their alternatives, that is the paradox.”
What do we do?
Simple comparisons?
Can we simply compare individuals with and without treatment to get the causal effect? Let’s consider the example from Mastering Metrics with Khuzdar and Maria and health insurance.
Simple comparisons?
If we do a simple comparison of health outcomes between Khuzdar, who opted for health insurance, and Maria, who opted out of it, we would end up with:
If we take this as the causal effect of health insurance on health we would conclude that health insurance is detrimental to health. Is health insurance really just some government conspiracy to keep us unhealthy?! Not necessarily.
Note that the true causal effect for Khuzdar would be given by
Thus, if we use a simple comparison such as that above we get a biased estimate of the true causal effect of health insurance on health outcomes.
We can determine the size of the bias by adding and subtracting in the simple comparison to get :
The first two terms comprise the true causal effect, so the last two terms measure the bias which in our imaginary example where we know all is -2.
Selection bias
We call this type of bias selection bias. It is the bias that is caused by one subject being more likely to select into treatment. In this case Khuzdar is more likely to select into insurance due to his relatively poorer health without it.
Another way to think of how we get to a causal effect is through the old, familiar economics term ceteris paribus. We need all else to be equal between Khuzdar and Maria for a simple comparison to get us the causal effect we want. It is clear, however, that ceteris isn’t paribus with respect to the health of our two subjects.
Average causal effects
If we can’t get individual causal effects, can we maybe get causal effects from groups of individuals?
When we move from looking at individuals to groups we must look at averages. The average causal effect is exactly as it sounds: it is the average of the individual causal effects
We have the same problem here as we did with our simple two person example. In order to get the true average causal effect we need both potential outcomes for each individual, which is impossible.
Group comparisons
If we simply compare groups with and without insurance to estimate the causal effect, we shouldn’t be surprised to find that our estimate is biased.
Such a comparison would be done by
is a dummy variable that equals 1 when individual i chooses insurance and 0 otherwise.
Formulating the selection bias in general is difficult, but for illustrative purposes consider a constant-effects model where
for all i.
So is the individual causal effect for everybody (and therefore also the average causal effect)
Then the simple group difference is
Again, the last two terms measure average selection bias.
Random assignment
All hope is not lost! We can rid ourselves of average selection bias by randomly assigning individuals to the treatment group (the group for whom ) and to the control group (no treatment).
Sample size matters
One takeaway from the cartoon on the previous slide is that sample size matters. In other words, if we are interested in estimating the average causal effect of, say, health insurance on health in the entire US population, we ought to have a large sample size with adequate numbers in both treatment and control groups in order to ensure that individuals are on average the same except for treatment status.
The exact technical reason that large sample sizes with random assignment rid us of selection bias is due to a concept in statistics called the law of large numbers. You don’t need to know it for this class, but the gist in this context is that if we jack up sample sizes we can make sample averages arbitrarily close to the corresponding population expectation.
Perfectly balanced, as things should be
Even when using random assignment a wise researcher generally checks for balance. That is, the researcher checks to see that treatment and control groups actually look to be similar on average.
This is done by comparing group averages between treatment and control groups for demographic features and key variables associated with the outcome of interest.
For a precise measure of how close sample averages are we can use the tools of hypothesis testing. We will review hypothesis testing later in the semester with regression, but for now it is instructive to compare sample averages using an “eyeball test”.
Checking for balance: example
The elusive ideal
Random assignment is powerful. It allows us to answer causal questions without concern of bias contaminating our results. However, if all economic questions could be answered with random assignment then I wouldn’t be here teaching this class.
For multiple reasons, random assignment may not be an option
Unethical: When thinking about how education impacts income, it may be considered unethical to randomly assign the opportunity of education.
Infeasible: Sometimes random assignment is just infeasible. Imagine trying to randomly assign minimum wage raises to workers throughout the country.
So what do we do when we can’t use random assignment? We use some of the wonderful tools that I will teach you in this class to get as close to the standard of random assignment as we can.