Applied Micro
Causal inference, selection bias, and research design
Causal Inference
The central question in applied microeconomics: Does X cause Y? This is much harder than it sounds.
Correlation vs. Causation
Consider this regression:
reg health_outcomes hospital_visits
lm(health_outcomes ~ hospital_visits, data = data)
You might find that more hospital visits are associated with worse health outcomes. Does this mean hospitals make people sicker? Of course not. People who visit hospitals more are already sicker.
The Fundamental Problem of Causal Inference
We can never observe the same person in both the treated and untreated state at the same time. The "treatment effect" is the difference between what did happen and what would have happened without treatment. But we only observe one of these.
The Potential Outcomes Framework
For each individual i, define:
- Yi(1) = outcome if treated
- Yi(0) = outcome if not treated
- Treatment effect = Yi(1) - Yi(0)
The problem: we observe Yi(1) OR Yi(0), never both. The unobserved outcome is called the counterfactual.
Average Treatment Effect (ATE)
Since we can't measure individual treatment effects, we focus on averages:
- ATE = E[Y(1) - Y(0)] = Average effect across everyone
- ATT = E[Y(1) - Y(0) | D=1] = Average effect on the treated
- LATE = Effect on "compliers" (those whose treatment status changes due to an instrument)
Selection Bias
The naive comparison of treated vs. untreated groups gives:
E[Y | D=1] - E[Y | D=0] = Treatment Effect + Selection Bias
Selection bias arises when treated and untreated groups differ in ways that affect the outcome, even without treatment.
Sources of Selection Bias
Omitted Variable Bias
A variable affects both treatment and outcome. Example: Ability affects both college attendance and earnings. Naive college premium is biased upward.
Reverse Causality
The outcome affects the treatment. Example: Countries with economic growth attract more foreign aid (not vice versa).
Self-Selection
People choose treatment based on expected benefits. Example: Workers who expect to gain more from training are more likely to enroll.
Checking for Selection
Compare observable characteristics between groups:
* Compare means by treatment status
ttest age, by(treatment)
ttest income, by(treatment)
ttest education, by(treatment)
* If treated group is older, richer, more educated...
* ...simple comparison of outcomes is biased
# Compare means by treatment status
t.test(age ~ treatment, data = data)
t.test(income ~ treatment, data = data)
t.test(education ~ treatment, data = data)
# If treated group is older, richer, more educated...
# ...simple comparison of outcomes is biased
Selection on Observables vs. Unobservables
Even if treatment and control groups look similar on observable characteristics, they may differ on unobservable ones (motivation, ability, risk preferences). This is why randomized experiments are the gold standard.
Research Designs
Different strategies for eliminating selection bias. This follows the "Furious Five" framework from Angrist & Pischke's Mostly Harmless Econometrics.
The Mostly Harmless Toolkit
- RCTs: Gold standard. Random assignment eliminates selection bias by design.
- Regression with Controls: Assumes selection on observables—conditional on controls, treatment is as good as random.
- IV: Find an instrument that affects outcome only through treatment. Identifies LATE for compliers.
- RD: Exploit sharp cutoffs. Units just above/below threshold are locally randomized.
- DiD: Compare changes over time. Assumes parallel trends absent treatment.
1. Randomized Controlled Trials (RCTs)
Random assignment ensures treatment is independent of potential outcomes. The treated and control groups are identical on average in all characteristics, observed and unobserved.
2. Difference-in-Differences (DiD)
Compare changes over time between a treated group and a control group. Requires a "parallel trends" assumption: absent treatment, both groups would have followed the same trajectory.
3. Instrumental Variables (IV)
Find a variable (instrument) that affects treatment but only affects the outcome through treatment. Classic example: draft lottery as an instrument for military service.
A valid instrument must satisfy:
- Relevance: The instrument affects treatment (testable)
- Exclusion: The instrument only affects the outcome through treatment (not testable)
- Independence: The instrument is as good as randomly assigned
4. Regression Discontinuity (RD)
When treatment is assigned based on a cutoff (test score, age, income threshold), compare units just above and below the cutoff. These units are essentially randomly assigned to treatment.
5. Fixed Effects / Panel Data
With repeated observations of the same units over time, you can control for all time-invariant characteristics of each unit (observed and unobserved).
Choosing a Research Design
The Design Should Match the Question
Don't pick a method and then look for a question. Start with a question and ask: "What would I need to observe to answer this credibly?" The research design follows from the question and available data.
Questions to Ask
- Is there random or quasi-random variation? Look for lotteries, arbitrary cutoffs, natural experiments.
- What's the source of selection? If you can name it, you might be able to control for it.
- What's the counterfactual? Who or what would the treated group look like without treatment?
- What assumptions are needed? Can they be tested or at least made plausible?
- What's the relevant margin? LATE at a cutoff might not generalize to the whole population.
Red Flags in Applied Work
- "We control for X, Y, and Z" - but no discussion of why these controls are sufficient
- Instrument that obviously affects outcome directly
- DiD with clearly non-parallel pre-trends
- Treatment variable that's likely measured with error
- Results that flip sign with minor specification changes
Finding a Project
A good empirical project requires three things that fit together:
Question + Data + Identification
- Question: Something that interests you and matters for policy or understanding
- Data: An accessible dataset that measures the variables you need
- Identification: A source of plausibly exogenous variation in treatment
The hard part is finding a project where all three align. You need variation in treatment that isn't driven by the same factors that affect the outcome.
Public Data Sources
For coursework, you'll likely use publicly available data. Here are some good places to start:
Survey Data
- IPUMS: Census, ACS, CPS microdata (ipums.org)
- NLSY: National Longitudinal Survey of Youth
- PSID: Panel Study of Income Dynamics
- GSS: General Social Survey
- ANES: American National Election Studies
Administrative & Economic Data
- FRED: Federal Reserve Economic Data
- BLS: Bureau of Labor Statistics
- CDC WONDER: Health and mortality data
- FBI UCR: Crime statistics
- NCES: Education statistics
Replication Data
- AEA Data Editor: Replication packages from AER, AEJ
- Harvard Dataverse: Shared research data
- ICPSR: Social science data archive
Sources of Identification
Where does exogenous variation come from? Look for:
- Policy changes: New laws, regulations, or program implementations that vary across states or over time
- Arbitrary cutoffs: Age thresholds, income limits, test score requirements
- Natural experiments: Weather events, draft lotteries, judge assignments
- Timing variation: Staggered rollouts of policies across jurisdictions
- Geographic boundaries: State borders, school district lines
Matching the Three
Start from any corner of the triangle:
- Start with a question: "Does X affect Y?" Then ask: where would I find exogenous variation in X, and what data measures both X and Y?
- Start with data: Browse a dataset you have access to. What treatments vary? What outcomes are measured? What comparisons are credible?
- Start with identification: Find a natural experiment or policy change. What outcomes might it affect? Is there data that captures both?
Common Pitfalls
- Interesting question, no identification: "Does education cause higher earnings?" is important but hard to answer causally without exogenous variation
- Clean identification, boring question: A perfect RD at an arbitrary threshold might not teach us much if the margin isn't policy-relevant
- Good idea, no data: The perfect dataset for your question may not exist or may not be accessible