SECTION 01

Causal inference

Methods for estimating causal effects from observational data using quasi-experimental variation — discontinuities, differential timing, excluded instruments, or covariate balance — without structural assumptions.

THE IDENTIFICATION PROBLEM

Observational data conflates treatment selection with treatment effects. Units that receive treatment differ systematically from those that don't — any naive comparison is confounded.

The goal is to find variation in treatment that is as-good-as-random: a policy cutoff, a natural experiment, an instrument. Each method isolates one such source of clean variation.

Choosing a method means choosing an identifying assumption. That assumption must be plausible, testable where possible, and clearly stated.

A GENERIC CAUSAL DAG

T → YThe causal effect of interest

U → T, YUnobserved confounder (dashed) — the problem

Z → TInstrument — shifts T but not Y directly

X → T, YObserved covariate — condition on to close path

CORE ASSUMPTIONS

Overlap / positivityEvery unit has nonzero probability of treatment assignment.

SUTVANo interference between units; one version of treatment.

Ignorability / CIANo unobserved confounders, given covariates.

Parallel trendsDiD: trends would have matched absent treatment.

Exclusion restrictionIV: instrument affects outcome only through treatment.

EXAMPLE — TWFE ESTIMATOR

did_estimate.R

library(fixest)

# Two-way fixed effects DiD
feols(
  insured ~ i(year, treated, ref = 2013) | state + year,
  data = acs_panel,
  cluster = ~state
) |> iplot(
  main = "Effect of Medicaid expansion",
  xlab = "Year relative to expansion"
)

Full pipeline with data prep, diagnostics, and output → Chapter 01 — DiD

METHODS IN THIS SECTION

01DiD

Difference-in-differences

Compares changes over time between treated and untreated groups. The workhorse of policy evaluation — valid even with selection into treatment, so long as trends would have matched.

∴ Parallel trendsPanel · repeated cross-sectionfixest, did

→

02ES

Event study

Plots treatment effects at each period relative to the event. Tests pre-trends visually and estimates dynamic effects — essential for any DiD specification.

∴ No anticipation · parallel trendsPanel datafixest, did

→

03IV

Instrumental variables

Exploits a variable that shifts treatment but has no direct path to the outcome. Recovers a LATE for compliers. Requires careful instrument selection and weak-instrument diagnostics.

∴ Relevance · exclusion · independenceCross-section or panelivreg, fixest

→

04RDD

Regression discontinuity

Identifies causal effects near an arbitrary cutoff in an assignment rule. Sharp designs give clean identification; fuzzy designs require an IV argument. Bandwidth selection is critical.

∴ Continuity at thresholdRunning variable + outcomerdrobust, rddensity

→

05Match

Matching & IPW

Constructs a valid comparison group by balancing observed covariates through matching or reweighting. Assumes no unobserved confounding — every common cause of T and Y must be observed.

∴ Conditional ignorability (CIA)Cross-section or panelMatchIt, WeightIt

→

06SC

Synthetic control

Constructs a weighted combination of control units that matches the pre-treatment trajectory of the treated unit. Best suited for comparative case studies with a single treated aggregate.

∴ Convex hull · pre-period fitAggregate panel · few treated unitsSynth, tidysynth

→