Getting Started¶
zelig2 follows the same four-step workflow introduced by the original Zelig package (Imai, King, and Lau 2007, 2008): estimate, set scenarios, simulate, and visualize. If you have used Zelig before, the interface will be familiar.
Installation¶
Install zelig2 from GitHub:
Dependencies¶
zelig2 requires the following packages (installed automatically):
| Package | Purpose |
|---|---|
ggplot2 |
Visualization |
MASS |
Parameter simulation (mvrnorm), negative binomial models |
sandwich |
Robust and clustered standard errors |
survey |
Survey-weighted estimation |
patchwork |
Composing multi-panel plots |
AER |
Tobit models |
quantreg |
Quantile regression |
fixest |
Fixed effects models |
Core Workflow¶
Step 1: Estimate a Model¶
zelig2() accepts a formula, a model name, and a data frame --- mirroring the original zelig() interface from King, Tomz, and Wittenberg (2000). It returns a zelig2 object containing the fitted model and all metadata needed for simulation.
Step 2: Set a Counterfactual Scenario¶
setx() specifies the covariate values at which to compute quantities of interest. Variables not explicitly set default to their sample median (numeric) or mode (factor).
Step 3: Simulate Quantities of Interest¶
sim() implements the simulation-based approach to inference from King, Tomz, and Wittenberg (2000). It draws parameter vectors from \(\mathcal{N}(\hat{\beta}, \hat{V})\) and computes:
- Expected values (EV): \(E[Y \mid X]\), the systematic component.
- Predicted values (PV): Draws from the stochastic component, incorporating observation-level noise.
Step 4: Examine Results¶
First Differences¶
To estimate the causal effect of changing one covariate, set a baseline scenario with setx() and a contrast scenario with setx1():
z <- zelig2(mpg ~ hp + wt, model = "ls", data = mtcars)
z <- setx(z, hp = 100, wt = 3)
z <- setx1(z, hp = 200, wt = 3)
z <- sim(z)
summary(z)
The first difference is \(\text{EV}(\texttt{setx1}) - \text{EV}(\texttt{setx})\).
Range Scenarios¶
Pass a vector of values to one covariate to see how predictions change across its range:
z <- zelig2(mpg ~ hp + wt, model = "ls", data = mtcars)
z <- setx(z, hp = seq(50, 300, by = 50), wt = 3)
z <- sim(z)
plot(z)
This produces a ribbon plot showing the expected value and its 95% confidence band across the range of hp.
Pipe-Friendly¶
All functions return zelig2 objects, so the workflow chains naturally with R's pipe operator:
z <- zelig2(mpg ~ hp + wt, model = "ls", data = mtcars) |>
setx(hp = 150, wt = 3) |>
sim()
summary(z)
plot(z)
Robust Standard Errors¶
Specify vcov_type to use heteroskedasticity-consistent standard errors:
Available options: "HC0", "HC1" (alias "robust"), "HC2", "HC3", "HC4", "cluster", "bootstrap".
Survey Weights¶
Pass a numeric weight vector to weights:
For full survey design specifications, use survey_design, or the convenience arguments ids, strata, and fpc.
Fixed Effects¶
Use the pipe syntax in the formula or the fixef argument:
# Pipe syntax
z <- zelig2(y ~ x1 + x2 | state + year, model = "ls", data = panel)
# fixef argument
z <- zelig2(y ~ x1 + x2, model = "ls", data = panel,
fixef = ~ state + year)
Fixed effects are estimated via fixest and support cluster-robust standard errors.