26.7 Two-way Fixed-effects | A Guide on Data Analysis (2024)

26.7 Two-way Fixed-effects

A generalization of the dif-n-dif model is the two-way fixed-effects models where you have multiple groups and time effects. But this is not a designed-based, non-parametric causal estimator (Imai and Kim 2021)

When applying TWFE to multiple groups and multiple periods, the supposedly causal coefficient is the weighted average of all two-group/two-period DiD estimators in the data where some of the weights can be negative. More specifically, the weights are proportional to group sizes and treatment indicator’s variation in each pair, where units in the middle of the panel have the highest weight.

The canonical/standard TWFE only works when

  • Effects are hom*ogeneous across units and across time periods (i.e., no dynamic changes in the effects of treatment). See (Goodman-Bacon 2021; Clément De Chaisemartin and d’Haultfoeuille 2020; L. Sun and Abraham 2021; Borusyak, Jaravel, and Spiess 2021) for details. Similarly, it relies on the assumption of linear additive effects (Imai and Kim 2021)

    • Have to argue why treatment heterogeneity is not a problem (e.g., plot treatment timing and decompose treatment coefficient using Goodman-Bacon Decomposition) know the percentage of observation are never treated (because as the never-treated group increases, the bias of TWFE decreases, with 80% sample to be never-treated, bias is negligible). The problem is worsen when you have long-run effects.

    • Need to manually drop two relative time periods if everyone is eventually treated (to avoid multicollinearity). Programs might do this randomly and if it chooses to drop a post-treatment period, it will create biases. The choice usually -1, and -2 periods.

    • Treatment heterogeneity can come in because (1) it might take some time for a treatment to have measurable changes in outcomes or (2) for each period after treatment, the effect can be different (phase in or increasing effects).

  • 2 time periods.

Within this setting, TWFE works because, using the baseline (e.g., control units where their treatment status is unchanged across time periods), the comparison can be

  • Good for

    • Newly treated units vs.control

    • Newly treated units vs not-yet treated

  • Bad for

    • Newly treated vs.already treated (because already treated cannot serve as the potential outcome for the newly treated).
    • Strict exogeneity (i.e., time-varying confounders, feedback from past outcome to treatment) (Imai and Kim 2019)
    • Specific functional forms (i.e., treatment effect hom*ogeneity and no carryover effects or anticipation effects) (Imai and Kim 2019)

Note: Notation for this section is consistent with (2020)

\[Y_{it} = \alpha_i + \lambda_t + \tau W_{it} + \beta X_{it} + \epsilon_{it}\]

where

  • \(Y_{it}\) is the outcome

  • \(\alpha_i\) is the unit FE

  • \(\lambda_t\) is the time FE

  • \(\tau\) is the causal effect of treatment

  • \(W_{it}\) is the treatment indicator

  • \(X_{it}\) are covariates

When \(T = 2\), the TWFE is the traditional DiD model

Under the following assumption, \(\hat{\tau}_{OLS}\) is unbiased:

  1. hom*ogeneous treatment effect
  2. parallel trends assumptions
  3. linear additive effects (Imai and Kim 2021)

Remedies for TWFE’s shortcomings

  • (Goodman-Bacon 2021): diagnostic robustness tests of the TWFE DiD and identify influential observations to the DiD estimate (Goodman-Bacon Decomposition)

  • (Callaway and Sant’Anna 2021): 2-step estimation with a bootstrap procedure that can account for autocorrelation and clustering,

    • the parameters of interest are the group-time average treatment effects, where each group is defined by when it was first treated (Multiple periods and variation in treatment timing)

    • Comparing post-treatment outcomes fo groups treated in a period against a similar group that is never treated (using matching).

    • Treatment status cannot switch (once treated, stay treated for the rest of the panel)

    • Package: did

  • (L. Sun and Abraham 2021): a specialization of (Callaway and Sant’Anna 2021) in the event-study context.

    • They include lags and leads in their design

    • have cohort-specific estimates (similar to group-time estimates in (Callaway and Sant’Anna 2021)

    • They propose the “interaction-weighted” estimator.

    • Package: fixest

  • (Imai and Kim 2021)

    • Different from (Callaway and Sant’Anna 2021) because they allow units to switch in and out of treatment.

    • Based on matching methods, to have weighted TWFE

    • Package: wfe and PanelMatch

  • (Gardner 2022): two-stage DiD

    • did2s
  • In cases with an unaffected unit (i.e., never-treated), using the exposure-adjusted difference-in-differences estimators can recover the average treatment effect (Clément De Chaisemartin and d’Haultfoeuille 2020). However, if you want to see the treatment effect heterogeneity (in cases where the true heterogeneous treatment effects vary by the exposure rate), exposure-adjusted did still fails (L. Sun and Shapiro 2022).

  • (2020): see below

To be robust against

  1. time- and unit-varying effects

We can use the reshaped inverse probability weighting (RIPW)- TWFE estimator

With the following assumptions:

  • SUTVA

  • Binary treatment: \(\mathbf{W}_i = (W_{i1}, \dots, W_{it})\) where \(\mathbf{W}_i \sim \mathbf{\pi}_i\) generalized propensity score (i.e., each person treatment likelihood follow \(\pi\) regardless of the period)

Then, the unit-time specific effect is \(\tau_{it} = Y_{it}(1) - Y_{it}(0)\)

Then the Doubly Average Treatment Effect (DATE) is

\[\tau(\xi) = \sum_{T=1}^T \xi_t \left(\frac{1}{n} \sum_{i = 1}^n \tau_{it} \right)\]

where

  • \(\frac{1}{n} \sum_{i = 1}^n \tau_{it}\) is the unweighted effect of treatment across units (i.e., time-specific ATE).

  • \(\xi = (\xi_1, \dots, \xi_t)\) are user-specific weights for each time period.

  • This estimand is called DATE because it’s weighted (averaged) across both time and units.

A special case of DATE is when both time and unit-weights are equal

\[\tau_{eq} = \frac{1}{nT} \sum_{t=1}^T \sum_{i = 1}^n \tau_{it}\]

Borrowing the idea of inverse propensity-weighted least squares estimator in the cross-sectional case that we reweight the objective function via the treatment assignment mechanism:

\[\hat{\tau} \triangleq \arg \min_{\tau} \sum_{i = 1}^n (Y_i -\mu - W_i \tau)^2 \frac{1}{\pi_i (W_i)}\]

where

  • the first term is the least squares objective

  • the second term is the propensity score

In the panel data case, the IPW estimator will be

\[\hat{\tau}_{IPW} \triangleq \arg \min_{\tau} \sum_{i = 1}^n \sum_{t =1}^T (Y_{i t}-\alpha_i - \lambda_t - W_{it} \tau)^2 \frac{1}{\pi_i (W_i)}\]

Then, to have DATE that users can specify the structure of time weight, we use reshaped IPW estimator (2020)

\[\hat{\tau}_{RIPW} (\Pi) \triangleq \arg \min_{\tau} \sum_{i = 1}^n \sum_{t =1}^T (Y_{i t}-\alpha_i - \lambda_t - W_{it} \tau)^2 \frac{\Pi(W_i)}{\pi_i (W_i)}\]

where it’s a function of a data-independent distribution \(\Pi\) that depends on the support of the treatment path \(\mathbb{S} = \cup_i Supp(W_i)\)

This generalization can transform to

  • IPW-TWFE estimator when \(\Pi \sim Unif(\mathbb{S})\)

  • randomized experiment when \(\Pi = \pi_i\)

To choose \(\Pi\), we don’t need to data, we just need possible assignments in your setting.

  • For most practical problems (DiD, staggered, transient), we have closed form solutions

  • For generic solver, we can use nonlinear programming (e..g, BFGS algorithm)

As argued in (Imai and Kim 2021) that TWFE is not a non-parametric approach, it can be subjected to incorrect model assumption (i.e., model dependence).

  • Hence, they advocate for matching methods for time-series cross-sectional data (Imai and Kim 2021)

  • Use wfe and PanelMatch to apply their paper.

This package is based on (Somaini and Wolak 2016)

# datasetlibrary(bacondecomp)df <- bacondecomp::castle
# devtools::install_github("paulosomaini/xtreg2way")library(xtreg2way)# output <- xtreg2way(y,# data.frame(x1, x2),# iid,# tid,# w,# noise = "1",# se = "1")# equilvalentlyoutput <- xtreg2way(l_homicide ~ post, df, iid = df$state, # group id tid = df$year, # time id # w, # vector of weight se = "1")output$betaHat#> [,1]#> l_homicide 0.08181162output$aVarHat#> [,1]#> [1,] 0.003396724# to save time, you can use your structure in the # last output for a new set of variables# output2 <- xtreg2way(y, x1, struc=output$struc)

Standard errors estimation options

SetEstimation
se = "0"Assume hom*oskedasticity and no within group correlation or serial correlation
se = "1" (default)robust to heteroskadasticity and serial correlation (Arellano 1987)
se = "2"robust to heteroskedasticity, but assumes no correlation within group or serial correlation
se = "11"Aerllano SE with df correction performed by Stata xtreg (Somaini and Wolak 2021)

Alternatively, you can also do it manually or with the plm package, but you have to be careful with how the SEs are estimated

library(multiwayvcov) # get vcov matrix library(lmtest) # robust SEs estimation# manualoutput3 <- lm(l_homicide ~ post + factor(state) + factor(year), data = df)# get variance-covariance matrixvcov_tw <- multiwayvcov::cluster.vcov(output3, cbind(df$state, df$year), use_white = F, df_correction = F)# get coefficientscoeftest(output3, vcov_tw)[2,] #> Estimate Std. Error t value Pr(>|t|) #> 0.08181162 0.05671410 1.44252696 0.14979397
# using the plm packagelibrary(plm)output4 <- plm(l_homicide ~ post,  data = df,  index = c("state", "year"),  model = "within",  effect = "twoways")# get coefficientscoeftest(output4, vcov = vcovHC, type = "HC1")#> #> t test of coefficients:#> #> Estimate Std. Error t value Pr(>|t|)#> post 0.081812 0.057748 1.4167 0.1572

As you can see, differences stem from SE estimation, not the coefficient estimate.

References

Arellano, Manuel. 1987. “Computing Robust Standard Errors for Within-Groups Estimators.” Oxford Bulletin of Economics and Statistics 49 (4): 431–34.

Borusyak, Kirill, Xavier Jaravel, and Jann Spiess. 2021. “Revisiting Event Study Designs: Robust and Efficient Estimation.” arXiv Preprint arXiv:2108.12419.

Callaway, Brantly, and Pedro HC Sant’Anna. 2021. “Difference-in-Differences with Multiple Time Periods.” Journal of Econometrics 225 (2): 200–230.

De Chaisemartin, Clément, and Xavier d’Haultfoeuille. 2020. “Two-Way Fixed Effects Estimators with Heterogeneous Treatment Effects.” American Economic Review 110 (9): 2964–96.

Gardner, John. 2022. “Two-Stage Differences in Differences.” arXiv Preprint arXiv:2207.05943.

Goodman-Bacon, Andrew. 2021. “Difference-in-Differences with Variation in Treatment Timing.” Journal of Econometrics 225 (2): 254–77.

Imai, Kosuke, and In Song Kim. 2019. “When Should We Use Unit Fixed Effects Regression Models for Causal Inference with Longitudinal Data?” American Journal of Political Science 63 (2): 467–90.

———. 2021. “On the Use of Two-Way Fixed Effects Regression Models for Causal Inference with Panel Data.” Political Analysis 29 (3): 405–15.

Somaini, Paulo, and Frank A Wolak. 2016. “An Algorithm to Estimate the Two-Way Fixed Effects Model.” Journal of Econometric Methods 5 (1): 143–52.

———. 2021. “TWFEM: Stata Module to Efficiently Estimate a Two-Way Fixed Effects Model Based on Somaini and Wolak (2015).”

Sun, Liyang, and Sarah Abraham. 2021. “Estimating Dynamic Treatment Effects in Event Studies with Heterogeneous Treatment Effects.” Journal of Econometrics 225 (2): 175–99.

Sun, Liyang, and Jesse M Shapiro. 2022. “A Linear Panel Model with Heterogeneous Coefficients and Variation in Exposure.” Journal of Economic Perspectives 36 (4): 193–204.

26.7 Two-way Fixed-effects | A Guide on Data Analysis (2024)

References

Top Articles
Best Ever No-Cook Play Dough Recipe! - The Imagination Tree
Quinoa Smoothie - My Fussy Eater | Easy Family Recipes
What Is Single Sign-on (SSO)? Meaning and How It Works? | Fortinet
Fighter Torso Ornament Kit
Dairy Queen Lobby Hours
Warren Ohio Craigslist
Maria Dolores Franziska Kolowrat Krakowská
Myexperience Login Northwell
Craigslist Campers Greenville Sc
12 Rue Gotlib 21St Arrondissem*nt
25X11X10 Atv Tires Tractor Supply
Yi Asian Chinese Union
Audrey Boustani Age
Sport Clip Hours
Kinkos Whittier
Buff Cookie Only Fans
Midlife Crisis F95Zone
Enterprise Car Sales Jacksonville Used Cars
Craigslist Free Stuff Greensboro Nc
Saatva Memory Foam Hybrid mattress review 2024
NBA 2k23 MyTEAM guide: Every Trophy Case Agenda for all 30 teams
Craigslist Appomattox Va
Sizewise Stat Login
Understanding Genetics
Timeforce Choctaw
Football - 2024/2025 Women’s Super League: Preview, schedule and how to watch
All Breed Database
Pirates Of The Caribbean 1 123Movies
Red Cedar Farms Goldendoodle
European city that's best to visit from the UK by train has amazing beer
Sand Dollar Restaurant Anna Maria Island
Dhs Clio Rd Flint Mi Phone Number
Skepticalpickle Leak
Current Students - Pace University Online
Ringcentral Background
Blush Bootcamp Olathe
Diggy Battlefield Of Gods
Broken Gphone X Tarkov
Mumu Player Pokemon Go
Hotels Near New Life Plastic Surgery
2007 Peterbilt 387 Fuse Box Diagram
2 Pm Cdt
Ajpw Sugar Glider Worth
Mikayla Campinos Alive Or Dead
Call2Recycle Sites At The Home Depot
Diesel Technician/Mechanic III - Entry Level - transportation - job employment - craigslist
Costco Tire Promo Code Michelin 2022
Service Changes and Self-Service Options
All Obituaries | Roberts Funeral Home | Logan OH funeral home and cremation
Room For Easels And Canvas Crossword Clue
Qvc Com Blogs
Latest Posts
Article information

Author: Jamar Nader

Last Updated:

Views: 6135

Rating: 4.4 / 5 (75 voted)

Reviews: 82% of readers found this page helpful

Author information

Name: Jamar Nader

Birthday: 1995-02-28

Address: Apt. 536 6162 Reichel Greens, Port Zackaryside, CT 22682-9804

Phone: +9958384818317

Job: IT Representative

Hobby: Scrapbooking, Hiking, Hunting, Kite flying, Blacksmithing, Video gaming, Foraging

Introduction: My name is Jamar Nader, I am a fine, shiny, colorful, bright, nice, perfect, curious person who loves writing and wants to share my knowledge and understanding with you.