1

I'm working with a two-way fixed effects (TWFE) logit model in R using the fixest package. My model includes interaction terms between some fixed effects and a categorical variable called group. To report the coefficients of each fixed effect term, I've included one of the fixed effects as a regular control variable. However, I've noticed that this approach leads to differences in standard errors and p-values compared to the standard TWFE model.

When I include one of the fixed effects (e.g., wave) as a regular control variable instead of as a fixed effect, the standard errors and p-values differ, although the coefficients remain the same. Below is a reproducible example:


  library(fixest)
library(dplyr)

# Simulate data
set.seed(12345)
n <- 1000  
data <- data.frame(
  outcome = rbinom(n, 1, 0.5),
  group = factor(sample(c("Group1", "Group2", "Group3"), n, replace = TRUE)),
  event1 = rbinom(n, 1, 0.3),
  event2 = rbinom(n, 1, 0.2),
  event3 = rbinom(n, 1, 0.4),
  event4 = rbinom(n, 1, 0.25),
  age_group = factor(sample(2:4, n, replace = TRUE)),
  gender = factor(sample(c("Male", "Female"), n, replace = TRUE)),
  education = factor(sample(c("Low", "Medium", "High"), n, replace = TRUE)),
  ethnicity = factor(sample(c("Ethnic1", "Ethnic2"), n, replace = TRUE)),
  income_level = factor(sample(1:4, n, replace = TRUE)),
  perception = factor(sample(1:3, n, replace = TRUE)),
  previous_vote = rbinom(n, 1, 0.4),
  struggle = factor(sample(1:3, n, replace = TRUE)),
  news_type = factor(sample(c("News1", "News2", "News3", "NoNews"), n, replace = TRUE)),
  location_type = factor(sample(c("Urban", "Rural"), n, replace = TRUE)),
  city = factor(sample(1:10, n, replace = TRUE)),
  wave = factor(sample(paste0("Wave", 1:5), n, replace = TRUE)),
  weight = runif(n, 0.5, 2)
)

# Model 1: Including Wave as a fixed effect
model1 <- feglm(
  outcome ~ i(group, "Group1") + 
    i(group, event1, ref = "Group1", ref2 = "0") + 
    i(group, event2, ref = "Group1", ref2 = "0") + 
    i(group, event3, ref = "Group1", ref2 = "0") + 
    i(group, event4, ref = "Group1", ref2 = "0") + 
    i(age_group, ref = "2") + gender + education + 
    i(ethnicity, ref = "Ethnic1") + income_level + 
    i(perception, ref = "2") + as.factor(previous_vote) + 
    i(struggle, ref = "2") + as.factor(news_type) + location_type | city + wave,
  data = data, 
  family = binomial("logit"), 
  weights = data$weight,
  cluster = ~city + wave
)

summary(model1, vcov = "twoway", ssc = ssc(adj = FALSE, cluster.adj = FALSE))

# Model 2: Including Wave as a regular control variable
model2 <- feglm(
  outcome ~ i(group, "Group1") + i(wave, ref = "Wave5") + 
    i(group, event1, ref = "Group1", ref2 = "0") + 
    i(group, event2, ref = "Group1", ref2 = "0") + 
    i(group, event3, ref = "Group1", ref2 = "0") + 
    i(group, event4, ref = "Group1", ref2 = "0") + 
    i(age_group, ref = "2") + gender + education + 
    i(ethnicity, ref = "Ethnic1") + income_level + 
    i(perception, ref = "2") + as.factor(previous_vote) + 
    i(struggle, ref = "2") + as.factor(news_type) + location_type | city,
  data = data, 
  family = binomial("logit"), 
  weights = data$weight,
  cluster = ~city + wave
)

summary(model2, vcov = "twoway", ssc = ssc(adj = FALSE, cluster.adj = FALSE))

Questions:

  1. Is the difference in standard errors and p-values between the two models due to the adjustment of degrees of freedom?

  2. Is this a statistical problem (e.g. incidental parameter problem) or purely a software issue? If just a software issue, how can I fix it?

1
  • Please add any user-contributed packages in your R code. Eg. library(fixest). Commented Aug 21, 2024 at 12:06

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.