Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

boottest results change when the type of the variable used in clustid change from integer to character #14

Closed
timotheedotc opened this issue Sep 7, 2021 · 4 comments
Labels
bug Something isn't working invalid This doesn't seem right

Comments

@timotheedotc
Copy link

timotheedotc commented Sep 7, 2021

Hello,

I am reproducing here a problem i have in real life.

TLDR :
-I have a fixed-effect model with one fixed-effect for variable_a
-I want to use wild boostrap CI using two-ways clustered standard errors (variable_a and variable_b)
-When i change the type of variable_a (from character to integer), it changes the results significantly

I can reproduce here the example with the switch from integer to numeric, but not from integer to character.

# fit the model via fixest::feols(), lfe::felm() or stats::lm()
feols_fit <- feols(proposition_vote ~ treatment  + log_income | Q1_immigration, data = voters)

# bootstrap inference via boottest()
feols_boot <- boottest(feols_fit,
                       clustid = c("Q1_immigration","Q2_defense"),
                       B = 9999,
                       param = "treatment",
                       bootcluster='min')
feols_boot

###Transform Q1 and Q2 as character

voters_2<-voters
voters_2$Q1_immigration <- as.numeric(voters_2$Q1_immigration)
voters_2$Q2_defense <- as.numeric(voters_2$Q2_defense)
feols_fit_2 <- feols(proposition_vote ~ treatment  + log_income | Q1_immigration, data = voters_2)
feols_boot_2 <- boottest(feols_fit_2,
                       clustid = c("Q1_immigration","Q2_defense"),
                       B = 9999,
                       param = "treatment",
                       bootcluster='min')
feols_boot_2

My understanding is that the variable works as index, and thus the type should not impact the result.

I don't know which one to use.

My apologies if it stems from a misunderstanding of what is happening under the hood.

Best regards,

Timothée

@timotheedotc timotheedotc changed the title boottest results change when the type of the variable used in clustid change from character to integer boottest results change when the type of the variable used in clustid change from integer to character Sep 7, 2021
@s3alfisc
Copy link
Owner

s3alfisc commented Sep 10, 2021

Hi Timothée,

Thanks for reporting this issue, and sorry for taking a couple of days to respond. I am currently on vacation and have not yet explored the issue in much detail, but I am convinced that you have found a bug, but it is not related to the type of the clustering variable but the type of the fixed effect.

I believe that the issue is instead related to the pre-processing of the fixed effect in your regression. If you estimate your suggested models without fixed effects, both p-values will be equal.


# fit model 1 via fixest::feols(), lfe::felm() or stats::lm()
feols_fit <- feols(proposition_vote ~ treatment  + log_income , data = voters)
feols_fit_2 <- feols(proposition_vote ~ treatment  + log_income , data = voters_2)
# does fixest create the same regression results? yes
etable(feols_fit, feols_fit_2)

# bootstrap inference via boottest()
feols_boot <- boottest(feols_fit,
                       clustid = c("Q1_immigration","Q2_defense"),
                       B = 9999,
                       param = "treatment",
                       bootcluster='min', 
                       seed = 1)

feols_boot_2 <- boottest(feols_fit_2,
                         clustid = c("Q1_immigration","Q2_defense"),
                         B = 9999,
                         param = "treatment",
                         bootcluster='min', 
                         seed = 1)

library(modelsummary)
msummary(model = list(feols_boot, feols_boot_2), 
         estimate = "{estimate} ({p.value})", 
         statistic = "[{conf.low}, {conf.high}]")

My guess for what happens is that I have not properly protected the fixed effects variable to be of type character. If you feed a numeric or integer as a fixed effect into feols(), if automatically converts it into a character variable, hence the type of variable does not effect the regression estimates. boottest() insteads works on the "original data", as it is somewhat difficult to get a ´model.frame´ type object out of feols(), and I assume that I have not properly protected the fixed effect variables employed in either fixest::feols() or lfe::felm() if they are not a factor or character variable. Hence my current prior is that if your model fixed effects in feols() are of type character or factor in the original data set prior to transformation in feols() or felm(), inference via boottest() should be correct.

I will have a closer look at this in the next days and then get back to you.

Best, Alex

s3alfisc added a commit that referenced this issue Sep 10, 2021
…ffects in either `felm()` or `feols()` are not factor variables in the original data.

associated tests in inst/tinytest/tests_numeric_fe_clusters.R
@s3alfisc
Copy link
Owner

s3alfisc commented Sep 10, 2021

Hi, with commit 97b9b70 , boottest() should now throw an error whenever variables used as fixed effects in either felm() or feols() are not factor variables in the original data (the voters data set in your example).
Associated tests can be found in https://github.com/s3alfisc/fwildclusterboot/blob/master/inst/tinytest/test_numeric_fe_clusters.R.
I will try to allow for "integer" and "numeric" fixed effects as soon as possible.

@s3alfisc
Copy link
Owner

With commit 56efefb, boottest() now forces all fixed effects variables employed in fixest::feols() and lfe::felm() to be factors even if they are not in the original data set. This mimics the behavior of both fixest::feols() and lfe::felm() and guarantees that the output of boottest() is consistent and does not vary with the class of fixed effects variables.

Updated tests can be found in https://github.com/s3alfisc/fwildclusterboot/blob/master/inst/tinytest/test_numeric_fe_clusters.R..

@s3alfisc
Copy link
Owner

The updates are now on its way to CRAN with package version 0.3.7. I will keep this issue open for a while so that it remains visible.

@s3alfisc s3alfisc added bug Something isn't working invalid This doesn't seem right labels Sep 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working invalid This doesn't seem right
Projects
None yet
Development

No branches or pull requests

2 participants