Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

boottest fails when a fixed effect variable is of type "date" #115

Closed
95goo opened this issue May 15, 2023 · 13 comments
Closed

boottest fails when a fixed effect variable is of type "date" #115

95goo opened this issue May 15, 2023 · 13 comments
Labels
bug Something isn't working can't reproduce a reported bug that I currently cannot reproduce wontfix This will not be worked on

Comments

@95goo
Copy link

95goo commented May 15, 2023

Hi, thank you very much for this amazing package, It's been great to dig into.
Below is some sample code. I am clustering by c and also using it as a fixed effect in my model. Note that b is also a categorical variable.
model <- feols(Y ~ X |b + c, cluster = "c")
I run boottest as follows:
boottest(model, clustid = "c", param = "X", B = 9999)
The error I receive is: "error in solve.default(crossprod(weights_sq * X)) : system is computationally singular: reciprocal condition number"

Any advices at all will be much appreciated. Thank you very much.

@s3alfisc
Copy link
Owner

Hi, thanks for the feedback and the nice words!

Unfortunatley (or fortunately 😄 ) I cannot reproduce this error. Which version of the package are you running? What happens if you install the most up to date version from CRAN? If you still run into these issues, can you maybe provide data so that I can reproduce it?

Here is the code that I run:

library(fwildclusterboot)
library(fixest)
data(voters)

sapply(
  voters[,
         c(
           "proposition_vote", 
           "treatment", 
           "Q1_immigration", 
           "Q2_defense"
           )], 
  class
)

# proposition_vote 
# "integer" 
# treatment 
# "integer" 
# Q1_immigration 
# "factor" 
# Q2_defense 
# "factor" 


fit <- feols(
  proposition_vote ~ treatment |
    Q1_immigration + Q2_defense, 
  data = voters
)

boot <- boottest(
  fit, 
  param = "treatment", 
  clustid = "Q1_immigration", 
  B = 999
)
summary(boot)
# boottest.fixest(object = fit, param = "treatment", B = 999, clustid = "Q1_immigration")
# 
# Hypothesis: 1*treatment = 0
# Observations: 300
# Bootstr. Type: rademacher
# Clustering: 1-way
# Confidence Sets: 95%
# Number of Clusters: 10
# 
# term estimate statistic p.value conf.low conf.high
# 1 1*treatment = 0    0.077     1.462   0.108   -0.003     0.213

boot <- boottest(
  fit, 
  param = "treatment", 
  clustid = "Q2_defense", 
  B = 999
)
summary(boot)
# boottest.fixest(object = fit, param = "treatment", B = 999, clustid = "Q2_defense")
# 
# Hypothesis: 1*treatment = 0
# Observations: 300
# Bootstr. Type: rademacher
# Clustering: 1-way
# Confidence Sets: 95%
# Number of Clusters: 10
# 
# term estimate statistic p.value conf.low conf.high
# 1 1*treatment = 0    0.077     2.758   0.033     0.01     0.159

@s3alfisc s3alfisc added the can't reproduce a reported bug that I currently cannot reproduce label May 15, 2023
@95goo
Copy link
Author

95goo commented May 15, 2023

Thanks very much. I updated the package as you recommended.
I tried to run with only one fixed effect(categorical variable) to test now:
model <- feols(date = reg_data, Y ~ X | c, cluster = "c")
boottest(model, clustid = "c", param = "X", B = 200)

This is the error I receive: "Error in 1 | c : operations are possible for numeric, logical, or complex types"
If you have any recommendation further please let me know, otherwise I will provide sample data for assistance. When I add the " | c", where c is the categorical variable I am clustering with, to the feols model, that is where the issue begins in the boottest. It runs fine without it.

I am noting the bootstrap function runs fine where I add year fixed effects to the feols model as done below.
model <- feols(date = reg_data, Y ~ X | year, cluster = "c")
However, there is an issue when I use week fixed effects as opposed to year.
model <- feols(date = reg_data, Y ~ X | week, cluster = "c")
In that scenario, I get the following error: "Error in Ops.Date(1, week) : | not defined for "Date" objects"

I appreciate your advices:)

@s3alfisc
Copy link
Owner

What is the exact type of your date and year variables? I.e. what do you get when run running sapply(data, class)?

@s3alfisc
Copy link
Owner

I am asking as it looks like you are providing a date object?

@95goo
Copy link
Author

95goo commented May 16, 2023

The year is formatted as 'numeric' while the week is formatted as 'date' object. Is there an issue with this?

@s3alfisc
Copy link
Owner

Great, maybe the date type causes the problem here - I will check this later today. What happens if you convert the date to a plain factor?

@s3alfisc
Copy link
Owner

In your first example, were either b or c date variables?

@95goo
Copy link
Author

95goo commented May 17, 2023

Thank you very much sir. Converting the week variable to the plain factor using "factor" makes the code below work for me!!
model <- feols(date = reg_data, Y ~ X | week, cluster = "c")
hooray!!!

In my first example b and c were not date variables however. =( I get the following error for the following code where c is a categorical variable (10 distinct values)
model <- feols(date = reg_data, Y ~ X | c, cluster = "c")
boottest(model, clustid = "c", param = "X", B = 200)

This is the error I receive: "Error in 1 | c : operations are possible for numeric, logical, or complex types"

@95goo
Copy link
Author

95goo commented May 17, 2023

HOWEVER, changing variable "c" from character to factor using the "factor" function... the boottest function works... I do not totally understand this and would love to learn more about why this is the case. See below for the code used and the added line.

reg_data <-reg_data %>% mutate(c = factor(c))
model <- feols(date = reg_data, Y ~ X | c, cluster = "c")
boottest(model, clustid = "c", param = "X", B = 200)

@s3alfisc
Copy link
Owner

I cannot reproduce the error that you observe with character variables. Can you provide me an example with simulated date where the error occurs? All fixed effects are transformed into factors internally, hence it should not matter if you provide c as a character or factor.

Note that in general, it is dangerous to call variables "c", because c is also a base function. Lots of things that could go wrong there. ChatGPT says the following: "It is generally not recommended to use the name "c" for a variable in R because "c" is a commonly used base function in R for combining or concatenating objects. If you assign a value to the variable "c", you will overwrite the default behavior of the base function, leading to potential confusion and errors in your code."

With the date variable, you have indeed discovered a bug - something fails in the fixed effects preprocessing pipeline. I'll try to fix that asap =)

@s3alfisc s3alfisc changed the title FixedEffects Error FixedEffects Error when a fixed effect varibale is of type "date" May 19, 2023
@s3alfisc s3alfisc changed the title FixedEffects Error when a fixed effect varibale is of type "date" FixedEffects Error when a fixed effect varible is of type "date" May 19, 2023
@s3alfisc s3alfisc changed the title FixedEffects Error when a fixed effect varible is of type "date" FixedEffects Error when a fixed effect variable is of type "date" May 19, 2023
@s3alfisc s3alfisc added the bug Something isn't working label May 19, 2023
@s3alfisc
Copy link
Owner

It looks like Formula and expand.model.frame do not handle date variables in the second part of the formula.

        suppressWarnings(
          expand.model.frame(
            model =
              manipulate_object(object),
            extras = clustid_fml,
            na.expand = FALSE,
            envir = call_env
          )
        )

The error I receive is

Error in Ops.Date(1, date) : | not defined for "Date" objects
Called from: Ops.Date(1, date)
```. 

`sandwich` fails in the same context as well: 

```r
library(sandwich)
library(fwildclusterboot)

data(voters)
date <- sample(1:7, nrow(voters), TRUE)
voters$date <- as.Date(date,origin = "1970-01-01")

sapply(voters, class)

fit <- feols(proposition_vote ~ treatment | date, data = voters)
sandwich::vcovCL(fit, ~date)
# Error in Ops.Date(treatment, date) : | not defined for "Date" objects

fwildclusterboot::boottest(fit, param = "treatment", clustid = "group_id", B = 999)
# Error in Ops.Date(1, date) : | not defined for "Date" objects

Tagging @zeileis here for awareness (in case you are not aware already).

For now, I will label this as "won't fix".

@s3alfisc s3alfisc added the wontfix This will not be worked on label Jun 17, 2023
@s3alfisc s3alfisc changed the title FixedEffects Error when a fixed effect variable is of type "date" boottest fails when a fixed effect variable is of type "date" Jun 17, 2023
@zeileis
Copy link

zeileis commented Jun 17, 2023

I think Formula is not involved here, is it? The error message would indicate that standard formula processing (as opposed to Formula) is used. To expand.model.frame the model specification looks like a standard formula and hence treats | in the basic way and not for separating a model part. I would recommend to be explicit and use ... | factor(date) instead. Then expand.model.frame() seems to work again.

@s3alfisc
Copy link
Owner

Thanks for your feedback, Achim. Indeed you are right that Formula is not involved =) I'll close this issue, as I don't think I will fix this in the nearer future. Hope this is ok with you @95goo ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working can't reproduce a reported bug that I currently cannot reproduce wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

3 participants