Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong class of input dataset gives unclear error message during harmo_process() #72

Open
zchenmr opened this issue Aug 30, 2024 · 3 comments
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@zchenmr
Copy link

zchenmr commented Aug 30, 2024

I got the error below when trying to run harmo_process, which was resolved by converting the input dataset into a tibble. However, it was unclear from the error message and traceback where the issue came from. The class of the original input dataset was "data.frame" (instead of ""tbl_df" "tbl" "data.frame"). When I ran the is_dataset function, it returned TRUE so I initially thought that the problem was with other processing elements and not the dataset.

image
@GuiFabre GuiFabre added bug Something isn't working good first issue Good for newcomers To validate labels Oct 6, 2024
@GuiFabre
Copy link
Contributor

@a-trottier : here is the same bug you face, I'll keep you posted if anything comes.

@zchenmr : can you give me an example of a dataset that generates the problem ?
Thanks

@zchenmr
Copy link
Author

zchenmr commented Nov 14, 2024

The dataset is on an external server so I can't share it, but I can send a summary report if that helps. I'm not sure if it's actually the class causing the problem, I've run harmo_process with other datasets of class "data.frame" and those still worked without any issues.

@GuiFabre
Copy link
Contributor

hello @zchenmr. I think I found the problem, which might not be due to the package.
In a nutshell, across() function from dplyr, and widely used in the package does not work with the grouping variable.

tidyverse/dplyr#6127

# does not work and throw an error
iris %>% group_by(Species) %>% mutate(across(Species,as.character))

# does not work but silently bypass Species (everything is coerced into a character, except Species)
iris %>% group_by(Species) %>% mutate(across(everything(),as.character))

# does work
iris %>% group_by(Species) %>% mutate(Species = as.character(Species))

I'll check across the package when the dataset is grouped and ensure it is ungroup() before

@

GuiFabre added a commit to maelstrom-research/madshapR that referenced this issue Nov 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants