Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in quantile.default(x[[ncol(x)]], probs = (1 + c(-level, level))/2) : missing values and NaN's not allowed if 'na.rm' is FALSE #520

Closed
Yunuuuu opened this issue Dec 22, 2023 · 3 comments · Fixed by #521
Labels
bug an unexpected problem or unintended behavior

Comments

@Yunuuuu
Copy link

Yunuuuu commented Dec 22, 2023

When sample is small, bootstrap samples will give NA value, which prevented the calculation of confidence interval.

library(infer)
data <- tibble::tibble(
    prop = runif(10L),
    gender = rep(c("female", "male"), each = 5L)
)
lapply(seq_len(100L), function(i) {
    boot_dist <- data %>%
        specify(prop ~ gender) %>% # alt: response = age, explanatory = season
        hypothesize(null = "independence") %>%
        generate(reps = 1000, type = "bootstrap") %>%
        calculate(stat = "diff in medians", order = c("female", "male"))
    get_ci(boot_dist)
})
#> Error in quantile.default(x[[ncol(x)]], probs = (1 + c(-level, level))/2): missing values and NaN's not allowed if 'na.rm' is FALSE

Created on 2023-12-22 with reprex v2.0.2

@Yunuuuu
Copy link
Author

Yunuuuu commented Dec 22, 2023

I don't know directly modify following line is okay?

ci_vec <- stats::quantile(x[[ncol(x)]], probs = (1 + c(-level, level)) / 2, na.rm = TRUE) 

ci_vec <- stats::quantile(x[[ncol(x)]], probs = (1 + c(-level, level)) / 2)

Or we should convert NA value into others like 0?

@simonpcouch simonpcouch added the bug an unexpected problem or unintended behavior label Dec 22, 2023
@simonpcouch
Copy link
Collaborator

Thanks for the issue! A slightly more minimal reprex:

library(infer)

data <- data.frame(
   prop = seq(0, 1, length.out = 10),
   group = rep(c("a", "b"), each = 5L)
)

set.seed(1)
boot_dist <-
   data %>%
   specify(prop ~ group) %>%
   hypothesize(null = "independence") %>%
   generate(reps = 1000, type = "bootstrap") %>%
   calculate(stat = "diff in medians", order = c("b", "a"))

get_confidence_interval(boot_dist, .95)
#> Error in quantile.default(x[[ncol(x)]], probs = (1 + c(-level, level))/2): missing values and NaN's not allowed if 'na.rm' is FALSE

Created on 2023-12-22 with reprex v2.0.2

On it. :)

Copy link

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Feb 15, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants