fix: fixed an issue of a solitary NA as var_labels passed to analyze #965

kpagacz · 2024-11-22T11:58:46Z

Here's an MRE of the issue:

library(rtables)
# devtools::load_all()
lyt <- basic_table() %>%
  split_cols_by("ARM") %>%
  analyze("AGE", var_labels = c(AGE = NA))

which results in:

Error in validObject(.Object) :
invalid class “AnalyzeVarSplit” object: invalid object for slot "split_l
abel" in class "AnalyzeVarSplit": got class "logical", should be or extend
class "character"

R treats a single NA as a logical vector, which fails the validation on the Split object, which requires the labels slot to be a character vector. This issue does not happen when the var_labels array contains other character values. The untyped NAs are cast to characters automatically in this case.

This also does not happen in this scenario:

library(rtables)
# devtools::load_all()
lyt <- basic_table() %>%
  split_cols_by("ARM") %>%
  analyze("AGE", var_labels = c(AGE = NA_character_))

What prompted this is the investigation of what happens when launching the below teal application:

require(teal.modules.general)
require(teal.modules.clinical)
data <- teal_data()
data <- within(data, {
  ADSL <- teal.modules.general::rADSL
  # Remove the label attribute to simulate a dataset without labels
  attributes(ADSL$SEX)$label <- NULL
  ADSL$EOSDY[1] <- NA_integer_
})
join_keys(data) <- default_cdisc_join_keys["ADSL"]

ADSL <- data[["ADSL"]]

app <- init(
  data = data,
  modules = modules(
    teal.modules.clinical::tm_t_summary(
      label = "Demographic Table",
      dataname = "ADSL",
      arm_var = choices_selected(c("ARM", "ARMCD"), "ARM"),
      add_total = TRUE,
      summarize_vars = choices_selected(
        c("SEX", "RACE", "BMRKR2", "EOSDY", "DCSREAS", "AGE"),
        c("SEX")
      ),
      useNA = "ifany"
    )
  )
)
if (interactive()) {
  shinyApp(app$ui, app$server)
}

This application errors out with the above rtables error, because the application tries to evaluate the below:

library(nvimcom)
library(ggplot2)
library(ggmosaic)
library(shiny)
library(teal.code)
library(teal.data)
library(teal.slice)
library(teal)
library(teal.transform)
library(teal.modules.general)
library(formatters)
library(magrittr)
library(rtables)
library(tern)
library(teal.modules.clinical)

ADSL <- teal.modules.general::rADSL
attributes(ADSL$SEX)$label <- NULL
ADSL$EOSDY[1] <- NA

stopifnot(rlang::hash(ADSL) == "fdef4b3486c80ad054476ea2b4131889")

ANL_1 <- ADSL %>% dplyr::select(STUDYID, USUBJID, ARM, SEX)
ANL <- ANL_1
ANL <- ANL %>% teal.data::col_relabel(ARM = "Description of Planned Arm")
ANL_ADSL_1 <- ADSL %>% dplyr::select(STUDYID, USUBJID, ARM)
ANL_ADSL <- ANL_ADSL_1
ANL_ADSL <- ANL_ADSL %>% teal.data::col_relabel(ARM = "Description of Planned Arm")
anl <- ANL %>%
  df_explicit_na(
    omit_columns = setdiff(names(ANL), c(structure(c(SEX = "SEX"), dataname = "ADSL", always_selected = character(0)))),
    na_level = "<Missing>"
  )
anl <- anl %>% dplyr::mutate(ARM = droplevels(ARM))
arm_levels <- levels(anl[["ARM"]])
ANL_ADSL <- ANL_ADSL %>% dplyr::filter(ARM %in% arm_levels)
ANL_ADSL <- ANL_ADSL %>% dplyr::mutate(ARM = droplevels(ARM))
ANL_ADSL <- df_explicit_na(ANL_ADSL, na_level = "<Missing>")
lyt <- rtables::basic_table(
  main_footer = "n represents the number of unique subject IDs such that the variable has non-NA values."
) %>%
  rtables::split_cols_by("ARM", split_fun = drop_split_levels) %>%
  rtables::add_overall_col("All Patients") %>%
  rtables::add_colcounts() %>%
  analyze_vars(
    vars = structure(c(SEX = "SEX"), dataname = "ADSL", always_selected = character(0)),
    var_labels = c(SEX = NA),
    show_labels = "visible",
    na.rm = FALSE,
    na_str = "<Missing>",
    denom = "N_col",
    .stats = c("n", "mean_sd", "mean_ci", "geom_mean", "median", "median_ci", "quantiles", "range", "count_fraction")
  )
result <- rtables::build_table(lyt = lyt, df = anl, alt_counts_df = ANL_ADSL)
result

This fails due to solitary NA in the analyze_vars call, passed directly to rtables::analyze.

Unfortunately, trying to fix the issue on the teal.modules.clinical side is unsuccessful due to how substitute treats such solitary NAs (basically casting them from the explicit NA_character_ to normal NA). The fix could be made on the level of tern, but it seems silly not to go to the bottom, which is rtables::analyze, and deeper - AnalyzeMultiVars, etc...

Let me know if this makes sense for you or if you would like me to take a different route.

github-actions · 2024-11-22T11:59:01Z

✅ All contributors have signed the CLA
_{Posted by the CLA Assistant Lite bot.}

kpagacz · 2024-11-22T11:59:10Z

I have read the CLA Document and I hereby sign the CLA

Melkiades

Thanks @kpagacz for this! I think it makes sense to do it with one line at source instead of spreading it around, also considering that NA_character_ is fine. Could you just add a comment and a regression test for this? Also add NEWS, thanks!! ;)

kpagacz · 2024-11-22T15:23:13Z

Added the comment, test and the NEWS entry. Ready for re-review.

Melkiades

Thank you @kpagacz! Great work

shajoezhu · 2024-11-23T00:18:30Z

Hi @kpagacz , thanks a lot for the PR! great change! thanks a lot. no concerns from my side.

I have a question to @pawelru , I was wondering if r-deps allow us to trigger downstream package checks before we merge?
previously, when we were using staged.dependencies, we could trigger downstream package change by the following step, e.g. scda.test

create a branch, and modify the staged.dpendcies.yml package source, e.g. rtables, change to @kpagacz 's fork
raise a pr, and it will test.

i was wondering should we do the same by modifying the github action check.yml to trigger the downstream cicd tests?

fix: fixed an issue of a solitary NA as var_labels passed to analyze

ed93699

kpagacz requested review from edelarua, Melkiades and ayogasekaram as code owners November 22, 2024 11:58

Melkiades reviewed Nov 22, 2024

View reviewed changes

added comment, test and NEWS

7204055

Melkiades approved these changes Nov 22, 2024

View reviewed changes

shajoezhu assigned Melkiades Nov 23, 2024

shajoezhu added the sme label Nov 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: fixed an issue of a solitary NA as var_labels passed to analyze #965

fix: fixed an issue of a solitary NA as var_labels passed to analyze #965

kpagacz commented Nov 22, 2024 •

edited

Loading

github-actions bot commented Nov 22, 2024 •

edited

Loading

kpagacz commented Nov 22, 2024

Melkiades left a comment

kpagacz commented Nov 22, 2024

Melkiades left a comment

shajoezhu commented Nov 23, 2024

fix: fixed an issue of a solitary NA as var_labels passed to analyze #965

Are you sure you want to change the base?

fix: fixed an issue of a solitary NA as var_labels passed to analyze #965

Conversation

kpagacz commented Nov 22, 2024 • edited Loading

github-actions bot commented Nov 22, 2024 • edited Loading

kpagacz commented Nov 22, 2024

Melkiades left a comment

Choose a reason for hiding this comment

kpagacz commented Nov 22, 2024

Melkiades left a comment

Choose a reason for hiding this comment

shajoezhu commented Nov 23, 2024

kpagacz commented Nov 22, 2024 •

edited

Loading

github-actions bot commented Nov 22, 2024 •

edited

Loading