Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When using tar_make(callr_function = NULL) wrapped with callr targets are not invalidated when code in source files changes #896

Closed
11 tasks done
MilesMcBain opened this issue Aug 3, 2022 · 3 comments
Assignees

Comments

@MilesMcBain
Copy link

Prework

  • Read and agree to the code of conduct and contributing guidelines.
  • Confirm that your issue is a genuine bug in the targets package itself and not a user error, known limitation, or issue from another package that targets depends on. For example, if you get errors running tar_make_clustermq(), try isolating the problem in a reproducible example that runs clustermq and not targets. And for miscellaneous troubleshooting, please post to discussions instead of issues.
  • If there is already a relevant issue, whether open or closed, comment on the existing thread instead of posting a new issue.
  • Using targets::tar_reprex(), reprex::reprex(), or similar, post a minimal reproducible example like this one so the maintainer can troubleshoot the problems you identify. A reproducible example is:
    • Runnable: post enough R code and data so any onlooker can create the error on their own computer.
    • Minimal: reduce runtime wherever possible and remove complicated details that are irrelevant to the issue at hand.
    • Readable: format your code according to the tidyverse style guide.

Description

When I wrap tar_make(callr_function = NULL) with callr::r code changes in source files affecting targets are not detected.

If I remove the callr::r wrapping changes are detected.

If I make a change in the plan script file, correct (modified) source code is used.

Background: I was doing this as per @noamross's suggestion in the rOpensci #pipelinetoolkits slack channel. The idea is you have an outer iteration that allows you to inject different parameters into the same plan being run several times against different stores.

Reproducible example

  • Using targets::tar_reprex(), reprex::reprex(), or similar, post a minimal reproducible example so the maintainer can troubleshoot the problems you identify. A reproducible example is:
    • Runnable: post enough R code and data so any onlooker can create the error on their own computer.
    • Minimal: reduce runtime wherever possible and remove complicated details that are irrelevant to the issue at hand.
    • Readable: format your code according to the tidyverse style guide.

Applogies if this is not minimal I wanted you to know what I know about this problem:

library(targets)

working_dir <- file.path(tempdir(), "targets_reprex")
dir.create(working_dir)
func_file <- file.path(working_dir, "func.R")
tar_script <- file.path(working_dir, "_targets.R")
script_store <- file.path(working_dir, "_targets/script")

file.create(func_file)
#> [1] TRUE
writeLines("result_fn <- function(x) x + 1", func_file)

eval(bquote(
  tar_script(
    {
      source(.(func_file))
      list(tar_target(
        result,
        result_fn(1)
      ))
    },
    script = tar_script,
    ask = FALSE
  )
))

callr::r(
  eval(bquote(
    function() {
      targets::tar_make(
        script = .(tar_script),
        store = .(script_store),
        callr_function = NULL,
      )
    }
  )),
  show = TRUE
)
#> • start target result
#> • built target result
#> • end pipeline: 0.059 seconds
#> NULL

tar_read(result, store = script_store)
#> [1] 2

writeLines("result_fn <- function(x) x + 2", func_file)

callr::r(
  eval(bquote(function() {
    targets::tar_make(
      script = .(tar_script),
      store = .(script_store),
      callr_function = NULL
    )
  })),
  show = TRUE
)
#> ✔ skip target result
#> ✔ skip pipeline: 0.047 seconds
#> NULL

tar_read(result, store = script_store)
#> [1] 2

# But does work if change detected in plan:
eval(bquote(
  tar_script(
    {
      source(.(func_file))
      list(tar_target(
        result,
        result_fn(1) * 1
      ))
    },
    script = tar_script,
    ask = FALSE
  )
))

callr::r(
  eval(bquote(function() {
    targets::tar_make(
      script = .(tar_script),
      store = .(script_store),
      callr_function = NULL
    )
  })),
  show = TRUE
)
#> • start target result
#> • built target result
#> • end pipeline: 0.06 seconds
#> NULL

tar_read(result, store = script_store)
#> [1] 3

unlink(working_dir, recursive = TRUE)

Created on 2022-08-03 by the reprex package (v2.0.1)

Expected result

The second time tar_read(result, store = script_store) is called it should return the result 3, following the change in result_fn written to the function source file.

Instead we don't see this result until the third time, when the plan script is modified.

Diagnostic information

  • A reproducible example.
  • Session info, available through sessionInfo() or reprex(si = TRUE).
  • A stack trace from traceback() or rlang::trace_back().
  • The SHA-1 hash of the GitHub commit of targets currently installed. packageDescription("targets")$GithubSHA1 shows you this.
devtools::session_info()
─ Session info ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.2.1 (2022-06-23)
 os       Ubuntu 20.04.4 LTS
 system   x86_64, linux-gnu
 ui       X11
 language (EN)
 collate  C.UTF-8
 ctype    C.UTF-8
 tz       Etc/UTC
 date     2022-08-03
 pandoc   2.5 @ /usr/bin/pandocPackages ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 package     * version    date (UTC) lib source
 assertthat    0.2.1      2019-03-21 [1] RSPM (R 4.2.0)
 backports     1.4.1      2021-12-13 [1] RSPM (R 4.2.0)
 base64url     1.4        2018-05-14 [1] CRAN (R 4.2.0)
 brio          1.1.3      2021-11-30 [1] RSPM (R 4.2.0)
 cachem        1.0.6      2021-08-19 [1] RSPM (R 4.2.0)
 callr         3.7.0      2021-04-20 [1] RSPM (R 4.2.0)
 cli           3.3.0      2022-04-25 [1] CRAN (R 4.2.0)
 codetools     0.2-18     2020-11-04 [4] CRAN (R 4.0.3)
 crayon        1.5.1      2022-03-26 [1] RSPM (R 4.2.0)
 data.table    1.14.2     2021-09-27 [1] RSPM
 desc          1.4.1      2022-03-06 [1] RSPM (R 4.2.0)
 devtools      2.4.3      2021-11-30 [1] RSPM (R 4.2.0)
 digest        0.6.29     2021-12-01 [1] RSPM (R 4.2.0)
 ellipsis      0.3.2      2021-04-29 [1] RSPM (R 4.2.0)
 fansi         1.0.3      2022-03-24 [1] RSPM (R 4.2.0)
 fastmap       1.1.0      2021-01-25 [1] RSPM (R 4.2.0)
 filelock      1.0.2      2018-10-05 [1] RSPM (R 4.2.0)
 fs            1.5.2      2021-12-08 [1] RSPM (R 4.2.0)
 glue          1.6.2      2022-02-24 [1] RSPM (R 4.2.0)
 igraph        1.3.1      2022-04-20 [1] CRAN (R 4.2.0)
 jsonlite      1.8.0      2022-02-22 [1] RSPM (R 4.2.0)
 keyring       1.3.0      2021-11-29 [1] RSPM (R 4.2.0)
 knitr         1.39       2022-04-26 [1] CRAN (R 4.2.1)
 lifecycle     1.0.1      2021-09-24 [1] RSPM (R 4.2.0)
 magrittr      2.0.3      2022-03-30 [1] RSPM (R 4.2.0)
 memoise       2.0.1      2021-11-26 [1] RSPM (R 4.2.0)
 paint         0.1.5      2022-05-09 [1] Github (milesmcbain/paint@0b27c48)
 pillar        1.7.0      2022-02-01 [1] RSPM (R 4.2.0)
 pkgbuild      1.3.1      2021-12-20 [1] RSPM (R 4.2.0)
 pkgconfig     2.0.3      2019-09-22 [1] RSPM (R 4.2.0)
 pkgload       1.2.4      2021-11-30 [1] RSPM (R 4.2.0)
 prettycode    1.1.0.9000 2022-06-10 [1] Github (milesmcbain/prettycode@0386eca)
 prettyunits   1.1.1      2020-01-24 [1] RSPM (R 4.2.0)
 processx      3.5.2      2021-04-30 [1] RSPM (R 4.2.0)
 prompt        1.0.1      2021-03-12 [1] RSPM (R 4.2.0)
 ps            1.6.0      2021-02-28 [1] RSPM (R 4.2.0)
 purrr         0.3.4      2020-04-17 [1] RSPM (R 4.2.0)
 R6            2.5.1      2021-08-19 [1] RSPM (R 4.2.0)
 rappdirs      0.3.3      2021-01-31 [1] RSPM (R 4.2.0)
 remotes       2.4.2      2021-11-30 [1] RSPM (R 4.2.0)
 rlang         1.0.3      2022-06-27 [1] RSPM (R 4.2.0)
 rprojroot     2.0.3      2022-04-02 [1] CRAN (R 4.2.0)
 sessioninfo   1.2.2      2021-12-06 [1] RSPM (R 4.2.0)
 sodium        1.2.0      2021-10-21 [1] RSPM (R 4.2.0)
 targets     * 0.12.1     2022-06-03 [1] RSPM (R 4.2.0)
 testthat      3.1.4      2022-04-26 [1] RSPM (R 4.2.0)
 tibble        3.1.7      2022-05-03 [1] CRAN (R 4.2.0)
 tidyselect    1.1.2      2022-02-21 [1] RSPM (R 4.2.0)
 usethis       2.1.5      2021-12-09 [1] CRAN (R 4.2.0)
 utf8          1.2.2      2021-07-24 [1] RSPM (R 4.2.0)
 vctrs         0.4.1      2022-04-13 [1] CRAN (R 4.2.0)
 withr         2.5.0      2022-03-03 [1] RSPM (R 4.2.0)
 xfun          0.30       2022-03-02 [1] RSPM (R 4.2.0)
 yaml          2.3.5      2022-02-21 [1] RSPM (R 4.2.0)

 [1] /home/ubuntu/R/x86_64-pc-linux-gnu-library/4.2
 [2] /usr/local/lib/R/site-library
 [3] /usr/lib/R/site-library
 [4] /usr/lib/R/library
@MilesMcBain
Copy link
Author

contrast with this one, where the callr::r wrapping is removed in the second run:

library(targets)

working_dir <- file.path(tempdir(), "targets_reprex")
dir.create(working_dir)
func_file <- file.path(working_dir, "func.R")
tar_script <- file.path(working_dir, "_targets.R")
script_store <- file.path(working_dir, "_targets/script")

file.create(func_file)
#> [1] TRUE
writeLines("result_fn <- function(x) x + 1", func_file)

eval(bquote(
  tar_script(
    {
      source(.(func_file))
      list(tar_target(
        result,
        result_fn(1)
      ))
    },
    script = tar_script,
    ask = FALSE
  )
))

callr::r(
  eval(bquote(
    function() {
      targets::tar_make(
        script = .(tar_script),
        store = .(script_store),
        callr_function = NULL,
      )
    }
  )),
  show = TRUE
)
#> • start target result
#> • built target result
#> • end pipeline: 0.068 seconds
#> NULL

tar_read(result, store = script_store)
#> [1] 2

writeLines("result_fn <- function(x) x + 2", func_file)

targets::tar_make(
  script = tar_script,
  store = script_store,
  callr_function = NULL
)
#> • start target result
#> • built target result
#> • end pipeline: 0.058 seconds

tar_read(result, store = script_store)
#> [1] 3

unlink(working_dir, recursive = TRUE)

Created on 2022-08-03 by the reprex package (v2.0.1)

@wlandau
Copy link
Member

wlandau commented Aug 3, 2022

It's because source() is somehow assigning the function to the wrong environment. It will work if you run eval(parse(text = readLines(script_file), envir = tar_option_get("envir")) (or use tar_source(), which I added today.) I don't consider this a bug because callr_function = NULL is only for debugging purposes (and in some cases, tests) and is likely to break in similar environment-related ways in practice. I know you are looking for a way to insert options and environment variables from the session, and I would recommend a workaround that respects the R session encapsulation that targets is trying to enforce. Maybe programmatically writing to _targets.R or implementing something that depends on the project/_targets.yaml.

@MilesMcBain
Copy link
Author

Ahhh thanks Will. I know there's that gotcha where source always assigns the function in the top level environment maybe that's part of it, although I would have thought that would work in this case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants