Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

future_lapply fails after Rfast is used #638

Closed
ChristophH opened this issue Jul 29, 2022 · 7 comments
Closed

future_lapply fails after Rfast is used #638

ChristophH opened this issue Jul 29, 2022 · 7 comments
Labels

Comments

@ChristophH
Copy link

ChristophH commented Jul 29, 2022

Describe the bug

After using Rfast future_lapply fails with Error: C stack usage ... is too close to the limit. I'm not sure this is a future or future.apply or Rfast problem. The problem has previously been mentioned in the Rfast issue #5.

Reproduce example

future.apply::future_lapply(1:4, sum)  # works
Rfast::colsums(x = matrix(1:12, 3, 4))  # now use Rfast (any function will do)
future.apply::future_lapply(1:4, sum)  # Error: C stack usage

Expected behavior

Second call of future.apply::future_lapply should return the same values as the first one.

Session information

> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.3 LTS

Matrix products: default
BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.8.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
 [1] Rfast_2.0.4        compiler_4.1.2     parallelly_1.30.0  RcppZiggurat_0.1.6
 [5] tools_4.1.2        parallel_4.1.2     listenv_0.8.0      future.apply_1.8.1
 [9] Rcpp_1.0.8         codetools_0.2-18   digest_0.6.29      globals_0.14.0
[13] future_1.23.0
> future::futureSessionInfo()
*** Package versions
future 1.23.0, parallelly 1.30.0, parallel 4.1.2, globals 0.14.0, listenv 0.8.0

*** Allocations
availableCores():
system  nproc
     2      2
availableWorkers():
$system
[1] "localhost" "localhost"

*** Settings
- future.plan=<no set>
- future.fork.multithreading.enable=<no set>
- future.globals.maxSize=<no set>
- future.globals.onReference=<no set>
- future.resolve.recursive=<no set>
- future.rng.onMisuse=<no set>
- future.wait.timeout=<no set>
- future.wait.interval=<no set>
- future.wait.alpha=<no set>
- future.startup.script=<no set>

*** Backends
Number of workers: 1List of future strategies:
1. sequential:
   - args: function (..., envir = parent.frame())
   - tweaked: FALSE
   - call: NULL
*** Basic tests
  worker pid     r sysname           release
1      1  54 4.1.2   Linux 5.15.0-41-generic
                                              version     nodename machine
1 #44~20.04.1-Ubuntu SMP Fri Jun 24 13:27:29 UTC 2022 f9be587bfc01  x86_64
    login        user effective_user
1 unknown xxx    xxx
Number of unique PIDs: 1 (as expected)
@ChristophH ChristophH added the bug label Jul 29, 2022
@HenrikBengtsson
Copy link
Collaborator

Thanks for reporting. Before anything else, can you please verify that it still happens with another up-to-date version of future. Version 1.23.0 is quite old.

Also, what does traceback() report if you call it immediately after getting the error?

@ChristophH
Copy link
Author

Same problem after updating. traceback() shows hundreds of lines with .print_helper_for_environment (duplicate lines removed below)

> traceback()
859: match.fun(FUN)
858: lapply(list(...), as.character)
857: gettext(fmt, domain = domain)
856: sprintf(gettext(fmt, domain = domain), ...)
855: gettextf("%s converted to character string", sQuote(name))
854: warning(gettextf("%s converted to character string", sQuote(name)),
         domain = NA)
853: ls(x, all.names = all.names)
852: .print_helper_for_environment(val, all.names, count_depth + 1)
...
13: .print_helper_for_environment(val, all.names, count_depth + 1)
12: .print_helper_for_environment(x, all.names, 0)
11: print.environment(env)
10: print(env)
9: withVisible(...elt(i))
8: capture.output(print(env))
7: FUN(X[[i]], ...)
6: vapply(where, FUN = envname, FUN.VALUE = NA_character_, USE.NAMES = FALSE)
5: globalsOf(expr, envir = envir, substitute = FALSE, tweak = tweak,
       locals = locals, dotdotdot = "return", method = globals.method,
       unlist = TRUE, mustExist = mustExist, recursive = TRUE)
4: getGlobalsAndPackages(expr, envir = envir, globals = globals)
3: getGlobalsAndPackagesXApply(FUN = FUN, args = args, MoreArgs = MoreArgs,
       envir = envir, future.globals = future.globals, future.packages = future.packages,
       debug = debug)
2: future_xapply(FUN = FUN, nX = nX, chunk_args = X, args = list(...),
       get_chunk = `[`, expr = expr, envir = envir, future.envir = future.envir,
       future.globals = future.globals, future.packages = future.packages,
       future.scheduling = future.scheduling, future.chunk.size = future.chunk.size,
       future.stdout = future.stdout, future.conditions = future.conditions,
       future.seed = future.seed, future.label = future.label, fcn_name = fcn_name,
       args_name = args_name, debug = debug)
1: future.apply::future_lapply(1:4, sum)

> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.3 LTS

Matrix products: default
BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.8.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
 [1] Rfast_2.0.4        compiler_4.1.2     parallelly_1.32.1  RcppZiggurat_0.1.6
 [5] tools_4.1.2        parallel_4.1.2     listenv_0.8.0      future.apply_1.9.0
 [9] Rcpp_1.0.8         codetools_0.2-18   digest_0.6.29      globals_0.15.1
[13] future_1.27.0

> future::futureSessionInfo()
*** Package versions
future 1.27.0, parallelly 1.32.1, parallel 4.1.2, globals 0.15.1, listenv 0.8.0

*** Allocations
availableCores():
        system cgroups.cpuset          nproc
             2              2              2
availableWorkers():
$system
[1] "localhost" "localhost"


*** Settings
- future.plan=<not set>
- future.fork.multithreading.enable=<not set>
- future.globals.maxSize=<not set>
- future.globals.onReference=<not set>
- future.resolve.recursive=<not set>
- future.rng.onMisuse=<not set>
- future.wait.timeout=<not set>
- future.wait.interval=<not set>
- future.wait.alpha=<not set>
- future.startup.script=<not set>

*** Backends
Number of workers: 1
List of future strategies:
1. sequential:
   - args: function (..., envir = parent.frame())
   - tweaked: FALSE
   - call: NULL

*** Basic tests
Main R session details:
  pid     r sysname           release
1 509 4.1.2   Linux 5.15.0-41-generic
                                              version nodename machine   login
1 #44~20.04.1-Ubuntu SMP Fri Jun 24 13:27:29 UTC 2022  host001  x86_64 user002
     user effective_user
1 user001        user001
Worker R session details:
  worker pid     r sysname           release
1      1 509 4.1.2   Linux 5.15.0-41-generic
                                              version nodename machine   login
1 #44~20.04.1-Ubuntu SMP Fri Jun 24 13:27:29 UTC 2022  host001  x86_64 user002
     user effective_user
1 user001        user001
Number of unique worker PIDs: 1 (as expected)

@HenrikBengtsson
Copy link
Collaborator

Thanks. However, I cannot reproduce this;

$ R --vanilla

R version 4.2.1 (2022-06-23) -- "Funny-Looking Kid"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
...

> y <- future.apply::future_lapply(1:4, FUN = sum)
> str(y)
List of 4
 $ : int 1
 $ : int 2
 $ : int 3
 $ : int 4
 
> dummy <- Rfast::colsums(x = matrix(1:12, nrow = 3, ncol = 4))
 
> y <- future.apply::future_lapply(1:4, FUN = sum)
> str(y)
List of 4
 $ : int 1
 $ : int 2
 $ : int 3
 $ : int 4
> sessionInfo()
R version 4.2.1 (2022-06-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.4 LTS

Matrix products: default
BLAS:   /home/hb/shared/software/CBI/R-4.2.1-gcc9/lib/R/lib/libRblas.so
LAPACK: /home/hb/shared/software/CBI/R-4.2.1-gcc9/lib/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] future.apply_1.9.0 future_1.27.0     

loaded via a namespace (and not attached):
 [1] Rfast_2.0.6        compiler_4.2.1     parallelly_1.32.1  RcppZiggurat_0.1.6
 [5] tools_4.2.1        parallel_4.2.1     listenv_0.8.0      Rcpp_1.0.9        
 [9] codetools_0.2-18   digest_0.6.29      globals_0.15.1    
> 

Maybe try to update Rfast too?

@ChristophH
Copy link
Author

Thank you for looking into this! I can confirm that after also updating Rfast to 2.0.6 the example does not cause any errors. 🎉
I had tried using that version of Rfast before, but this was before updating the future-related packages. Maybe I had a bad combination of package versions 🤷 I will go back to the code that originally triggered this error. If everything works as expected with this updated combination of packages, I will close the issue. Thanks again!

@HenrikBengtsson
Copy link
Collaborator

It looks like there used to be a print.environment() and .print_helper_for_environment() functions in Rfast in the past, so that was probably the problem.

HenrikBengtsson added a commit that referenced this issue Jul 29, 2022
…ase there's a print.environment() function defined by some package [#638]
HenrikBengtsson added a commit to futureverse/globals that referenced this issue Jul 29, 2022
…nt(env), just in case there's a print.environment() function defined by some package (futureverse/future#638)
@ChristophH
Copy link
Author

Closing this issue since everything is working fine now.

Just a note why I failed to realize that the newest version of Rfast does not trigger the error: I had installed Rfast 2.0.6 in a separate directory, while the default package search path still included 2.0.4. In my script I used .libPaths() to make sure 2.0.6 would come before 2.0.4. Problem was that I used Rstudio with code completion turned on - when the file was open in the editor (before executing .libPaths()) Rstudio automatically attached the namespace for 2.0.4. 2.0.6 was never used - I should have checked the session info. Note to self, be aware of Rstudio's auto-loading of namespaces!

@HenrikBengtsson
Copy link
Collaborator

FWIW, I've made a small change to the upcoming version of globals that would have avoided this problem in Rfast. Hopefully, that'll lower the risk for something similar happening again.

clrpackages pushed a commit to clearlinux-pkgs/R-globals that referenced this issue Aug 9, 2022
…n 0.16.0

Henrik Bengtsson (15):
      Bump develop version [ci skip]
      Add internal assign_Globals() and new, exported [[<- (fix #81)
      globalsByName() won't set Globals class until the very end in order to avoid calling [[<- for Globals [#81]
      REPRODUCIBILITY: 'where' and 'class' are always the last two attributes and in that order
      Add [<- for Globals (fix #82)
      REVDEP: 371 reverse dependencies (350 from CRAN + 21 from Bioconductor) [ci skip]
      TESTS: More corner cases for new [<- function, e.g. assign with NULLs
      BUG FIX: c() for Globals would lose the 'where' environment for any functions appended (fix #83)
      REVDEP: 374 reverse dependencies (352 from CRAN + 22 from Bioconductor) [ci skip]
      REVDEP: 375 first- & second-order reverse dependencies (353 from CRAN + 22 from Bioconductor) [ci skip]
      NEWS: tweak
      REVDEP: Check 384 reverse dependencies (363 from CRAN + 21 from Bioconductor) first- and second-order dependencies [ci skip]
      ROBUSTNESS: Internal envname() uses print.default(env) instead of print(env), just in case there's a print.environment() function defined by some package (futureverse/future#638)
      REVDEP: 384 reverse dependencies (363 from CRAN + 21 from Bioconductor) [ci skip]
      globals 0.16.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants