-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fail to make final infercnv heatmap #503
Comments
Hi @Cristinex , How did you proceed to up the C stack limit? It is surprising that the error only occurs on the final plot and not during the preliminary plot. When this error occurs during plotting, it is due to the "as.dendrogram()" call that converts the hclust of cells to a dendrogram to plot on the left side of the figure. That method is part of base R and uses recursion, so it can run out of stack if their are too many branches in the tree. I will see if I can write a substitute hclust to dendrogram method that doesn't use recursion, but there are other things I am currently working on that have priority. Regards, |
Hi, @GeorgescuC Frist, thanks for you and your usefool tools! I also met this problem in my running.
I tried your methods in #196 but it did not work
I got your answers in #491 , but I did not know where code: tree() is. My cells have been clustered. Can I run in batches with different clusters while using the same reference? Best wishes to you! |
Hi @GeorgescuC, I appreciate it a lot for your help and reply! I set ulimit by ulimit -s unlimited or ulimit -s 32768 at the very beginning of running. Then, I open R to set the options. There was no plot made during the whole process, no preliminary plot either as you expected. InferCNV is a great and important tool. Thank you for develop and maintain it!! |
Hi @fpengstudy and @Cristinex , Just to make sure for @fpengstudy , the "ulimit -s unlimited" command needs to be run in a terminal before starting the R session for it to be taken into account. For both, if you simply run "ulimit -s" in the terminal, that should output the current limit. Is the limit reported the one you set? The ulimit command may silently fail to change the setting or limit the max value depending on permissions on the machine you are using, or if you are running in a container such as Docker. A potential workaround to get usable plots would be to use the "plot_per_group()" method, it allows to plot each of your samples/annotations on a separate plot, so that would reduce the max size of the trees that infercnv needs to work with. If that alone is not enough, the method includes options to sample results based on a threshold and a sampling rate to further reduce groups that would be too big. Regards, |
Hi! @GeorgescuC |
Hi @fpengstudy & @GeorgescuC, plot_per_group() function worked for my reference group. So, for @fpengstudy, maybe it was the cell number issue. However, the plot I got was exactly the same as the issue in #496, and I think that is also the causation of large number of cnv and large stack requirement. Maybe I will try to run again with my leiden clustering scoring function back to modularity or change the current threshold of "CPM" to 0.05. Regards, |
Hi @fpengstudy and @Critinex , @fpengstudy what does running @Critinex can you check how many subclusters you have with To inspect the subclusters, the following code generates a modified infercnv object that assigns subclusters to annotations to easily visualize them:
You can then plot Regards, |
Hi! @GeorgescuC, Cstack_info()
options(error = function() traceback(2))
|
Hi @fpengstudy , I have not seen "NA" yet as a return from "CStack_info()". What platform are you running R on? Can you try installing the attempted fix I made on Github using the following?
I made a small edit to disable writing the hclust to file as newick strings which requires conversion to a phylo object, which should be the call that results in the recursion. Regards, |
Hi @GeorgescuC , Besides, when I tried to run like #496 , new issue pop up (TAT) at step 15:
Best Regards, |
Hi @Cristinex , For the length, I forgot to say you should first load the backup object because when run() errors, it does not return the modified object, so you still have the newly created object in R. With output_dir being the folder where you did the first run that errored only during plotting, you can run the following:
You can also use this final_obj to try the plotting again. For the new error you posted, it seems there is an issue with running the subclustering on the hspike (calibrating data for the HMM). I am not sure why it is happening with the combination of settings you have, but I made a change that should prevent any issues from arising there. Can you update your infercnv install with:
Then run infercnv again? Regards, |
Hi @GeorgescuC, Thank you soooo much for your replying. I wonder if there is some downloadable code for me to reinstall? Because I am currently in a place where I am blocked from installing right from github...... Regards, |
Hi @Cristinex , You can download the code directly from the github website on the same branch if that is an option. Either git clone the repo then checkout to the branch, or download the repo as a zip archive. zip link: https://github.com/broadinstitute/infercnv/archive/refs/heads/test_fix_plot_recursion.zip After download the code (and extracting if needed), move to the root of the repo, open an R session and run: Regards, |
Hi @GeorgescuC, After I reinstall the package, it processed to further steps. However, the memory problem still exists, which should be something wrong with the output. The length of subclusters is 9, equaling to the sum of types of cells in reference cell types and tumors.
Best Regards, |
Hi @Cristinex , This is one of the weird bugs that occasionally occurs... There seems to be a bug somewhere in the code that the R
Regards, |
Hi @GeorgescuC,
I also tried your methods for downloading a zip, but the error also occupied. Kind regards, |
Hi @GeorgescuC, I cannot thank you more for your quick help and useful suggestions. Things run well after reinstalling and using Best Wishes, |
Hi @fpengstudy , I am not sure what is happening there. The issue seems to be with loading RcppAnnoy, which we do not directly use but is used by dependencies. Regards, |
Hi @GeorgescuC, Thank you for maintaining infercnv!
Kind Regards, |
Hi @Cristinex , The "...dendrogram.txt" is the output I disabled for the attempt to work around the node stack overflow error. You can try running the snippet of code taken from I looked more into where exactly the issue would be arising from in the If you want to use the phylogeny within R, you can replace the Regards, |
Hi @GeorgescuC, Thank you for your reply. I used the lines that I selected from your scripts and somehow smoothly generated a "infercnv.observation.dendrogram.txt" from "run.final_infercnv_obj". I am not sure whether it was the right one, though the dendrogram plotted according to it was the same as that on the heatmap drew from run.final_infercnv_obj. In addition, it was still not available to run the whole process to step 22 unless setting
Best Regards, |
Hi @Cristinex , The "run.final_infercnv_obj" file is written when the run finishes, so as the run whose results you want finished, it should be the latest version every time. Did you get that error while using the Regards, |
Hi @GeorgescuC, I got error while using Kind Regards, |
Hi @Cristinex , Looking into the code, it turns out step 18 and 19 call plot_cnv() through a different method, and Regards, |
Hi Christophe, I tried both infercnv 1.16.0 from bioconductor and infercnv I tried the default > options(expressions=10000)
> options(error = function() traceback(2))
> Cstack_info()
size current direction eval_depth
NA NA 1 2
> plot_cnv(run.final.infercnv_obj, out_dir='.', out)
INFO [2023-08-28 14:42:18] ::plot_cnv:Start
INFO [2023-08-28 14:42:18] ::plot_cnv:Current data dimensions (r,c)=8873,45044 Total=400940222.339581 Min=0.755879146783192 Max=2.05344326007797.
INFO [2023-08-28 14:42:20] ::plot_cnv:Depending on the size of the matrix this may take a moment.
INFO [2023-08-28 14:42:31] plot_cnv(): auto thresholding at: (0.887767 , 1.118562)
INFO [2023-08-28 14:42:40] plot_cnv_observation:Start
INFO [2023-08-28 14:42:40] Observation data size: Cells= 41544 Genes= 8873
Error: node stack overflow
Error during wrapup: node stack overflow
Error: no more error handlers available (recursive errors?); invoking 'abort' restart
> sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
locale:
[1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: Europe/Berlin
tzcode source: system (glibc)
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] infercnv_1.15.1 workflowr_1.7.0
loaded via a namespace (and not attached):
[1] RcppAnnoy_0.0.21 splines_4.3.1
[3] later_1.3.1 bitops_1.0-7
[5] tibble_3.2.1 polyclip_1.10-4
[7] lifecycle_1.0.3 fastcluster_1.2.3
[9] edgeR_3.42.4 doParallel_1.0.17
[11] rprojroot_2.0.3 globals_0.16.2
[13] processx_3.8.1 lattice_0.21-8
[15] MASS_7.3-60 magrittr_2.0.3
[17] limma_3.56.2 plotly_4.10.2
[19] rmarkdown_2.22 yaml_2.3.7
[21] httpuv_1.6.11 Seurat_4.3.0.1
[23] sctransform_0.3.5 sp_1.6-1
[25] spatstat.sparse_3.0-1 reticulate_1.30
[27] cowplot_1.1.1 pbapply_1.7-2
[29] RColorBrewer_1.1-3 multcomp_1.4-23
[31] abind_1.4-5 zlibbioc_1.46.0
[33] Rtsne_0.16 GenomicRanges_1.52.0
[35] purrr_1.0.1 BiocGenerics_0.46.0
[37] RCurl_1.98-1.12 TH.data_1.1-2
[39] sandwich_3.0-2 git2r_0.32.0
[41] GenomeInfoDbData_1.2.10 IRanges_2.34.1
[43] S4Vectors_0.38.1 ggrepel_0.9.3
[45] irlba_2.3.5.1 listenv_0.9.0
[47] spatstat.utils_3.0-3 goftest_1.2-3
[49] spatstat.random_3.1-5 fitdistrplus_1.1-11
[51] parallelly_1.36.0 leiden_0.4.3
[53] codetools_0.2-19 coin_1.4-2
[55] DelayedArray_0.26.7 tidyselect_1.2.0
[57] futile.logger_1.4.3 rjags_4-14
[59] matrixStats_1.0.0 stats4_4.3.1
[61] spatstat.explore_3.2-1 jsonlite_1.8.5
[63] ellipsis_0.3.2 progressr_0.14.0
[65] ggridges_0.5.4 survival_3.5-5
[67] iterators_1.0.14 foreach_1.5.2
[69] tools_4.3.1 ica_1.0-3
[71] Rcpp_1.0.10 glue_1.6.2
[73] gridExtra_2.3 xfun_0.39
[75] MatrixGenerics_1.12.3 GenomeInfoDb_1.36.1
[77] dplyr_1.1.2 formatR_1.14
[79] fastmap_1.1.1 fansi_1.0.4
[81] callr_3.7.3 caTools_1.18.2
[83] digest_0.6.31 parallelDist_0.2.6
[85] R6_2.5.1 mime_0.12
[87] colorspace_2.1-0 scattermore_1.2
[89] gtools_3.9.4 tensor_1.5
[91] spatstat.data_3.0-1 utf8_1.2.3
[93] tidyr_1.3.0 generics_0.1.3
[95] data.table_1.14.8 httr_1.4.6
[97] htmlwidgets_1.6.2 S4Arrays_1.0.5
[99] whisker_0.4.1 uwot_0.1.16
[101] pkgconfig_2.0.3 gtable_0.3.3
[103] modeltools_0.2-23 lmtest_0.9-40
[105] SingleCellExperiment_1.22.0 XVector_0.40.0
[107] htmltools_0.5.5 SeuratObject_4.1.3
[109] scales_1.2.1 Biobase_2.60.0
[111] png_0.1-8 phyclust_0.1-33
[113] knitr_1.43 lambda.r_1.2.4
[115] rstudioapi_0.14 reshape2_1.4.4
[117] coda_0.19-4 nlme_3.1-162
[119] zoo_1.8-12 stringr_1.5.0
[121] KernSmooth_2.23-21 parallel_4.3.1
[123] miniUI_0.1.1.1 libcoin_1.0-9
[125] pillar_1.9.0 grid_4.3.1
[127] vctrs_0.6.2 gplots_3.1.3
[129] RANN_2.6.1 promises_1.2.0.1
[131] xtable_1.8-4 cluster_2.1.4
[133] evaluate_0.21 locfit_1.5-9.8
[135] mvtnorm_1.2-2 cli_3.6.1
[137] compiler_4.3.1 futile.options_1.0.1
[139] rlang_1.1.1 crayon_1.5.2
[141] future.apply_1.11.0 ps_1.7.5
[143] getPass_0.2-2 argparse_2.2.2
[145] plyr_1.8.8 fs_1.6.2
[147] stringi_1.7.12 viridisLite_0.4.2
[149] deldir_1.0-9 munsell_0.5.0
[151] lazyeval_0.2.2 spatstat.geom_3.2-1
[153] Matrix_1.6-1 patchwork_1.1.3
[155] future_1.32.0 ggplot2_3.4.2
[157] shiny_1.7.4 SummarizedExperiment_1.30.2
[159] ROCR_1.0-11 igraph_1.4.3
[161] RcppParallel_5.1.7 ape_5.7-1 I was wondering if there is any workaround for this. Thank you for your help. Best regards, |
Good day to you!
Here is the running log. I tried to uptune the C stack limit but still not OK. I saw there are other posts about this problem. #491 I don't think my dataset is a big one for today's application. Maybe the developers should really think about this problem. ^ ^
The text was updated successfully, but these errors were encountered: