Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error Encountered during Step 15 "Error in paste(sig, collapse = "#") : (converted from warning) NAs introduced by coercion to integer range" #426

Open
bhabakd opened this issue Jun 13, 2022 · 16 comments

Comments

@bhabakd
Copy link

bhabakd commented Jun 13, 2022

Hi
I am new to infercnv.
While attempting to run the analysis with analysis_mode = "subcluster", I am encountering the following during Step 15:

STEP 15: computing tumor subclusters via leiden

INFO [2022-06-12 20:26:56] define_signif_tumor_subclusters(p_val=0.05
INFO [2022-06-12 20:26:56] define_signif_tumor_subclusters(), tumor: M
Error in paste(sig, collapse = "#") :
(converted from warning) NAs introduced by coercion to integer range

15: (function ()
traceback(2))()
14: doWithOneRestart(return(expr), restart)
13: withOneRestart(expr, restarts[[1L]])
12: withRestarts({
.Internal(.signalCondition(simpleWarning(msg, call), msg,
call))
.Internal(.dfltWarn(msg, call))
}, muffleWarning = function() NULL)
11: .signalSimpleWarning("NAs introduced by coercion to integer range",
base::quote(paste(sig, collapse = "#")))
10: paste(sig, collapse = "#")
9: .sigLabel(signature)
8: .findMethodInTable(c(from, to), methods)
7: .quickCoerceSelect(thisClass, Class, coerceFun, coerceMethods,
where)
6: as(object, "dgCMatrix")
5: leiden.Matrix(sparse_adjacency_matrix, resolution_parameter = leiden_resolution)
4: leiden(sparse_adjacency_matrix, resolution_parameter = leiden_resolution)
3: .single_tumor_leiden_subclustering(tumor_group = tumor_group,
tumor_group_idx = tumor_group_idx, tumor_expr_data = tumor_expr_data,
k_nn = k_nn, leiden_resolution = leiden_resolution, hclust_method = hclust_method)
2: define_signif_tumor_subclusters(infercnv_obj, p_val = tumor_subcluster_pval,
k_nn = k_nn, leiden_resolution = leiden_resolution, hclust_method = hclust_method,
cluster_by_groups = cluster_by_groups, partition_method = tumor_subcluster_partition_method)
1: infercnv::run(T99_infercnv_obj, analysis_mode = "subclusters",
cutoff = 0.1, out_dir = tempfile(), cluster_by_groups = TRUE,
plot_steps = FALSE, denoise = TRUE, HMM = TRUE, num_threads = 10,
tumor_subcluster_pval = 0.05, no_plot = TRUE)

This is the command I use after creating the infercnv object
S99_infercnv_obj <- infercnv::run(S99_infercnv_obj, analysis_mode = 'subclusters',
cutoff=0.1, # cutoff=1 works well for Smart-seq2, and cutoff=0.1 works well for 10x Genomics
out_dir=tempfile(),
cluster_by_groups=TRUE,
plot_steps = FALSE,
denoise=TRUE,
HMM=TRUE,
num_threads = 10,
tumor_subcluster_pval = 0.05,
no_plot = TRUE)

The analysis is been performed on a count_matrix generated from a Seurat object using GetAssayData(obj, slot ="count").
Cells have been annotated as either malignant (M) or non-malignant(NM), with 492 cells of the former and 4190 of the latter.

Interestingly, the command works fine without 'subcluster'.

@iS4i4S
Copy link

iS4i4S commented Jun 16, 2022

Hi,

Change the method for the subclusters to:

tumor_subcluster_partition_method='random_trees'

That should fix it

Good luck

@GeorgescuC
Copy link
Collaborator

Hi @bhabakd @iS4i4S ,

Unfortunately from looking into it, this seems to be an issue coming from the Leiden package and/or some of its dependencies (such as igraph). I get the same warnings when running the following minimal Leiden example:

library(leiden)

adjacency_matrix <- rbind(cbind(matrix(round(rbinom(400, 1, 0.8)), 20, 20),
                                matrix(round(rbinom(400, 1, 0.3)), 20, 20), 
                                matrix(round(rbinom(400, 1, 0.1)), 20, 20)),
                          cbind(matrix(round(rbinom(400, 1, 0.3)), 20, 20), 
                                matrix(round(rbinom(400, 1, 0.8)), 20, 20), 
                                matrix(round(rbinom(400, 1, 0.2)), 20, 20)),
                          cbind(matrix(round(rbinom(400, 1, 0.3)), 20, 20), 
                                matrix(round(rbinom(400, 1, 0.1)), 20, 20), 
                                matrix(round(rbinom(400, 1, 0.9)), 20, 20)))

leiden(adjacency_matrix)

The warnings existed for some time, but it seems like in R 4.2 they get converted to errors. The workaround for now is to simply add a suppressWarnings() around the infercnv::run() call.

Regards,
Christophe.

@GeorgescuC GeorgescuC pinned this issue Jun 16, 2022
@bhabakd
Copy link
Author

bhabakd commented Jun 18, 2022

@iS4i4S @GeorgescuC
Thank you for your suggestions.
I tried @iS4i4S recommendation first and this is what I get after nearly 18h of computing.

Error in file(con, "w") : all connections are in use

traceback()
15: file(con, "w")
14: writeLines(input, f)
13: system(cmd, wait = FALSE, input = "")
12: newPSOCKnode(names[[i]], options = options, rank = i)
11: makePSOCKcluster(names = spec, ...)
10: makeCluster(cores)
9: registerDoParallel(cores = infercnv.env$GLOBAL_NUM_THREADS)
8: .parameterize_random_cluster_heights_smoothed_trees(tumor_expr_data,
hclust_method, window_size)
7: .single_tumor_subclustering_recursive_random_smoothed_trees(tumor_expr_data = df,
hclust_method = hclust_method, p_val = p_val, grps.adj = grps.adj,
window_size = window_size, max_recursion_depth = max_recursion_depth,
min_cluster_size_recurse = min_cluster_size_recurse, recursion_depth = recursion_depth +
1)
6: .single_tumor_subclustering_recursive_random_smoothed_trees(tumor_expr_data = df,
hclust_method = hclust_method, p_val = p_val, grps.adj = grps.adj,
window_size = window_size, max_recursion_depth = max_recursion_depth,
min_cluster_size_recurse = min_cluster_size_recurse, recursion_depth = recursion_depth +
1)
5: .single_tumor_subclustering_recursive_random_smoothed_trees(tumor_expr_data,
hclust_method, p_val, grps, window_size, max_recursion_depth,
min_cluster_size_recurse)
4: .partition_by_random_smoothed_trees(tumor_name, tumor_expr_data,
hclust_method, p_val, window_size, max_recursion_depth, min_cluster_size_recurse)
3: .single_tumor_subclustering_smoothed_tree(tumor_group, tumor_group_idx,
tumor_expr_data, p_val, hclust_method, window_size, max_recursion_depth,
min_cluster_size_recurse)
2: define_signif_tumor_subclusters_via_random_smooothed_trees(infercnv_obj,
p_val = tumor_subcluster_pval, hclust_method = hclust_method,
cluster_by_groups = cluster_by_groups)
1: infercnv::run(T99_infercnv_obj, analysis_mode = "subclusters",
cutoff = 0.1, out_dir = tempfile(), cluster_by_groups = TRUE,
plot_steps = FALSE, denoise = TRUE, HMM = TRUE, num_threads = 10,
tumor_subcluster_pval = 0.05, tumor_subcluster_partition_method = "random_trees",
no_plot = TRUE)
Warning messages:
1: In .Internal(gzfile(description, open, encoding, compression)) :
closing unused connection 127 (<-Bhaba:11550)
2: In .Internal(gzfile(description, open, encoding, compression)) :
closing unused connection 126 (<-Bhaba:11550)
3: In .Internal(gzfile(description, open, encoding, compression)) :
closing unused connection 125 (<-Bhaba:11550)
4: In .Internal(gzfile(description, open, encoding, compression)) :
closing unused connection 124 (<-Bhaba:11550)
5: In .Internal(gzfile(description, open, encoding, compression)) :
closing unused connection 123 (<-Bhaba:11550)

I will be trying @GeorgescuC's suggestion next and see if that works out.

Will keep you guys posted.

Thank you.

@bhabakd
Copy link
Author

bhabakd commented Jun 18, 2022

@GeorgescuC
Upon trying to run infercnv with suppressWarning() around it, I get this error message.

Error in py_module_import(module, convert = convert) :
ImportError: DLL load failed while importing _c_leiden: The specified procedure could not be found.

This is what traceback() returns:
11: stop(structure(list(message = "ImportError: DLL load failed while importing _c_leiden: The specified procedure could not be found.\n",
call = py_module_import(module, convert = convert), cppstack = NULL), class = c("Rcpp::exception",
"C++Error", "error", "condition")))
10: py_module_import(module, convert = convert)
9: import("leidenalg", delay_load = TRUE)
8: leiden.igraph(object, partition_type = partition_type, weights = weights,
node_sizes = node_sizes, resolution_parameter = resolution_parameter,
seed = seed, n_iterations = n_iterations, degree_as_node_size = degree_as_node_size,
laplacian = laplacian, legacy = legacy)
7: leiden.Matrix(sparse_adjacency_matrix, resolution_parameter = leiden_resolution)
6: leiden(sparse_adjacency_matrix, resolution_parameter = leiden_resolution)
5: .single_tumor_leiden_subclustering(tumor_group = tumor_group,
tumor_group_idx = tumor_group_idx, tumor_expr_data = tumor_expr_data,
k_nn = k_nn, leiden_resolution = leiden_resolution, hclust_method = hclust_method)
4: define_signif_tumor_subclusters(infercnv_obj, p_val = tumor_subcluster_pval,
k_nn = k_nn, leiden_resolution = leiden_resolution, hclust_method = hclust_method,
cluster_by_groups = cluster_by_groups, partition_method = tumor_subcluster_partition_method)
3: infercnv::run(T99_infercnv_obj, analysis_mode = "subclusters",
cutoff = 0.1, out_dir = tempfile(), cluster_by_groups = TRUE,
plot_steps = FALSE, denoise = TRUE, HMM = TRUE, num_threads = 10,
tumor_subcluster_pval = 0.05, no_plot = TRUE)
2: withCallingHandlers(expr, warning = function(w) if (inherits(w,
classes)) tryInvokeRestart("muffleWarning"))
1: suppressWarnings(T99_infercnv_obj <- infercnv::run(T99_infercnv_obj,
analysis_mode = "subclusters", cutoff = 0.1, out_dir = tempfile(),
cluster_by_groups = TRUE, plot_steps = FALSE, denoise = TRUE,
HMM = TRUE, num_threads = 10, tumor_subcluster_pval = 0.05,
no_plot = TRUE))

Here is the sessioninfo for the run.

sessionInfo()
R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22000)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8 LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C LC_TIME=English_United States.utf8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] Matrix.utils_0.9.8 igraph_1.3.1 infercnvApp_0.0.0.9000 infercnv_1.12.0 leidenAlg_1.0.3
[6] leiden_0.4.2 Matrix_1.4-1 ggplot2_3.3.6 dbplyr_2.1.1 patchwork_1.1.1
[11] sp_1.4-7 SeuratObject_4.1.0 Seurat_4.1.1 dplyr_1.0.9 RColorBrewer_1.1-3
[16] rjags_4-13 coda_0.19-4

loaded via a namespace (and not attached):
[1] parallelDist_0.2.6 plyr_1.8.7 lazyeval_0.2.2 splines_4.2.0
[5] listenv_0.8.0 scattermore_0.8 GenomeInfoDb_1.32.2 TH.data_1.1-1
[9] digest_0.6.29 foreach_1.5.2 htmltools_0.5.2 fansi_1.0.3
[13] magrittr_2.0.3 tensor_1.5 cluster_2.1.3 doParallel_1.0.17
[17] ROCR_1.0-11 limma_3.52.1 globals_0.15.0 fastcluster_1.2.3
[21] RcppParallel_5.1.5 matrixStats_0.62.0 sandwich_3.0-1 spatstat.sparse_2.1-1
[25] sccore_1.0.1 colorspace_2.0-3 rappdirs_0.3.3 ggrepel_0.9.1
[29] crayon_1.5.1 RCurl_1.98-1.6 jsonlite_1.8.0 libcoin_1.0-9
[33] spatstat.data_2.2-0 progressr_0.10.0 survival_3.3-1 zoo_1.8-10
[37] iterators_1.0.14 ape_5.6-2 glue_1.6.2 polyclip_1.10-0
[41] gtable_0.3.0 zlibbioc_1.42.0 XVector_0.36.0 DelayedArray_0.22.0
[45] future.apply_1.9.0 SingleCellExperiment_1.18.0 BiocGenerics_0.42.0 abind_1.4-5
[49] scales_1.2.0 futile.options_1.0.1 mvtnorm_1.1-3 DBI_1.1.2
[53] edgeR_3.38.1 spatstat.random_2.2-0 miniUI_0.1.1.1 Rcpp_1.0.8.3
[57] viridisLite_0.4.0 xtable_1.8-4 spatstat.core_2.4-4 reticulate_1.25
[61] stats4_4.2.0 htmlwidgets_1.5.4 httr_1.4.3 gplots_3.1.3
[65] modeltools_0.2-23 ellipsis_0.3.2 ica_1.0-2 pkgconfig_2.0.3
[69] reshape_0.8.9 uwot_0.1.11 deldir_1.0-6 here_1.0.1
[73] locfit_1.5-9.5 utf8_1.2.2 tidyselect_1.1.2 rlang_1.0.2
[77] reshape2_1.4.4 later_1.3.0 munsell_0.5.0 phyclust_0.1-30
[81] tools_4.2.0 cli_3.3.0 generics_0.1.2 ggridges_0.5.3
[85] stringr_1.4.0 fastmap_1.1.0 argparse_2.1.5 goftest_1.2-3
[89] fitdistrplus_1.1-8 caTools_1.18.2 purrr_0.3.4 RANN_2.6.1
[93] coin_1.4-2 pbapply_1.5-0 future_1.25.0 nlme_3.1-157
[97] mime_0.12 formatR_1.12 grr_0.9.5 rstudioapi_0.13
[101] compiler_4.2.0 plotly_4.10.0 png_0.1-7 spatstat.utils_2.3-1
[105] tibble_3.1.7 stringi_1.7.6 futile.logger_1.4.3 rgeos_0.5-9
[109] lattice_0.20-45 vctrs_0.4.1 pillar_1.7.0 lifecycle_1.0.1
[113] BiocManager_1.30.18 spatstat.geom_2.4-0 lmtest_0.9-40 RcppAnnoy_0.0.19
[117] data.table_1.14.2 cowplot_1.1.1 bitops_1.0-7 irlba_2.3.5
[121] httpuv_1.6.5 GenomicRanges_1.48.0 R6_2.5.1 promises_1.2.0.1
[125] KernSmooth_2.23-20 gridExtra_2.3 IRanges_2.30.0 parallelly_1.31.1
[129] codetools_0.2-18 lambda.r_1.2.4 MASS_7.3-56 gtools_3.9.2.1
[133] assertthat_0.2.1 SummarizedExperiment_1.26.1 rprojroot_2.0.3 withr_2.5.0
[137] sctransform_0.3.3 multcomp_1.4-19 S4Vectors_0.34.0 GenomeInfoDbData_1.2.8
[141] mgcv_1.8-40 parallel_4.2.0 rpart_4.1.16 grid_4.2.0
[145] tidyr_1.2.0 MatrixGenerics_1.8.0 Rtsne_0.16 Biobase_2.56.0
[149] shiny_1.7.1

@GeorgescuC
Copy link
Collaborator

Hi @bhabakd ,

It seems there is an error loading the Leiden dependencies. Can you try running this minimal example to see if Leiden is setup properly?

library(leiden)

adjacency_matrix <- rbind(cbind(matrix(round(rbinom(400, 1, 0.8)), 20, 20),
                                matrix(round(rbinom(400, 1, 0.3)), 20, 20), 
                                matrix(round(rbinom(400, 1, 0.1)), 20, 20)),
                          cbind(matrix(round(rbinom(400, 1, 0.3)), 20, 20), 
                                matrix(round(rbinom(400, 1, 0.8)), 20, 20), 
                                matrix(round(rbinom(400, 1, 0.2)), 20, 20)),
                          cbind(matrix(round(rbinom(400, 1, 0.3)), 20, 20), 
                                matrix(round(rbinom(400, 1, 0.1)), 20, 20), 
                                matrix(round(rbinom(400, 1, 0.9)), 20, 20)))

leiden(adjacency_matrix)

Regards,
Christophe.

@bhabakd
Copy link
Author

bhabakd commented Jul 2, 2022

Hi @GeorgescuC
I did as you recommended, and this is what I get

leiden(adjacency_matrix)
Error in py_module_import(module, convert = convert) :
ModuleNotFoundError: No module named 'leidenalg'
traceback()
5: stop(structure(list(message = "ModuleNotFoundError: No module named 'leidenalg'\n",
call = py_module_import(module, convert = convert), cppstack = NULL), class = c("Rcpp::exception",
"C++Error", "error", "condition")))
4: py_module_import(module, convert = convert)
3: import("leidenalg", delay_load = TRUE)
2: leiden.matrix(adjacency_matrix)
1: leiden(adjacency_matrix)

regards

Bhaba

@GeorgescuC
Copy link
Collaborator

Hi @bhabakd,

Sorry for the delay in getting back to you. I have been working on replacing the old implementation of the Leiden clustering that required Python to the one from igraph that doesn't, so the current master branch should allow you to run infercnv without this issue. You might however need to install a couple extra R dependencies that have been added.

Regards,
Christophe.

@bhabakd
Copy link
Author

bhabakd commented Aug 19, 2022 via email

@tbrunetti
Copy link

@GeorgescuC I just had the same issue on using Rv4.2.1 despite building the conda env from the recommended commands from the leiden package using r-reticulate.

I uninstalled infercnv and installed the current infercnv directly from the github master branch and it seems to be running step 15 fine now. Not sure if I will run into any other complications since the analysis is not yet completed, but it seems to resolve the error initially thrown.

@GeorgescuC
Copy link
Collaborator

Hi @tbrunetti ,

The version currently on master does not use python and r-reticulate anymore at all, so the initial issue in step 15 should not happen anymore. Please let me know if you encounter any issues with this new version, I am planning on releasing it on the devel branch of Bioconductor after a few people have had the time to try it out to make sure things run well in different setups.

Regards,
Christophe.

@tbrunetti
Copy link

tbrunetti commented Aug 24, 2022

Yes, this fix resolved my issues. Thanks!

@bhabakd
Copy link
Author

bhabakd commented Aug 29, 2022

Hi @GeorgescuC,
Good news, your fix worked. There were no errors during the sub-clustering module.
Thanks once again for attending to this issue.

I have yet another query though. I want to arrange the samples in a specific order. Is there a way to do so?

Regards

BhabaKD

@shanshenbing
Copy link

shanshenbing commented Aug 30, 2022

I get a different error when runing the latest infercnv.

Warning in irlba(A = t(x = object), nv = npcs, ...) :
You're computing too large a percentage of total singular values, use a standard svd instead.
PC_ 1
Positive: COL18A1, COL6A1, FAM207A, COL6A2, DIP2A
Negative: SUMO3, PRMT2, PTTG1IP, S100B, ITGB2
PC_ 2
Positive: PRMT2, S100B, DIP2A, COL6A2, COL6A1
Negative: SUMO3, PTTG1IP, ITGB2, FAM207A, COL18A1
PC_ 3
Positive: COL6A1, COL18A1, COL6A2, FAM207A, ITGB2
Negative: PRMT2, SUMO3, S100B, PTTG1IP, DIP2A
PC_ 4
Positive: SUMO3, COL6A2, COL6A1, DIP2A, S100B
Negative: ITGB2, FAM207A, PRMT2, PTTG1IP, COL18A1
PC_ 5
Positive: COL18A1, PRMT2, SUMO3, COL6A1, FAM207A
Negative: DIP2A, ITGB2, PTTG1IP, COL6A2, S100B
Error in FindNeighbors.Seurat(seurat_obs, k.param = k_nn) :
More dimensions specified in dims than have been computed
Calls: ... .leiden_seurat_preprocess_routine -> FindNeighbors -> FindNeighbors.Seurat
In addition: Warning messages:
1: In simpleLoess(y, x, w, span, degree = degree, parametric = parametric, :
span too small. fewer data values than degrees of freedom.

This error occurred when running Step15. More strangely, Most samples can get the right result but only some samples get this error. I have tried docker image and install infercnv from github directly, both methods will get this error.

@GeorgescuC
Copy link
Collaborator

Hi @bhabakd ,

In the current version, samples/annotations are sorted with sort() in R, so if you append a prefix to them you should be able to order them as you wish. One thing to remember though is that the plot is done bottom to top, so the order will be reversed there.

To avoid rerunning the whole analysis, you can create a new infercnv object with the updated annotation names and transfer the @observation_grouped_cell_indices field from the new object to the object with results, as long as you do not change anything else (the matrix most importantly).

Regards,
Christophe.

@GeorgescuC
Copy link
Collaborator

Hi @shanshenbing ,

I suspect chromosome 21 has too few genes for the PCA to be meaningful in some of those samples. I have added a fallback to the simpler Leiden clustering for those cases in the latest commit, so if you can update your installation of infercnv from the master branch, it will hopefully solve the issue.

Regards,
Christophe.

@GeorgescuC GeorgescuC unpinned this issue Sep 21, 2022
@shanshenbing
Copy link

Thank you for your reply. I finally updated my R to 4.2.1 and installed the latest version of inferCNV on my self local windows10 computer using the same samples with no errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants