Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in do.call(rbind, mcmc[[j]]) : second argument must be a list #505

Open
Sa753 opened this issue Feb 17, 2023 · 7 comments
Open

Error in do.call(rbind, mcmc[[j]]) : second argument must be a list #505

Sa753 opened this issue Feb 17, 2023 · 7 comments

Comments

@Sa753
Copy link

Sa753 commented Feb 17, 2023

Dear Christophe,

Since I installed version 1.15, I can't proceed forward with this error. You previously replied to similar issue suggesting changing the number of threads but I tried 2 , 4, 6, 10, 15 and all gave the same error. It didn't happen with version 1.14

I am running the analysis on interactive R session on cluster and have 24 cores available and up to 200GB of memory. It happened when I run it on 20K cells and on 600 cells.

STEP 18: Run Bayesian Network Model on HMM predicted CNVs

INFO [2023-02-17 09:51:15] Initializing new MCM InferCNV Object.
INFO [2023-02-17 09:51:15] validating infercnv_obj
INFO [2023-02-17 09:51:15] Total CNV's: 144
INFO [2023-02-17 09:51:15] Loading BUGS Model.
INFO [2023-02-17 09:51:15] Running Sampling Using Parallel with 6 Cores
INFO [2023-02-17 09:51:15] Obtaining probabilities post-sampling
Error in do.call(rbind, mcmc[[j]]) : second argument must be a list
In addition: Warning message:
In parallel::mclapply(seq_along(obj@cell_gene), FUN = par_func, :
all scheduled cores encountered errors in user code

Thanks

@GeorgescuC
Copy link
Collaborator

Hi Sa753 ,

How soon after the last log message does the error occur? Does it seem to be running for a while or does it just crash right away?

This error occurs when calling JAGS, so it is difficult to track what the issue actually is. What version of JAGS do you have installed on your system?

You could also try running with a single thread so that no parallelization is done to see if that helps.

Regards,
Christophe.

@Sa753
Copy link
Author

Sa753 commented Feb 17, 2023

Hi Christophe,

It happens after very long time (see below as an example). Similar error happened if I run it on 2K cells that took about 1h to run using 10 cores before the error occurred. the same 2K cells ran successfully using version 1.14

STEP 18: Run Bayesian Network Model on HMM predicted CNVs

INFO [2023-02-11 06:24:23] Initializing new MCM InferCNV Object.
INFO [2023-02-11 06:24:23] validating infercnv_obj
INFO [2023-02-11 06:24:29] Total CNV's: 44008
INFO [2023-02-11 06:24:29] Loading BUGS Model.
INFO [2023-02-11 06:33:57] Running Sampling Using Parallel with 4 Cores

INFO [2023-02-11 15:15:27] Obtaining probabilities post-sampling
Error in do.call(rbind, mcmc[[j]]) : second argument must be a list
In addition: Warning message:
In parallel::mclapply(seq_along(obj@cell_gene), FUN = par_func, :
scheduled cores 1, 2, 3, 4 did not deliver results, all values of the jobs will be affected

rjags version is 4.13.

It is very weird as sometimes the run is complete and sometimes the run is stopped due to error in either all or few of the cores as core 6 and 7 for example..

Running on one core takes very long time for 20K cells and never been able to complete the run as it is always above 24h

@GeorgescuC
Copy link
Collaborator

Hi @Sa753 ,

When installing version 1.15, are you using the Bioconductor devel branch, or the master branch from Github?
There are no code changes to the Bayesian network step between versions 1.14 and 1.15, so it is surprising that the issue frequently happens on 1.15 but not 1.14. Can you run sessionInfo() when using both versions to see if there are differences in any of the dependencies?

One difference I see in the 2nd log you posted however is a much higher CNV regions count, especially for only 2k cells. I think the subclustering settings need to be tweaked, which you can refer to my answer to your other post in issue #508 . With better subclustering the number of regions to evaluate should be drastically reduced which should help with both runtime and risk of errors.

Regards,
Christophe.

@Yijia-Jiang
Copy link

Hi, I met the similar issue using infercnv_1.14.2, do you find a solution?

@GeorgescuC
Copy link
Collaborator

Hi @aj088 ,

Does the log indicate a high number of CNV regions as well?

If you update to version 1.16.0 that just released, there are new features to investigate the subclustering that should help identify if highly fragmented subclusters is the issue.

subcluster_obj = readRDS("15_tumor_subclustersHMMi6.leiden.infercnv_obj")
plot_cnv(infercnv_obj = subcluster_obj, out_dir = output_path)

Running this code snippet (after updating the output path) should create a figure "subclusters_as_annotation.png" where you can see subclusters. If there are too many of them compared to the actual diversity of your cells and they are of too small size, you will need to adjust the "eiden_resolution parameter down. Rerunning infercnv with that setting changed should restart the run from step 15 and generate a new plot of subclusters automatically.

Regards,
Christophe.

@Yijia-Jiang
Copy link

@GeorgescuC Thank you for the feedback! I tried to set leiden_method="simple" for the current version of infercnv, and the same problem occurs. In fact, I get warning in the previous step shown below:

Running Sampling Using Parallel with 4 Cores
Warning in parallel::mclapply(seq_along(obj@cell_gene), FUN = par_func, :
scheduled cores 1, 2, 3, 4 did not deliver results, all values of the jobs will be affected
Obtaining probabilities post-sampling
Error in do.call(rbind, mcmc[[j]]) : second argument must be a list

Do you think it's possible that the error occurs due to the previous Running Sampling Using Parallel step?

@GeorgescuC
Copy link
Collaborator

Hi @aj088 ,

Each step is independent, so if one successfully ran, it should have no negative impact on the following steps in term of parallelization.

Looking at the log, how many total CNVs have been predicted?

Regards,
Christophe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants