-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
long running time with space separated list of files as input #329
Comments
Hey @rbenel, I am wondering are you interested in using bootstrap estimate of Alevin? If not then removing 'numCellBootstrap' flag will dramatically improve the running time. |
Hey @k3yavi, bootstrapping really improved my population studies so I figured I would try it with sc, but I haven't even seen the run get there when I use the multiple files... after |
Oh yes, you are right it never started mapping let alone quantification. I'll take a look in a bit. Thanks for raising the issue . |
Correct, Thanks @k3yavi :) |
Hi @rbenel , I just tested it on a couple of datasets we have, it seems to work fine. |
Hi @k3yavi,
|
The bash script looks good to me, and I am not aware of any hard limit on the number of files as input. However, I just did tested on 24 files as an input and it seems to work. Hard to tell what's wrong, without being able to replicate the issue. |
Happy to hear the bash script looks good. |
Does each individual file fail? Is there a particular file pair that fails to run (an ill formed file)? |
I have already run them all (successfully) separately as pairs, but for downstream analysis I need them to be a single library, so I thought it would be simpler to run them as multiple input files..? |
Hey @rbenel, |
I have narrowed down the issue, the first 8 files are fine, adding a 9th reproduces the issue. Will send you a link with a subset of the files |
Hi @rbenel , Since you are using Alevin by compiling from source our latest commit (c3eeec9) on the develop branch should solve your issue. We will eventually merge the fix to master in the next release or as a hot-fix sometime later in the future. |
Hi @k3yavi, I also tried this on
Thanks!! |
Hi @rbenel , |
It finished running after a few hours! Thanks for all of the help! |
Hi @rbenel , |
alevin (single-cell mode)
I am trying to run alevin using a space separated list of (20) files as input. The fastq files we received from sequencing, were separated arbitrarily to keep them at about ~200 MB a file, but they are all the same sample and I wish to treat them as one library. There is no error produced, but it has been running for ~15 hours, and the log files are blank. As a side note, running each "pair" works just fine.
v0.12.1
compiled from source
OS - Ubuntu Linux, x86_64 x86_64 x86_64 GNU/Linux
Alevin is supposed to be able to run with multiple read files, as specified here: https://github.com/COMBINE-lab/salmon/blob/master/doc/source/salmon.rst#providing-multiple-read-files-to-salmon
any help or advice would be appreciated :)
The text was updated successfully, but these errors were encountered: