Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory issue? #39

Closed
lucacozzuto opened this issue Sep 8, 2022 · 4 comments
Closed

memory issue? #39

lucacozzuto opened this issue Sep 8, 2022 · 4 comments

Comments

@lucacozzuto
Copy link

Dear developers,
thanks for your valuable tool! I'm trying to use it for some nanopore data and I got the following error:

[Thu Sep  8 14:45:34 2022] creating directory for output: KO_fastqc
[limitst]	using file /falco-1.1.0/Configuration/limits.txt
[adapters]	using file /falco-1.1.0/Configuration/adapter_list.txt
[contaminants]	using file /falco-1.1.0/Configuration/contaminant_list.txt
[Thu Sep  8 14:45:34 2022] Started reading file KO.fq.gz
[Thu Sep  8 14:45:34 2022] reading file as gzipped FASTQ format
[running falco|=                                                  |  2%]/ 2: 19870 Killed                  falco -o KO_fastqc -t 1 KO.fq.gz

I used 80Gb of RAM so I don't think I have a problem with RAM.

Luca

@andrewdavidsmith
Copy link
Collaborator

@lucacozzuto can you provide some additional information, for example a small piece of the input file? And also the command -- you've snipped the verbose output and progress but we don't see the arguments or anything like filenames. If you feel there would be confidential info in the filenames, then it would help if you could copy them to generic filenames and post the exact command line you used. Thanks!!!

@lucacozzuto
Copy link
Author

Many thanks for your quick answer!
This is the command line

falco -o KO_fastqc -t 1  KO.fq.gz

The file is huge (59G) and there are some reads that are up to 1 Mb

@guilhermesena1
Copy link
Collaborator

hello,

thank you for reaching out about the issue.
I was able to replicate the problem with synthetic very large reads.

This seems not as much a memory issue as it is a bug in falco where we weren't accounting for the maximum read length to be as large as the ones currently produced by oxford nanopore.

If you are working with a clone of the repo, I pushed a fix at 2f82110 that may resolve the issue. On my 16 GB RAM machine I was able to run falco on a simulated read of size 30 million to completion.

If at all possible, could you let us know if you can run falco to completion on your data with this commit?

Thank you very much in advance!

@lucacozzuto
Copy link
Author

Dear @guilhermesena1, it worked!
Thanks for this fix, I managed to add it to my nextflow pipeline for replacing fastQC. I made a Docker file with your tool, so in case you want I can add it to your repo.

Best,

Luca

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants