-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
--max_memory and --max_cpus in latest version of sarek #1765
Comments
I have similar issue, it says that I require 24 threads and I do have only 12 I have gone through multiple config files etc but I couldn't resolve the issue. |
Please have a look at this post. Cheers |
I am trying to allocate more CPUs and memory to the nf-core/sarek pipeline because I noticed that it is currently using a maximum of 500GB memory, leaving unused resources. My goal is to increase resource allocation so that the pipeline runs faster. I added this to nextflow.config file that I found in this directory (/home/Ons14/.nextflow/assets/nf-core/sarek): process { and then I ran my code but I didn't see any change in my htop command , memory usage doesnt go above 500.GB. I'am new to using nf-core/sarek pipeline and I would greatly appreciate your help. Thank you! |
I think you guys should join the sarek-Slack channel to get answers for these kinds of questions. As far as I understood, |
To add, Nextflow will cap resource request at these max values. This is important to avoid a job requesting more resources than are available and never being submitted. It will not increase resource requests for an individual job. Most jobs do not need 500GB of memory and therefore we typically set fewer in the base.config. If you wish to increase resources for individual tasks you will need to create a custom config where you define this for the corresponding processes. On that note: we have set the sarek default requests using 80X WGS samples. It should work for the majority of use cases. If you are seeing issues related to job packing and would like to optimize this, you could try setting resources to half a node and full nodes for the medium/high tasks to ensure several jobs can be submitted. Nextflow will attempts it's best to submit jobs for as long as it finds suitable space and you are not exceeding any usage limits set on your cluster: To illustrate: Let's say all your nodes have 16 CPUs and 32 GB memory. If you set the memory requests for all you jobs to 24 GBs. only ever one job can be submitted per node: 32 -24 = 8GB left and no job satisfies this. In this case it could be more efficient to set 15GB per process and always have two jobs submitted in parallel. |
So, the Sarek default base.config would reasonably work with 30X WGS samples, even if I have more CPU and memory to allocate? I am trying to maximise the usage of the server, to parallelize more jobs, since I have a large cohort |
To parallelize more jobs you want to request just the right amount of resources for each task and then work on putting as many jobs as possible onto the cluster. Increasing the resourcing for a task itself will result in that task having more cpu/memory but fewer jobs being able to run in parallel. using configuration options like |
Description of feature
Hello,
I am working with human WGS 30X and have a large cohort. I tried to adjust the memory and CPU usage in the latest version of Sarek, but I couldn't. When I used --max_memory and --max_cpus, it showed a warning. Could you please tell me if there's another way to increase the resources used?
The text was updated successfully, but these errors were encountered: