-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Vamb-jgi-filter memory issues #100
Conversation
requires GTDBtk and CheckM2 databases
segmentation fault seems to be fixed
for manual testing
Still get errors. If manually fixed, then get more segmentation errors in finalize_stats rule. Both rules are written as "run" commands and so are directly run in the main python env. |
Hey Sam, I think it might be best to set resources on both of these rules to pull from the maximum memory supplied by the user as is done in a few of the other rules
That would prevent snakemake from trying to auto-generate the memory requirements that end up being insufficient, but would also make sure the user is control of how much RAM can be used |
But doesn't that just affect how much memory snakemake assumes it will use (e.g. https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#resources)? So would only matter if there is memory pressure from another rule? For finalize_stats, only singlem_pipe_reads tends to run at the same time in my testing. |
It does, but probably wouldn't hurt in this scenario. The default-resources are also derived from the input file, I don't know what happens if the default resources and requested resources mismatch by a large margin. The only thing that should kill a snakemake rule should be the scheduler or the system itself. It doesn't look like the scheduler is killing your job, but how large exactly are these coverm files that are being generated? Are they just too large for memory? |
The smallest is 1.2MB and still errored. The rule resources are |
And this is only happening in the co-assembly pipeline or just all the time? Could you send me through a couple of example coverm.cov files that failed? |
I've sent one in an email. This is happening with both single and co-assembled assemblies, though I'm running recover independently. |
I'll try adding resources and putting them in a snakemake script |
Okay, so the coverm file should be totally fine just a kind of small assembly. but I looked back at the original error and noticed this warning line: I think there might be something sus happening with python/protobuf/snakemake installations in the root snakemake/aviary environment that you are using. Maybe try uninstalling and re-installing protobuf. Also check out this thread: protocolbuffers/protobuf#9180 Note this is the original error, so it is occurring with |
I am finding it difficult to reproduce the error in a simplified run (running just the vamb_jgi_filter rule). No error with pandas or polars, with or without the resources tag. But it still errors in the main Aviary v0.6.0 env. So looks like this whole thing is an env error. |
Okay cool, yeah refer to my previous comment for potential source of the issue in the environment. You should be able to update either python or protobuf without nuking the entire environment. But if you can't then, yeah you'll have to remake the environment |
Fixes #99