Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MTase linker adding spaces to output results for most jobs in workflow. #65

Open
osvatic opened this issue Aug 5, 2024 · 6 comments
Open

Comments

@osvatic
Copy link

osvatic commented Aug 5, 2024

I am currently trying to run MTase linker in v0.4.11. The snakemake commands are failing because spaces are introduced through the workflow. These spaces make all jobs fail.

Command ran:

nanomotif MTase-linker run -t 4 --assembly ref/10N.222.51.B4_N03.fa --contig_bin ref/10N.222.51.B4_N03.contig_bin.tsv --bin_motifs nanomotif_results/10N.222.51.B4_N03_motifs/bin-motifs.tsv -d /lisc/app/conda/miniconda3/envs/nanomotif-0.4.11/share/ML_dependencies -o nanomotif_results/10N.222.51.B4_N03_mtase_linker

Issue:

Snippet 1 (warning snakemake version ):

Building DAG of jobs...
File path 'nanomotif_results/10N.222.51.B4_N03_mtase_linker/pfam_hmm_hits/ 10N.222.51.B4_N03 _gene_id_mod_table.tsv ' ends with whitespace. This is likely unintended. It can also lead to inconsistent results of the file-matching approach used by Snakemake.
File path 'nanomotif_results/10N.222.51.B4_N03_mtase_linker/defensefinder/ 10N.222.51.B4_N03 _processed_defense_finder_mtase.tsv ' ends with whitespace. This is likely unintended. It can also lead to inconsistent results of the file-matching approach used by Snakemake.
File path 'nanomotif_results/10N.222.51.B4_N03_mtase_linker/blastp/ 10N.222.51.B4_N03 _rebase_mtase_sign_alignment.tsv ' ends with whitespace. This is likely unintended. It can also lead to inconsistent results of the file-matching approach used by Snakemake.
File path 'nanomotif_results/10N.222.51.B4_N03_mtase_linker/pfam_hmm_hits/ 10N.222.51.B4_N03 _hmm_hits_mtase_aaseqs.tsv ' ends with whitespace. This is likely unintended. It can also lead to inconsistent results of the file-matching approach used by Snakemake.
File path 'nanomotif_results/10N.222.51.B4_N03_mtase_linker/pfam_hmm_hits/ 10N.222.51.B4_N03 _gene_id_mod_table.tsv ' ends with whitespace. This is likely unintended. It can also lead to inconsistent results of the file-matching approach used by Snakemake.
File path 'nanomotif_results/10N.222.51.B4_N03_mtase_linker/defensefinder/ 10N.222.51.B4_N03 _processed_defense_finder_mtase.faa ' ends with whitespace. This is likely unintended. It can also lead to inconsistent results of the file-matching approach used by Snakemake.

Snippet 2 (actual error in snakemake job):

Error in rule prodigal:
    jobid: 7
    input: /lisc/scratch/jmf/internal/Analyses_Jay/methylationtesting/nanomotif_testing/ref/10N.222.51.B4_N03.fa
    output: nanomotif_results/10N.222.51.B4_N03_mtase_linker/prodigal/ 10N.222.51.B4_N03 .gff , nanomotif_results/10N.222.51.B4_N03_mtase_linker/prodigal/ 10N.222.51.B4_N03 .faa 
    conda-env: /lisc/app/conda/miniconda3/envs/nanomotif-0.4.11/share/ML_dependencies/ML_envs/a47dc28c4049018f211c4f7943154086_
    shell:
        
          prodigal -i /lisc/scratch/jmf/internal/Analyses_Jay/methylationtesting/nanomotif_testing/ref/10N.222.51.B4_N03.fa -o nanomotif_results/10N.222.51.B4_N03_mtase_linker/prodigal/ 10N.222.51.B4_N03 .gff  -a nanomotif_results/10N.222.51.B4_N03_mtase_linker/prodigal/ 10N.222.51.B4_N03 .faa  -f gff
          
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-08-05T175037.700987.snakemake.log

log:
2024-08-05T174304.626493.snakemake.log

@JSBoejer
Copy link
Collaborator

JSBoejer commented Sep 3, 2024

Hey Osvatic

It seems to be a problem with the naming of the assembly file. In the snakemake script, the name of the assembly file is used as a basename for the generated files.

BASENAME = os.path.splitext(os.path.basename(ASSEMBLY_PATH))[0]

I tested your folder structure and naming locally, and I do not get any errors, when running MTase linker in v0.4.11.

This was the command I ran:

nanomotif MTase-linker run -t 4 --assembly ref/10N.222.51.B4_N03.fa --contig_bin ref/10N.222.51.B4_N03.contig_bin.tsv --bin_motifs nanomotif_results/10N.222.51.B4_N03_motifs/bin-motifs.tsv -d ../../../restriction-modification-annotation/data/ML_dependencies/ -o nanomotif_results/10N.222.51.B4_N03_mtase_linker

Are you using python=3.9 in your environment?

Also, maybe try renaming your assembly file to check if this resolves the problem.

Please let me know if it works :-)

Sorry for the late response -> vacation and conference.

Best Jeppe

@osvatic
Copy link
Author

osvatic commented Sep 4, 2024

Hey,

The error persists if I change the file name (tried several random names with .fa at the end).

The run environment is using python 3.12.4.

I tested a downgraded environment with python version 3.10. This does fix the previously mentioned error but the snakemake then fails on "motif_assignment" (see attached log)

2024-09-04T133333.517982.snakemake.log

What python version should I officially be using for now, and for future tests?

@JSBoejer
Copy link
Collaborator

JSBoejer commented Sep 6, 2024

Hey again,

We have now updated the dependency version requirements, and soon Nanomotif v.0.4.12 should be available via bioconda.

If you install nanomotif v.0.4.12 following the installation guidelines, then everything should run smoothly.

conda create -n nanomotif  python=3.9
conda activate nanomotif
conda install -c bioconda nanomotif

Nanomotif was developed using python 3.9, and therefore we recommend creating a conda environment with python=3.9.

I see that the MTase-linker module is not quite compatible with the newest snakemake v8. and python v3.12. We will look into this.

I hope this was helpful.

Best Jeppe

@osvatic
Copy link
Author

osvatic commented Sep 9, 2024

Hey,

It looks like conda create installs 0.4.11 and that fails with the python restrictions.

We switched to PyPi to install 0.4.12 which works with python 3.9.

Unfortunately, that produces the same error as previously mentioned.

@JSBoejer
Copy link
Collaborator

UPDATE:
Conda Install: You are correct about the conda installation. I am currently working on making the MTase-linker module compatible with the newest Snakemake v8 and Python v3.12, which should resolve the dependency issues. This update will be available soon.

Current Solution: In the meantime, the PyPI installation of Nanomotif 0.4.12 with Python 3.9 should work. I have conducted multiple tests locally, and it works fine for me.
Based on your feedback, we will include a small test dataset and a small test run during installation of the MTase-linker module.

We appreciate you taking the time to post these issues.

I will let you know when nanomotif can be installed via conda again.

@JSBoejer
Copy link
Collaborator

JSBoejer commented Oct 18, 2024

Hey Jay
In the Nanomotif version 0.1.15, the MTase-linker module is now compatible with the newest Snakemake v8 and Python v3.12, which should resolve the conda installation issue.

We have updated the installation recommendations.

conda create -n nanomotif  python=3.12
conda activate nanomotif
conda install -c bioconda nanomotif

Also, we have included a installation check in the nanomotif MTase-linker install command to check that everything works.

Moreover, we have included a small test dataset in the "nanomotif/nanomotif/datasets" path: https://github.com/MicrobialDarkMatter/nanomotif/tree/main/nanomotif/datasets

I believe, the second error you experienced might be due to a wrongly formated bin_contig file. You can use this dataset to test this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants