diff --git a/CHANGELOG.md b/CHANGELOG.md index 1cc3c622..42b92beb 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -3,6 +3,31 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html). +## [4.0] - 2024-04-22 Ascendio + +### `Added` + +- [#319](https://github.com/nf-core/airrflow/pull/319) Added AIRR compliance badge + +### `Fixed` + +- [#319](https://github.com/nf-core/airrflow/pull/319) Fix test full profile and nebnext_umi_tcr profile. +- [#321](https://github.com/nf-core/airrflow/pull/321) Label Dowser tips by isotype instead of c_call by default. +- [#322](https://github.com/nf-core/airrflow/pull/322) Use RAxML as the default builder for dowser. Skip lineage trees by default. + +### `Dependencies` + +| Dependency | Old version | New version | +| ---------- | ----------- | ----------- | +| enchantr | 0.1.11 | 0.1.14 | + +### `Deprecated parameters` + +- `--skip_lineage_trees` is now deprecated in favor of `--lineage_trees`. Lineage trees are skipped by default. +- `--igphyml` parameter is deprecated in favor of `--lineage_tree_exec`. All lineage tree building software part of Dowser are now supported. +- `--igblast_base` is deprecated in favor of `--reference_igblast`. +- `--imgtdb_base` is depracated in favor of `--reference_fasta`. + ## [3.3.0] - 2024-03-31 Confringo ### `Added` diff --git a/CITATIONS.md b/CITATIONS.md index 5dd7833c..4300d353 100644 --- a/CITATIONS.md +++ b/CITATIONS.md @@ -32,42 +32,10 @@ > Gupta, N. T., Vander Heiden, J. A., Uduman, M., Gadala-Maria, D., Yaari, G., & Kleinstein, S. H. (2015). Change-O: a toolkit for analyzing large-scale B cell immunoglobulin repertoire sequencing data: Table 1. Bioinformatics, 31(20), 3356–3358. -- [Alakazam](https://doi.org/10.1126/scitranslmed.3008879) - - > Stern, J. N. H., Yaari, G., Vander Heiden, J. A., Church, G., Donahue, W. F., Hintzen, R. Q., … O’Connor, K. C. (2014). B cells populating the multiple sclerosis brain mature in the draining cervical lymph nodes. Science Translational Medicine, 6(248). - -- [SCOPer](https://doi.org/10.1093/bioinformatics/bty235) - - > Nouri N, Kleinstein S (2018). “A spectral clustering-based method for identifying clones from high-throughput B cell repertoire sequencing data.” Bioinformatics, i341-i349. - - > Nouri N, Kleinstein S (2020). “Somatic hypermutation analysis for improved identification of B cell clonal families from next-generation sequencing data.” PLOS Computational Biology, 16(6), e1007977. - - > Gupta N, Adams K, Briggs A, Timberlake S, Vigneault F, Kleinstein S (2017). “Hierarchical clustering can identify B cell clones with high confidence in Ig repertoire sequencing data.” The Journal of Immunology, 2489-2499. - -- [Dowser](https://doi.org/10.1371/journal.pcbi.1009885) - - > Hoehn K, Pybus O, Kleinstein S (2022). “Phylogenetic analysis of migration, differentiation, and class switching in B cells.” PLoS Computational Biology. - -- [IgPhyML](https://www.pnas.org/doi/10.1073/pnas.1906020116) - - > Hoehn K, Van der Heiden J, Zhou J, Lunter G, Pybus O, Kleinstein S (2019). “Repertoire-wide phylogenetic models of B cell molecular evolution reveal evolutionary signatures of aging and vaccination.” PNAS. - - [IgBLAST](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3692102/) > Ye J, Ma N, Madden TL, Ostell JM. (2013). IgBLAST: an immunoglobulin variable domain sequence analysis tool. Nucleic Acids Res. -- [Fastp](https://doi.org/10.1093/bioinformatics/bty560) - - > Shifu Chen, Yanqing Zhou, Yaru Chen, Jia Gu, fastp: an ultra-fast all-in-one FASTQ preprocessor, Bioinformatics. 2018 Sept 1; 34(17), i884–i890. doi: 10.1093/bioinformatics/bty560. - -- [pRESTO](https://doi.org/10.1093/bioinformatics/btu138) - - > Vander Heiden, J. A., Yaari, G., Uduman, M., Stern, J. N. H., O’Connor, K. C., Hafler, D. A., … Kleinstein, S. H. (2014). pRESTO: a toolkit for processing high-throughput sequencing raw reads of lymphocyte receptor repertoires. Bioinformatics, 30(13), 1930–1932. - -- [SHazaM, Change-O](https://doi.org/10.1093/bioinformatics/btv359) - - > Gupta, N. T., Vander Heiden, J. A., Uduman, M., Gadala-Maria, D., Yaari, G., & Kleinstein, S. H. (2015). Change-O: a toolkit for analyzing large-scale B cell immunoglobulin repertoire sequencing data. Bioinformatics, 31(20), 3356–3358. - - [Alakazam](https://doi.org/10.1126/scitranslmed.3008879) > Stern, J. N. H., Yaari, G., Vander Heiden, J. A., Church, G., Donahue, W. F., Hintzen, R. Q., … O’Connor, K. C. (2014). B cells populating the multiple sclerosis brain mature in the draining cervical lymph nodes. Science Translational Medicine, 6(248). @@ -88,6 +56,10 @@ > Hoehn K, Van der Heiden J, Zhou J, Lunter G, Pybus O, Kleinstein S (2019). “Repertoire-wide phylogenetic models of B cell molecular evolution reveal evolutionary signatures of aging and vaccination. PNAS, 116(45) 22664-22672." +- [RAxML](10.1093/bioinformatics/btu033) + + > Stamatakis A. (2014) RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics, 30(9): 1312-1313. + - [MultiQC](https://pubmed.ncbi.nlm.nih.gov/27312411/) > Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016 Oct 1;32(19):3047-8. doi: 10.1093/bioinformatics/btw354. Epub 2016 Jun 16. PubMed PMID: 27312411; PubMed Central PMCID: PMC5039924. diff --git a/README.md b/README.md index 0332b20f..ef1fd759 100644 --- a/README.md +++ b/README.md @@ -19,6 +19,7 @@ [![Follow on Twitter](http://img.shields.io/badge/twitter-%40nf__core-1DA1F2?labelColor=000000&logo=twitter)](https://twitter.com/nf_core) [![Follow on Mastodon](https://img.shields.io/badge/mastodon-nf__core-6364ff?labelColor=FFFFFF&logo=mastodon)](https://mstdn.science/@nf_core) [![Watch on YouTube](http://img.shields.io/badge/youtube-nf--core-FF0000?labelColor=000000&logo=youtube)](https://www.youtube.com/c/nf-core) +[![AIRR compliant](https://img.shields.io/static/v1?label=AIRR-C%20sw-tools%20v1&message=compliant&color=008AFF&labelColor=000000&style=plastic)](https://docs.airr-community.org/en/stable/swtools/airr_swtools_standard.html) ## Introduction @@ -32,7 +33,7 @@ On release, automated continuous integration tests run the pipeline on a full-si ## Pipeline summary -nf-core/airrflow allows the end-to-end processing of BCR and TCR bulk and single cell targeted sequencing data. Several protocols are supported, please see the [usage documentation](https://nf-co.re/airrflow/usage) for more details on the supported protocols. +nf-core/airrflow allows the end-to-end processing of BCR and TCR bulk and single cell targeted sequencing data. Several protocols are supported, please see the [usage documentation](https://nf-co.re/airrflow/usage) for more details on the supported protocols. The pipeline has been certified as [AIRR compliant](https://docs.airr-community.org/en/stable/swtools/airr_swtools_compliant.html) by the AIRR community, which means that it is compatible with downstream analysis tools also supporting this format. ![nf-core/airrflow overview](docs/images/metro-map-airrflow.png) @@ -58,7 +59,7 @@ nf-core/airrflow allows the end-to-end processing of BCR and TCR bulk and single 2. V(D)J annotation and filtering (bulk and single-cell) -- Assign gene segments with `IgBlast` using the IMGT database (`Change-O AssignGenes`). +- Assign gene segments with `IgBlast` using a germline reference (`Change-O AssignGenes`). - Annotate alignments in AIRR format (`Change-O MakeDB`) - Filter by alignment quality (locus matching v_call chain, min 200 informative positions, max 10% N nucleotides) - Filter productive sequences (`Change-O ParseDB split`) @@ -80,8 +81,8 @@ nf-core/airrflow allows the end-to-end processing of BCR and TCR bulk and single 4. Clonal analysis (bulk and single-cell) - Find threshold for clone definition (`SHazaM`, `EnchantR`). -- Create germlines and define clones, repertoire analysis (`Change-O`, `EnchantR`). -- Build lineage trees (`SCOPer`, `IgphyML`, `EnchantR`). +- Create germlines and define clones, repertoire analysis (`SCOPer`, `EnchantR`). +- Build lineage trees (`Dowser`, `IgphyML`, `RAxML`, `EnchantR`). 5. Repertoire analysis and reporting @@ -124,6 +125,16 @@ nextflow run nf-core/airrflow \ --outdir ./results ``` +For common **bulk sequencing protocols** we provide pre-set profiles that specify primers, UMI length, etc for common commercially available sequencing protocols. Please check the [Supported protocol profiles](#supported-protocol-profiles) for a full list of available profiles. An example command running the NEBNext UMI protocol profile with docker containers is: + +```bash +nextflow run nf-core/airrflow \ +-profile nebnext_umi,docker \ +--mode fastq \ +--input input_samplesheet.tsv \ +--outdir results +``` + A typical command to run the pipeline from **single cell raw fastq files** (10X genomics) is: ```bash diff --git a/assets/multiqc_config.yml b/assets/multiqc_config.yml index 8bc5e1a7..23dcd315 100644 --- a/assets/multiqc_config.yml +++ b/assets/multiqc_config.yml @@ -1,5 +1,5 @@ report_comment: > - This report has been generated by the nf-core/airrflow + This report has been generated by the nf-core/airrflow analysis pipeline. For information about how to interpret these results, please see the documentation. diff --git a/assets/repertoire_comparison.Rmd b/assets/repertoire_comparison.Rmd index 27ebbbad..d3f83e75 100644 --- a/assets/repertoire_comparison.Rmd +++ b/assets/repertoire_comparison.Rmd @@ -423,6 +423,10 @@ In addition, citations for the tools and data used in this pipeline are as follo > Hoehn K, Van der Heiden J, Zhou J, Lunter G, Pybus O, Kleinstein S (2019). “Repertoire-wide phylogenetic models of B cell molecular evolution reveal evolutionary signatures of aging and vaccination.” PNAS. +- [RAxML](10.1093/bioinformatics/btu033) + + > Stamatakis A. (2014) RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics, 30(9): 1312-1313. + - [TIgGER](https://doi.org/10.1073/pnas.1417683112) > Gadala-maria, D., Yaari, G., Uduman, M., & Kleinstein, S. H. (2015). Automated analysis of high-throughput B-cell sequencing data reveals a high frequency of novel immunoglobulin V gene segment alleles. Proceedings of the National Academy of Sciences, 112(8), 1–9. diff --git a/conf/clontech_umi_tcr.config b/conf/clontech_umi_tcr.config index d620dcee..34aad8ab 100644 --- a/conf/clontech_umi_tcr.config +++ b/conf/clontech_umi_tcr.config @@ -40,5 +40,4 @@ params { // TCR options clonal_threshold = 0 - skip_lineage = true } diff --git a/conf/modules.config b/conf/modules.config index a20f0bad..3bfdc09f 100644 --- a/conf/modules.config +++ b/conf/modules.config @@ -562,10 +562,9 @@ process { mode: params.publish_dir_mode, saveAs: { filename -> filename.equals('versions.yml') ? null : filename } ] - ext.args = ['build':'igphyml', - 'minseq':5, - 'traits':'c_call', - 'tips':'c_call'] + ext.args = ['minseq':5, + 'traits':'isotype', + 'tips':'isotype'] } // ------------------------------- diff --git a/conf/nebnext_umi_tcr.config b/conf/nebnext_umi_tcr.config index e030d952..3318e270 100644 --- a/conf/nebnext_umi_tcr.config +++ b/conf/nebnext_umi_tcr.config @@ -37,5 +37,4 @@ params { //TCR options clonal_threshold = 0 - skip_lineage } diff --git a/conf/test.config b/conf/test.config index 1bf667f7..4d6b1c40 100644 --- a/conf/test.config +++ b/conf/test.config @@ -23,8 +23,8 @@ params { input = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-bcr/Metadata_test_airr.tsv' cprimers = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-bcr/C_primers.fasta' vprimers = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-bcr/V_primers.fasta' - imgtdb_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip' - igblast_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip' + reference_fasta = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip' + reference_igblast = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip' mode = 'fastq' @@ -35,6 +35,7 @@ params { umi_position = 'R1' index_file = true isotype_column = 'c_primer' + lineage_trees = true } process{ diff --git a/conf/test_assembled_hs.config b/conf/test_assembled_hs.config index 602f5462..bb6caa19 100644 --- a/conf/test_assembled_hs.config +++ b/conf/test_assembled_hs.config @@ -19,13 +19,14 @@ params { // Input data mode = 'assembled' input = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-reveal/test_assembled_metadata_hs.tsv' - imgtdb_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip' - igblast_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip' + reference_fasta = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip' + reference_igblast = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip' reassign = true productive_only = true collapseby = 'filename' cloneby = 'subject_id' remove_chimeric = true + lineage_trees = true } diff --git a/conf/test_assembled_immcantation_devel_hs.config b/conf/test_assembled_immcantation_devel_hs.config index dad18d47..da5c8d56 100644 --- a/conf/test_assembled_immcantation_devel_hs.config +++ b/conf/test_assembled_immcantation_devel_hs.config @@ -19,9 +19,8 @@ params { // Input data mode = 'assembled' input = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-reveal/test_assembled_metadata_hs.tsv' - imgtdb_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip' - igblast_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip' - igphyml = '/usr/local/share/igphyml/src/igphyml' + reference_fasta = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip' + reference_igblast = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip' reassign = true productive_only = true diff --git a/conf/test_assembled_immcantation_devel_mm.config b/conf/test_assembled_immcantation_devel_mm.config index 2aea10a3..33fd5bcb 100644 --- a/conf/test_assembled_immcantation_devel_mm.config +++ b/conf/test_assembled_immcantation_devel_mm.config @@ -19,9 +19,8 @@ params { // Input data mode = 'assembled' input = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-reveal/test_assembled_metadata_mm.tsv' - imgtdb_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip' - igblast_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip' - igphyml = '/usr/local/share/igphyml/src/igphyml' + reference_fasta = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip' + reference_igblast = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip' reassign = true productive_only = true diff --git a/conf/test_assembled_mm.config b/conf/test_assembled_mm.config index a80d2099..69ad5052 100644 --- a/conf/test_assembled_mm.config +++ b/conf/test_assembled_mm.config @@ -19,13 +19,15 @@ params { // Input data mode = 'assembled' input = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-reveal/test_assembled_metadata_mm.tsv' - imgtdb_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip' - igblast_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip' + reference_fasta = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip' + reference_igblast = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip' reassign = true productive_only = true collapseby = 'filename' cloneby = 'subject_id' remove_chimeric = true + + lineage_trees = true } diff --git a/conf/test_clontech_umi.config b/conf/test_clontech_umi.config index 2263d057..1d64ad1c 100644 --- a/conf/test_clontech_umi.config +++ b/conf/test_clontech_umi.config @@ -23,10 +23,9 @@ params { // Input data input = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-clontech/samplesheet.tsv' - imgtdb_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip' - igblast_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip' + reference_fasta = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip' + reference_igblast = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip' clonal_threshold = 0.1 - skip_lineage = true } diff --git a/conf/test_full.config b/conf/test_full.config index 8196a9d3..0ac79d53 100644 --- a/conf/test_full.config +++ b/conf/test_full.config @@ -18,8 +18,10 @@ params { input = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-bcr/metadata_pcr_umi_airr_300.tsv' cprimers = 's3://ngi-igenomes/test-data/airrflow/pcr_umi/cprimers.fasta' vprimers = 's3://ngi-igenomes/test-data/airrflow/pcr_umi/vprimers.fasta' - imgtdb_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip' - igblast_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip' + reference_fasta = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip' + reference_igblast = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip' + + lineage_trees = true // Other params library_generation_method = 'specific_pcr_umi' @@ -27,13 +29,26 @@ params { umi_length = 15 umi_start = 0 umi_position = 'R1' + isotype_column = 'c_primer' } process { withName:DOWSER_LINEAGES{ - ext.args = ['build':'igphyml', - 'minseq':5, - 'traits':'c_primer', - 'tips':'c_primer'] + ext.args = ['minseq':5, + 'traits':'isotype', + 'tips':'isotype'] + } + + withName:DEFINE_CLONES_COMPUTE{ + ext.args = ['outname':'', 'model':'hierarchical', + 'method':'nt', 'linkage':'single', + 'min_n':30] + + } + withName:DEFINE_CLONES_REPORT{ + ext.args = ['outname':'', 'model':'hierarchical', + 'method':'nt', 'linkage':'single', + 'min_n':30] + } } diff --git a/conf/test_nebnext_umi.config b/conf/test_nebnext_umi.config index d1712c8d..76c9bbea 100644 --- a/conf/test_nebnext_umi.config +++ b/conf/test_nebnext_umi.config @@ -24,10 +24,9 @@ params { // Input data input = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-neb/samplesheet.tsv' - imgtdb_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip' - igblast_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip' + reference_fasta = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip' + reference_igblast = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip' clonal_threshold = 0.1 - skip_lineage = true } diff --git a/conf/test_no_umi.config b/conf/test_no_umi.config index e17a6526..8800b20c 100644 --- a/conf/test_no_umi.config +++ b/conf/test_no_umi.config @@ -30,8 +30,8 @@ params { input = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-no-umi/Metadata_test-no-umi_airr.tsv' cprimers = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-no-umi/Greiff2014_CPrimers.fasta' vprimers = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-no-umi/Greiff2014_VPrimers.fasta' - imgtdb_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip' - igblast_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip' + reference_fasta = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip' + reference_igblast = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip' } diff --git a/conf/test_nocluster.config b/conf/test_nocluster.config index 469de7b8..aabccb9b 100644 --- a/conf/test_nocluster.config +++ b/conf/test_nocluster.config @@ -23,8 +23,8 @@ params { input = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-bcr/Metadata_test_airr.tsv' cprimers = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-bcr/C_primers.fasta' vprimers = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-bcr/V_primers.fasta' - imgtdb_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip' - igblast_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip' + reference_fasta = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip' + reference_igblast = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip' mode = 'fastq' diff --git a/conf/test_raw_immcantation_devel.config b/conf/test_raw_immcantation_devel.config index b309cb60..11b8ff69 100644 --- a/conf/test_raw_immcantation_devel.config +++ b/conf/test_raw_immcantation_devel.config @@ -24,9 +24,8 @@ params { cprimers = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-bcr/C_primers.fasta' vprimers = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-bcr/V_primers.fasta' - imgtdb_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip' - igblast_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip' - igphyml = '/usr/local/share/igphyml/src/igphyml' + reference_fasta = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip' + reference_igblast = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip' mode = 'fastq' diff --git a/conf/test_tcr.config b/conf/test_tcr.config index fb878caa..5af84ee7 100644 --- a/conf/test_tcr.config +++ b/conf/test_tcr.config @@ -25,15 +25,14 @@ params { library_generation_method = 'dt_5p_race_umi' cprimer_position = 'R1' clonal_threshold = 0 - skip_lineage = true // Input data input = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-tcr/TCR_metadata_airr.tsv' cprimers = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-tcr/cprimers.fasta' race_linker = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-tcr/linker.fasta' - imgtdb_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip' - igblast_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip' + reference_fasta = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip' + reference_igblast = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip' } diff --git a/docs/output.md b/docs/output.md index 1dedb05c..532bdf80 100644 --- a/docs/output.md +++ b/docs/output.md @@ -245,7 +245,7 @@ generate a `.fasta` file from the rearrangement table. -Assign genes with Igblast, using the IMGT database is performed by the [AssignGenes](https://changeo.readthedocs.io/en/stable/examples/igblast.html#running-igblast) command of the Change-O tool from the Immcantation Framework. +Assign genes with Igblast, using the a germline reference is performed by the [AssignGenes](https://changeo.readthedocs.io/en/stable/examples/igblast.html#running-igblast) command of the Change-O tool from the Immcantation Framework. ### Make database from assigned genes @@ -482,7 +482,7 @@ Parsing the logs from the previous processes. Summary of the number of sequences Copy of the downloaded IMGT database by the process `fetch_databases`, used for the gene assignment step. -If databases are provided with `--imgtdb_base` and `--igblast_base` this folder will not be present. +If databases are provided with `--reference_fasta` and `--reference_igblast` this folder will not be present. ## MultiQC diff --git a/modules/local/airrflow_report/airrflow_report.nf b/modules/local/airrflow_report/airrflow_report.nf index fafbd052..b4422153 100644 --- a/modules/local/airrflow_report/airrflow_report.nf +++ b/modules/local/airrflow_report/airrflow_report.nf @@ -6,8 +6,8 @@ process AIRRFLOW_REPORT { error "nf-core/airrflow currently does not support Conda. Please use a container profile instead." } container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'docker.io/immcantation/airrflow:3.3.0': - 'docker.io/immcantation/airrflow:3.3.0' }" + 'docker.io/immcantation/airrflow:4.0.0': + 'docker.io/immcantation/airrflow:4.0.0' }" input: tuple val(meta), path(tab) // sequence tsv table in AIRR format diff --git a/modules/local/changeo/changeo_creategermlines.nf b/modules/local/changeo/changeo_creategermlines.nf index d424377a..35cf76d4 100644 --- a/modules/local/changeo/changeo_creategermlines.nf +++ b/modules/local/changeo/changeo_creategermlines.nf @@ -11,7 +11,7 @@ process CHANGEO_CREATEGERMLINES { input: tuple val(meta), path(tab) // sequence tsv table in AIRR format - path(imgt_base) // imgt db + path(reference_fasta) // reference fasta output: tuple val(meta), path("*germ-pass.tsv"), emit: tab @@ -22,7 +22,7 @@ process CHANGEO_CREATEGERMLINES { def args = task.ext.args ?: '' """ CreateGermlines.py -d ${tab} \\ - -r ${imgt_base}/${meta.species}/vdj/ \\ + -r ${reference_fasta}/${meta.species}/vdj/ \\ -g dmask --format airr \\ --log ${meta.id}.log --outname ${meta.id} $args > ${meta.id}_create-germlines_command_log.txt ParseLog.py -l ${meta.id}.log -f ID V_CALL D_CALL J_CALL diff --git a/modules/local/changeo/changeo_makedb.nf b/modules/local/changeo/changeo_makedb.nf index a71d6282..4862ba86 100644 --- a/modules/local/changeo/changeo_makedb.nf +++ b/modules/local/changeo/changeo_makedb.nf @@ -6,14 +6,13 @@ process CHANGEO_MAKEDB { conda "bioconda::changeo=1.3.0 bioconda::igblast=1.22.0 conda-forge::wget=1.20.1" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - //TODO: update mulled containers when available 'https://depot.galaxyproject.org/singularity/mulled-v2-7d8e418eb73acc6a80daea8e111c94cf19a4ecfd:a9ee25632c9b10bbb012da76e6eb539acca8f9cd-1' : 'biocontainers/mulled-v2-7d8e418eb73acc6a80daea8e111c94cf19a4ecfd:a9ee25632c9b10bbb012da76e6eb539acca8f9cd-1' }" input: tuple val(meta), path(reads) // reads in fasta format path(igblast) // igblast fasta from ch_igblast_db_for_process_igblast.mix(ch_igblast_db_for_process_igblast_mix).collect() - path(imgt_base) + path(reference_fasta) output: tuple val(meta), path("*db-pass.tsv"), emit: tab //sequence table in AIRR format @@ -24,7 +23,7 @@ process CHANGEO_MAKEDB { def args = task.ext.args ?: '' """ MakeDb.py igblast -i $igblast -s $reads -r \\ - ${imgt_base}/${meta.species.toLowerCase()}/vdj/ \\ + ${reference_fasta}/${meta.species.toLowerCase()}/vdj/ \\ $args \\ --outname ${meta.id} > ${meta.id}_makedb_command_log.txt diff --git a/modules/local/enchantr/collapse_duplicates.nf b/modules/local/enchantr/collapse_duplicates.nf index ebec7209..903824fe 100644 --- a/modules/local/enchantr/collapse_duplicates.nf +++ b/modules/local/enchantr/collapse_duplicates.nf @@ -8,8 +8,8 @@ process COLLAPSE_DUPLICATES { error "nf-core/airrflow currently does not support Conda. Please use a container profile instead." } container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'docker.io/immcantation/airrflow:3.3.0': - 'docker.io/immcantation/airrflow:3.3.0' }" + 'docker.io/immcantation/airrflow:4.0.0': + 'docker.io/immcantation/airrflow:4.0.0' }" input: tuple val(meta), path(tabs) // tuple [val(meta), sequence tsv in AIRR format ] diff --git a/modules/local/enchantr/define_clones.nf b/modules/local/enchantr/define_clones.nf index b6cf9ec8..64b8e7df 100644 --- a/modules/local/enchantr/define_clones.nf +++ b/modules/local/enchantr/define_clones.nf @@ -25,13 +25,13 @@ process DEFINE_CLONES { error "nf-core/airrflow currently does not support Conda. Please use a container profile instead." } container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'docker.io/immcantation/airrflow:3.3.0': - 'docker.io/immcantation/airrflow:3.3.0' }" + 'docker.io/immcantation/airrflow:4.0.0': + 'docker.io/immcantation/airrflow:4.0.0' }" input: tuple val(meta), path(tabs) // meta, sequence tsv in AIRR format val threshold - path imgt_base + path reference_fasta path repertoires_samplesheet output: @@ -53,7 +53,7 @@ process DEFINE_CLONES { """ Rscript -e "enchantr::enchantr_report('define_clones', \\ report_params=list('input'='${input}', \\ - 'imgt_db'='${imgt_base}', \\ + 'imgt_db'='${reference_fasta}', \\ 'species'='auto', \\ 'cloneby'='${params.cloneby}', \\ 'outputby'='${params.cloneby}', \\ diff --git a/modules/local/enchantr/detect_contamination.nf b/modules/local/enchantr/detect_contamination.nf index 1ed10e8e..aae3ef92 100644 --- a/modules/local/enchantr/detect_contamination.nf +++ b/modules/local/enchantr/detect_contamination.nf @@ -9,8 +9,8 @@ process DETECT_CONTAMINATION { error "nf-core/airrflow currently does not support Conda. Please use a container profile instead." } container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'docker.io/immcantation/airrflow:3.3.0': - 'docker.io/immcantation/airrflow:3.3.0' }" + 'docker.io/immcantation/airrflow:4.0.0': + 'docker.io/immcantation/airrflow:4.0.0' }" input: path(tabs) diff --git a/modules/local/enchantr/dowser_lineages.nf b/modules/local/enchantr/dowser_lineages.nf index e50a9e07..03444f19 100644 --- a/modules/local/enchantr/dowser_lineages.nf +++ b/modules/local/enchantr/dowser_lineages.nf @@ -25,8 +25,8 @@ process DOWSER_LINEAGES { error "nf-core/airrflow currently does not support Conda. Please use a container profile instead." } container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'docker.io/immcantation/airrflow:3.3.0': - 'docker.io/immcantation/airrflow:3.3.0' }" + 'docker.io/immcantation/airrflow:4.0.0': + 'docker.io/immcantation/airrflow:4.0.0' }" input: tuple val(meta), path(tabs) @@ -43,7 +43,8 @@ process DOWSER_LINEAGES { """ Rscript -e "enchantr::enchantr_report('dowser_lineage', \\ report_params=list('input'='${tabs}', \\ - 'exec'='${params.igphyml}', \\ + 'build'='${params.lineage_tree_builder}', \\ + 'exec'='${params.lineage_tree_exec}', \\ 'outdir'=getwd(), \\ 'nproc'=${task.cpus},\\ 'log'='${id_name}_dowser_command_log' ${args}))" diff --git a/modules/local/enchantr/find_threshold.nf b/modules/local/enchantr/find_threshold.nf index 89b1c3b8..8632e081 100644 --- a/modules/local/enchantr/find_threshold.nf +++ b/modules/local/enchantr/find_threshold.nf @@ -25,8 +25,8 @@ process FIND_THRESHOLD { error "nf-core/airrflow currently does not support Conda. Please use a container profile instead." } container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'docker.io/immcantation/airrflow:3.3.0': - 'docker.io/immcantation/airrflow:3.3.0' }" + 'docker.io/immcantation/airrflow:4.0.0': + 'docker.io/immcantation/airrflow:4.0.0' }" input: diff --git a/modules/local/enchantr/remove_chimeric.nf b/modules/local/enchantr/remove_chimeric.nf index 76f4e0b5..94805169 100644 --- a/modules/local/enchantr/remove_chimeric.nf +++ b/modules/local/enchantr/remove_chimeric.nf @@ -9,13 +9,13 @@ process REMOVE_CHIMERIC { error "nf-core/airrflow currently does not support Conda. Please use a container profile instead." } container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'docker.io/immcantation/airrflow:3.3.0': - 'docker.io/immcantation/airrflow:3.3.0' }" + 'docker.io/immcantation/airrflow:4.0.0': + 'docker.io/immcantation/airrflow:4.0.0' }" input: tuple val(meta), path(tab) // sequence tsv in AIRR format - path(imgt_base) + path(reference_fasta) output: tuple val(meta), path("*chimera-pass.tsv"), emit: tab // sequence tsv in AIRR format diff --git a/modules/local/enchantr/report_file_size.nf b/modules/local/enchantr/report_file_size.nf index ece9d93f..4fc4c3fa 100644 --- a/modules/local/enchantr/report_file_size.nf +++ b/modules/local/enchantr/report_file_size.nf @@ -10,8 +10,8 @@ process REPORT_FILE_SIZE { error "nf-core/airrflow currently does not support Conda. Please use a container profile instead." } container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'docker.io/immcantation/airrflow:3.3.0': - 'docker.io/immcantation/airrflow:3.3.0' }" + 'docker.io/immcantation/airrflow:4.0.0': + 'docker.io/immcantation/airrflow:4.0.0' }" input: path logs diff --git a/modules/local/enchantr/single_cell_qc.nf b/modules/local/enchantr/single_cell_qc.nf index 6b232155..49e97796 100644 --- a/modules/local/enchantr/single_cell_qc.nf +++ b/modules/local/enchantr/single_cell_qc.nf @@ -24,8 +24,8 @@ process SINGLE_CELL_QC { error "nf-core/airrflow currently does not support Conda. Please use a container profile instead." } container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'docker.io/immcantation/airrflow:3.3.0': - 'docker.io/immcantation/airrflow:3.3.0' }" + 'docker.io/immcantation/airrflow:4.0.0': + 'docker.io/immcantation/airrflow:4.0.0' }" input: path(tabs) diff --git a/modules/local/enchantr/validate_input.nf b/modules/local/enchantr/validate_input.nf index 0dcd884e..db8ab075 100644 --- a/modules/local/enchantr/validate_input.nf +++ b/modules/local/enchantr/validate_input.nf @@ -10,8 +10,8 @@ process VALIDATE_INPUT { error "nf-core/airrflow currently does not support Conda. Please use a container profile instead." } container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'docker.io/immcantation/airrflow:3.3.0': - 'docker.io/immcantation/airrflow:3.3.0' }" + 'docker.io/immcantation/airrflow:4.0.0': + 'docker.io/immcantation/airrflow:4.0.0' }" input: file samplesheet diff --git a/modules/local/fetch_databases.nf b/modules/local/fetch_databases.nf index 2deb3cb4..07853277 100644 --- a/modules/local/fetch_databases.nf +++ b/modules/local/fetch_databases.nf @@ -10,7 +10,7 @@ process FETCH_DATABASES { output: path("igblast_base"), emit: igblast - path("imgtdb_base"), emit: imgt + path("imgtdb_base"), emit: reference_fasta path "versions.yml" , emit: versions path("igblast_base/database/imgt_human_ig_v.ndb"), emit: igblast_human_ig_v path("igblast_base/database/imgt_human_ig_d.ndb"), emit: igblast_human_ig_d diff --git a/modules/local/reveal/add_meta_to_tab.nf b/modules/local/reveal/add_meta_to_tab.nf index 67c930d6..8413cebc 100644 --- a/modules/local/reveal/add_meta_to_tab.nf +++ b/modules/local/reveal/add_meta_to_tab.nf @@ -7,8 +7,8 @@ process ADD_META_TO_TAB { error "nf-core/airrflow currently does not support Conda. Please use a container profile instead." } container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'docker.io/immcantation/airrflow:3.3.0': - 'docker.io/immcantation/airrflow:3.3.0' }" + 'docker.io/immcantation/airrflow:4.0.0': + 'docker.io/immcantation/airrflow:4.0.0' }" cache 'deep' // Without 'deep' this process would run when using -resume diff --git a/modules/local/reveal/filter_junction_mod3.nf b/modules/local/reveal/filter_junction_mod3.nf index c373ddbf..f792aca2 100644 --- a/modules/local/reveal/filter_junction_mod3.nf +++ b/modules/local/reveal/filter_junction_mod3.nf @@ -7,8 +7,8 @@ process FILTER_JUNCTION_MOD3 { error "nf-core/airrflow currently does not support Conda. Please use a container profile instead." } container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'docker.io/immcantation/airrflow:3.3.0': - 'docker.io/immcantation/airrflow:3.3.0' }" + 'docker.io/immcantation/airrflow:4.0.0': + 'docker.io/immcantation/airrflow:4.0.0' }" input: tuple val(meta), path(tab) // sequence tsv in AIRR format diff --git a/modules/local/reveal/filter_quality.nf b/modules/local/reveal/filter_quality.nf index 46062cb9..aa803279 100644 --- a/modules/local/reveal/filter_quality.nf +++ b/modules/local/reveal/filter_quality.nf @@ -7,8 +7,8 @@ process FILTER_QUALITY { error "nf-core/airrflow currently does not support Conda. Please use a container profile instead." } container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'docker.io/immcantation/airrflow:3.3.0': - 'docker.io/immcantation/airrflow:3.3.0' }" + 'docker.io/immcantation/airrflow:4.0.0': + 'docker.io/immcantation/airrflow:4.0.0' }" input: tuple val(meta), path(tab) // sequence tsv in AIRR format diff --git a/nextflow.config b/nextflow.config index dd398999..00c53278 100644 --- a/nextflow.config +++ b/nextflow.config @@ -81,8 +81,8 @@ params { // ----------------------- productive_only = true reassign = true - igblast_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip' - imgtdb_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip' + reference_igblast = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip' + reference_fasta = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip' fetch_imgt = false save_databases = true isotype_column = 'c_call' @@ -105,8 +105,9 @@ params { skip_report_threshold = false // tree lineage options - igphyml="/usr/local/share/igphyml/src/igphyml" - skip_lineage = false + lineage_tree_builder = 'raxml' + lineage_tree_exec = '/usr/local/bin/raxml-ng' + lineage_trees = false // ----------------------- // reporting options @@ -367,7 +368,7 @@ manifest { description = """B and T cell repertoire analysis pipeline with the Immcantation framework.""" mainScript = 'main.nf' nextflowVersion = '!>=23.04.0' - version = '3.3.0' + version = '4.0' doi = '10.5281/zenodo.2642009' } diff --git a/nextflow_schema.json b/nextflow_schema.json index 3902f9ed..cc33f8ff 100644 --- a/nextflow_schema.json +++ b/nextflow_schema.json @@ -338,16 +338,16 @@ "description": "Save databases so you can use the cache in future runs.", "fa_icon": "fas fa-file-download" }, - "imgtdb_base": { + "reference_fasta": { "type": "string", - "description": "Path to the cached IMGT database.", - "help_text": "By default, we provide a pre-downloaded version of the IMGT database. It is also possible to provide a custom IMGT reference database. To fetch a fresh version of IMGT, set the `--fetch_imgt` parameter instead.", + "description": "Path to the germline reference fasta.", + "help_text": "By default, we provide a pre-downloaded version of the IMGT database. It is also possible to provide a custom reference fasta database. To fetch a fresh version of IMGT, set the `--fetch_imgt` parameter instead.", "fa_icon": "fas fa-database" }, - "igblast_base": { + "reference_igblast": { "type": "string", "description": "Path to the cached igblast database.", - "help_text": "By default, we provide a pre-downloaded version of the IMGT database. It is also possible to provide a custom IMGT reference database. To fetch a fresh version of IMGT, set the `--fetch_imgt` parameter instead.", + "help_text": "By default, we provide a pre-downloaded version of the IMGT database. It is also possible to provide a custom reference fasta database. To fetch a fresh version of IMGT, set the `--fetch_imgt` parameter instead.", "fa_icon": "fas fa-database" }, "fetch_imgt": { @@ -412,9 +412,9 @@ "fa_icon": "fab fa-pagelines", "description": "Set the clustering threshold Hamming distance value. Default: 'auto'" }, - "skip_lineage": { + "lineage_trees": { "type": "boolean", - "description": "Skip clonal lineage analysis and lineage tree plotting.", + "description": "Perform clonal lineage tree analysis.", "fa_icon": "fas fa-angle-double-right" }, "cloneby": { @@ -429,11 +429,18 @@ "description": "Name of the field used to identify external groups used to identify a clonal threshold.", "fa_icon": "fab fa-pagelines" }, - "igphyml": { + "lineage_tree_builder": { "type": "string", - "default": "/usr/local/share/igphyml/src/igphyml", - "description": "Path to IgPhyml executable.", - "fa_icon": "fas fa-file" + "default": "raxml", + "description": "Lineage tree software to use to build trees within Dowser. If you change the default, also set the `lineage_tree_exec` parameter.", + "enum": ["raxml", "igphyml"], + "fa_icon": "fas fa-pagelines" + }, + "lineage_tree_exec": { + "type": "string", + "default": "/usr/local/bin/raxml-ng", + "description": "Path to lineage tree building executable.", + "fa_icon": "fas fa-pagelines" }, "singlecell": { "type": "string", diff --git a/subworkflows/local/bulk_qc_and_filter.nf b/subworkflows/local/bulk_qc_and_filter.nf index 34b082d1..326fbe34 100644 --- a/subworkflows/local/bulk_qc_and_filter.nf +++ b/subworkflows/local/bulk_qc_and_filter.nf @@ -7,7 +7,7 @@ workflow BULK_QC_AND_FILTER { take: ch_repertoire // tuple [meta, repertoire_tab] - ch_imgt + ch_reference_fasta main: @@ -20,7 +20,7 @@ workflow BULK_QC_AND_FILTER { // Create germlines (not --cloned) CHANGEO_CREATEGERMLINES( ch_repertoire, - ch_imgt.collect() + ch_reference_fasta.collect() ) ch_logs = ch_logs.mix(CHANGEO_CREATEGERMLINES.out.logs) ch_versions = ch_versions.mix(CHANGEO_CREATEGERMLINES.out.versions) @@ -28,7 +28,7 @@ workflow BULK_QC_AND_FILTER { // Remove chimera REMOVE_CHIMERIC( CHANGEO_CREATEGERMLINES.out.tab, - ch_imgt.collect() + ch_reference_fasta.collect() ) ch_logs = ch_logs.mix(REMOVE_CHIMERIC.out.logs) ch_versions = ch_versions.mix(REMOVE_CHIMERIC.out.versions) diff --git a/subworkflows/local/clonal_analysis.nf b/subworkflows/local/clonal_analysis.nf index 887aed92..68237551 100644 --- a/subworkflows/local/clonal_analysis.nf +++ b/subworkflows/local/clonal_analysis.nf @@ -7,7 +7,7 @@ include { DOWSER_LINEAGES } from '../../modules/local/enchantr/dowser_lineages' workflow CLONAL_ANALYSIS { take: ch_repertoire - ch_imgt + ch_reference_fasta ch_logo main: @@ -76,7 +76,7 @@ workflow CLONAL_ANALYSIS { DEFINE_CLONES_COMPUTE( ch_define_clones, clone_threshold.collect(), - ch_imgt.collect(), + ch_reference_fasta.collect(), [] ) @@ -102,7 +102,7 @@ workflow CLONAL_ANALYSIS { DEFINE_CLONES_REPORT( ch_all_repertoires_cloned, clone_threshold.collect(), - ch_imgt.collect(), + ch_reference_fasta.collect(), ch_all_repertoires_cloned_samplesheet ) ch_versions = DEFINE_CLONES_REPORT.out.versions @@ -114,7 +114,7 @@ workflow CLONAL_ANALYSIS { .map { it -> [ [id: "${it.baseName}".replaceFirst("__clone-pass", "")], it ] } .set{ch_repertoires_cloned} - if (!params.skip_lineage){ + if (params.lineage_trees){ DOWSER_LINEAGES( ch_repertoires_cloned ) diff --git a/subworkflows/local/databases.nf b/subworkflows/local/databases.nf index 594b340e..08e59108 100644 --- a/subworkflows/local/databases.nf +++ b/subworkflows/local/databases.nf @@ -1,6 +1,6 @@ include { FETCH_DATABASES } from '../../modules/local/fetch_databases' include { UNZIP_DB as UNZIP_IGBLAST } from '../../modules/local/unzip_db' -include { UNZIP_DB as UNZIP_IMGT } from '../../modules/local/unzip_db' +include { UNZIP_DB as UNZIP_REFERENCE_FASTA } from '../../modules/local/unzip_db' workflow DATABASES { @@ -11,44 +11,44 @@ workflow DATABASES { // FETCH DATABASES if( !params.fetch_imgt ){ - if (params.igblast_base.endsWith(".zip")) { - Channel.fromPath("${params.igblast_base}") - .ifEmpty{ error "IGBLAST DB not found: ${params.igblast_base}" } + if (params.reference_igblast.endsWith(".zip")) { + Channel.fromPath("${params.reference_igblast}") + .ifEmpty{ error "IGBLAST DB not found: ${params.reference_igblast}" } .set { ch_igblast_zipped } UNZIP_IGBLAST( ch_igblast_zipped.collect() ) ch_igblast = UNZIP_IGBLAST.out.unzipped ch_versions = ch_versions.mix(UNZIP_IGBLAST.out.versions) } else { - Channel.fromPath("${params.igblast_base}") - .ifEmpty { error "IGBLAST DB not found: ${params.igblast_base}" } + Channel.fromPath("${params.reference_igblast}") + .ifEmpty { error "IGBLAST DB not found: ${params.reference_igblast}" } .set { ch_igblast } } } if( !params.fetch_imgt ){ - if (params.imgtdb_base.endsWith(".zip")) { - Channel.fromPath("${params.imgtdb_base}") - .ifEmpty{ error "IMGTDB not found: ${params.imgtdb_base}" } - .set { ch_imgt_zipped } - UNZIP_IMGT( ch_imgt_zipped.collect() ) - ch_imgt = UNZIP_IMGT.out.unzipped - ch_versions = ch_versions.mix(UNZIP_IMGT.out.versions) + if (params.reference_fasta.endsWith(".zip")) { + Channel.fromPath("${params.reference_fasta}") + .ifEmpty{ error "IMGTDB not found: ${params.reference_fasta}" } + .set { ch_reference_fasta_zipped } + UNZIP_REFERENCE_FASTA( ch_reference_fasta_zipped.collect() ) + ch_reference_fasta = UNZIP_REFERENCE_FASTA.out.unzipped + ch_versions = ch_versions.mix(UNZIP_REFERENCE_FASTA.out.versions) } else { - Channel.fromPath("${params.imgtdb_base}") - .ifEmpty { error "IMGT DB not found: ${params.imgtdb_base}" } - .set { ch_imgt } + Channel.fromPath("${params.reference_fasta}") + .ifEmpty { error "IMGT DB not found: ${params.reference_fasta}" } + .set { ch_reference_fasta } } } if (params.fetch_imgt) { FETCH_DATABASES() ch_igblast = FETCH_DATABASES.out.igblast - ch_imgt = FETCH_DATABASES.out.imgt + ch_reference_fasta = FETCH_DATABASES.out.reference_fasta ch_versions = ch_versions.mix(FETCH_DATABASES.out.versions) } emit: versions = ch_versions - imgt = ch_imgt + reference_fasta = ch_reference_fasta igblast = ch_igblast } diff --git a/subworkflows/local/repertoire_analysis_reporting.nf b/subworkflows/local/repertoire_analysis_reporting.nf index cbcf7456..2a796751 100644 --- a/subworkflows/local/repertoire_analysis_reporting.nf +++ b/subworkflows/local/repertoire_analysis_reporting.nf @@ -30,7 +30,7 @@ workflow REPERTOIRE_ANALYSIS_REPORTING { main: ch_versions = Channel.empty() - if (params.mode == "fastq" && !params.library_generation_method in ["sc_10x_genomics"]) { + if (params.mode == "fastq" && params.library_generation_method != "sc_10x_genomics") { PARSE_LOGS( ch_presto_filterseq_logs, ch_presto_maskprimers_logs, diff --git a/subworkflows/local/vdj_annotation.nf b/subworkflows/local/vdj_annotation.nf index 4ac2b9df..692320ec 100644 --- a/subworkflows/local/vdj_annotation.nf +++ b/subworkflows/local/vdj_annotation.nf @@ -13,7 +13,7 @@ workflow VDJ_ANNOTATION { ch_fasta // [meta, fasta] ch_validated_samplesheet ch_igblast - ch_imgt + ch_reference_fasta main: ch_versions = Channel.empty() @@ -30,7 +30,7 @@ workflow VDJ_ANNOTATION { CHANGEO_MAKEDB ( CHANGEO_ASSIGNGENES.out.fasta, CHANGEO_ASSIGNGENES.out.blast, - ch_imgt.collect() + ch_reference_fasta.collect() ) ch_logs = ch_logs.mix(CHANGEO_MAKEDB.out.logs) ch_versions = ch_versions.mix(CHANGEO_MAKEDB.out.versions) @@ -78,8 +78,8 @@ workflow VDJ_ANNOTATION { emit: versions = ch_versions repertoire = ADD_META_TO_TAB.out.tab - imgt = ch_imgt - igblast = ch_igblast + reference_fasta = ch_reference_fasta + reference_igblast = ch_igblast changeo_makedb_logs = ch_assignment_logs logs = ch_logs diff --git a/workflows/airrflow.nf b/workflows/airrflow.nf index 41a96d90..bc6b7924 100644 --- a/workflows/airrflow.nf +++ b/workflows/airrflow.nf @@ -119,15 +119,15 @@ workflow AIRRFLOW { ch_fastp_json = SEQUENCE_ASSEMBLY.out.fastp_reads_json ch_fastqc_postassembly_mqc = SEQUENCE_ASSEMBLY.out.fastqc_postassembly ch_validated_samplesheet = SEQUENCE_ASSEMBLY.out.samplesheet.collect() - ch_presto_filterseq_logs = SEQUENCE_ASSEMBLY.out.presto_filterseq_logs - ch_presto_maskprimers_logs = SEQUENCE_ASSEMBLY.out.presto_maskprimers_logs - ch_presto_pairseq_logs = SEQUENCE_ASSEMBLY.out.presto_pairseq_logs - ch_presto_clustersets_logs = SEQUENCE_ASSEMBLY.out.presto_clustersets_logs - ch_presto_buildconsensus_logs = SEQUENCE_ASSEMBLY.out.presto_buildconsensus_logs - ch_presto_postconsensus_pairseq_logs = SEQUENCE_ASSEMBLY.out.presto_postconsensus_pairseq_logs - ch_presto_assemblepairs_logs = SEQUENCE_ASSEMBLY.out.presto_assemblepairs_logs - ch_presto_collapseseq_logs = SEQUENCE_ASSEMBLY.out.presto_collapseseq_logs - ch_presto_splitseq_logs = SEQUENCE_ASSEMBLY.out.presto_splitseq_logs + ch_presto_filterseq_logs = SEQUENCE_ASSEMBLY.out.presto_filterseq_logs.ifEmpty([]) + ch_presto_maskprimers_logs = SEQUENCE_ASSEMBLY.out.presto_maskprimers_logs.ifEmpty([]) + ch_presto_pairseq_logs = SEQUENCE_ASSEMBLY.out.presto_pairseq_logs.ifEmpty([]) + ch_presto_clustersets_logs = SEQUENCE_ASSEMBLY.out.presto_clustersets_logs.ifEmpty([]) + ch_presto_buildconsensus_logs = SEQUENCE_ASSEMBLY.out.presto_buildconsensus_logs.ifEmpty([]) + ch_presto_postconsensus_pairseq_logs = SEQUENCE_ASSEMBLY.out.presto_postconsensus_pairseq_logs.ifEmpty([]) + ch_presto_assemblepairs_logs = SEQUENCE_ASSEMBLY.out.presto_assemblepairs_logs.ifEmpty([]) + ch_presto_collapseseq_logs = SEQUENCE_ASSEMBLY.out.presto_collapseseq_logs.ifEmpty([]) + ch_presto_splitseq_logs = SEQUENCE_ASSEMBLY.out.presto_splitseq_logs.ifEmpty([]) } } else if ( params.mode == "assembled" ) { @@ -175,7 +175,7 @@ workflow AIRRFLOW { ch_fasta, ch_validated_samplesheet.collect(), DATABASES.out.igblast.collect(), - DATABASES.out.imgt.collect() + DATABASES.out.reference_fasta.collect() ) ch_versions = ch_versions.mix( VDJ_ANNOTATION.out.versions ) @@ -192,7 +192,7 @@ workflow AIRRFLOW { BULK_QC_AND_FILTER( ch_repertoire_by_processing.bulk, - VDJ_ANNOTATION.out.imgt.collect() + VDJ_ANNOTATION.out.reference_fasta.collect() ) ch_versions = ch_versions.mix( BULK_QC_AND_FILTER.out.versions ) @@ -215,7 +215,7 @@ workflow AIRRFLOW { // Clonal analysis CLONAL_ANALYSIS( ch_repertoires_for_clones, - VDJ_ANNOTATION.out.imgt.collect(), + VDJ_ANNOTATION.out.reference_fasta.collect(), ch_report_logo_img.collect().ifEmpty([]) ) ch_versions = ch_versions.mix( CLONAL_ANALYSIS.out.versions)