Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete lib directory and replace with utils_* subworkflows #1197

Merged
merged 9 commits into from
Jan 22, 2024

Conversation

drpatelh
Copy link
Member

@drpatelh drpatelh commented Jan 20, 2024

  • Delete lib/ directory and replace with utils_* subworkflows installed from nf-core/modules
  • Split out PREPARE_GENOME subworkflow to run outside of main rnaseq workflow
  • Remove CUSTOM_DUMPSOFTWAREVERSIONS to use native MultiQC software version reporting
  • Replace local MultiQC module with the one from nf-core/modules
  • Remove test_cache profile as this will be superseded by using test data from S3 whilst porting to nf-test

Copy link

github-actions bot commented Jan 20, 2024

nf-core lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 3734375

+| ✅ 142 tests passed       |+
#| ❔  10 tests were ignored |#
!| ❗   5 tests had warnings |!

❗ Test warnings:

  • files_exist - File not found: assets/multiqc_config.yml
  • files_exist - File not found: .github/workflows/awstest.yml
  • files_exist - File not found: .github/workflows/awsfulltest.yml
  • files_exist - File not found: lib/WorkflowRnaseq.groovy
  • pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline

❔ Tests ignored:

  • files_exist - File is ignored: lib/nfcore_external_java_deps.jar
  • files_exist - File is ignored: lib/NfcoreTemplate.groovy
  • files_exist - File is ignored: lib/Utils.groovy
  • files_exist - File is ignored: lib/WorkflowMain.groovy
  • files_unchanged - File ignored due to lint config: assets/email_template.html
  • files_unchanged - File ignored due to lint config: assets/email_template.txt
  • files_unchanged - File does not exist: lib/nfcore_external_java_deps.jar
  • files_unchanged - File does not exist: lib/NfcoreTemplate.groovy
  • actions_awstest - 'awstest.yml' workflow not found: /home/runner/work/rnaseq/rnaseq/.github/workflows/awstest.yml
  • multiqc_config - 'assets/multiqc_config.yml' not found

✅ Tests passed:

Run details

  • nf-core/tools version 2.11.1
  • Run at 2024-01-21 11:13:50

@drpatelh drpatelh marked this pull request as draft January 20, 2024 19:27
@drpatelh drpatelh changed the base branch from nf-test to config_refactor January 21, 2024 09:39
@drpatelh drpatelh marked this pull request as ready for review January 21, 2024 10:57
Copy link
Member

@maxulysse maxulysse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I LOOOOOVE IT

@maxulysse
Copy link
Member

Just please not that there is still an issue with the monochrome_logs params:

> nextflow run drpatelh/nf-core-rnaseq -r remove_lib -profile test,docker --outdir results
N E X T F L O W  ~  version 23.04.3
Pulling drpatelh/nf-core-rnaseq ...
 downloaded from https://github.com/drpatelh/nf-core-rnaseq.git
Launching `https://github.com/drpatelh/nf-core-rnaseq` [pedantic_bassi] DSL2 - revision: 37343750f1 [remove_lib]
WARN: Access to undefined parameter `monochromeLogs` -- Initialise it to a default value eg. `params.monochromeLogs = some_value`

@drpatelh
Copy link
Member Author

drpatelh commented Jan 22, 2024

Just please not that there is still an issue with the monochrome_logs params:

Yep, I think this is actually coming from the nf-validation plugin because I see the same issue with nf-core/fetchngs. Can you create an issue there please? I started trying to look at the nf-validation code base to find the issue but got distracted.

@drpatelh drpatelh merged commit 9308fd2 into nf-core:config_refactor Jan 22, 2024
29 checks passed
Copy link
Contributor

@adamrtalbot adamrtalbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future thoughts for as we update this.

Comment on lines 36 to +59
workflow PREPARE_GENOME {
take:
fasta // file: /path/to/genome.fasta
gtf // file: /path/to/genome.gtf
gff // file: /path/to/genome.gff
additional_fasta // file: /path/to/additional.fasta
transcript_fasta // file: /path/to/transcript.fasta
gene_bed // file: /path/to/gene.bed
splicesites // file: /path/to/splicesites.txt
bbsplit_fasta_list // file: /path/to/bbsplit_fasta_list.txt
star_index // directory: /path/to/star/index/
rsem_index // directory: /path/to/rsem/index/
salmon_index // directory: /path/to/salmon/index/
kallisto_index // directory: /path/to/kallisto/index/
hisat2_index // directory: /path/to/hisat2/index/
bbsplit_index // directory: /path/to/rsem/index/
gencode // boolean: whether the genome is from GENCODE
is_aws_igenome // boolean: whether the genome files are from AWS iGenomes
biotype // string: if additional fasta file is provided biotype value to use when appending entries to GTF file
prepare_tool_indices // list: tools to prepare indices for
filter_gtf // boolean: whether to filter GTF file
fasta // file: /path/to/genome.fasta
gtf // file: /path/to/genome.gtf
gff // file: /path/to/genome.gff
additional_fasta // file: /path/to/additional.fasta
transcript_fasta // file: /path/to/transcript.fasta
gene_bed // file: /path/to/gene.bed
splicesites // file: /path/to/splicesites.txt
bbsplit_fasta_list // file: /path/to/bbsplit_fasta_list.txt
star_index // directory: /path/to/star/index/
rsem_index // directory: /path/to/rsem/index/
salmon_index // directory: /path/to/salmon/index/
kallisto_index // directory: /path/to/kallisto/index/
hisat2_index // directory: /path/to/hisat2/index/
bbsplit_index // directory: /path/to/rsem/index/
gencode // boolean: whether the genome is from GENCODE
featurecounts_group_type // string: The attribute type used to group feature types in the GTF file when generating the biotype plot with featureCounts
aligner // string: Specifies the alignment algorithm to use - available options are 'star_salmon', 'star_rsem' and 'hisat2'
pseudo_aligner // string: Specifies the pseudo aligner to use - available options are 'salmon'. Runs in addition to '--aligner'
skip_gtf_filter // boolean: Skip filtering of GTF for valid scaffolds and/ or transcript IDs
skip_bbsplit // boolean: Skip BBSplit for removal of non-reference genome reads
skip_alignment // boolean: Skip all of the alignment-based processes within the pipeline
skip_pseudo_alignment // boolean: Skip all of the pseudoalignment-based processes within the pipeline
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this, I think this subworkflow is doing waaayyyyy too much.

Comment on lines +96 to +113
// Determine whether to filter the GTF or not
def filter_gtf =
((
// Condition 1: Alignment is required and aligner is set
!skip_alignment && aligner
) ||
(
// Condition 2: Pseudoalignment is required and pseudoaligner is set
!skip_pseudo_alignment && pseudo_aligner
) ||
(
// Condition 3: Transcript FASTA file is not provided
!transcript_fasta
)) &&
(
// Condition 4: --skip_gtf_filter is not provided
!skip_gtf_filter
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Way too much stuff going on here. We should be able to simplify this by clarifying what filter_gtf means. What are we filtering from the GTF? Why?

//
// Print version and exit if required and dump pipeline parameters to JSON file
//
UTILS_NEXTFLOW_PIPELINE (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not happy with the name UTILS_NEXTFLOW_PIPELINE, NEXTFLOW_PIPELINE_UTILITIES makes more sense but it still doesn't say what the workflow actually does, making it hard to read.

If we didn't know what the subworkflow did, what would we think this was doing?

def pre_help_text = nfCoreLogo(params.monochrome_logs)
def post_help_text = '\n' + workflowCitation() + '\n' + dashedLine(params.monochrome_logs)
def String workflow_command = "nextflow run ${workflow.manifest.name} -profile <docker/singularity/.../institute> --input samplesheet.csv --genome GRCh37 --outdir <OUTDIR>"
UTILS_NFVALIDATION_PLUGIN (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same problem here.

========================================================================================
*/

workflow PIPELINE_INITIALISATION {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name is slightly better, but a function name is a verb so saying what it is doing is more clear for the developer.

//
// Custom validation for pipeline parameters
//
validateInputParameters()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at this and without looking at the code knew exactly what it was doing. A+.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants