Add skip tools parameter for tool selection #68

Aratz · 2024-11-22T13:19:09Z

Hi!

I've been working on tool selection lately and tried to implement the profile-based proposal mentioned in #32 but I've found it hard to:

define a smooth UI. I find it a bit redundant to have to specify both a file describing all the available profiles and to have to specify which profile is used. "profile" is probably a little confusing too because it is to close to Nextflow config profiles. Even though a simple "skip_tools" argument is quite basic, it has the advantage of being immediately understandable and usable by anyone, and it case it gets too lengthy it can be specified in a params-file or even a nextflow config file.
motivate why we need an alternative way to customize tools while this is already possible with nextflow configs. If we were to have this option in tool profiles too, we would need to make it clear what's the precedence between nextflow config and tool profile, i.e. in case both are defined, do we ignore one of them? or do we concatenate all ext.args together? Either way this will make debugging harder in many cases.

Hence I would really advocate for a simpler solution that makes better use of functionalities already provided by Nextflow. Tool selection and customization is something many other pipelines do and I think it would be much better to try to stick to the nf-core standard. The precedence between config files is well documented and this will make the user experience better for users who are already familiar with Nextflow and other nf-core pipelines.

This PR provides a very simple solution where the user provide a list of tools to be excluded. This can be provided either through command line arguments or through a config file. As specified in Nexflow docs, cli arguments will override values defined in config files. This is similar to what's implemented in nf-core/demultiplex/.

Essentially this still makes it possible to define custom profiles. For example, one can write a nanopore.config, specify which tools should be skipped, and even add extra arguments to some tools with withName:TOOL statements (as specified in the nf-core documentation: https://nf-co.re/docs/usage/configuration#customising-tool-arguments).

This doesn't mean we should completely abandon the original idea, we can always improve the current solution if we find it too limiting in the future.

Closes #32

PR checklist

This comment contains a description of changes (with reason).
If you've fixed a bug or added code that should be tested, add tests!
Make sure your code lints (nf-core pipelines lint).
Ensure the test suite passes (nf-test test).
Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
Usage Documentation in docs/usage.md is updated.
CHANGELOG.md is updated.

github-actions · 2024-11-22T13:20:36Z

`nf-core pipelines lint` overall result: Passed ✅ ⚠️

Posted for pipeline commit 438c715

+| ✅ 191 tests passed       |+
#| ❔   1 tests were ignored |#
!| ❗  21 tests had warnings |!

❗ Test warnings:

readme - README contains the placeholder zenodo.XXXXXXX. This should be replaced with the zenodo doi (after the first release).
pipeline_todos - TODO string in main.nf: Remove this line if you don't need a FASTA file
pipeline_todos - TODO string in nextflow.config: Specify your pipeline's command line flags
pipeline_todos - TODO string in nextflow.config: Optionally, you can add a pipeline-specific nf-core config at https://github.com/nf-core/configs
pipeline_todos - TODO string in README.md: TODO nf-core:
pipeline_todos - TODO string in README.md: Include a figure that guides the user through the major workflow steps. Many nf-core
pipeline_todos - TODO string in README.md: Fill in short bullet-pointed list of the default steps in the pipeline
pipeline_todos - TODO string in README.md: Add citation for pipeline after first release. Uncomment lines below and update Zenodo doi and badge at the top of this file.
pipeline_todos - TODO string in README.md: Add bibliography of tools and data used in your pipeline
pipeline_todos - TODO string in usage.md: Add documentation about anything specific to running your pipeline. For general topics, please point to (and add to) the main nf-core website.
pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!
pipeline_todos - TODO string in test_full.config: Specify the paths to your full test data ( on nf-core/test-datasets or directly in repositories, e.g. SRA)
pipeline_todos - TODO string in test_full.config: Give any required params for the test so that command line flags are not needed
pipeline_todos - TODO string in test.config: Specify the paths to your test data on nf-core/test-datasets
pipeline_todos - TODO string in test.config: Give any required params for the test so that command line flags are not needed
pipeline_todos - TODO string in base.config: Check the defaults for all processes
pipeline_todos - TODO string in base.config: Customise requirements for specific processes.
pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline
pipeline_todos - TODO string in awsfulltest.yml: You can customise AWS full pipeline tests as required

❔ Tests ignored:

files_unchanged - File ignored due to lint config: .github/CONTRIBUTING.md

✅ Tests passed:

files_exist - File found: .gitattributes
files_exist - File found: .gitignore
files_exist - File found: .nf-core.yml
files_exist - File found: .editorconfig
files_exist - File found: .prettierignore
files_exist - File found: .prettierrc.yml
files_exist - File found: CHANGELOG.md
files_exist - File found: CITATIONS.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: LICENSE or LICENSE.md or LICENCE or LICENCE.md
files_exist - File found: nextflow_schema.json
files_exist - File found: nextflow.config
files_exist - File found: README.md
files_exist - File found: .github/.dockstore.yml
files_exist - File found: .github/CONTRIBUTING.md
files_exist - File found: .github/ISSUE_TEMPLATE/bug_report.yml
files_exist - File found: .github/ISSUE_TEMPLATE/config.yml
files_exist - File found: .github/ISSUE_TEMPLATE/feature_request.yml
files_exist - File found: .github/PULL_REQUEST_TEMPLATE.md
files_exist - File found: .github/workflows/branch.yml
files_exist - File found: .github/workflows/ci.yml
files_exist - File found: .github/workflows/linting_comment.yml
files_exist - File found: .github/workflows/linting.yml
files_exist - File found: assets/email_template.html
files_exist - File found: assets/email_template.txt
files_exist - File found: assets/sendmail_template.txt
files_exist - File found: assets/nf-core-seqinspector_logo_light.png
files_exist - File found: conf/modules.config
files_exist - File found: conf/test.config
files_exist - File found: conf/test_full.config
files_exist - File found: docs/images/nf-core-seqinspector_logo_light.png
files_exist - File found: docs/images/nf-core-seqinspector_logo_dark.png
files_exist - File found: docs/output.md
files_exist - File found: docs/README.md
files_exist - File found: docs/README.md
files_exist - File found: docs/usage.md
files_exist - File found: main.nf
files_exist - File found: assets/multiqc_config.yml
files_exist - File found: conf/base.config
files_exist - File found: conf/igenomes.config
files_exist - File found: conf/igenomes_ignored.config
files_exist - File found: .github/workflows/awstest.yml
files_exist - File found: .github/workflows/awsfulltest.yml
files_exist - File found: modules.json
files_exist - File not found check: .github/ISSUE_TEMPLATE/bug_report.md
files_exist - File not found check: .github/ISSUE_TEMPLATE/feature_request.md
files_exist - File not found check: .github/workflows/push_dockerhub.yml
files_exist - File not found check: .markdownlint.yml
files_exist - File not found check: .nf-core.yaml
files_exist - File not found check: .yamllint.yml
files_exist - File not found check: bin/markdown_to_html.r
files_exist - File not found check: conf/aws.config
files_exist - File not found check: docs/images/nf-core-seqinspector_logo.png
files_exist - File not found check: lib/Checks.groovy
files_exist - File not found check: lib/Completion.groovy
files_exist - File not found check: lib/NfcoreTemplate.groovy
files_exist - File not found check: lib/Utils.groovy
files_exist - File not found check: lib/Workflow.groovy
files_exist - File not found check: lib/WorkflowMain.groovy
files_exist - File not found check: lib/WorkflowSeqinspector.groovy
files_exist - File not found check: parameters.settings.json
files_exist - File not found check: pipeline_template.yml
files_exist - File not found check: Singularity
files_exist - File not found check: lib/nfcore_external_java_deps.jar
files_exist - File not found check: .travis.yml
nextflow_config - Found nf-schema plugin
nextflow_config - Config variable found: manifest.name
nextflow_config - Config variable found: manifest.nextflowVersion
nextflow_config - Config variable found: manifest.description
nextflow_config - Config variable found: manifest.version
nextflow_config - Config variable found: manifest.homePage
nextflow_config - Config variable found: timeline.enabled
nextflow_config - Config variable found: trace.enabled
nextflow_config - Config variable found: report.enabled
nextflow_config - Config variable found: dag.enabled
nextflow_config - Config variable found: process.cpus
nextflow_config - Config variable found: process.memory
nextflow_config - Config variable found: process.time
nextflow_config - Config variable found: params.outdir
nextflow_config - Config variable found: params.input
nextflow_config - Config variable found: validation.help.enabled
nextflow_config - Config variable found: manifest.mainScript
nextflow_config - Config variable found: timeline.file
nextflow_config - Config variable found: trace.file
nextflow_config - Config variable found: report.file
nextflow_config - Config variable found: dag.file
nextflow_config - Config variable found: validation.help.beforeText
nextflow_config - Config variable found: validation.help.afterText
nextflow_config - Config variable found: validation.help.command
nextflow_config - Config variable found: validation.summary.beforeText
nextflow_config - Config variable found: validation.summary.afterText
nextflow_config - Config variable (correctly) not found: params.nf_required_version
nextflow_config - Config variable (correctly) not found: params.container
nextflow_config - Config variable (correctly) not found: params.singleEnd
nextflow_config - Config variable (correctly) not found: params.igenomesIgnore
nextflow_config - Config variable (correctly) not found: params.name
nextflow_config - Config variable (correctly) not found: params.enable_conda
nextflow_config - Config variable (correctly) not found: params.max_cpus
nextflow_config - Config variable (correctly) not found: params.max_memory
nextflow_config - Config variable (correctly) not found: params.max_time
nextflow_config - Config variable (correctly) not found: params.validationFailUnrecognisedParams
nextflow_config - Config variable (correctly) not found: params.validationLenientMode
nextflow_config - Config variable (correctly) not found: params.validationSchemaIgnoreParams
nextflow_config - Config variable (correctly) not found: params.validationShowHiddenParams
nextflow_config - Config timeline.enabled had correct value: true
nextflow_config - Config report.enabled had correct value: true
nextflow_config - Config trace.enabled had correct value: true
nextflow_config - Config dag.enabled had correct value: true
nextflow_config - Config manifest.name began with nf-core/
nextflow_config - Config variable manifest.homePage began with https://github.com/nf-core/
nextflow_config - Config dag.file ended with .html
nextflow_config - Config variable manifest.nextflowVersion started with >= or !>=
nextflow_config - Config manifest.version ends in dev: 1.0dev
nextflow_config - Config params.custom_config_version is set to master
nextflow_config - Config params.custom_config_base is set to https://raw.githubusercontent.com/nf-core/configs/master
nextflow_config - Lines for loading custom profiles found
nextflow_config - nextflow.config contains configuration profile test
nextflow_config - Config default value correct: params.sample_size= 0
nextflow_config - Config default value correct: params.igenomes_base= s3://ngi-igenomes/igenomes/
nextflow_config - Config default value correct: params.custom_config_version= master
nextflow_config - Config default value correct: params.custom_config_base= https://raw.githubusercontent.com/nf-core/configs/master
nextflow_config - Config default value correct: params.publish_dir_mode= copy
nextflow_config - Config default value correct: params.max_multiqc_email_size= 25.MB
nextflow_config - Config default value correct: params.validate_params= true
nextflow_config - Config default value correct: params.pipelines_testdata_base_path= https://raw.githubusercontent.com/nf-core/test-datasets/
files_unchanged - .gitattributes matches the template
files_unchanged - .prettierrc.yml matches the template
files_unchanged - CODE_OF_CONDUCT.md matches the template
files_unchanged - LICENSE matches the template
files_unchanged - .github/.dockstore.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/bug_report.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/config.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/feature_request.yml matches the template
files_unchanged - .github/PULL_REQUEST_TEMPLATE.md matches the template
files_unchanged - .github/workflows/branch.yml matches the template
files_unchanged - .github/workflows/linting_comment.yml matches the template
files_unchanged - .github/workflows/linting.yml matches the template
files_unchanged - assets/email_template.html matches the template
files_unchanged - assets/email_template.txt matches the template
files_unchanged - assets/sendmail_template.txt matches the template
files_unchanged - assets/nf-core-seqinspector_logo_light.png matches the template
files_unchanged - docs/images/nf-core-seqinspector_logo_light.png matches the template
files_unchanged - docs/images/nf-core-seqinspector_logo_dark.png matches the template
files_unchanged - docs/README.md matches the template
files_unchanged - .gitignore matches the template
files_unchanged - .prettierignore matches the template
actions_ci - '.github/workflows/ci.yml' is triggered on expected events
actions_ci - '.github/workflows/ci.yml' checks minimum NF version
actions_awstest - '.github/workflows/awstest.yml' is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml does not use -profile test
readme - README Nextflow minimum version badge matched config. Badge: 24.04.2, Config: 24.04.2
plugin_includes - No wrong validation plugin imports have been found
pipeline_name_conventions - Name adheres to nf-core convention
template_strings - Did not find any Jinja template strings (0 files)
schema_lint - Schema lint passed
schema_lint - Schema title + description lint passed
schema_lint - Input mimetype lint passed: 'text/csv'
schema_params - Schema matched params returned from nextflow config
system_exit - No System.exit calls found
actions_schema_validation - Workflow validation passed: nf-test.yml
actions_schema_validation - Workflow validation passed: linting.yml
actions_schema_validation - Workflow validation passed: branch.yml
actions_schema_validation - Workflow validation passed: fix-linting.yml
actions_schema_validation - Workflow validation passed: release-announcements.yml
actions_schema_validation - Workflow validation passed: awsfulltest.yml
actions_schema_validation - Workflow validation passed: template_version_comment.yml
actions_schema_validation - Workflow validation passed: download_pipeline.yml
actions_schema_validation - Workflow validation passed: ci.yml
actions_schema_validation - Workflow validation passed: clean-up.yml
actions_schema_validation - Workflow validation passed: awstest.yml
actions_schema_validation - Workflow validation passed: linting_comment.yml
merge_markers - No merge markers found in pipeline files
modules_json - Only installed modules found in modules.json
multiqc_config - assets/multiqc_config.yml found and not ignored.
multiqc_config - assets/multiqc_config.yml contains report_section_order
multiqc_config - assets/multiqc_config.yml contains export_plots
multiqc_config - assets/multiqc_config.yml contains report_comment
multiqc_config - assets/multiqc_config.yml follows the ordering scheme of the minimally required plugins.
multiqc_config - assets/multiqc_config.yml contains a matching 'report_comment'.
multiqc_config - assets/multiqc_config.yml contains 'export_plots: true'.
modules_structure - modules directory structure is correct 'modules/nf-core/TOOL/SUBTOOL'
base_config - conf/base.config found and not ignored.
modules_config - conf/modules.config found and not ignored.
modules_config - SEQTK_SAMPLE found in conf/modules.config and Nextflow scripts.
modules_config - FASTQC found in conf/modules.config and Nextflow scripts.
modules_config - SEQFU_STATS found in conf/modules.config and Nextflow scripts.
modules_config - MULTIQC_GLOBAL found in conf/modules.config and Nextflow scripts.
modules_config - MULTIQC_PER_TAG found in conf/modules.config and Nextflow scripts.
nfcore_yml - Repository type in .nf-core.yml is valid: pipeline
nfcore_yml - nf-core version in .nf-core.yml is set to the latest version: 3.0.2

Run details

nf-core/tools version 3.0.2
Run at 2024-11-22 13:22:57

MatthiasZepper · 2024-11-22T17:23:38Z

motivate why we need an alternative way to customize tools while this is already possible with nextflow configs.

So you propose to include extensive configs defining multiple pipeline specific profiles with this pipeline? Could you please include an example of this in the PR or here as comment, then? I still struggle to wrap my head around that.

Tool selection and customization is something many other pipelines do and I think it would be much better to try to stick to the nf-core standard.

I am not aware that there is a standard. In contrast, I was baffled by the individual differences.

The precedence between config files is well documented and this will make the user experience better for users who are already familiar with Nextflow and other nf-core pipelines.

It is? I am aware about the docs for process selectors and for various config files, but find these priorities very hard to comprehend.

Ultimately, such top-down hierarchies do not help us anyway. We would need to patch additional labels to the modules, and then somehow handle withLabel for processes with multiple labels. What is the precedence in case of -profile singularity,uppmax,nanopore,extensive vs. -profile uppmax,singularity,extensive,nanopore?

Aratz · 2024-11-25T16:14:47Z

motivate why we need an alternative way to customize tools while this is already possible with nextflow configs.

So you propose to include extensive configs defining multiple pipeline specific profiles with this pipeline? Could you please include an example of this in the PR or here as comment, then? I still struggle to wrap my head around that.

If by profile you mean tool profile (instead of nextflow profile) then yes more or less. For each use case, one would need to write either a params-file or a nextflow config file (in case some tools need some specific parameters). This would be left to the user though. I think for v1 it is acceptable to just run everything that doesn't need extra db arguments for instance. These files would look like this:

params.json:

{
    "skip_tools": "seqfu_stats"
}

or, eg. nanopore.config:

params {
    skip_tools = 'seqfu_stats'
}

process {
    withName: FASTQC {
        ext.args = "--nano"
    }
}

Tool selection and customization is something many other pipelines do and I think it would be much better to try to stick to the nf-core standard.

I am not aware that there is a standard. In contrast, I was baffled by the individual differences.

Well there is a standard in how configuration settings are defined in a nextflow config file and in how tool parameters are defined there too.

The precedence between config files is well documented and this will make the user experience better for users who are already familiar with Nextflow and other nf-core pipelines.

It is? I am aware about the docs for process selectors and for various config files, but find these priorities very hard to comprehend.

Ultimately, such top-down hierarchies do not help us anyway. We would need to patch additional labels to the modules, and then somehow handle withLabel for processes with multiple labels. What is the precedence in case of -profile singularity,uppmax,nanopore,extensive vs. -profile uppmax,singularity,extensive,nanopore?

Well actually the configs would be provided with for instance -c extensive.config -c nanopore.config (again I think we should choose another term than profile to avoid the confusion with nextflow profiles), and in this case, as the docs states, each config file would be applied in turn, that is: extensive.config would be applied first, and then nanopore.config, i.e. the value of skip_tools in nanopore.config would override the one in extensive.config.

But at this point I don't think there is much need for being able to combine different configuration files anyways. Maybe this will change in the future, but until then and as long as we only have a dozen tools or so I find it perfectly acceptable to stick to nextflow configs.

Again, this PR is quite basic, but it does fulfill all our requirements given the number of tools we currently plan to have in v1.0.

MatthiasZepper · 2024-11-27T18:22:53Z

If by profile you mean tool profile (instead of nextflow profile) then yes more or less. For each use case, one would need to write either a params-file or a nextflow config file (in case some tools need some specific parameters).

Well, that is exactly the complexity, I wanted to avoid by unifying all settings in a YAML, which is also a lot easier to customize.

Particularly when using Nextflow profiles as tool profiles, we should then also stick with the standard API -profile and specify them accordingly in the main nextflow.config and not via -c. But if you wish, I have no objections against calling them differently - with an inspector theme in mind, dossier or case could work?

My main concern with the Nextflow config profiles is the Danger box here:

When using the profiles feature in your config file, do NOT set attributes in the same scope both inside and outside a profiles context.
In the above example, the process.cpus attribute is not correctly applied because the process scope is also used in the foo and bar profiles.

I fear this is easy to get wrong in combination with a poorly written institutional config. On an upside, it is indeed straightforward to test for the correct config, with nextflow config --flat, so we could define nf-test to test automatically the compatibility with all institutional configs in the nf-core/configs.

This would be left to the user though.

On the contrary, that is one of the main selling points of this pipeline - the in-depth knowledge of a sequencing facility. My experience from rnaseq is, that the peculiarities of tool arguments must be handled by us developers, because most people expect that a pipeline has reasonable defaults to just run.

Take for example, the extra_trimgalore_args parameter of that pipeline. While people frequently post their parameter files, I have not seen a single one that specified the --2colour argument to TrimGalore in extra_trimgalore_args. However, that is required for correct quality trimming when data from NovaSeq and NextSeq instruments is processed, which is arguably the default by now. I reckon very few people sequence on a MiSeq anymore for RNA-seq - maybe in the future on Aviti again.

Aratz · 2024-12-02T07:36:17Z

I'm not sure I understand why would anyone need to define tool profiles within Nextflow profiles? As you said, this would add unnecessary complexity, while it's perfectly fine and much safer to specify this parameter within a config file and pass it to the pipeline with -c.

And if you think we should provide some default configs for some selected applications, sure, this feature doesn't prevent that either.

MatthiasZepper · 2024-12-02T11:52:13Z

I'm not sure if I understand why would anyone need to define tool profiles within Nextflow profiles?

I do not know about anyone, but I know about us. We as pipeline developers need to ship the pipeline with a few default routes that can be used by changing a simple global parameter (my suggestion) or switching the config profiles (your suggestion as far as I understand)

Since you suggested to harness the default Nextflow configuration capabilities, I presume you want to use the ext.when directive to control if a particular tool is run, and the ext.args to adapt the behavior of the tool?

Foremost, I am not sure if both of these are future-proof, since I believe Seqera ponders about dropping support for them in future Nextflow versions, but if we use them nonetheless, we would end up with something like this to implement our Seqinspector cases/dossiers, correct?

profiles {
    extensive {
        params {
            some_contamination_screening_reference_default   = 's3://ngi-igenomes/seqinspector/default'
        }
        process {
            withLabel:contamination {
                    ext.when   = { TRUE }
                }
            withLabel:fastqfilechecks {
                    ext.when   = { TRUE }
            }
           withLabel:nanopore {
                    ext.when   = { FALSE }
            }
            withName: 'SOME_TOOL' {
                    ext.args   = { '-b -d' }
            }
    }}
    basic {
        process {
            withLabel:contamination {
                    ext.when   = { FALSE }
                }
            withLabel:fastqfilechecks {
                    ext.when   = { TRUE }
                }
            withLabel:nanopore {
                    ext.when   = { FALSE }
                }
            withName: 'SOME_TOOL' {
                    ext.args   = { '-a' }
            }
    }}
   nanopore {
        process {
            withLabel:nanopore {
                    ext.when   = { TRUE }
                }
            withName: 'SOME_NANOPOREQC_TOOL' {
                    ext.args   = { '--ont' }
            }
    }}
}

And that would ultimately enable a UX like:
nextflow run nf-core/seqinspector -profiles "uppmax,singularity,extensive".

That is nice. But how do we now, e.g. run the Nanopore specific tools for a Nanopore QC?

Is there a difference between
nextflow run nf-core/seqinspector -profiles "basic,nanopore"
nextflow run nf-core/seqinspector -profiles "nanopore,basic"

How is the withLabel / ext.when priority resolved in that case? I do not know, because afaik Nextflow's config documentation does not clarify this, and it matters, because either the tools run or not.

Aratz · 2024-12-06T13:33:29Z

I believe there are some misunderstandings both about the content and the scope of this PR:

First about the content, this PR only adds a new parameter, skip_tools, to disable tools when running the pipeline. This is similar both in the way it is used and in the way it is implemented to what is found in e.g. nf-core/demultiplex or nf-core/sarek. As with all nextflow parameters, it can be set in different ways, such as a cli argument, a params file, or a config file. Specific config files can thus be set in order to process specific kinds of data by disabling some tools and customizing some others (see the example I provided above). Most importantly: it does not offer to combine two configs, is not intended to be used with Nextflow profiles, and does not make use of ext.when in any way.

Second, about the scope, this PR implements what I believe would be the minimal requirements for a first release when it comes to tool selection. It is not supposed to provide the perfect user experience, just to provide a basic way to select tools. In other words, it is only supposed to be the first step towards tool selection and to serve as a baseline to compare it to future solutions.

Because the original idea is slightly more complicated (in particular when it comes to the possibility to combine profiles and how to achieve this), I believe it needs more thoughts before it is implemented, which is what I tried to motivate in the PR's description. That being said, I think being able to define and combine tool profiles is a great feature to have and that we should absolutely do it. I just think it will take some time before we implement something viable and that we should not condition v1 to having this fully implemented.

MatthiasZepper · 2024-12-08T07:23:06Z

Having worked myself into the ground with a previous profile implementation attempt due to my inability and rather diffuse Nextflow requirements, I am very sympathetic towards breaking it down to more manageable tasks.

Had you been framing your PR in this way, I would have no objections and happily focused on code review only. But even after rereading this whole discussion multiple times, the misunderstanding on my behalf cannot be remedied. Perhaps I truly overestimate the scope of this PR, but why is almost your whole PR description a plea to drop the profile/case idea entirely, then? I read redundant to have, hard to motivate why, advocate etc. and my impression persists: Your code contributions are tied to a fundamental diversion from the plan that we have agreed on in the development meeting.

I can admittedly not follow, if you first claim that we can achieve all requirements purely with Nextflow configs, but later refuse to comment on my mock-up with anything more specific than "it is not intended to work like that". Well, but how else?

The mock-up was essentially my attempt to redirect this high-level discussion, that would probably be much more suitable for a developmental meeting, back to specific code questions. But you seem to be hell-bent on incidentally ascertaining in this PR discussion what the minimal requirements for a first release are, and that nothing more than this is needed in terms of tool selection? (edited: This summary is indeed sharper than the original statement and therefore falsely exaggerated.)

I cannot recall any discussion in the developmental meeting regarding zeroing in onto a release. Personally, I perceive this pipeline still ramshackle and not anywhere near the state of a worthy nf-core pipeline. There are so many more mature pipelines than this still in dev; I would not mind a 0.1 release, but a 1.0.0? I suggest bringing this on the agenda of a developmental meeting, rather than casually deciding this in a PR discussion without involving the others?

In either way, I have come to the conclusion that I can't ask you to implement a feature that you seemingly consider dispensable.

matrulda

Great work! This looks great! @Aratz has left some minor comments for you.

This implementation is definitely simpler than what we previously discussed. I agree that skip_tools is more understandable and easier to grasp than adding another layer of configuration.

I understand the desire for the pipeline to come with default routes that are ready to use. However, as @Aratz mentioned, we could add configurations that users can utilize if they so desire.

I see the challenge in trying to combine different "modes" (avoiding the word profile here), like lean and illumina. In my opinion, we don't necessarily need modes like lean/extensive; the most important thing is that the tools are compatible with the data. Starting with the assumption that one config will suffice, such as the nanopore.config mentioned earlier, makes total sense to me.

I believe this implementation meets our needs, but of course, we can discuss it further at a development meeting before merging.

Additionally, I would really appreciate it if we could all maintain a friendly tone. :)

docs/usage.md

nextflow_schema.json

MatthiasZepper · 2024-12-09T16:58:41Z

Thanks, @matrulda, for chiming in. I agree, that a fresh perspective is helpful to resolve this conflict.

I believe this implementation meets our needs, but of course, we can discuss it further at a development meeting before merging.

From my side, I am fine with merging it prior to further discussions. My objections are not rooted in the code itself, but in the dismissal of the whole concept of modes / routes, which I believe should happen in a developmental meeting and not in a PR discussion.

Personally, I still think we need to have them, but am the first to admit that it makes things very complicated. I wasted dozens of hours on that, and ultimately failed.

Additionally, I would really appreciate it if we could all maintain a friendly tone. :)

Do you have the impression that this was not the case?

As far as I am concerned, I do indeed feel strongly about this topic, given my unfruitful previous work on that subject and my desire for perfectionism, but also carefully worded my replies to emphasize assumptions and to ensure the subjectivity is clear. I also took note that Adrien did the same, so from my site, I never felt personally attacked, just confused and estranged by our very different perceptions.

But if I have misunderstood or exaggerated some statements, I would like to apologize. It was not my intention to discuss anything outside the question of how we deal with profiles/modes/routes, or to imply ad hominem traits.

matrulda · 2024-12-10T07:49:07Z

Thanks, @matrulda, for chiming in. I agree, that a fresh perspective is helpful to resolve this conflict.

Glad to hear that :)

From my side, I am fine with merging it prior to further discussions. My objections are not rooted in the code itself, but in the dismissal of the whole concept of modes / routes, which I believe should happen in a developmental meeting and not in a PR discussion.

I understand that, but I think @Aratz was very clear that he did not dismiss that concept, but that it for now could be put aside.
Even though we might not need a meeting for this, it would be nice to get a Yay or Nay for some other people in the group. What say you @alneberg @FranBonath @kedhammar ?

Do you have the impression that this was not the case?

Yeah, I interpreted your answers as a bit hostile. I'm happy to hear that was not your intention. We all communicate in different ways and nuances don't always get through in text.

alneberg · 2024-12-10T08:46:44Z

I agree with @matrulda that this is good enough to have something. I was wondering if we would like to have the subsampling involved in the skip_tools parameter as well? Currently, the subsampling is regulated using the --sample_size parameter. Are we happy with using only that? I'm not sure what my opinion is, but maybe it's cleaner if sample_size is used more as a standard ext.args and whether to subsample or not is inflicted using the skip_tools parameter?

kedhammar · 2024-12-10T10:44:59Z

On a scale from yay to nay, I'm also leaning more towards yay. Establishing a simple and intuitive baseline functionality, as a first step. Not disregarding any possible future implementations.

If we arrive at a point where we have a dozen platform-specific tools that are mutually exclusive, it might be helpful to add some example tool configs to the assets dir, or similar. I guess we'd have to create them anyway, to be used for functional testing of the different kinds of data in our test profile.

I would like this branch to pull from dev and address the newly added tools, prior to a final review and merge.

Aratz · 2024-12-13T11:18:14Z

@matrulda @alneberg @kedhammar Thanks for the feedback, I'll first fix the template update, then rebase this PR and address your comments

@alneberg unfortunately, the SEQTK_SAMPLE module expects the samplesize to be given as a parameter (see https://github.com/nf-core/modules/blob/08108058ea36a63f141c25c4e75f9f872a5b2296/modules/nf-core/seqtk/sample/main.nf#L11), but I agree it would be clearer if it could only be skipped using skip_tools rather than setting the samplesize to 0. I'll fix that 👍

alneberg · 2024-12-13T12:01:11Z

@alneberg unfortunately, the SEQTK_SAMPLE module expects the samplesize to be given as a parameter (see https://github.com/nf-core/modules/blob/08108058ea36a63f141c25c4e75f9f872a5b2296/modules/nf-core/seqtk/sample/main.nf#L11), but I agree it would be clearer if it could only be skipped using skip_tools rather than setting the samplesize to 0. I'll fix that 👍

Ah, I see. Yeah then we'll keep that as a parameter, and it's not really in the scope of this PR anyway. 👍

Aratz added 3 commits November 22, 2024 13:18

Add skip tools parameter

1732a27

Add tests for skip_tools

b905d15

Update docs

91b11c2

Aratz self-assigned this Nov 22, 2024

Update CHANGELOG

4a5beac

Fixed linting

438c715

matrulda self-requested a review December 9, 2024 07:29

matrulda reviewed Dec 9, 2024

View reviewed changes

docs/usage.md Show resolved Hide resolved

nextflow_schema.json Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add skip tools parameter for tool selection #68

Add skip tools parameter for tool selection #68

Aratz commented Nov 22, 2024 •

edited

Loading

github-actions bot commented Nov 22, 2024 •

edited

Loading

❗ Test warnings:

❔ Tests ignored:

✅ Tests passed:

Run details

MatthiasZepper commented Nov 22, 2024

Aratz commented Nov 25, 2024

MatthiasZepper commented Nov 27, 2024

Aratz commented Dec 2, 2024

MatthiasZepper commented Dec 2, 2024

Aratz commented Dec 6, 2024 •

edited

Loading

MatthiasZepper commented Dec 8, 2024 •

edited

Loading

matrulda left a comment

MatthiasZepper commented Dec 9, 2024

matrulda commented Dec 10, 2024 •

edited

Loading

alneberg commented Dec 10, 2024

kedhammar commented Dec 10, 2024

Aratz commented Dec 13, 2024

alneberg commented Dec 13, 2024

Add skip tools parameter for tool selection #68

Are you sure you want to change the base?

Add skip tools parameter for tool selection #68

Conversation

Aratz commented Nov 22, 2024 • edited Loading

PR checklist

github-actions bot commented Nov 22, 2024 • edited Loading

nf-core pipelines lint overall result: Passed ✅ ⚠️

❗ Test warnings:

❔ Tests ignored:

✅ Tests passed:

Run details

MatthiasZepper commented Nov 22, 2024

Aratz commented Nov 25, 2024

MatthiasZepper commented Nov 27, 2024

Aratz commented Dec 2, 2024

MatthiasZepper commented Dec 2, 2024

Aratz commented Dec 6, 2024 • edited Loading

MatthiasZepper commented Dec 8, 2024 • edited Loading

matrulda left a comment

Choose a reason for hiding this comment

MatthiasZepper commented Dec 9, 2024

matrulda commented Dec 10, 2024 • edited Loading

alneberg commented Dec 10, 2024

kedhammar commented Dec 10, 2024

Aratz commented Dec 13, 2024

alneberg commented Dec 13, 2024

Aratz commented Nov 22, 2024 •

edited

Loading

github-actions bot commented Nov 22, 2024 •

edited

Loading

`nf-core pipelines lint` overall result: Passed ✅ ⚠️

Aratz commented Dec 6, 2024 •

edited

Loading

MatthiasZepper commented Dec 8, 2024 •

edited

Loading

matrulda commented Dec 10, 2024 •

edited

Loading