Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sans-UMI option #128

Merged
merged 64 commits into from
Aug 27, 2021
Merged

Add sans-UMI option #128

merged 64 commits into from
Aug 27, 2021

Conversation

dladd
Copy link
Contributor

@dladd dladd commented Jul 22, 2021

Addresses issue #127

PR checklist

  • Add no-umi mode by setting new params.umi_length = 0. Will be -1 by default to remind users to set this value when running the pipeline
    • Added a sans-UMI subworkflow for presto processes
    • Added alternative "post-assembly" processes added for presto filterseq and maskprimers
    • Added alt. log parser
  • Other misc. additions
    • Include light chains (if present) in ParseDB and shazam
    • Do not require input read files to end in _R(1/2)
    • Add a post-assembly and QC filtering (but pre-collapse) fastqc step and MultiQC summary
    • New parameters:
      • primer_revpr
  • Add test: data from Illumina MiSeq 2x250 BCR mRNA / Greiff 2014 data from the presto docs
    • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
    • If necessary, also make a PR on the nf-core/bcellmagic branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint .).
  • Ensure the test suite passes (nextflow run . -profile test,docker).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

ggabernet and others added 22 commits January 14, 2020 12:02
…tered reads (but before collapsing duplicates)
…ultiple files to output_dir instead of only one when multiple sources exist
Copy link
Member

@ggabernet ggabernet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @dladd thanks for this PR, it has some great additions to the pipeline!

Seeing the number of changes required to add the sans-umi option, I've tried to make such addition a bit easier adding a presto_UMI sub-workflow (and maybe a bit easier to follow in the code for other people). Check out the #130 PR. Let me know what you think there, if that would make it easier to incorporate your changes with less if/else, also providing a subworkflow presto_sans_UMI (or sth similar), and maybe also for other people to add further pRESTO pre-processing subworkflows.

bin/TIgGER-shazam.R Show resolved Hide resolved
bin/TIgGER-shazam.R Show resolved Hide resolved
bin/log_parsing_no-umi.py Outdated Show resolved Hide resolved
modules/local/changeo/changeo_makedb.nf Outdated Show resolved Hide resolved
modules/local/changeo/changeo_makedb.nf Outdated Show resolved Hide resolved
modules/local/presto/presto_splitseq.nf Outdated Show resolved Hide resolved
modules/local/presto/presto_splitseq.nf Outdated Show resolved Hide resolved
workflows/bcellmagic.nf Outdated Show resolved Hide resolved
workflows/bcellmagic.nf Outdated Show resolved Hide resolved
Copy link
Member

@ggabernet ggabernet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @dladd ,

it's looking great, thanks for the sans-umi addition! I just have a couple of comments, but otherwise looks good.

So the nf-core CI tests also pass, one can add the - assets/multiqc_config.yaml to the list of files that need to be ignored for the linting test, as you added a FastQC module description. It should be below this line:

https://github.com/nf-core/bcellmagic/blob/cc29af5aeb19fb1a58ed6d627ac7aa5145d713fe/.nf-core.yml#L2

conf/modules.config Outdated Show resolved Hide resolved
modules/local/presto/presto_maskprimers_postassembly.nf Outdated Show resolved Hide resolved
nextflow.config Outdated Show resolved Hide resolved
nextflow_schema.json Outdated Show resolved Hide resolved
@@ -46,7 +46,7 @@ if (params.protocol == "pcr_umi"){

// Validate UMI position
if (params.index_file & params.umi_position == 'R2') {exit 1, "Please do not set `--umi_position` option if index file with UMIs is provided."}
if (params.umi_length == 0) {exit 1, "Please provide the UMI barcode length in the option `--umi_length`."}
if (params.umi_length == null) {exit 1, "Please provide the UMI barcode length in the option `--umi_length`. To run without UMIs, set umi_length to 0."}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (params.umi_length == null) {exit 1, "Please provide the UMI barcode length in the option `--umi_length`. To run without UMIs, set umi_length to 0."}
if (!params.umi_length) {exit 1, "Please provide the UMI barcode length in the option `--umi_length`. To run without UMIs, set umi_length to 0."}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've swapped this back to check for params.umi_length == null. Apparently in groovy !0 == true so running a no-UMI pipeline would otherwise always trigger the exit condition

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm it doesn't look like the linting likes us defining an integer param with null as the default in the schema. I've set default umi_length = -1 as a workaround but let me know if you have a different preferred solution.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok weird, I've tested it in the nextflow console (nextflow console) and it was interpreted as false:

def foo = null
var= "hello"
if (!foo) {
  print var
}

But yes you are right, the schema does not like an integer defined by default as null. The proposed -1 solution sounds good to me!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue was that with no-UMI, we are defining foo = 0, which is interpreted the same as foo = null and triggers the exit. Cool will stick with the -1 default workaround :-)

'presto_parseheaders_collapse_sans_umi' {
publish_dir = false
subcommand = 'collapse'
args = '-f CONSCOUNT --act min'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this here, as far as I understand it the CONSCOUNT value is the count of sequences having the same UMI barcode use to build a consensus. So without UMIs you should actually not have this field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! This was a vestigial step that snuck under my radar. Removed in d8dc9d0

subworkflows/local/presto_sans_umi.nf Outdated Show resolved Hide resolved
Copy link
Member

@ggabernet ggabernet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should really fix the lint tests now :)

.nf-core.yml Outdated Show resolved Hide resolved
Copy link
Member

@ggabernet ggabernet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see those two extra lines 🙄 , these should be removed as well 👍

.nf-core.yml Outdated Show resolved Hide resolved
@github-actions
Copy link

github-actions bot commented Aug 13, 2021

nf-core lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 72893e5

+| ✅ 135 tests passed       |+
#| ❔   1 tests were ignored |#
!| ❗   2 tests had warnings |!

❗ Test warnings:

  • files_exist - File not found: conf/igenomes.config
  • readme - README did not have a Nextflow minimum version mentioned in Quick Start section.

❔ Tests ignored:

  • files_unchanged - File ignored due to lint config: assets/multiqc_config.yaml

✅ Tests passed:

Run details

  • nf-core/tools version 2.1
  • Run at 2021-08-27 03:15:42

Copy link
Member

@ggabernet ggabernet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR looks good to me, thanks a lot for your contribution!

README.md Outdated Show resolved Hide resolved
@ggabernet ggabernet merged commit 63fecb2 into nf-core:dev Aug 27, 2021
@ggabernet ggabernet mentioned this pull request Mar 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants