Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way to specify a common reference genome and run the analysis without protospacer #43

Closed
npatel-ah opened this issue Jun 15, 2023 · 5 comments
Labels
enhancement Improvement for existing functionality

Comments

@npatel-ah
Copy link

Description of feature

Hello,

This is a great pipeline, I have a feature request, which I believe is simple, I have multiple samples, with the same reference genome, now providing reference genome within the sampleshseet.csv creates a separate reference.fasta for each of them with the sample name as contig_id. Now when Minimap2 maps the Fastq to the references, all my bam files have different contigs, making it bit difficult to compare.

Also, I see the a large portion of the pipeline can be run without "protospacer" , would it be possible to make it optional ?

Best,

@npatel-ah npatel-ah added the enhancement Improvement for existing functionality label Jun 15, 2023
@mirpedrol
Copy link
Member

Hi @npatel-ah,
Thank you for using this pipeline and providing your suggestions. This is a valuable enhancement, and we will definitely work on incorporating it. :)

@mirpedrol
Copy link
Member

I have added a new parameter --reference to allow an input fasta file to provide the same reference for all samples, if this parameter is given, then the reference field in the sample sheet is not required.
The protospacer is required as we use it to orient the reference and also for the parsing of SNPs and indels produced by the cut, but I also added a new parameter --protospacer which will work in the same way as --reference.

@npatel-ah
Copy link
Author

Thank you so much for accommodating the change.

@benemead
Copy link

benemead commented Feb 2, 2024

Hi there, appreciate these changes as this is my exact use case, but I've found that I'm getting errors when I specify both options in v2.1.1.

I have tried multiple combinations of specifying protospacer and reference in samplesheet and params (running via nextflow tower on aws batch) - the only way that leads to successful pipeline completion is with both specified only in the samplesheet.

When I use only --protospacer I encounter an error on CRISPRSEQ_PLOTTER where the protospacer is passed as [].

When I use --reference as a path to a .fasta I encounter an error on ORIENT_REFERENCE:

Caused by:
  No such variable: id -- Check script '.nextflow/assets/nf-core/crisprseq/./workflows/../modules/local/orient_reference.nf' at line: 2

Appreciate any help you may have!

@mirpedrol
Copy link
Member

Hi @benemead, thanks for reporting this! I will open a new issue to not lose track and have a look at the bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improvement for existing functionality
Projects
None yet
Development

No branches or pull requests

3 participants