Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: How hard would it be to have ngs_mapper accept fasta files #110

Closed
mmelendrez opened this issue Mar 12, 2015 · 2 comments
Closed

Comments

@mmelendrez
Copy link
Member

So in my continuing efforts to make your life difficult - I'd like to map the contigs obtained from de novo (pathogen discovery pipeline) or really any de novo assembler and map them (and unassembled reads of interest) to a reference.

Problem - they are fasta format and I don't think ngs_mapper can do just plain fasta mapping?

I can toss all reads into a directory for ngs_mapper but it won't accept fasta format for mapping.

I remember we chatted about this when I was attempting to map Rickettsia to the draft 88 contigs from Genbank.

I think this would be a useful feature - scientists often want to take the de novo generated contigs and map back to references to see how they are doing on genomic coverage of their organism of interest.

For a better test case than Rickettsia which is bacterial and messy you can use the file generated during pathogen discovery of Chikungunya - the sample was almost 100% Chik. Project management issue number 9618 (https://vdbpm.org/issues/9618). For this one - since the majority of the sample was Chik I was able to just map the raw reads in the fastq directly to the fasta Chik genome (in the issue). But the pathogen discovery pipeline also generated contigs and whatnot in fasta format (which I can't use in ngs_mapper) - but you can use them if you decide to add this feature. I would suggest rerunning pathogen discovery on this sample to get the contigs though - because this was before we found the issue with iterative phylo where half the results were missing.

Files located at: /media/VD_Research/Analysis/ProjectBased_Analysis/melanie/share/Issue_9618

I did chmod 777 on the directory so you should have access to everything in there.

@necrolyte2 necrolyte2 added the bug label Mar 17, 2015
@necrolyte2 necrolyte2 added this to the v1.2.0 milestone Mar 17, 2015
@necrolyte2
Copy link
Member

It may be easier at this time to just manually run them

  1. Setup
    Set these variables up(you need to specify absolute paths to reference and input fasta file)
inputfastapath=/path/to/contig.fasta
refpath=/path/to/reference.fasta
outputdir=outputdir
  1. Run
    Copy-paste the rest
mkdir -p ${outputdir}/input
ln -s $inputfastapath $outputdir/input/
inputfastapath=${outputdir}/input/$(basename $inputfastapath)
cp $refpath $outputdir
refpath=${outputdir}/$(basename $refpath)
bwa index $refpath
bwa mem $refpath ${inputfastapath} | samtools view -Su - | samtools sort -o - - > ${outputdir}/out.bam
samtools index ${outputdir}/out.bam
base_caller ${outputdir}/out.bam ${inputfastapath} ${outputdir}/out.bam.vcf
graphsample ${outputdir}/out.bam -od ${outputdir}
vcf_consensus ${outputdir}/out.bam.vcf -i contig -o ${outputdir}/out.bam.consensus.fasta

@necrolyte2 necrolyte2 removed this from the v1.2.0 milestone Mar 17, 2015
@necrolyte2 necrolyte2 modified the milestone: 1.3.1 Jun 12, 2015
@averagehat
Copy link
Contributor

How this will effect the stages:

  • 1. ngs_filter works as normal (but must be changed to accept FASTA)
  • 2. Skip trim_reads
  • 3. bwa should work with FASTA -- but output BAM file will have '*' instead of base quality (BQ)
  • 4. tagreads should work as-is
  • 5. base_caller: uses MPileupColumn which expects BQ, also mark_lq may be adjusted to only look at depth and not quality.
  • 6. graphsample: graph_qualdepth needs to change
  • 7. fqstats will need to be altered
  • 8. vcf_consens will work as is

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants