You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would be great to have an (optional) module in the pipeline that generates coverage data for species of interest (e.g., all human infecting pathogens with greater than X number of hits).
Ideally, the resulting coverage data would let you plot individual read alignments and look at coverage depth as a function of position along a reference genome.
The goal I have in mind for a module like this is to build confidence in species assignments by checking that read coverage is approximately uniform across the genome.
I'm unsure of the best implementation, but I've sketched out a rough idea below. Others (CC @mikemc, @jeffkaufman) definitely have more context and probably have better ideas!
Input
Reference genome(s) to align to
Output
A dataframe with columns read ID, reference genome ID, start position of read along genome, end position of read along genome.
The text was updated successfully, but these errors were encountered:
I don't currently see a great way of implementing this within the constraints of the pipeline's current functionality, which makes me think it's a better fit for post-pipeline downstream analysis. But I'm open to changing that view if someone can suggest a concrete implementation that fits well within the current pipeline.
Having sat on this for a while, I'm now more confident that this won't be implemented in the pipeline as currently conceived anytime soon. However, this might be a better fit for whatever downstream pipeline things like duplication analysis and clade counts end up in after we've implemented #122.
It would be great to have an (optional) module in the pipeline that generates coverage data for species of interest (e.g., all human infecting pathogens with greater than X number of hits).
Ideally, the resulting coverage data would let you plot individual read alignments and look at coverage depth as a function of position along a reference genome.
The goal I have in mind for a module like this is to build confidence in species assignments by checking that read coverage is approximately uniform across the genome.
I'm unsure of the best implementation, but I've sketched out a rough idea below. Others (CC @mikemc, @jeffkaufman) definitely have more context and probably have better ideas!
Input
Output
The text was updated successfully, but these errors were encountered: