Feature request - module for generating coverage data #68

lennijusten · 2024-10-09T17:14:54Z

It would be great to have an (optional) module in the pipeline that generates coverage data for species of interest (e.g., all human infecting pathogens with greater than X number of hits).

Ideally, the resulting coverage data would let you plot individual read alignments and look at coverage depth as a function of position along a reference genome.

The goal I have in mind for a module like this is to build confidence in species assignments by checking that read coverage is approximately uniform across the genome.

I'm unsure of the best implementation, but I've sketched out a rough idea below. Others (CC @mikemc, @jeffkaufman) definitely have more context and probably have better ideas!

Input

Reference genome(s) to align to

Output

A dataframe with columns read ID, reference genome ID, start position of read along genome, end position of read along genome.

willbradshaw · 2024-10-21T13:40:24Z

I don't currently see a great way of implementing this within the constraints of the pipeline's current functionality, which makes me think it's a better fit for post-pipeline downstream analysis. But I'm open to changing that view if someone can suggest a concrete implementation that fits well within the current pipeline.

willbradshaw · 2024-12-17T19:47:08Z

Having sat on this for a while, I'm now more confident that this won't be implemented in the pipeline as currently conceived anytime soon. However, this might be a better fit for whatever downstream pipeline things like duplication analysis and clade counts end up in after we've implemented #122.

willbradshaw added enhancement New feature or request priority_3 labels Oct 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request - module for generating coverage data #68

Feature request - module for generating coverage data #68

lennijusten commented Oct 9, 2024

willbradshaw commented Oct 21, 2024

willbradshaw commented Dec 17, 2024

Feature request - module for generating coverage data #68

Feature request - module for generating coverage data #68

Comments

lennijusten commented Oct 9, 2024

willbradshaw commented Oct 21, 2024

willbradshaw commented Dec 17, 2024