Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optionally mask the start/end of sequences #92

Open
matthuska opened this issue Aug 30, 2023 · 0 comments
Open

Optionally mask the start/end of sequences #92

matthuska opened this issue Aug 30, 2023 · 0 comments

Comments

@matthuska
Copy link
Contributor

matthuska commented Aug 30, 2023

In GitLab by @hoelzer on Jun 21, 2021, 09:51

It would be great if CovSonar can mask start/end of a sequence (similar to what Nextstrain does, ...), e.g.

  • --mask_start 200
  • --mask_end 200

The ends are often fuzzy and can lead to false positively called substitutions/indels. If such sites are included in the profiles subsequent tasks such as clustering (breakfast, ...) might fail

EDIT here how Nextstrain does it:

https://github.com/nextstrain/ncov/blob/master/defaults/parameters.yaml

# Mask settings determine how the multiple sequence alignment is masked prior to phylogenetic inference.
mask:
  # Number of bases to mask from the beginning and end of the alignment. These regions of the genome
  # are difficult to sequence accurately.
  mask_from_beginning: 100
  mask_from_end: 50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant