An R package for working with genetic information encoded in variant string format.
More specifically, we can define two types of strings that we interested in:
- variant strings contain position information (gene and codon) along with corresponding amino acids. Optionally they may also include read counts for each corresponding amino acid.
- position strings are a stripped down version of a variant string, containing just the gene and codon position information.
In brief, this package contains functions that...
- Check for correctly formatted variant and position strings.
- Extract a position string from a variant string.
- Subset a variant string based on a position string.
- Compare two variant strings to look for a match (useful in numerator of prevalence calculation). Reports if this is an exact match or an ambiguous match.
- Compare a position string against a variant string to look for a match (useful in denominator of prevalence calculation).
- Convert between string format and a long-form data.frame format.
There are also a few more utility functions not listed here - see the package help for a complete list of functions.
The current version is 1.7.0, released 16 Jan 2025.