From cae06de441b3de4eb32d334eb20cb007751c5dde Mon Sep 17 00:00:00 2001 From: Manuel Holtgrewe Date: Tue, 7 Nov 2023 09:18:48 +0100 Subject: [PATCH] docs: document database and rule sets for ACMG prediction (#190) (#193) --- docs/acmg_seqvars_criteria.rst | 1128 +++++++++++++++++ ...g_seqvars.rst => acmg_seqvars_details.rst} | 252 ++-- docs/index.rst | 3 +- 3 files changed, 1211 insertions(+), 172 deletions(-) create mode 100644 docs/acmg_seqvars_criteria.rst rename docs/{acmg_seqvars.rst => acmg_seqvars_details.rst} (85%) diff --git a/docs/acmg_seqvars_criteria.rst b/docs/acmg_seqvars_criteria.rst new file mode 100644 index 00000000..870a1090 --- /dev/null +++ b/docs/acmg_seqvars_criteria.rst @@ -0,0 +1,1128 @@ +.. _acmg_seqvars_criteria: + +============================== +ACMG Sequence Variant Criteria +============================== + +This section describes the criteria as implemented for the automated classification of ACMG sequence variants. + +A large focus is given towards the transparency to the user. +The user is always given a report of the evidence for and against a given criterion. +Even if not further described, the reason for skipping a criterion is always reported (e.g., BA1 disables all other benign criteria, or if a criterion is only valid for missense variants). + +.. _acmg_seqvars_criteria-inheritance: + +------------------- +Mode of Inheritance +------------------- + +The mode of inheritance for a is derived from the following list of sources. +The sources are iterated in the order given below and the first one with a match is used for deriving mode of inheritance. + +1. **NHGRI CGD** + The `National Human Genome Research Institute Clinical Genomic Database `__ +2. **ClinGen Disease Validity** `Website `__ +3. **EBI gene2phenotype** `Website `__ +4. **ClinGen GenCC** `Website `__ +5. **Genomics England PanelApp** `Web-App `__ +6. **Domino** is a machine learning method for prediction of mode of inheritance and described in `PMID:28985496 `__. + We use the thresholds from `PMID:30376034 `__. + + - dominant: if score >= 0.5934 + - recessive: if score <= 0.3422 + - dominant/recessive: if score is between 0.3422 and 0.5934 + +.. _acmg_seqvars_criteria-inheritance-literature: + +Literature +========== + + +.. _acmg_seqvars_criteria-frequency: + +---------------- +Allele Frequency +---------------- + +The databases gnomAD exomes and genomes will be used to derived allele frequencies. +We trust the gnomAD quality control pipeline and filter solely above the gnomAD variant annotation. +Frequencies are considered valid if: + +- allele number is >=2000 +- gnomAD quality filter is PASS + +The criteria BA1 and BS1 will ignore the following "bottlenecked" populations: +Amish (ami), Ashkenazi Jewish (asj), European Finnish (fin), Middle Eastern (mid), and "Remaining Individuals (rmi) groups". + +.. _acmg_seqvars_criteria-calibration: + +----------- +Calibration +----------- + +We currently do not have our own calibration yet. +This is a big TODO once we have all the data. + +-------------------- +Transcript Selection +-------------------- + +We use the MANE Plus Clinical set of transcripts for the genes that have such information. +Otherwise, we use the longest transcript as the MANE transcript and consider all alternate RefSeq transcripts. + +MANE annotation is not available for GRCh37 and will not be provided by ENSEMBL/NCBI. +We thus map the MANE (and MANE Plus Clinical) transcripts from GRCh38 by using the latest version of the transcript with the same identifier. +If mapping either MANE or any of the MANE Plus Clinical transcripts fails then we fall back to the longest transcript rule. + +.. _acmg_seqvars_criteria-patho-predictions: + +------------------------- +Pathogenicity Predictions +------------------------- + +We currently use the thresholds from Pejaver et al. (2022) only. + +Once we have our own calibration, we can extend our predictions to novel tools such as AlphaMissense. + +.. _acmg_seqvars_mods: + +------------------------------ +Code Modification Nomenclature +------------------------------ + +In accordance with ClinGen Sequence Variant Interpretation Work Group (2017), modification codes are used. +That is, for a criterion ``${CRIT}``, the modification codes ``${CRIT}_Supporting``, ``${CRIT}_Moderate``, ``${CRIT}_Strong``, ``${CRIT}_VeryStrong``. + +.. _acmg_seqvars_criteria-rules: + +-------------------- +Rules / Point System +-------------------- + +We consider three rule systems: + +- The original ACMG 2015 rules +- The ACGS 2020 rules +- The 2020 Point system described by Tavtigian et al. (2020) + +ACMG 2015 Rules +=============== + +The following rules have been defined in Richards et al. (2015). + +Pathogenic +---------- + +If one of the following criteria 1-3 are fulfilled. + +1. 1 very strong (PVS1) AND one of the following + a. >=1 strong (PS1-PS4) + b. >=2 moderate (PM1-PM6) + c. >=1 moderate (PM1-PM6) AND >=1 supporting (PP1-PP5) + d. >=2 supporting (PP1-PP5) +2. >=2 strong (PS1-PS4) +3. 1 strong (PS1-PS4) AND + a. >=3 moderate (PM1-PM6) + b. 2 moderate (PM1-PM6) AND >=2 supporting (PP1-PP5) + c. 1 moderate (PM1-PM6) AND >=4 supporting (PP1-PP5) + +Likely Pathogenic +----------------- + +If one of the following criteria 1-7 are fulfilled. + +1. 1 very strong (PVS1) AND 1 moderate (PM1-PM6) +2. 1 strong (Ps1-PS4) AND 1-2 moderate (PM1-PM6) +3. 1 strong (PS1-PS4) AND >=2 supporting (PP1-PP5) +4. >=3 moderate (PM1-PM6) +5. 2 moderate (PM1-PM6) AND >=2 supporting (PP1-PP5) +6. 1 moderate (PM1-PM6) AND >=4 supporting (PP1-PP5) + +Benign +------ + +If one of the following criteria 1-2 are fulfilled. + +1. 1 standalone (BA1) +2. >=2 strong (BS1-BS4) + +Likely Benign +------------- + +If one of the following criteria 1-2 are fulfilled. + +1. 1 strong (BS1-BS4) AND 1 supporting (BP1-BP7) +2. >=2 supporting (BP1-BP7) + +Uncertain Significance +---------------------- + +If if one of the following criteria 1-2 are fulfilled. + +1. Other criteria shown above are not met +2. the criteria for benign and pathogenic are contradictory + +ACGS 2020 Rules +=============== + +The following is a refinement of the rules above set by the Ellard et al. (2020). + +Pathogenic +---------- + +1. 1 very strong (PVS) AND one of the following + a. >=1 strong + b. >=1 moderate + c. >=2 supporting +2. >=3 strong +3. 2 strong AND one of the following + a. >=1 moderate + b. >=2 supporting +4. 1 strong AND one of the following + a. >=3 moderate + b. >=2 moderate AND >=2 supporting + c. >=1 moderate AND >=4 supporting + +Likely Pathogenic +----------------- + + +1. >=2 strong +2. 1 strong AND one of teh following + a. 1-2 moderate OR + b. >=2 supporting +3. >=3 moderate OR + a. 2 moderate AND >=2 supporting + b. 1 modereate AND >=4 supporting + +Tavtigian et al. (2020) Rules +============================= + +Alternatively, Tavtigian et al. (2020) formulated the rules as an integer point system. + +Table 2 from this manuscript gives point values: + +.. list-table:: Points per proband + + * - evidence strength + - points pathogenic + - points benign + * - indeterminate + - 0 + - 0 + * - supporting + - 1 + - -1 + * - moderate + - 2 + - -2 + * - strong + - 4 + - -4 + * - very strong + - 8 + - -8 + +The point-based variant classification categories are then given in their Table 3: + +.. list-table:: Categories and point ranges + + * - category + - point ranges + * - pathogenic + - >= 10 + * - likely pathogenic + - 6 to 9 + * - uncertain significance + - 0 to 5 + * - likely benign + - -1 to -6 + * - benign + - <= -7 + +-------- +Criteria +-------- + +.. _acmg_seqvars_criteria-pvs1: + +PVS1 (null variant) +=================== + +Original Definition +------------------- + + Null variant (nonsense, frameshift, canonical +/-1 or 2 splice sites, initiation codon, single or multi-exon deletion) in a gene where loss of function (LOF) is a known mechanism of disease. + + Caveats: + + - Beware of genes where LOF is not a known disease mechanism (e.g. GFAP, MYH7) + - Use caution interpreting LOF variants at the extreme 3' end of a gene + - Use caution with splice variants that are predicted to lead to exon skipping but leave the remainder of the protein intact + - Use caution in the presence of multiple transcripts + + -- Richards et al. (2015); Table 4 + +Preconditions / Precomputations +------------------------------- + +- Criterion establishes whether LoF is a known mechanism of disease: + - If at least 2 LoF variants are reported in ClinVar with two or more stars then this criterion is triggered. + - If the gnomAD LOF Observed/Expected is less than 0.7555 then this criterion is triggered. +- Criterion establishes whether a stop_gain variant introduced nonsense mediated decay (NMD) consistent with Abou Youn et al. (2018) and the VEP NMD plugin. + - If the variant is on chrMT then it cannot be NMD. + - If the variant is not_stop gain then then it cannot be NMD, else: + - If the variant is in the last exon of the transcript then it is predicted to escape NMD. + - If the variant falls 50bp upstream of the penuultimate (second to the last) exon then it is predicted to escape NMD. + - If the variant falls int the first 100 coding bases in teh transcript then it is predicted to escape NMD. + - If the variant is in an intronless transcript, meaning only one exon exists in the transcript, then it is predicted to escape NMD. + - Else, the variant is predicted to be NMD. +- The MANE Plus Clinical transcripts are used for "biologically relevant transcripts" in this criterion. + +Implemented Criterion +--------------------- + +While the original description is somewhat vague, the specification in Abou Tayoun et al. (2018) is more precise but complex to implement. +We plan to implement it as closely as possible. + +TODO: full specification + +Literature +---------- + +- Richards et al. (2015) describes the original criterion. +- Abou Tayoun et al. (2018) describe refined criteria for PVS1. +- McCormick et al. (2020) describe the ACMG criteria for chrMT variants. +- The following are from the VEP NMD plugin: + - Identifying Genes Whose Mutant Transcripts Cause Dominant Disease Traits by Potential Gain-of-Function Alleles (Coban-Akdemir, 2018) + - The criteria and impact of nonsense-mediated mRNA decay in human cancers (Lindeboom, 2016) + +User Report +----------- + +The following information is reported to the user: + +- The evidence for / against LoF as disease mechanism. +- Whether NMD and NMD escape is predicted for this variant and the reason. +- The use of MANE Plus Clinical or alternate transcripts for locating alternate start codons. +- Further information of interest from the Abou Tayoun et al. (2018) decision tree. + +Caveats +------- + +- We use the thresholds from `PMID:30376034 `__ but should reconsider, e.g., switching to LOEUF here with our own thresholds. +- This is currently not implementing the full criteria set from Abou Tayoun et al. (2018). + +Notes +----- + +- If this criterion is triggered then PP3 and PM4 will be disabled. + +.. _acmg_seqvars_criteria-ps1: + +PS1 (same amino acid) +===================== + +Original Definition +------------------- + + Same amino acid change as a previously established pathogenic variant regardless of nucleotide change. + + Caveat: Beware of changes that impact splicing rather than at the amino acid/protein level. + + -- Richards et al. (2015); Table 4 + +Preconditions / Precomputations +------------------------------- + +- If the variant is not a missense variant then this criterion is skipped. + +Implemented Criterion +--------------------- + +- Consider all equivalent missense variants in ClinVar. +- If at least one of the variant then this criterion is triggered. + - If the variant has zero stars in ClinVar then we report PS1_Supporting only + - If the variant has only one star in ClinVar then we report PS1_Moderate only + - If the variant has two stars in ClinVar then we report PS1 + - If the variant has three stars or above in ClinVar then we report PS1_VeryStrong + +User Report +----------- + +- The selected variant in ClinVar and with assessment its star status with accession. +- All alternate variants in Clinvar with assessments and star status with accessions. + +Literature +---------- + +N/A + +Caveats +------- + +- The wording of "established pathogenic" variant is not clear so we use the steps from above. +- Note that this also depends on disease match which the user must confirm manually. + +.. _acmg_seqvars_criteria-ps2: + +PS2 (confirmed *de novo*) +========================= + +No automation has been implemented. + +Original Definition +------------------- + + De novo (both maternity and paternity confirmed) in a patient with the disease and no family history + + Note: Confirmation of paternity only is insufficient. + Egg donation, surrogate motherhood, errors in embryo transfer, etc. can contribute to non-maternity. + + -- Richards et al. (2015); Table 4 + +.. _acmg_seqvars_criteria-ps3: + +PS3 (functional studies) +======================== + +No automation has been implemented. + +Original Definition +------------------- + + Well-established in vitro or in vivo functional studies supportive of a damaging effect on the gene or gene product. + + Note: Functional studies that have been validated and shown to be reproducible and robust in a clinical diagnostic laboratory setting are considered the most well-established. + + -- Richards et al. (2015); Table 4 + +.. _acmg_seqvars_criteria-ps4: + +PS4 (prevalence) +================ + +No automation has been implemented. + +Original Definition +------------------- + + The prevalence of the variant in affected individuals is significantly increased compared to the prevalence in controls + + Note 1: Relative risk (RR) or odds ratio (OR), as obtained from case-control studies, is >5.0 and the confidence interval around the estimate of RR or OR does not include 1.0. See manuscript for detailed guidance. + + Note 2: In instances of very rare variants where case-control studies may not reach statistical significance, the prior observation of the variant in multiple unrelated patients with the same phenotype, and its absence in controls, may be used as moderate level of evidence. + + -- Richards et al. (2015); Table 4 + +.. _acmg_seqvars_criteria-pm1: + +PM1 (hotspot) +============= + +Original Definition +------------------- + + Located in a mutational hot spot and/or critical and well-established functional domain (e.g. active site of an enzyme) without benign variation. + + -- Richards et al. (2015); Table 4 + +Preconditions / Precomputations +------------------------------- + +- If the variant is on chrMT then this criterion is skipped according to McCormick et al. (2020). + +Implemented Criterion +--------------------- + +- If the variant is within a hotspot (at least 4 pathogenic missense/in-frame variants within 25bp radius) then this criterion is triggered. +- If the variant is within an annotated UniProt domain and the domain contains at least 2 pathogenic variants then this criterion is triggered. + +User Report +----------- + +- The hotspot region definition and the number of pathogenic variants in the region. + +Literature +---------- + +- McCormick et al. (2020) describe the ACMG criteria for chrMT variants. + +Caveats +------- + +- We currently use the threshold from `PMID:30376034 `__ and are lacking our own calibration. + +.. _acmg_seqvars_criteria-pm3: + +PM3 (recessive in *trans*) +========================== + +No automation has been implemented. + +Original Definition +------------------- + + For recessive disorders, detected in trans with a pathogenic variant. + + Note: This requires testing of parents (or offspring) to determine phase. + + -- Richards et al. (2015); Table 4 + +.. _acmg_seqvars_criteria-pm4: + +PM4 (protein length) +==================== + +Original Definition +------------------- + + Protein length changes due to in-frame deletions/insertions in a non-repeat region or stop-loss variants. + + -- Richards et al. (2015); Table 4 + +Preconditions / Precomputations +------------------------------- + +- If PVS1 was triggered then this criterion is skipped to avoid double counting. +- If the variant is not an in-frame indel and not a stop-loss variant then this criterion is skipped. + +Implemented Criterion +--------------------- + +- If the variant is an in-frame indel + - If the variant is inside a repeat masked region then it is skipped + - If the variant is inside a repeat as annotated by UniProt then it is skipped + - Otherwise, this criterion is triggered. +- If the variant is a stop-loss variant then this criterion is triggered. + +User Report +----------- + +- Any reasons for skipping in repeat regions. +- The transcript identifier. + +Literature +---------- + +N/A + +Caveats +------- + +- Richards et al. (2015) state that the size of the indel and amount of change in amino acids should influence the classification. + We currently do not have this implemented. + +.. _acmg_seqvars_criteria-pm5: + +PM5 (overlapping missense) +========================== + +Original Definition +------------------- + + Novel missense change at an amino acid residue where a different missense change determined to be pathogenic has been seen before. + + Caveat: Beware of changes that impact splicing rather than at the amino acid/protein level. + + -- Richards et al. (2015); Table 4 + +Preconditions / Precomputations +------------------------------- + +- If the variant is on a nuclear chromosome + - If it is not a missense variant then this criterion is skipped. +- If the variant is on chrMT and not missense and not on a tRNA gene then this criterion is skipped. + +Implemented Criterion +--------------------- + +- If the variant is on a nuclear chromosome: + - If the variant is at the same position as a pathogenic missense variant then this criterion is triggered. +- If the variant is on chrMT: + - If the variant is a missense variant and at the same position as a pathogenic one then the criterion is triggered. + - If the variant is on a tRNA gene and at the same position as a pathogenic one then the criterion is triggered as PM5_Supporting. + +User Report +----------- + +- The overlapping variant used for criterion. +- Any alternative overlapping variants not chosen. + +Literature +---------- + +- Richards et al. (2018) describes the criterion for nuclear chromosomes. +- McCormick et al. (2020) describes the criterion for chrMT. + +Caveats +------- + +N/A + +.. _acmg_seqvars_criteria-pm6: + +PM6 (assumed *de novo*) +======================= + +No automation has been implemented. + +Original Definition +------------------- + + Assumed de novo, but without confirmation of paternity and maternity. + + -- Richards et al. (2015); Table 4 + +.. _acmg_seqvars_criteria-pm2: + +PM2_Supporting (absent from controls) +===================================== + +Original Definition +------------------- + + Absent from controls (or at extremely low frequency if recessive) in Exome Sequencing Project, 1000 Genomes or ExAC. + + -- Richards et al. (2015); Table 4 + +Preconditions / Precomputations +------------------------------- + +- Determine :ref:`acmg_seqvars_criteria-inheritance` for the gene. +- Determine :ref:`acmg_seqvars_criteria-frequency`. +- If the allele frequency is invalid then this criterion is skipped. + +Implemented Criterion +--------------------- + +- If the variant is on a nuclear chromosome: + - If the gene is marked as recessive or X-linked: + - If the variant allele count is <=4 then this criterion is triggered. + - If the gene is marked as dominant: + - If the homozygous allele count is <=1 then this criterion is triggered. + - If the allele frequency is less than 0.0001 then this criterion is triggered. +- If the variant is on chrMT: + If the variant frequency is below 0.00002=0.002%=1/50,000 then this criterion is triggered. + +User Report +----------- + +- The values and thresholds used by the criterion even if failed. + +Literature +---------- + +- Richards et al. (2015) describes the original criterion. +- ClinGen Sequence Variant Interpretation Work Group (2020): SVI Recommendation for Absence/Rarity (PM2) - Version 1.0 describes the downgrade to supporting. +- McCormick et al. (2020) describe the ACMG criteria for chrMT variants. + +Caveats +------- + +- We currently use the threshold from `PMID:30376034 `__ and are lacking our own calibration. +- This criterion has been downgraded by default to supporting from strong in accordance to ClinGen Sequence Variant Interpretation Work Group (2020): *SVI Recommendation for Absence/Rarity (PM2) - Version 1.0* + +.. _acmg_seqvars_criteria-pp1: + +PP1 (cosegregation) +=================== + +No automation has been implemented. + +.. _acmg_seqvars_criteria-pp2: + +PP2 (missense) +============== + +Original Definition +------------------- + + Missense variant in a gene that has a low rate of benign missense variation and where missense variants are a common mechanism of disease. + + -- Richards et al. (2015); Table 4 + +Preconditions / Precomputations +------------------------------- + +- If the variant is on chrMT then this criterion is skipped according to McCormick et al. (2020). +- If the variant is not a missense variant then this criterion is skipped. + +Implemented Criterion +--------------------- + +- If the ratio of pathogenic missense variants over all non-VUS missense variants is greater than 0.808 then this criterion is triggered. + +User Report +----------- + +- Report the ratio of pathogenic missense variants over all non-VUS missense variants. + +Literature +---------- + +- McCormick et al. (2020) describe the ACMG criteria for chrMT variants. + +Caveats +------- + +- We currently use the threshold from `PMID:30376034 `__ and are lacking our own calibration. + +Notes +----- + +- This criterion is similar to :ref:`acmg_seqvars_criteria-bp1` + +.. _acmg_seqvars_criteria-pp3: + +PP3 (*in silico* predictions) +============================= + +Original Definition +------------------- + + Multiple lines of computational evidence support a deleterious effect on the gene or gene product (conservation, evolutionary, splicing impact, etc). + + Caveats: + + - As many in silico algorithms use the same or very similar input for their predictions, each algorithm should not be counted as an independent criterion. + - PP3 can be used only once in any evaluation of a variant. + + -- Richards et al. (2015); Table 4 + +Preconditions / Precomputations +------------------------------- + +- If the criterion PVS1 was triggered then this criterion is skipped. +- If the variant is on chrMT then it is skipped, as we don't have calibration for chrMT yet. +- If the variant is not found in dbNSFP or CADD precomputed scores then it is skipped as we don't have calibration for chrMT yet. + +Implemented Criterion +--------------------- + +An initial prediction is fist done using the general purpose pathogenicity predictors. + +- If we have a score from the following, then the prediction is used (in descending order of priority): + - REVEL, MutPred2, CADD, BayesDel, VEST4, ..., PhyloP + - we will use the modifiers from Pejaver et al. (2022) +- If predictions are missing then then PhyloP of the position of the variant is used as a fallback. + +Then, for splicing the following is done. + +- If a SpliceAI prediction is performed then it is interpreted according to Walker et al. (2023). + +The highest-scoring variant is used for the final prediction. + +User Report +----------- + +- The scores and predictions from the predictors. + +Literature +---------- + +- Pejaver et al. (2022) has our thresholds for general variants +- Walker et al. (2023) has the threshold for splicing + +Caveats +------- + +- As described in :ref:`acmg_seqvars_criteria-patho-predictions`, we are currently limited to the precomputed threshold from the literature. + This hinders us in adopting AlphaMissense effectively, for example. +- We need to compute accuracy to rank the implemented methods. +- We need our own calibration for chrMT. + +Notes +----- + +- This criterion is similar to :ref:`acmg_seqvars_criteria-bp4` + +.. _acmg_seqvars_criteria-pp4: + +PP4 (monogenetic) +================= + +No automation has been implemented. + +.. _acmg_seqvars_criteria-ba1: + +BA1 (5% frequency) +================== + +Original Definition +------------------- + + Allele frequency is >5% in Exome Sequencing Project, 1000 Genomes Project, or Exome Aggregation Consortium + + -- Richards et al. (2015); Table 4 + +Preconditions / Precomputations +------------------------------- + +- The variant is absent from the exception list from Ghosh et al. (2018). + If the variant is present on this list, then this criterion is skipped. + +Implemented Criterion +--------------------- + +- If the variant is nuclear (not on chrMT) + - If the allele frequency is above 0.05 in gnomAD global population then this criterion is triggered. +- else (the variant is on chrMT) + - If the allele frequency is above 0.01 in gnomAD-mtDNA global population then this criterion is triggered. + +User Report +----------- + +- The variant frequency. + +Literature +---------- + +- Richards et al. (2015) describes the 5% allele frequency threshold. +- Ghosh et al. (2018) introduce the exception list and ClinGen maintains it. +- McCormick et al. (2020) describe the 1% allele frequency threshold as appropriate for chrMT variants. + +Caveats +------- + +- The exception *"However, there must be no additional conflicting evidence to support pathogenicity, such as a novel occurrence in a certain haplogroup" from McCormick et al. (2020)* is not implemented yet. + +.. _acmg_seqvars_criteria-bs1: + +BS1 (expected frequency) +======================== + +Original Definition +------------------- + + Allele frequency greater than expected for disorder. + + -- Richards et al. (2015); Table 4 + +Preconditions / Precomputations +------------------------------- + +- Determine :ref:`acmg_seqvars_criteria-frequency`. +- If the allele frequency is invalid then this criterion is skipped. + +Implemented Criterion +--------------------- + +- If the variant is on a nuclear chromosome and the user provided a maximal credible population frequency: + - If the FAF from gnomAD is above the maximal credible population frequency then this criterion is triggered. +- If the variant is on chrMT: + - If the population frequency is above 0.5% then this criterion is triggered in accordance to McCormick et al. (2020). + +User Report +----------- + +- The variant frequency and again the user specified maximal credible population frequency for nuclear variants. +- The variant frequency and the 0.5% threshold for chrMT variants. + +Literature +---------- + +- Richards et al. (2015) describes the original criterion without thresholds. +- Gudmundsson et al. (2022) describe the FAF threshold provided by gnomAD. +- McCormick et al. (2020) describe the ACMG criteria for chrMT variants. + +.. _acmg_seqvars_criteria-bs2: + +BS2 (healthy adult) +=================== + +Original Definition +------------------- + + Observed in a healthy adult individual for a recessive (homozygous), dominant (heterozygous), or X-linked (hemizygous) disorder, with full penetrance expected at an early age. + + -- Richards et al. (2015); Table 4 + +Preconditions / Precomputations +------------------------------- + +- If the criterion BA1 triggered then this criterion is skipped. +- Determine :ref:`acmg_seqvars_criteria-inheritance` for the gene. +- Determine :ref:`acmg_seqvars_criteria-frequency`. +- If the allele frequency is invalid then this criterion is skipped. +- If the criterion BA1 was triggered then this criterion is skipped. + +Implemented Criterion +--------------------- + +- If the gene is marked as recessive or X-linked: + - If the variant allele count is above 2 then this criterion is triggered. +- If the gene is marked as dominant: + - If the variant allele count is above 5 then this criterion is triggered. + +User Report +----------- + +- The variant frequency and the threshold used. + +Literature +---------- + +- Chen et al. (2022), Karczewski et al. (2020), etc. describe gnomAD. +- The modes of inheritance for the genes are taken from different sources as described in :ref:`acmg_seqvars_criteria-inheritance`. + +Caveats +------- + +- The conditions of "full penetrance" and "expected at an early age" need to be checked by the user. + +Notes +----- + +- Genes can be marked as both recessive and dominant. +- We use the thresholds from `PMID:30376034 `__. + +.. _acmg_seqvars_criteria-bs3: + +BS3 (functional studies) +======================== + +No automation has been implemented. + +Original Definition +------------------- + + Well-established in vitro or in vivo functional studies shows no damaging effect on protein function or splicing. + + -- Richards et al. (2015); Table 4 + +.. _acmg_seqvars_criteria-bs4: + +BS4 (lack of segregation) +========================= + +No automation has been implemented. + +Original Definition +------------------- + + Lack of segregation in affected members of a family + + Caveats: + + - The presence of phenocopies for common phenotypes (i.e. cancer, epilepsy) can mimic lack of segregation among affected individuals. + - Also, families may have more than one pathogenic variant contributing to an autosomal dominant disorder, further confounding an apparent lack of segregation. + + -- Richards et al. (2015); Table 4 + +.. _acmg_seqvars_criteria-bp1: + +BP1 (missense) +============== + +Original Definition +------------------- + + Missense variant in a gene for which primarily truncating variants are known to cause disease + + -- Richards et al. (2015); Table 4 + +Preconditions / Precomputations +------------------------------- + +- If the criterion BA1 triggered then this criterion is skipped. +- If the variant is on chrMT then this criterion is skipped according to McCormick et al. (2020). +- If the variant is not a missense variant then this criterion is skipped. + +Implemented Criterion +--------------------- + +- If the ratio of benign missense variants over all non-VUS missense variants is greater than 0.569 then this criterion is triggered. + +User Report +----------- + +- Report the ratio of benign missense variants over all non-VUS missense variants together with threshold. + +Literature +---------- + +- McCormick et al. (2020) describe the ACMG criteria for chrMT variants. + +Caveats +------- + +- We currently use the threshold from `PMID:30376034 `__ and are lacking our own calibration. + +Notes +----- + +- This criterion is similar to :ref:`acmg_seqvars_criteria-pp2` + +.. _acmg_seqvars_criteria-bp2: + +BP2 (recessive in *trans*) +========================== + +No automation has been implemented. + +Original Definition +------------------- + + Observed in trans with a pathogenic variant for a fully penetrant dominant gene/disorder; or observed in cis with a pathogenic variant in any inheritance pattern + + -- Richards et al. (2015); Table 4 + +.. _acmg_seqvars_criteria-bp3: + +BP3 (in-frame repetitive) +========================= + +.. note:: + + - We do not have proper Uniprot data yet (domain / repeat) + - Similar to repeat masker. + - Probably same for phylop100way? + +Original Definition +------------------- + + In-frame deletions/insertions in a repetitive region without a known function. + + -- Richards et al. (2015); Table 4 + +Preconditions / Precomputations +------------------------------- + +- If the criterion BA1 triggered then this criterion is skipped. +- If the variant is on chrMT then this criterion is skipped. + +Implemented Criterion +--------------------- + +- If the variant is in a known functional domain according to UniProt then this criterion is skipped. +- If the variant is in a repeat region according to UniProt repeat annotation genome repeat masker then this criterion is skipped. +- If the variant is in a region of low conservation (PhyloP100Way less than 3.58, same as `PMID:30376034 `__) then this criterion is skipped. +- If all conditions above fail then this criterion is triggered. + +User Report +----------- + +- The variant position and the reason for triggering or skipping. + +Literature +---------- + +- McCormick et al. (2020) describe the ACMG criteria for chrMT variants. + +Caveats +------- + +- We currently use the conservation threshold from `PMID:30376034 `__ and are lacking our own calibration. +- Different from `PMID:30376034 `__, we do not check whether there are known pathogenic variants in the region. + +.. _acmg_seqvars_criteria-bp4: + +BP4 (*in silico* predictions) +============================= + +.. note:: + + - we have not implemented MitoTip or MitImpact yet + - we are lacking phylop scores yet + - we don't have live CADD scores yet + +Original Definition +------------------- + + Multiple lines of computational evidence suggest no impact on gene or gene product (conservation, evolutionary, splicing impact, etc). + + Caveat: As many in silico algorithms use the same or very similar input for their predictions, each algorithm cannot be counted as an independent criterion. + BP4 can be used only once in any evaluation of a variant. + + -- Richards et al. (2015); Table 4 + +Preconditions / Precomputations +------------------------------- + +- If the criterion BA1 triggered then this criterion is skipped. +- If the variant is on chrMT then it is skipped, as we don't have calibration for chrMT yet. +- If the variant is not found in dbNSFP or CADD precomputed scores then it is skipped as we don't have calibration for chrMT yet. + +Implemented Criterion +--------------------- + +See :ref:`acmg_seqvars_criteria-pp3` for details. + +User Report +----------- + +See :ref:`acmg_seqvars_criteria-pp3` for details. + +Literature +---------- + +See :ref:`acmg_seqvars_criteria-pp3` for details. + +Caveats +------- + +See :ref:`acmg_seqvars_criteria-pp3` for details. + +Notes +----- + +- This criterion is similar to :ref:`acmg_seqvars_criteria-pp3` + +.. _acmg_seqvars_criteria-bp5: + +BP5 (found in solved) +===================== + +No automation has been implemented. + +Original Definition +------------------- + + Variant found in a case with an alternate molecular basis for disease. + + -- Richards et al. (2015); Table 4 + +.. _acmg_seqvars_criteria-bp7: + +BP7 (synonymous) +================ + +Original Definition +------------------- + + A synonymous (silent) variant for which splicing prediction algorithms predict no impact to the splice consensus sequence nor the creation of a new splice site AND the nucleotide is not highly conserved. + + -- Richards et al. (2015); Table 4 + +Preconditions / Precomputations +------------------------------- + +- If the variant is on chrMT then this criterion is skipped according to McCormick et al. (2020). + +Implemented Criterion +--------------------- + +- If there is a pathogenic variant +/- 2bp of the position in ClinVar then the criterion is skipped. +- If the variant is closer than 2bp to a splice site then the criterion is skipped. +- If the variant is not predicted to alter the splice site using SpliceAI then the criterion is triggered. + +User Report +----------- + +- The variant position and the reason for triggering or skipping. + +Literature +---------- + +- McCormick et al. (2020) describe the ACMG criteria for chrMT variants. + +Caveats +------- + +N/A + +Notes +----- + +- We use the thresholds from `PMID:30376034 `__. diff --git a/docs/acmg_seqvars.rst b/docs/acmg_seqvars_details.rst similarity index 85% rename from docs/acmg_seqvars.rst rename to docs/acmg_seqvars_details.rst index 2e5b8c19..d2712699 100644 --- a/docs/acmg_seqvars.rst +++ b/docs/acmg_seqvars_details.rst @@ -1,18 +1,50 @@ -.. _acmg_seqvars: +.. _acmg_seqvars_details: -=============================== -Sequence Variant Classification -=============================== +============================= +AMCG Sequence Variant Details +============================= -This section describes the ACMG classification for sequence variants. +This section describes additional details on our implementation of ACMG classification for sequence variants. -.. _acmg_seqvars-references: +.. _acmg_seqvars_details-Datas: + +---- +Data +---- + +The following datasources are used in the classification of sequence variants. + +.. list-table:: Tools and datasources used + + * - Data + - Tool / Datasource + - Original Datasource + * - Transcripts + - `Mehari `__ / `cdot `__ + - RefSeq / ENSEMBL + * - Variant Effect Predictions + - `Mehari `__ + - N/A + * - Variant Frequencies + - annonars + - gnomAD exomes, genomes, mtDNA + * - chrMT scores, annotations + - annonars + - MITOMAP, MitoTip, MitImpact, MtSNPscore + * - gene annotations + - annonars + - ClinGen, GCD, gene2phenotype, GenCC, PanelApp, DOMINO + * - protein annotations: domains, repeats, mutations + - UCSC genome browser + - UniProt + +.. _acmg_seqvars_details-references: ---------- References ---------- -It is based on the following publications, also `listed on the ClinGen Website `__. +Literature with direct ACMG / ACGS / ClinGen relationship - Richards S, Aziz N, Bale S, Bick D, Das S, Gastier-Foster J, Grody WW, Hegde M, Lyon E, Spector E, Voelkerding K, Rehm HL; ACMG Laboratory Quality Assurance Committee. *Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology.* @@ -20,6 +52,9 @@ It is based on the following publications, also `listed on the ClinGen Website < - ClinGen Sequence Variant Interpretation Work Group. *Recommendations for ACMG/AMP guideline criteria code modifications nomenclature.* November 10, 2017. +- Whiffin N, Minikel E, Walsh R, O'Donnell-Luria AH, Karczewski K, Ing AY, Barton PJR, Funke B, Cook SA, MacArthur D, Ware JS. + *Using high-resolution variant frequencies to empower clinical genome interpretation.* + Genet Med. 2017 Oct;19(10):1151-1158. doi: 10.1038/gim.2017.26. Epub 2017 May 18. PMID: 28518168; PMCID: PMC5563454. - Abou Tayoun AN, Pesaran T, DiStefano MT, Oza A, Rehm HL, Biesecker LG, Harrison SM; ClinGen Sequence Variant Interpretation Working Group (ClinGen SVI). *Recommendations for interpreting the loss of function PVS1 ACMG/AMP variant criterion.* Hum Mutat. 2018 Nov;39(11):1517-1524. doi: 10.1002/humu.23626. Epub 2018 Sep 7. PMID: 30192042; PMCID: PMC6185798. @@ -44,12 +79,18 @@ It is based on the following publications, also `listed on the ClinGen Website < - Ellard S, Baple EL, Callaway A, Berry I, Forrester N, Turnbull C, Owens M, Eccles DM, Abbs S, Scott R, Deans ZC, Lester T, Campbell J, Newman WG, Ramsden S, McMullan DJ. *ACGS Best Practice Guidelines for Variant Classification in Rare Disease 2020.* 2020. +- McCormick EM, Lott MT, Dulik MC, Shen L, Attimonelli M, Vitale O, Karaa A, Bai R, Pineda-Alvarez DE, Singh LN, Stanley CM, Wong S, Bhardwaj A, Merkurjev D, Mao R, Sondheimer N, Zhang S, Procaccio V, Wallace DC, Gai X, Falk MJ. + *Specifications of the ACMG/AMP standards and guidelines for mitochondrial DNA variant interpretation.* + Hum Mutat. 2020 Dec;41(12):2028-2057. doi: 10.1002/humu.24107. Epub 2020 Nov 10. PMID: 32906214; PMCID: PMC7717623. - Tavtigian SV, Harrison SM, Boucher KM, Biesecker LG. *Fitting a naturally scaled point system to the ACMG/AMP variant classification guidelines.* Hum Mutat. 2020 Oct;41(10):1734-1737. doi: 10.1002/humu.24088. Epub 2020 Aug 30. PMID: 32720330; PMCID: PMC8011844. - ClinGen Sequence Variant Interpretation Work Group. *SVI Recommendation for De Novo Criteria (PS2 & PM6) - Version 1.1* 2021. +- DiStefano MT, Goehringer S, Babb L, Alkuraya FS, Amberger J, Amin M, Austin-Tse C, Balzotti M, Berg JS, Birney E, Bocchini C. + *The gene curation coalition: a global effort to harmonize gene-disease evidence resources.* + Genetics in Medicine. 2022 Aug 1;24(8):1732-42. - Pejaver V, Byrne AB, Feng BJ, Pagel KA, Mooney SD, Karchin R, O'Donnell-Luria A, Harrison SM, Tavtigian SV, Greenblatt MS, Biesecker LG, Radivojac P, Brenner SE; ClinGen Sequence Variant Interpretation Working Group. *Calibration of computational tools for missense variant pathogenicity classification and ClinGen recommendations for PP3/BP4 criteria.* Am J Hum Genet. 2022 Dec 1;109(12):2163-2177. doi: 10.1016/j.ajhg.2022.10.013. Epub 2022 Nov 21. PMID: 36413997; PMCID: PMC9748256. @@ -57,12 +98,6 @@ It is based on the following publications, also `listed on the ClinGen Website < *Using the ACMG/AMP framework to capture evidence related to predicted and observed impact on splicing: Recommendations from the ClinGen SVI Splicing Subgroup.* Am J Hum Genet. 2023 Jul 6;110(7):1046-1067. doi: 10.1016/j.ajhg.2023.06.002. Epub 2023 Jun 22. PMID: 37352859; PMCID: PMC10357475. -Further references: - -- Eilbeck K, Lewis SE, Mungall CJ, Yandell M, Stein L, Durbin R, Ashburner M. - The Sequence Ontology: a tool for the unification of genome annotations. - Genome Biol. 2005;6(5):R44. doi: 10.1186/gb-2005-6-5-r44. Epub 2005 Apr 29. PMID: 15892872; PMCID: PMC1175956. - We currently exclude the following resources (we plan to later incorporate them): - Variant Curation Expert Panel (VCEP) Criteria Specifications approved by the SVI VCEP Review Committee @@ -70,15 +105,36 @@ We currently exclude the following resources (we plan to later incorporate them) *Evaluating the Clinical Validity of Gene-Disease Associations: An Evidence-Based Framework Developed by the Clinical Genome Resource.* Am J Hum Genet. 2017 Jun 1;100(6):895-906. doi: 10.1016/j.ajhg.2017.04.015. Epub 2017 May 25. PMID: 28552198; PMCID: PMC5473734. -.. _acmg_seqvars-criteria: +Further Supporting Literature + +- Eilbeck K, Lewis SE, Mungall CJ, Yandell M, Stein L, Durbin R, Ashburner M. + The Sequence Ontology: a tool for the unification of genome annotations. + Genome Biol. 2005;6(5):R44. doi: 10.1186/gb-2005-6-5-r44. Epub 2005 Apr 29. PMID: 15892872; PMCID: PMC1175956. +- Quinodoz M, Royer-Bertrand B, Cisarova K, Di Gioia SA, Superti-Furga A, Rivolta C. + *DOMINO: using machine learning to predict genes associated with dominant disorders.* + The American Journal of Human Genetics. 2017 Oct 5;101(4):623-9. +- Kopanos C, Tsiolkas V, Kouris A, Chapple CE, Aguilera MA, Meyer R, Massouras A. + *VarSome: the human genomic variant search engine.* + Bioinformatics. 2019 Jun 6;35(11):1978. +- Martin AR, Williams E, Foulger RE, Leigh S, Daugherty LC, Niblock O, Leong IU, Smith KR, Gerasimenko O, Haraldsdottir E, Thomas E. + *PanelApp crowdsources expert knowledge to establish consensus diagnostic gene panels.* + Nature genetics. 2019 Nov;51(11):1560-5. +- Thormann A, Halachev M, McLaren W, Moore DJ, Svinti V, Campbell A, Kerr SM, Tischkowitz M, Hunt SE, Dunlop MG, Hurles ME. + *Flexible and scalable diagnostic filtering of genomic variants using G2P with Ensembl VEP. Nature communications.* + 2019 May 30;10(1):2373. +- Gudmundsson S, Singer-Berk M, Watts NA, Phu W, Goodrich JK, Solomonson M; Genome Aggregation Database Consortium; Rehm HL, MacArthur DG, O'Donnell-Luria A. + *Variant interpretation using population databases: Lessons from gnomAD.* + Hum Mutat. 2022 Aug;43(8):1012-1030. doi: 10.1002/humu.24309. Epub 2021 Dec 16. PMID: 34859531; PMCID: PMC9160216. + +.. _acmg_seqvars_details-criteria: -------- Criteria -------- -The text in the following section is based on the one by Richards et al. (2015) and the updates listed in :ref:`acmg_seqvars-references`. +The text in the following section is based on the one by Richards et al. (2015) and the updates listed in :ref:`acmg_seqvars_details-references`. -.. _acmg_seqvars-criteria-pvs: +.. _acmg_seqvars_details-criteria-pvs: Pathogenic Very Strong ====================== @@ -262,14 +318,14 @@ PS1 (same amino acid change) - with P comparison variant: VUA is PS1_Moderate - with LP comparison variant: VUA is PS1_Supporting -.. image:: img/Walker-2023-Table-2-1.png - :alt: Table 2 (part 1) from Walker et al. (2023) +.. image:: img/Walker-2023-Table-3-1.png + :alt: Table 3 (part 1) from Walker et al. (2023) -.. image:: img/Walker-2023-Table-2-1.png - :alt: Table 2 (part 2) from Walker et al. (2023) +.. image:: img/Walker-2023-Table-3-2.png + :alt: Table 3 (part 2) from Walker et al. (2023) -.. image:: img/Walker-2023-Table-3.png - :alt: Table 3 from Walker et al. (2023) +.. image:: img/Walker-2023-Table-2.png + :alt: Table 2 from Walker et al. (2023) **Caveats** @@ -374,7 +430,7 @@ PM4 (protein length) PM5 (overlapping missense) -------------------------- -- n ovel missense change at an amino acid residue where a different missense change determined to be pathogenic has been seen before +- novel missense change at an amino acid residue where a different missense change determined to be pathogenic has been seen before **Caveats** @@ -473,8 +529,8 @@ This category does not exist in Richards et al. (2015) but is implicitely introd Benign Strong ============= -BS1 (expecte frequency) ------------------------ +BS1 (expected frequency) +------------------------ - allele frequency is greater than expected for disorder @@ -558,149 +614,3 @@ BP7 (synonymous) - synonymous (silent) variant for which splicing prediction algorithms predict no impact to the splice consensus sequence nor the creation of a new splice site AND the nucleotide is not highly conserved - incorporate figures 4-5 from Walker et al. (2023) - ------ -Rules ------ - -ACMG 2015 Rules -=============== - -The following rules have been defined in Richards et al. (2015). - -Pathogenic ----------- - -If one of the following criteria 1-3 are fulfilled. - -1. 1 very strong (PVS1) AND one of the following - a. >=1 strong (PS1-PS4) - b. >=2 moderate (PM1-PM6) - c. >=1 moderate (PM1-PM6) AND >=1 supporting (PP1-PP5) - d. >=2 supporting (PP1-PP5) -2. >=2 strong (PS1-PS4) -3. 1 strong (PS1-PS4) AND - a. >=3 moderate (PM1-PM6) - b. 2 modreate (PM1-PM6) AND >=2 supporting (PP1-PP5) - c. 1 moderate (PM1-PM6) AND >=4 supporting (PP1-PP5) - -Likely Pathogenic ------------------ - -If one of the following criteria 1-7 are fulfilled. - -1. 1 very strong (PVS1) AND 1 moderate (PM1-PM6) -2. 1 strong (Ps1-PS4) AND 1-2 moderate (PM1-PM6) -3. 1 strong (PS1-PS4) AND >=2 supporting (PP1-PP5) -4. >=3 moderate (PM1-PM6) -5. 2 moderate (PM1-PM6) AND >=2 supporting (PP1-PP5) -6. 1 moderate (PM1-PM6) AND >=4 supporting (PP1-PP5) - -Benign ------- - -If one of the following criteria 1-2 are fulfilled. - -1. 1 standalone (BA1) -2. >=2 strong (BS1-BS4) - -Likely Benign -------------- - -If one of the following criteria 1-2 are fulfilled. - -1. 1 strong (BS1-BS4) AND 1 supporting (BP1-BP7) -2. >=2 supporting (BP1-BP7) - -Uncertain Significance ----------------------- - -If if one of the following criteria 1-2 are fulfilled. - -1. Other criteria shown above are not met -2. the criteria for benign and pathogenic are contradictory - -ACGS 2020 Rules -=============== - -The following is a refinement of the rules above set by the Ellard et al. (2020). - -Pathogenic ----------- - -1. 1 very strong (PVS) AND one of the following - a. >=1 strong - b. >=1 moderate - c. >=2 supporting -2. >=3 strong -3. 2 strong AND one of the following - a. >=1 moderate - b. >=2 supporting -4. 1 strong AND one of the following - a. >=3 moderate - b. >=2 moderate AND >=2 supporting - c. >=1 moderate AND >=4 supporting - -Likely Pathogenic ------------------ - - -1. >=2 strong -2. 1 strong AND one of teh following - a. 1-2 moderate OR - b. >=2 supporting -3. >=3 moderate OR - a. 2 moderate AND >=2 supporting - b. 1 modereate AND >=4 supporting - -Tavtigian et al. 2020 Rules -=========================== - -Alternatively, Tavtigian et al. (2020) formulated the rules as an integer point system. - -Table 2 from this manuscript gives point values - -.. list-table:: Points per proband - - * - evidence strength - - points pathogenic - - points benign - * - indeterminate - - 0 - - 0 - * - supporting - - 1 - - -1 - * - moderate - - 2 - - -2 - * - strong - - 4 - - -4 - * - very strong - - 8 - - -8 - -The point-based variant classification categories are then given in Table 3 - -.. list-table:: Categories and point ranges - - * - category - - point ranges - * - pathogenic - - >= 10 - * - likely pathogenic - - 6 to 9 - * - uncertain significance - - 0 to 5 - * - likely benign - - -1 to -6 - * - benign - - <= -7 - ------------------------------- -Code Modification Nomenclature ------------------------------- - -In accordance with ClinGen Sequence Variant Interpretation Work Group (2017), modification codes are used. -That is, for a criterion ``${CRIT}``, the modification codes ``${CRIT}_Supporting``, ``${CRIT}_Moderate``, ``${CRIT}_Strong``, ``${CRIT}_VeryStrong``. diff --git a/docs/index.rst b/docs/index.rst index dc772ebc..53c6a54a 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -9,7 +9,8 @@ It supports them by providing aggregated information on genomic variants. :maxdepth: 1 :caption: Contents: - acmg_seqvars + acmg_seqvars_criteria + acmg_seqvars_details .. toctree:: :hidden: