Skip to content
This repository has been archived by the owner on Apr 19, 2024. It is now read-only.

Commit

Permalink
First pass at functional annotations, #177
Browse files Browse the repository at this point in the history
  • Loading branch information
cmungall committed Dec 11, 2020
1 parent b8eeb52 commit 6c3c5a1
Show file tree
Hide file tree
Showing 2 changed files with 144 additions and 1 deletion.
142 changes: 142 additions & 0 deletions schema/annotation.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
id: https://microbiomedata/schema/annotation
name: NMDC-Annotation
title: Annotation Module for NMDC Schema
description: >-
This module in the schema is for representing annotations including functional annotations of proteins and other gene products,
as well as controlled terms for describing things like metabolites
see_also:
- https://github.com/microbiomedata/nmdc-metadata/issues/176

prefixes:
biolinkml: https://w3id.org/biolink/biolinkml/
biolink: https://w3id.org/biolink/vocab/

imports:
- biolinkml:types
- core
- prov

classes:

functional annotation term:
aliases:
- function
- functional annotation
is_a: controlled term value
description: >-
Abstract grouping class for any term/descriptor that can be applied to a functional unit of a genome (protein, ncRNA, complex).
abstract: true
todos:
- decide if this should be used for product naming

pathway:
aliases:
- biological process
- metabolic pathway
- signaling pathway
is_a: functional annotation term
description: >-
A pathway is a sequence of steps/reactions carried out by an organism or community of organisms
id_prefixes:
- KEGG.PATHWAY
- COG
exact_mappings:
- biolink:Pathway

reaction:
is_a: functional annotation term
description: >-
An individual biochemical transformation carried out by a functional unit of an organism, in which a collection of substrates are transformed into a collection of products.
Can also represent transporters
id_prefixes:
- KEGG.REACTION
- RHEA
- MetaCyc
- EC
- GO
exact_mappings:
- biolink:MolecularActivity

orthology group:
is_a: functional annotation term
description: >-
A set of genes or gene products in which all members are orthologous
id_prefixes:
- KEGG.ORTHOLOGY
- EGGNOG
- PFAM
- TIGRFAM
- SUPFAM
- PANTHER.FAMILY
exact_mappings:
- biolink:GeneFamily

chemical entity:
aliases:
- metabolite
- chemical substance
- chemical compound
- chemical
is_a: controlled term value
description: >-
An atom or molecule that can be represented with a chemical formula. Include lipids, glycans, natural products, drugs.
There may be different terms for distinct acid-base forms, protonation states
slot_usage:
inchi:
key: true
multivalued: false
inchi key:
key: false # rare; Pletnev I, Erin A, McNaught A, Blinov K, Tchekhovskoi D, Heller S (2012) InChIKey collision resistance: an experimental testing. J Cheminform. 4:12
multivalued: false
smiles:
description: >-
A string encoding of a molecular graph, no chiral or isotopic information. There are usually a large number of valid SMILES which represent a given structure. For example, CCO, OCC and C(O)C all specify the structure of ethanol.
multivalued: true
see_also:
- https://bioconductor.org/packages/devel/data/annotation/vignettes/metaboliteIDmapping/inst/doc/metaboliteIDmapping.html
id_prefixes:
- KEGG.COMPOUND
- CHEBI
- CHEMBL.COMPOUND
- DRUGBANK
- PUBCHEM.COMPOUND
- CAS
- HMDB
- MESH
exact_mappings:
- biolink:ChemicalSubstance

gene product:
description: >-
A molecule encoded by a gene that has an evolved function
comments:
- we may include a more general gene product class in future to allow for ncRNA annotation
id_prefixes:
- UniProtKB
- gtpo
- PR
exact_mappings:
- biolink:GeneProduct

functional annotation:
description: >-
An assignment of a function term (e.g. reaction or pathway) that is executed by a gene product, or which the gene product plays an active role in.
Functional annotations can be assigned manually by curators, or automatically in workflows. In the context of NMDC, all function annotation is performed
automatically, typically using HMM or Blast type methods
see_also:
- https://img.jgi.doe.gov/docs/functional-annotation.pdf
- https://github.com/microbiomedata/mg_annotation/blob/master/functional-annotation.wdl
slots:
- was generated by
slot_usage:
subject:
range: gene product
has function:
range: functional annotation term
was generated by:
description: >-
provenance for the annotation. Note to be consistent with the rest of the NMDC schema we use the PROV annotation model, rather than GPAD
narrow_mappings:
- biolink:GeneToGoTermAssociation

3 changes: 2 additions & 1 deletion schema/nmdc.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ imports:
- mixs
- core
- prov
- annotation

subsets:
workflow subset:
Expand Down Expand Up @@ -813,4 +814,4 @@ slots:
range: string

completion_date:
range: string
range: string

0 comments on commit 6c3c5a1

Please sign in to comment.