Skip to content

Commit

Permalink
“The International Committee on Taxonomy of Viruses (ICTV) authorizes…
Browse files Browse the repository at this point in the history
… and organizes the taxonomic classification of and the nomenclatures for viruses. The ICTV has developed a universal taxonomic scheme for viruses, and thus has the means to appropriately describe, name, and classify every virus that affects living organisms. The members of the International Committee on Taxonomy of Viruses are considered expert virologists. The ICTV was formed from and is governed by the Virology Division of the International Union of Microbiological Societies. Detailed work, such as delimiting the boundaries of species within a family, typically is performed by study groups of experts in the families.” Description from Wikipedia.

The ICTV Master Species List is curated by virology experts, which have established over 100 international study groups, which organize discussions on emerging taxonomic issues in their field, oversee the submission of proposals for new taxonomy, and prepare or revise the relevant chapter(s) in ICTV reports. ICTV is open to submissions of proposals for taxonomic changes from an individual, however in practice proposals are usually submitted by members of the relevant study groups.

The ICTV chooses an exemplar virus for each species and the Virus Metadata Resource provides a list of these exemplars. An exemplar virus serves as an example of a well-characterized virus isolate of that species and includes the GenBank accession number for the genomic sequence of the isolate as well as the virus name, isolate designation, suggested abbreviation, genome composition, and host source.

This import is internally documented including the schema additions [here](https://docs.google.com/document/d/1ELM4XmjyG1bitWqdSrSp6d49EQ2_ya4PpXHc_B0cPIE/edit?resourcekey=0-eefsHcX6YqQ7UqRcwVpaBg#heading=h.qtewylhpzoc9). This import is also being documented on GitHub in datacommonsorg/data [PR #834](datacommonsorg/data#834).

This cleans up the Master Species List and Virus Metadata Resource datasets from ICTV. They are formatted as a  tmcf + csv biomedical import. It also adds schema to represent the data in this import. It passed the tests from Prashanth's json tool and the internal v3 staging tool.

PiperOrigin-RevId: 520131436
  • Loading branch information
spiekos authored and copybara-github committed Mar 28, 2023
1 parent b994b06 commit b77be4e
Show file tree
Hide file tree
Showing 5 changed files with 13,516 additions and 5 deletions.
10 changes: 5 additions & 5 deletions biomedical_schema/genome_annotation.mcf
Original file line number Diff line number Diff line change
Expand Up @@ -51,21 +51,21 @@ name: "Gene"
typeOf: schema:Class
subClassOf: dcs:GenomeAnnotation
description: "Gene symbol of a gene, which is the basic hereditary unit of life."
sameAs: "https://bioportal.bioontology.org/ontologies/OGG"
descriptionUrl: "https://bioportal.bioontology.org/ontologies/OGG"

Node: dcid:GeneticVariant
name: "GeneticVariant"
typeOf: schema:Class
subClassOf: dcs:GenomeAnnotation
description: "A single-nucleotide polymorphism, which is a substitution of a single nucleotide that occurs at a specific position in the genome, where each variation is present to some appreciable degree within a population. These are defined by dbSNP and includes small indels as well."
sameAs: "http://rohsdb.usc.edu/GBshape/cgi-bin/hgTables?db=hg19&hgta_group=varRep&hgta_track=snp137&hgta_table=snp137&hgta_doSchema=describe+table+schema"
descriptionUrl: "http://rohsdb.usc.edu/GBshape/cgi-bin/hgTables?db=hg19&hgta_group=varRep&hgta_track=snp137&hgta_table=snp137&hgta_doSchema=describe+table+schema"

Node: dcid:GeneticVariantGeneAssociation
name: "GeneticVariantGeneAssociation"
typeOf: schema:Class
subClassOf: dcs:GeneticAssociation
description: "An association between a genetic variant and a gene in a given tissue. This is determined by performing a regression analysis on paired genome sequencing and RNA-sequencing across a population."
sameAs: "https://storage.googleapis.com/gtex_analysis_v6p/single_tissue_eqtl_data/README_eQTL_v6p.txt"
descriptionUrl: "https://storage.googleapis.com/gtex_analysis_v6p/single_tissue_eqtl_data/README_eQTL_v6p.txt"

Node: Position
typeOf: dcs:UnitOfMeasure
Expand Down Expand Up @@ -123,14 +123,14 @@ Node: dcid:genBankAccession
name: "genBankAssemblyAccession"
typeOf: schema:Property
rangeIncludes: schema:Text
domainIncludes: dcs:GenomeAssembly,dcs:GenomeAssemblyUnit,dcs:Chromosome
domainIncludes: dcs:BiologicalElement
description: "The accession version of the GenBank assembly or sequence element."

Node: dcid:refSeqAccession
name: "refSeqAssemblyAccession"
typeOf: schema:Property
rangeIncludes: schema:Text
domainIncludes: dcs:GenomeAssembly,dcs:GenomeAssemblyUnit,dcs:Chromosome
domainIncludes: dcs:BiologicalElement
description: "The accession version of the RefSeq assembly or sequence element."

Node: dcid:ncbiBioProject
Expand Down
Loading

0 comments on commit b77be4e

Please sign in to comment.