Skip to content

GENE : TPS

Sagar Jadhav edited this page Aug 2, 2021 · 69 revisions

GENE DICTIONARY:

  1. Vasant and Giulia started Gene dictionary.

  2. eo_Gene in excel

  3. This eo_Gene dictionary contain 11951 TPS genes.

    TOTAL GENES: 11951

#Creation of eo_Gene dictionary

  1. I got 7572 UNIPROTKB AC/ID from following links.

    UNIPROTKB AC/IDs:Terpene synthase

    UNIPROTKB AC/IDs:Terpene synthase C

    Please find above both combined.

    UNIPROTKB AC/ID retrieval for TPS

  2. I also searched uniprot for "terpene synthase AND reviewed:yes" and found new 945 TPS genes.

    uniprot terpene synthase

    After removing duplicates (point 1 and 2) and entries with "deleted" annotation, I have total 8409 TPS genes.

  3. Gene names, synonyms, organism names (and IDs), protein information, Enzyme Commission number and Gene ontology (molecular information) was retrieved from Uniprot (https://www.uniprot.org/uploadlists/).

    uniprot

    uniprot 8409

  4. Gene identifier IDs such as AT5G23960 for Arabidopsis (TPS21) are being used in literature. (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3268506/table/t02_01/)

  5. So, I mapped Primary and secondary identifiers from Phytomine. (https://phytozome.jgi.doe.gov/phytomine/template.do?name=Proteins%20with%20Two%20PFAM%20Domains&scope=all). I retrieved Gene identifier IDs for 1607 Genes (around 18 species including Arabidopsis).

  6. Found 7 new plant species from (https://digitalcommons.wustl.edu/cgi/viewcontent.cgi?filename=5&article=9240&context=open_access_pubs&type=additional)

  7. From these 7 species and PFAM domain information, I retrieved Gene Identifier IDs for 510 new TPS genes.

    510 TPS

  8. Collected 3032 new TPS genes from

    3032 TPS

    Phytomine https://phytozome.jgi.doe.gov/phytomine/begin.do

    http://www.nipgr.ac.in/terzyme.html

    http://radish.kazusa.or.jp/cgi-bin/keyword.cgi.

    www.rosaceae.org

    www.solgenomics.net

    www.citrusgenomedb.org

    www.pulsedb.org

    https://viggs.dna.affrc.go.jp/

    www.cucurbitgenomics.org

    www.banana-genome-hub.southgreen.fr

    www.morus.swu.edu.cn

  9. Finally, dictionary now contain 11951 TPS genes (8409+510+3032).

  10. TPS GENE CLASSIFICATION

Monoterpene sythase: 1062

Sesquiterpene sythase: 2273

Diterpene sythase: 681

Prenyl transferase: 179

Triterpenoid synthase: 94

Uncharacterized: 3237

Total above: 7526

Total TPS Genes: 11951

Terpen synthase domain containing protein, Unannotated genome, Species specific terpen: 4425

TPS Classification

  1. TPS Gene distribution in different species

    all taxons

    monocots

    Dicots

    Dicots

    Gymnosperms and others

    gene density monocot

  2. Creating eo_Gene dictionary and minicorpus

    A) I created txt file containing list of all gene names.

    B) I created txt file containing list of all species names and terms such as TPS, TPS1, TPS2 and so on.

    amidict -v --dictionary eo_Gene--directory gene --input genee.txt create --informat list --outformats xml

    pls find dictionary here

    Gene1

    eogene

    getpapers -q "(terpene synthase)" -o corporaTPS -x -p -k 500 -f corporaTPS/log.txt downloaded 500 papers

    getpapers -q "(terpene synthase) AND (characterisation) AND (characterization)" -o corpusTPS -x -p -k 500 -f corporaTPS/log.txt
    downloaded around 35 papers

  3. Testing eo_Gene dictionary

    ami -p "corporaTPS" section

    ami -p "corporaTPS" search --dictionary eo_Gene.xml

  4. eo_Gene1

    Gene

    eo_Gene

    Gene

  5. Difficulties:

    a) Some papers mention TPS in Vitis vinifera as VvTPS and VviTPS. Some use TPS1 or TPS01.

    b) Gene names are in tables, figures or supplementary files.

Clone this wiki locally