Skip to content

GENE : TPS

Sagar Jadhav edited this page Aug 2, 2021 · 69 revisions

GENE DICTIONARY:

  1. Vasant and Giulia started Gene dictionary.

  2. eo_Gene with classification in excel

  3. This eo_Gene dictionary contain 11951 TPS genes.

    TOTAL GENES: 11951


#Creation of eo_Gene dictionary

  1. I got 7572 UNIPROTKB AC/ID from following links.

    UNIPROTKB AC/IDs:Terpene synthase

    UNIPROTKB AC/IDs:Terpene synthase C

    Please find above both combined.

    UNIPROTKB AC/ID retrieval for TPS

  2. I also searched uniprot for "terpene synthase AND reviewed:yes" and found new 945 TPS genes.

    uniprot terpene synthase

    After removing duplicates (point 1 and 2) and entries with "deleted" annotation, I have total 8409 TPS genes.

  3. Gene names, synonyms, organism names (and IDs), protein information, Enzyme Commission number and Gene ontology (molecular information) was retrieved from Uniprot (https://www.uniprot.org/uploadlists/).

    uniprot

    uniprot 8409

  4. Gene identifier IDs such as AT5G23960 for Arabidopsis (TPS21) are being used in literature. (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3268506/table/t02_01/)

  5. So, I mapped Primary and secondary identifiers from Phytomine. (https://phytozome.jgi.doe.gov/phytomine/template.do?name=Proteins%20with%20Two%20PFAM%20Domains&scope=all).

  6. Found 7 new plant species from (https://digitalcommons.wustl.edu/cgi/viewcontent.cgi?filename=5&article=9240&context=open_access_pubs&type=additional)

  7. From these 7 species and PFAM domain information, I retrieved Gene Identifier IDs for 510 new TPS genes.

    510 TPS

  8. Collected 3032 new TPS genes from

    3032 TPS

    Phytomine https://phytozome.jgi.doe.gov/phytomine/begin.do

    http://www.nipgr.ac.in/terzyme.html

    http://radish.kazusa.or.jp/cgi-bin/keyword.cgi.

    www.rosaceae.org

    www.solgenomics.net

    www.citrusgenomedb.org

    www.pulsedb.org

    https://viggs.dna.affrc.go.jp/

    www.cucurbitgenomics.org

    www.banana-genome-hub.southgreen.fr

    www.morus.swu.edu.cn

  9. Finally, dictionary now contain 11951 TPS genes (8409+510+3032).

  10. TPS GENE CLASSIFICATION

Monoterpene sythase: 1062

Sesquiterpene sythase: 2273

Diterpene sythase: 681

Prenyl transferase: 179

Triterpenoid synthase: 94

Uncharacterized: 3237

Total above: 7526

Total TPS Genes: 11951

Terpen synthase domain containing protein, Unannotated genome, Species specific terpen: 4425

TPS Classification

  1. TPS Gene distribution in different species

    all taxons

    monocots

    Dicots

    Dicots

    Gymnosperms and others

    gene density monocot

  2. Creating eo_Gene dictionary and minicorpus

    A) I created txt file containing list of all gene names.

    B) I created txt file containing list of all species names and terms such as TPS, TPS1, TPS2 and so on.

    amidict -v --dictionary eo_Gene--directory gene --input genee.txt create --informat list --outformats xml

    pls find dictionary here

    Gene1

    eogene

    getpapers -q "(terpene synthase)" -o corporaTPS -x -p -k 500 -f corporaTPS/log.txt downloaded 500 papers

    getpapers -q "(terpene synthase) AND (characterisation) AND (characterization)" -o corpusTPS -x -p -k 500 -f corporaTPS/log.txt
    downloaded around 35 papers

  3. Testing eo_Gene dictionary

    ami -p "corporaTPS" section

    ami -p "corporaTPS" search --dictionary eo_Gene.xml

  4. eo_Gene1

    Gene

    eo_Gene

    Gene

  5. Difficulties:

    a) Some papers mention TPS in Vitis vinifera as VvTPS and VviTPS. Some use TPS1 or TPS01.

    b) Gene names are in tables, figures or supplementary files.

**

Creating TPS Goldstandard:

  1. I queried https://europepmc.org/ for different searches and got results as:

    Query Number of hits
    terpene synthase 4308
    terpene synthase plant 3447
    terpene synthase plant volatile 1200
    terpene synthase plant TPS 650
    terpene synthase TPS plant volatile 376
    terpene synthase TPS plant volatile compounds 355 (Research articles 312)
Clone this wiki locally