Skip to content

GENE : TPS

Sagar Jadhav edited this page Aug 5, 2021 · 69 revisions

GENE DICTIONARY:

INTRODUCTION

  1. Vasant and Giulia started Gene dictionary.

  2. eo_Gene with classification in excel

  3. This eo_Gene dictionary contain 11951 TPS genes.

    TOTAL GENES: 11951


How I created eo_Gene dictionary

  1. I got 7572 UNIPROTKB AC/ID from following links.

    UNIPROTKB AC/IDs:Terpene synthase

    UNIPROTKB AC/IDs:Terpene synthase C

    Please find above both combined.

    UNIPROTKB AC/ID retrieval for TPS

  2. I also searched uniprot for "terpene synthase AND reviewed:yes" and found new 945 TPS genes.

    uniprot terpene synthase

    After removing duplicates (point 1 and 2) and entries with "deleted" annotation, I have total 8409 TPS genes.

  3. Gene names, synonyms, organism names (and IDs), protein information, Enzyme Commission number and Gene ontology (molecular information) was retrieved from Uniprot (https://www.uniprot.org/uploadlists/).

    uniprot

    uniprot 8409

  4. Gene identifier IDs such as AT5G23960 for Arabidopsis (TPS21) are being used in literature. (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3268506/table/t02_01/)

  5. So, I mapped Primary and secondary identifiers from Phytomine. (https://phytozome.jgi.doe.gov/phytomine/template.do?name=Proteins%20with%20Two%20PFAM%20Domains&scope=all).

  6. Found 7 new plant species from (https://digitalcommons.wustl.edu/cgi/viewcontent.cgi?filename=5&article=9240&context=open_access_pubs&type=additional)

  7. From these 7 species and PFAM domain information, I retrieved Gene Identifier IDs for 510 new TPS genes.

    510 TPS

  8. Collected 3032 new TPS genes from

    3032 TPS

    Phytomine https://phytozome.jgi.doe.gov/phytomine/begin.do

    http://www.nipgr.ac.in/terzyme.html

    http://radish.kazusa.or.jp/cgi-bin/keyword.cgi.

    www.rosaceae.org

    www.solgenomics.net

    www.citrusgenomedb.org

    www.pulsedb.org

    https://viggs.dna.affrc.go.jp/

    www.cucurbitgenomics.org

    www.banana-genome-hub.southgreen.fr

    www.morus.swu.edu.cn

  9. Finally, dictionary now contain 11951 TPS genes (8409+510+3032).



  1. TPS GENE CLASSIFICATION

Monoterpene sythase: 1062

Sesquiterpene sythase: 2273

Diterpene sythase: 681

Prenyl transferase: 179

Triterpenoid synthase: 94

Uncharacterized: 3237

Total above: 7526

Total TPS Genes: 11951

Terpen synthase domain containing protein, Unannotated genome, Species specific terpen: 4425

TPS Classification



  1. TPS Gene distribution in different species

    all taxons

    monocots

    Dicots

    Dicots

    Gymnosperms and others

    gene density monocot


  1. Creating eo_Gene dictionary and minicorpus:

    I used following 2 approaches:

    A) I created txt file containing list of all gene names.

    B) I created txt file containing list of all species names and terms such as TPS, TPS1, TPS2 and so on.

    I used following command to create dictionary

    amidict -v --dictionary eo_Gene--directory gene --input genee.txt create --informat list --outformats xml

    pls find dictionary here

    Gene1

    eogene

    getpapers -q "(terpene synthase)" -o corporaTPS -x -p -k 500 -f corporaTPS/log.txt downloaded 500 papers

    getpapers -q "(terpene synthase) AND (characterisation) AND (characterization)" -o corpusTPS -x -p -k 500 -f corporaTPS/log.txt downloaded around 35 papers

  2. Testing above eo_Gene dictionaries:

    ami -p "corporaTPS" section

    ami -p "corporaTPS" search --dictionary eo_Gene.xml

  3. eo_Gene1

    Gene1

    eo_gene

    gene

  4. Difficulties:

    a) Some papers mention TPS in Vitis vinifera as VvTPS and VviTPS. Some use TPS1 or TPS01.

    b) Gene names are in tables, figures or supplementary files.



Creating TPS corpus:

  1. Date 2/8/2021

    I queried https://europepmc.org/ for following searches and got results as:

    Query Number of hits
    terpene synthase 4308
    terpene synthase plant 3447
    terpene synthase plant volatile 1200
    terpene synthase plant TPS 650
    terpene synthase TPS plant volatile 376
    terpene synthase TPS plant volatile compounds 355 (Research articles 312)
  2. I continued TPS goldstandard on Date 3/8/2021, 4/8/2021 and 5/8/2021

    For 312 papers, I looked PMCID, Plant, Compound and TPS nomenclature availability.

  3. Date 5/8/2021

    Pls find TPS Goldstandard 312 papers

Clone this wiki locally