Progress Summary : Utkarsha Pande

Maintained by : Utkarsha Pande

System specifications :

MacBook Air (13-inch, 2017)
Processor: 1.8 GHz Dual-Core Intel Core i5
Software: macOS Big Sur version 11.0.1

Date : 08/09/2021

`pygetpapers`

It is a powerful tool packaged as a python package developed from getpapers that helps in downloading full-text scientific texts, using open access repositories such as API. The tool can be accessed through a command-line interface (terminal).

To know more details on pygetpapers: https://github.com/petermr/pygetpapers

Installing `pygetpapers`

Make sure you have pip and python installed.
After downloading python, go to terminal and and run the command pip install pygetpapers.
Check if your installation was successful by typing pygetpapers --help.
pygetpapers --help should open the pygetpapers help prompt on terminal.

Usage of pygetpapers

Enter your query on terminal : pygetpapers -q "terpene synthase volatile plant_name " -o plant_nameTPS -x -p -s

`ami3`

ami3 is a toolkit to manage (scholarly) documents by dividing the document(research papers) into various sections like abstract, materials ad methods, results, ethical statements and acknowledgements which can be further used for text search and classifications. ami3 is written in Java, and is designed to be a declarative system, with commands and data modules.

To know more about ami3 and its working: https://github.com/petermr/ami3

Date - 09/09/2021

Installing `ami3`

Install Java

Install the latest version of Java on your system from https://www.java.com/en/download/
Test if the installation was successful by typing java -version on command line.

Install JDK

Install latest JDk on your system from https://www.oracle.com/java/technologies/javase-jdk16-downloads.html
Check JDK version by typing ls /Library/Java/JavaVirtualMachines on terminal.

Install git

Install git for your system from https://git-scm.com/downloads
Check if git is successfully installed by typing git -version

Install Maven

Install maven for your system from https://maven.apache.org/download.cgi. I installed apache-maven-3.8.2-bin.zip from this link.
Unzip the apache-maven-3.8.2-bin.zip file in the downloads folder using unzip apache-maven-3.8.2-bin.zip.
Move the apache-maven-3.8.2 folder to /Applications by :

pwd
cd downloads/
mv apache-maven-3.8.2/Applications/
cd /Applications
ls

Close the terminal and open it again and set the path for mvn installation.
For setting path :

-> Open bash profile by open -e .bash_profile(In my case I did it in home directory) this will open a text edit file.

-> If the bash profile is not present in the directory, create it using touch .bash_profile and then open the bash profile using the above command.

-> Press enter to go to new line and edit the text file by pasting this in the file :

export M2_HOME=/Applications/apache-maven-3.8.2
export PATH=$PATH:$M2_HOME/bin

-> Save the file and close it.

-> Type this source .bash_profileon command line.

-> And install maven by mvn -install

You can also follow this youtube link for maven installation : https://youtu.be/j0OnSAP-KtU

Check the maven installation by typing mvn -version on the command line, it should show apache-maven-3.8.2.

Date - 10/09/2021

ami3

Close and open the terminal and git clone the ami3 repository on the terminal

git clone https://github.com/petermr/ami3.git

Drag and drop the ami3 folder from the home directory to the /Applications directory.
Set the path for ami3, ppen bash profile by open -e .bash_profile and add the following lines :

export A2_HOME=/Applications/ami3/target/appassembler
export PATH=$PATH:$A2_HOME/bin

Save and close the bash profile.
On the terminal change directory :

cd \Applications
cd ami3

Paste the following command, it will take some time for installation.

mvn install -Dmaven.test.skip=true

Check ami3 installation by typing ami --help. It should open the help message.
Usage of ami3

Go through each paper to manually scop for TPS related to a crop.
Collect gene (names) terms such as CsTPS, MonoTPS and so on. Put those terms into excel file as a list and save excel file as gene.txt file and save it in the same directory as the ami3.
Enter your query on terminal : amidict -v --dictionary eo_Gene --directory gene --input gene.txt create --informat list --outformats xml.
Generates a XML file with eo_gene dictionary.

Date- 13/09/2021

Ran metadata analysis on Tomato corpus and generated CSV file for 'result' and 'abstract' section.

Running Metadata analysis script developed by Shweata Hegde

To know further: Metadata analysis Wikipage

Install Visual Studio on the system and added the following extensions python, prettiercode, excel-to-csv.

pip install pandas 
pip install yake
pip install scispacy
pip install bs4
pip install https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.4.0/en_ner_bionlp13cg_md-0.4.0.tar.gz

Copy paste the metadata script on a new python file in Visual Studio Metadata_analysis.py
Uncomment this line #querying_pygetpapers_sectioning("('terpene synthase') AND ('volatile') AND ('Citrus') AND (((SRC:MED OR SRC:PMC OR SRC:AGR OR SRC:CBA) NOT (PUB_TYPE:'Review')))",'200',CPROJECT) and edit the code with your crop of interest.
Run the code in the same environment in which all the requirements stated above have been pip-installed.
Generates a tps_corpus and a CSV file for the crop containing all the tps gene names and ids from a particular section.

Date- 17-09/2021 - 25/09/2021

Vitis dictionary by Ragheshwari - eo_Gene.xml
Helped Ragheshwari with metadata analysis script for Vitis vinifera - Vitis metadata script CSV file

Problems faced :

ImportError cannot import name Deque - so tried solving it using pip install websockets but Deque is not available for python versions 3.6.0 and below.
Updated python version to 3.6.7 in Ragheshwari's system.
Re-ran metadata analysis script for Vitis and generated CSV file with TPS names in 'result' section - Vitis metadata script CSV file

Date - 12/10/2021

Validation Software

Pyami code for developing dictionary SOFTWARE test_pyamidict_tdd.py

Clone pyami repo using git clone https://github.com/petermr/pyami.git
Set path open -e .bash_profile. It opens the text editor and then paste the below path. Save and close the text editor.

export P2_HOME=/Users/sagar/pyami
export PATH=$PATH:$P2_HOME/py4ami

Re-open terminal go to test folder in pyami by cd utkarsha/pyami/test.
Install pytest by pip install pytest.
Run pytest test_pyamidict_tdd.py in the test folder.
If it shows ImportError: cannot import name 'AMIDict' from 'py4ami.dict_lib', then open the test_pyamidict_tdd.py code in any code editor and add .. at lines 14 and 15 before 'py4ami'. So the lines 14 and 15 should be :

from ..py4ami.xml_lib import XmlLib
from ..py4ami.dict_lib import AMIDict, AMIDictError, Synonym, Entry

Run pytest test_pyamidict_tdd.py in the test folder. Ran 58 tests passed.

Date - 12/10/2021

Testing Dictionaries

Pyami code for testing dictionaries pyami/test/test_cevopen_tps_dictionaries.py.

Go to test folder in pyami by cd utkarsha/pyami/test.
Install pytest by pip install pytest if not installed.
open the pyami/test/test_cevopen_tps_dictionaries.py code in any code editor and add .. at line 1 before 'py4ami'. So the line 1 should be :

from ..py4ami.dict_lib import AMIDict

Run pytest test_cevopen_tps_dictionaries.py in the test folder. Ran 4 test_dictionaries, 3 passed 1 failed.

pyami.py4ami.dict_lib.AMIDictError: Failed to read URL: https://raw.githubusercontent.com/petermr/crops/main/Vitis%20vinifera/eo_Gene.xml; reason = Not Found

../py4ami/dict_lib.py:496: AMIDictError
FAILED test_cevopen_tps_dictionaries.py::test_vitis_is_valid - pyami.py4ami.dict_lib.AMIDictError: Failed to read URL: https://raw.githubusercontent.com/petermr/crops/main/Vitis%20vinifera/eo_Gene.xml;...

Project Ideas

Title : Analysis of Semantic Terpene Synthase Dictionaries

Demonstrating the working of all the intern dictionaries (including the previous interns)
Integration of all the dictionaries to enzyme name dictionary to create a knowledgebase for terpene synthases.

Adding EC numbers to the enzyme_name dictionary and crops dictionaries (Karya interns, Sagar, me)
Adding wikidata id’s to all the dictionaries (Karya interns, Sagar, me)
Calculating co-occurrences of enzyme names in each of the crop dictionaries (TBD)
Comparative analysis of tps_enzymes between each of the crops (TBD)
Entity-relationship modeling using the full datatables to relate tps to other entities like volatile compound emission (TBD)

Slides - https://docs.google.com/presentation/d/1D4EXVkYUjqQtr_56dt_ORE215fX_U5koQUpskCd45qQ/edit?usp=sharing

Work done:

Created dictionaries using wikidata ids for camelia and tomato tps
Generated XML and csv dictionaries. camelia_wikidata.csv , tomato_wikidata.csv , camelia_sparql, tomato_sparql
Converted CSV dictionaries to pandas Dataframes.
Removed duplicates to obtain a list of the tps enzymes for camelia and tomato.
Identifying common tps enzymes in both the crops camelia_tps[camelia_tps['itemLabel'].isin(tomato_tps['itemLabel'])].
Plot correlation matrix between enzyme names and crop tps enzymes (The attached images are generated from incomplete dictionaries)

Caption - common tps in tomato and camelia"

Caption - Correlation plot for camelia and tomato tps

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Progress Summary : Utkarsha Pande

Table of Contents

System specifications :

`pygetpapers`

Installing `pygetpapers`

Usage of `pygetpapers`

`ami3`

Installing `ami3`

Usage of `ami3`

Running Metadata analysis script developed by Shweata Hegde

Validation Software

Testing Dictionaries

Project Ideas

Clone this wiki locally

Progress Summary : Utkarsha Pande

Table of Contents

System specifications :

pygetpapers

Installing pygetpapers

Usage of pygetpapers

ami3

Installing ami3

Usage of ami3

Running Metadata analysis script developed by Shweata Hegde

Validation Software

Testing Dictionaries

Project Ideas

Clone this wiki locally

`pygetpapers`

Installing `pygetpapers`

Usage of `pygetpapers`

`ami3`

Installing `ami3`

Usage of `ami3`