Meeting Records

Previous meeting records can be found here: https://github.com/petermr/openVirus/wiki/Records-and-Reports

Date: 17th Dec. 2020

Participants: PMR, Ayush, Shweata, Anugrah, Dheeraj

Key Points:

Schematron - A tool to validate dictionaries.
Everybody should come up with a set of rules that their dictionary should comply to.

Date: 21st Dec. 2020

Participants: PMR, GY, Aishwarya, Ambreen, Anugrah, Ayush, Dheeraj, Kareena, Shweata

Key Points:

Briefly discussed the progress and the future directions for the project.
GY: New Inters are to join us from January.
Test-Driven Development
- Write tests that the dictionary editors should comply with.
Discussed the structure of projects in Python. We looked at the project structure of 'Pymatgen', a project in which PMR is involved.

Date: 4th Jan. 2021

Participants: PMR, GY, Ayush, Mukul, Dheeraj, Shweata, Radhu

Key Points

Welcome Radhu, the new NIPGR intern!
History and a bit of Context:
- 1.5 years ago, NIPGR-ContentMine internships started.
- Plant literature into a database.
- openVirus started in March with a focus on 'viral epidemics.
- Gita ma'am's group has developed EssOilDB which is now being converted to use for the newer technology. New interns will be working on this.
- We will continue to work on openVirus dictionaries as well.
- We follow 'open notebook philosophy'
Review the EssOilDB and Emanuel work to make sure it's compatible.
Review:
- Plant Taxonomy
- A typical paper about a plant has
  - Place
  - Part
  - Chemicals
Project Management
- Regular Meetings
- Agenda
- Run the Meetings
- Meeting Account
Brief recap of the 'openVirus work' so far. (By Shweata, Ayush, PMR and Dheeraj)
Requirments for pygetpapers is documented here https://github.com/petermr/openVirus/wiki/pygetpapers
pyamidict We discussed the general workings of the code PMR developed. https://github.com/petermr/dictionary/blob/main/pythoncode/pyamidict/editor/amidict.py

IMMEDIATE TASKS

Coordinate the review for pygetpapers -> Ayush
Revise dictionaries
prototype code to search using Dictionary
Proper labelling and naming of rules for dictionary validation

Date: 7th Jan. 2021

Participants: PMR, GY, Ayush, Dheeraj, Shweata, Radhu

Key Points

Introduction about Julia Arsuffi, PhD Plant Science, Cambridge & Emanuel Faria, Brazil and their work
Discussion about Manny works so far
Dictionaries
- (Country, Disease, Drug, Organization)
- Plant(essential oil-producing), Extraction Methods, Phytochemicals
Brief about tigr2ess Program to Radhu (https://github.com/petermr/tigr2ess)
Explanation of pygetpapers by Ayush
Test-Driven Development: Collectively drafted some tests to implement for pyamidict
Branching: As multiple people are going to be involved in the development of software, it's important to make sure that we don't change each others' codes without knowing. Branching, therefore, becomes important. Radhu and Ayush, together are going to write a guide for all of us to help understand how branching works.

Date: 11th Jan. 2021

Participants: PMR, GY, Ayush, Dheeraj, Radhu, Kanishka, Manisha

Key Points

Welcome Kanishka, Manisha the new NIPGR intern!
Getting to know each other's computational backgrounds.
Introduction given to each intern and brief overview of their role in project.
We got an overview of the people, the project, the resources, how they are used, and how to build them.
Discussion about the agenda behind openVirus (wikicite presentation) By PMR, to build a system so that anybody can understand the science behind the current pandemic.
Gita ma'am's group has developed EssOilDB which is now being converted to use for the newer technology. New interns will be working on this.
Aim to start "4-projects" - for new interns with different criteria including Parts of plant, Chemicals, Extraction method & Analytic method.
Introduction about working of "Slack" and interns to new interns.
Introduction about working of Github.
PMR introduced 3 Software Project (getpapers, pygetpapers, Ami Dictionary).
learn how to use the wiki and contribute.
Discussion on running getpapers to retrive papers from medrxiv.
Running maven to build ami.

TASKS

be able to run getpapers and ami with existing dictionaries.
explore dictionaries in https://github.com/petermr/dictionary/tree/main/openVirus202011

Date: 14th Jan. 2021

Participants: PMR, GY, Ayush, Dheeraj, Kanishka, Prashant, Radhu, Shweata

Key Points

Welcoming our new intern, Prashant
A brief introduction of new interns
A short review of the CEV Open project
Brainstorming and discussing potential project ideas, feasible in 6 months time
Project Ideas generated at the end of the session:
- Phytochemistry-specific Projects (tentative):
  - Medicinal activities of plant essential oils
  - Volatile Terpenes and genes
  - Essential oils from invasive species
- Tech/Independent Projects:
  - pygetpapers
  - pyamidict
  - search
  - Location (Country)
  - Organization
  - Disease
  - Drug(?)
Interns' standup
pygetpapers Reviewing Ayush's code
- https://colab.research.google.com/drive/1PwJ7ZjqVC9DCC1pKfZxu5EdAhSDWZAwe?usp=sharing#scrollTo=_1CrrneEsSuW
- https://colab.research.google.com/drive/1UuiCX19ozC_UAvxxlYP947_CGi0akOyh?usp=sharing#scrollTo=NNdnIP6jhBno

Immediate tasks

Shweata: Email all the volunteers of openVirus team informing the new developments and project ideas.
- Ambreen, Anugrah, Rajan, Vanisha, Vaishali, Aishwarya, Mukul
- Priya, Kareena, Sana
Scoping Review: Preliminary searching and readings. Mainly to figure out the feasibility.
- Radhu: Medicinal activities of plant essential oils
- Kanishka: Volatile Terpenes and genes
- Prashant: Essential oils from invasive species

Date: 18th Jan. 2021

Participants: PMR, GY, Ayush, Ambreen, Vaishali, Radhu, Shweata

Key Points

Brief introduction by each intern and getting started.
Briefly discussed the progress about Ambreen's wrok.
A short review of the CEV Open project.
Discussion about last 3-4 week's works so far.
4 core dictionaries:
- country (Ambreen) => location => geolocation services
- disease (Dheeraj) - human disease
- drugs (Rajan)
- organizations (Shweata, Vaishali)
pygetpapers Reviewing Ayush's code.
Explanation of pygetpapers by Ayush to Ambreen and Vaishali.
Review of PMC papers to explain the interns about sectioning and output of xml files.
Review of dictionaries(Plants and openvirus).
Review of shweata's and dheeraj's excellent developments on "instanceof" and OPTIONAL work.
Discussion on SPARQL queries and downloading the .xml file.
Extended discussion on dictionaries, editing the wikipage dict schema, different terms and elements of wikidata.
Review on Poster of "WikiFactMine for Phytochemistry".

Thanks @Radhu. good start. There are some typos (wrok, Pythochemistry) Some things are case sensitive or space sensitive ("instance of", "CEVOpen"). This matters because machines can't correct errors. Can you remind us of the core projects? Did we decide on the fourth project for the new intern? I think we had "pheromones" . We'll need to agree this with Gita.

We need a little more detail (perhaps an extra line or two) where decisions were made. Where we have "extended discussion" we certainly need more. We have to assume that some people won't have been able to join and the record is important.

Date: 21th Jan. 2021

Participants: PMR, GY, Ayush, Kanishka, Dheeraj, Shweata, Radhu, Prashant

Key Points

Briefly discussed the progress and the future directions for the project.
Discussed on the new and exciting directions to the project.
Standup by interns (those who are present)
Extended discussion on dictionaries, editing the wikipage dict schema, different terms and elements of wikidata
Some requirements we identified
- merge entries with same WikidataID
- detect and eliminate scholarly articles, books, etc.
- add language wikipedia pages from wikidataID
- (SH) post-SPARQL filtering, or query refinement
- translate attributes into wikidata properties where possible (crossrefid => _p3153_crossrefid)
- remove unwanted terms (term value or wikidataID)

Date: 25th Jan. 2021

Participants: PMR, GY, Ayush, Dheeraj, Anugrah, Shweata, Radhu, Ambreen

Key Points

Introduction about planner for new interns
Dictionaries
- Plant(Radhu)
- Compound(Kanishka)
- Gene(Prashant)
Incomplete dictionaries(?)- Activity, extraction method, plant parts [Ask Emanuel] - Missing Wikidata items
Revise description and extract synonym.
Some terms have wikidata id of scholarly articles
Does Wikidata id of terms still exist? (Some items might have either be moved or deleted since the dictionary was created)
Write python code to go through the ids and check if they exist
PMR: Write software to convert SPARQL output into the dictionary.
volatile_compound
- PROBLEM: Chemicals have commas in them. AltLabel gives synonyms separated by a comma.
- PMR: Query all the chemicals automatically and look them up.
- Find out if the compounds are in CheBI
- found in taxon property - P703
Genes dictionary - Contact Guilia

Date: 28th Jan. 2021

Participants: PMR, Shweata, Radhu, Prashant

Key Points

PMR's Network has dropped and he has a meeting so no meeting on Thursday.

Date: 1st Feb. 2021

Participants: PMR, GY, Shweata, Radhu, Ambreen, Talha, Vasant

Key Points

Welcome Vasant, Talha the new NIPGR interns!
Getting to know each others Backgrounds.
General instruction given to all by Gitanjali Ma'am.
We got an overview of the people, the project, the resources, how they are used, and how to build them.
Discussion about the agenda behind openVirus By PMR, to build a system so that anybody can understand the science behind the current pandemic.
Introduction about working of "Slack" and interns to new interns.
Introduction about working of Github.
Mini Projects: Each project should have a scientific target. It must involve technology development.
- Medicinal oils (Emanuel (Manny) Faria) (Radhu Ladani)
- Genes (Giulia Arsuffi) (Talha Hasan)
- Invasive species ( Gitanjali Yadav) (Kanishka Parashar)
PMR introduced 3 Software Project (getpapers, pygetpapers, Ami Dictionary).

Date: 4th Feb. 2021

Participants: PMR, GY, Dheeraj, Talha, Vasant, Shweata, Radhu, Kanishka

Key Points

Briefly discussed the progress and the future directions for the project.
Explanation of a dictionary given by PMR , "Ocimum sanctum". Introduction to 'xml' markup language and other components such as 'elements', 'attributes' and Q number for wikidata.
Scientific strategy discussion over "Why we are doing these Projects?" To build an organized system using technological tools
Review of each intern's dictionary. Each one to create their dictionary's own wiki page.
Review of each miniproject (update your project pages with progress made, Create pages for tools which you use)
Everyone to commit their miniproject data on Github ( https://github.com/petermr/CEVOpen/wiki)
Main components of intern activity:
- Technology - (getpapers, ami, wikidata/SPARQL) - search
- Dictionaries (1 existing dictionary, 1 new dictionary) approx.
- Miniproject chemotype, genotype, activities (medicinal) phenotype - invasive species
- Integration - how these fit together - an atlas
All new topics of project will be discussed in slack and create Wiki pages on CEVOpen as Slack is for immediate conversations and GitHub is for structured technical conversions.

Date: 8th Feb. 2021

Participants: PMR, GY, Dheeraj, Talha, Vasant, Shweata, Radhu, Kanishka, Ambreen

Key Points

Brief project review by PMR to GY regarding the progress made and upcoming tasks.
Standup by each interns.
Introduction to interns about getpapers and ami search query by PMR.
Review of Vasant and Talha's miniprojects.
Review of dictionary- if each contains name, term, wikidata ID, wikidata label, description, Wikipedia URL.
Update the dictionary-Do all entries have synonyms? Wikipedia pages? If not we have to add them and how to retrieve these using VALUES.

Date: 11th Feb. 2021

Participants: PMR, Dheeraj, Talha, Vasant, Shweata, Radhu, Kanishka

Key Points

Briefly discussed the progress and the future directions for the project.
Interns' standup
Discussion of organization documents of Github and blockage of interns.
Explanation of Shuttleworth Foundation by PMR including their fellows, fellowships as well present the ideas of people who are open, innovative, global, etc.
- Open: “A piece of data or content is open if anyone is free to use, reuse, and redistribute it - subject only, at most, to the requirement to attribute and/or share-alike.”
Discussed the outline of "FlashForward presentation"
- ISC report and overview (PMR)
- Shweata overview
- Tigr2ess -> open climate -> open virus -> CEVopenPlant (-> crops, -> plant technology)
- Ayush pygetpapers (launch)
- demo of Wikidata/SPARQL - Dheeraj? + interns?
- (miniprojects) -> matplotlib displays
- (background photos - farms, landscapes, etc.) - stress mobile?
Theme finalization about event: "Scientific knowledge for Global Challenges"
Review of pygetpapers and explanation of Flowchart by Ayush and PMR.
difference between pygetpapers and getpapers https://github.com/petermr/dictionary/blob/main/pygetpapersdev/oldgetpapers.js

Date: 15th Feb. 2021

Participants: PMR, GY, Ayush, Talha, Vasant, Ambreen, Shweata, Radhu, Kanishka

Key Points

Standup by interns (those who are present)
Planning for the event (Flash Forward Workshop). It is going to be a hands-on session for people to try out our software and give us feedback. We are also going to discuss briefly about our current projects, its motives, and so on.
Presentation Discussion Regarding OpenVirus which is a team of volunteers who build software to query the scientific literature automatically in large amount.
We will explore the technology and the issues:
- where can you find science?
- what's its value? hoe can it be used?
- how much is hidden by the punishers?
- how can you do this yourself?

Date: 18th Feb. 2021

Participants: PMR, Ayush, Dheeraj, Vasant, Ambreen, Shweata, Radhu, Kanishka

Key Points

Flash Forward Workshop. This will be a workshop where
- we will demo the technology
- discuss what the world would like to be able to do. This includes non-English languages.
PMR will introduce current project EO, past projects climate and epidemics ? demo is specific topic of invasive species ? theme invasive species ? introductions ? [St Edmunds Game - Gita. Present science to non-scientists. Fun! Involved. May 2021]
Presentation discussion of openVirus Overview of the project:
- Shweata (Textmining software and community - Introduction) - 5 min.
- Ambreen (Her experience, initial results and machine learning 101 - 5 min.
What interns can present (CEVOpen - Invasive species theme):
- Kanishka - Intro to Invasive species project - 2 min. recorded video
- Talha - Megapublishers and Manual search of scientific literature - 2 min.
- Radhu - Intro to Wikidata, Invasive species dictionary creation demo - 2 min.
- Vasant - Jupyter Notebook demo - data display (histograms, cooccurrence) - 2 min
- Ayush - pygetpapers demonstration - 5 min.

Date: 22nd, 23th, 24th Feb. 2021

Participants: PMR, GY, Ayush, Talha, Vasant, Ambreen, Shweata, Radhu, Kanishka

Key Points

Discuss the time slot for the presentation
interns gave demo for their particular topic
PMR and other members suggest points and correction so presentation would me more precise.
get familiars with the BigBlueButton

Date: 25th Feb. 2021

Participants: PMR, GY, Ayush, Talha, Vasant, Ambreen, Shweata, Radhu

Key Points

Day of Flash Forward Workshop

Date: 1st March 2021

Participants: PMR, GY, Ayush, Talha, Vasant, Dheeraj, Shweata, Radhu, Kanishka

Key Points

review of FlashForward presentations
review of personal commitments (interns) including their College Schedule, Exams and other things.
alpha testing -> Preliminary tests for software.
- Every interns have to do testing of pygetpapers and recording in detail what they did. Good documented testing is one of the absolutely essential aspects to software.
pyamisearch works and needs alpha-testing. It will be used on sections. Currently we can extract given sections quite well . There are a number of short, important subsections and each intern can take one.
- acknowledgements
- conflict of interest
- ethics statements
- author contributions
CEVOpen at StEds. Event in Cambridge ca 2021-05-01 showing our projects and software to Cambridge students, researchers and faculty. Many are NOT scientists. This will be interactive and may have a game format. We need all interns to be completely fluent in
- installation
- tutorial materials and examples
- management (perhaps within teams)
- There may be a "dry run" about 2021-04-01 with volunteers (e.g. from Wikipedia)

Immediate tasks

Everyone must be prepared to alpha-test pygetpapers and give feedback pygetpapers: test reports.

Date: 3rd March 2021

Participants: PMR, Ayush, Talha, Vasant, Dheeraj, Shweata, Radhu, Kanishka, Ambreen

Key Points

review Ayush's work: Updated pygetpapers with options to make json and make csv on demand. Also added time elpased to pygetpapers.
review of Wikipages Dictionary wikipages for dictionaries. Everyone should have 1-2 dictionaries and summarize progress and problems.
- Radhu: plant and activity
- Kanishka: Invasive species
- Talha: compound and plant material history
- Vasant: Plant parts and gene
some skills will frequently need. so we have to can learning them by web tutorials and community self-help and set up Wiki pages for each:
- Programming: Regular Expressions
- Programming: Globbing
- Programming: Xpath
- Programming: JSON
- Programming: XML
- Programming: grep
Each page should help newcomers learn these techniques. we don't have to write a tutorial - it's more useful to point to good (often interactive) tutorials. It's also useful to point out the things which we found difficult.
Discussed the structure of projects in Python. We looked at the project structure of pyami.ini, a project in which PMR is involved.
Kanishka: Project Manager( maintain a wikipages for the testing of pyamisearch and asking people to put their tests and update this documents)

Immediate tasks

Every intern's have to check whether their dictionaries are capable to do ami search or not.
verify you can run /physchem/python/util.py
install pyami.ini to your "home" directory. I don't know where this is on Windows. This will need to be customised to define
- where your dictionaries are
- where your project/s are
try to use symbols where possible
run: python <your path>/util.py
keep editing till it shows your dictionaries

Suggestion to all

If any one has a any error then create issue on Github https://github.com/petermr/dictionary/issues and give a link of issue on Slack.
After issue has been resolved please add the problem encountered and the accepted solution to the wiki. Also please don't delete issues even after issue has been resolved because other fellows might encounter the same problem.

Date: 8th March 2021

Participants: PMR, GY, Ayush, Talha, Vasant, Shweata, Radhu, Kanishka

Key Points

Standup by interns
Review of the individual tasks, Interns to come up with their own dictionaries and projects.
- creation with SPARQL
- editing and problems of existing dictionaries
Potential use of dictionary of genes.
- identifying articles (IR)
- annotating genes in articles (IE)
- linking to Wikidata
- translating synonyms
Discussed the recent problem that we encountered with SPARQL query, as reported by Talha & Radhu( We can't downloads Large URL of endpoints so we have to download endpoints into the different part and then merge manually)

Immediate tasks

Every intern's dictionary wikipages created on "CEVOpen" repository and should records update and issues related dictionary

Date: 11th March 2021

Participants: PMR, Vasant, Dheeraj, Shweata, Radhu, Kanishka

Key Points

Standup by each interns.
Review of dictionaries
- whether all SPARQL results have been downloaded and have been converted with amidict
- each entry should have attributes:
  - term
  - name
  - wikidataID
  - wikidataURL
  - en-wikidataURL
  - en-description
- each entry may have children:
  - EN-synonym's (optional xml:lang attribute)
  - non-EN synonym's(xml:lang mandatory)
  - non-EN description (one per language, with xml:lang)
  - related - e.g. non-EN Wikipedia pages
- entry's may also have
  - p attributes for properties
  - q attributes for items

Immediate tasks

please checkout openDiagram and run latest search_lib.py:
- cd physchem/python
- python search_lib.py
This should create graphs of occurrences of chemicals. click on the destroy-window button to move to next edit the file search_lib to reference your dictionary and your corpus choose sections that are likely to contain words then run your search and be prepared to demonstrate to us. We want to see all 4(6) dictionaries in action.

Date: 15th March 2021

Participants: PMR, GY, Ayush, Talha, Vasant, Ambreen, Shweata, Radhu

Key Points

Discussed the current status of each member's Dictionary
Review of each dictionary if each contains name, term, wikidata ID, wikidata label, description, Wikipedia URL
Reviewed invasive species dictionary updated by Kanishka. Synonyms, IUCN status and taxon id need to be added.
Reviewed Activity dictionary updated by Radhu. Synonyms and language equivalents need to be added.
PMR Debugging people's problems while running search_lib.py using share screen
A brief discussion on Regular expressions (RegExp)

Date: 18th March 2021

Participants: PMR, Ayush, Talha, Dheeraj, Vasant, Shweata, Radhu, Kanishka

Key Points

Communal review of pygetpapers including installation and alpha testing review of each interns:
- --update option is unclear with what is meant to do
- downloading additional types of file such as --pdf could download PDF file to an existing repository
it is essential that all the options common to getpapers and pygetpapers are IDENTICAL. -In getpapers the -f <filename> option creates a LOG file. In pygetpapers this has a completely different operation "frompickle" . -Log levels and messages: -The current pygetpapers is verbose, relatively uninformative, and cannot be altered. Most of it would be debug D or trace T.
Users of getpapers will expect 16 flags to be present in pygetpapers.
- if present in pygetpapers these should have the SAME syntax as getpapers . The operation should ideally be the same. If enhanced or restricted this should be noted
- if NOT present in pygetpapers these flags should be reserved for future use.
- Flags should NOT be used for different purposes.

Date: 22nd March 2021

Participants: PMR, GY, Ayush, Talha, Shweata, Radhu

Key Points

Standup by interns (those who are present)
Review of pygetpapers with new version and new flag addition
Review of search_lib in (openDiagram)
ami search_lib is working with facets of dictionary, corpus and section and runs quickly on small corpus
We tested Talha Hasan and Radhu Ladani and Shweata Hegde dictionaries and they worked excellently!
We understand Concepts of Data Science in our project
Explanation of Natural Language ToolKit and Natural Language Processing by PMR

Immediate tasks

Talha Hasan please make Wiki page for EPMC(Explore synonyms on EPMC)
Radhu Ladani make wiki page for Natural Language ToolKit(NLTK) https://github.com/petermr/openDiagram/wiki/Natural-Language-Toolkit-(NLTK)

Date: 25th March 2021

Participants: PMR, Talha, Vasant, Radhu, Kanishka

Key Points

Standup by each interns
Minicorpora Review with each intern's dictionary
We downloads the 200 paper for each dictionary's topic using getpapers and then did section for ami search
- Radhu: Test Plant & Activity dictionary with ami search_lib here Activity dictionary worked perfectly but We need to update Plant dictionary to work with ami search_lib
- Kanishka: Test Invasive_species dictionary with Minicorpora oil186 & Invasive Plant species
- Vasant: Test Plant Parts
- Talha: Test Plant Compound
Understand The Database system and the Five Laws of Library Science of S. R. Ranganathan by PMR

Date: 29th March 2021 (HOLIDAY MONDAY VOLUNTARY)

Participants: PMR, Dheeraj, Ayush, Radhu, Kanishka

Key Points

Kanishka & Radhu work's Review Regarding their dictionary
Review of pygetpapers by Ayush
Review of Search_lib by PMR

Date: 5th April 2021

Participants: PMR, GY, Ayush, Talha, Vasant, Ambreen, Kanishka, Shweata, Radhu

Key Points

Update of St Edmunds Game & Project's Software to Gita ma'am by PMR including search_lib pygetpapers and their working
Review of search_lib with multilingual (English, Hindi, Urdu) with the different Dictionary Activity, Plant_Part, Plant_Compound, Plant_genus
Review of pygetpapers by Ayush including latest version, debug log-level, supplementary files, CSV file contain title & full column as well.
Every intern's Stand up.
Debugging the Search_lib for the command line so every body test on their system
- Query: python search_lib.py --dict --sect --proj
- Example: python search_lib.py --dict country --sect introduction method --proj oil186
--help will give you the following output to understand the query structure

C:\Users\DELL\Radhu\openDiagram\physchem\python>python search_lib.py --help
running search main
usage: search_lib.py [-h] [--dict DICT [DICT ...]] [--sect SECT [SECT ...]] [--proj PROJ [PROJ ...]] [--patt PATT [PATT ...]]

Search sections with dictionaries and patterns

optional arguments:
  -h, --help            show this help message and exit
  --dict DICT [DICT ...]
                        dictionaries to search with (lookup table from JSON (NYI); empty gives list
  --sect SECT [SECT ...]
                        sections to search; empty gives all (Not yet tested
  --proj PROJ [PROJ ...]
                        projects to search; empty will exit
  --patt PATT [PATT ...]
                        patterns to search with; regex may need quoting

Dictionary Activity, Plant_Parts, Plant_compound, country work excellently with the search_lib

Immediate tasks

everyone need to do Alpha testing of pygetpapers and make a report on wikipage
everyone need to do search_lib with their dictionary on own system

Date: 8th April 2021

Participants: PMR, Ayush, Talha, Vasant, Dheeraj, Shweata, Radhu

Key Points

UTF-8 is a variable-width character encoding used for communication discussed by PMR.
Every intern's Stand up
Feedback of pygetpapers by Ayush based on Alpha Test Reports and continue worked on --restart and --update for better improvement
We will introduced Tutorials Documentation for pygetpapers so people can understand well!(Volunteers Intern's worked on this with Ayush)
Review of Search_lib.py with different dictionary and demo set.
- The scheme is: search SECTIONS in PROJECTS with (DICTIONARIES and/or PATTERNS) with (DISPLAY and/or ANALYSIS) options
- python search_lib.py --demo gives different project and we have to choose from 'ethics', 'luke', 'plant_parts', 'worcester', 'word'
- Alpha code for searching document corpus SEARCH TUTORIAL
PMR explain file manager of search_lib, Standard-graph(matplotlib, seaborn) for graphical visualization
PMR explain supervised learning concepts in our project, powerful way of classifying section
dictionary review activity, plant, plant_part, plant_compound disease
- Everyone make sure they have a dictionary that works and has standard attributes
  - name
  - term
  - wikidataID (if known)
  - wikipediaPage (if known)
  - description (EN)
Every dictionary should have a name which is lowercase_underscore and a title which contains this value

Immediate tasks

Everyone make sure they have only ONE top level *.xml file for their dictionary .
This should have a name with is LOWERCASE (and optional UNDERSCORES) ONLY
It should be the same as the title attribute in the dictionary

Date: 12th April 2021

Participants: PMR, Talha, Kanishka, Radhu

Key Points

We are starting to come up with a Dictionary Naming Scheme
Every dictionary should have a name which is lowercase_underscore and a title which contains this value
Review the dictionary activity, plant, invasive_plant, plant_compound
- We make sure that dictionary works and has standard attributes
  - name
  - term
  - wikidataID
  - wikipediaPage
  - description
- and children:
  - <description xml:lang...> (optional)
  - <synonym> (optional)
  - <synonym xml:lang...>
  - <related ...> (optional)

Immediate tasks

Every dictionary should also have a minicorpus which contains content enriched in its terms. Please check that your minicorpus works with your dictionary.
Talha: create a document and maintain the current record of each dictionary including minicorpus, file name, location etc.

Date: 15th April 2021

Participants: PMR, GY, Ayush, Talha, Vasant, Dheeraj, Shweata, Kanishka, Radhu

Key Points

Standup by each interns.
pygetpapers review by ayush and demonstration of on commandline to gita ma'am.
Ayush explain the --restart and --update and create tutorial documentation of pygetpapers (Documentation)
Dictionary review( activity, plant_part, invasive_plant , plant_compound) of each interns and convert then into the standard format
search_lib review by each intern to PMR by screen sharing.
THESIS strategyFor interns (Radhu, Kanishka, Vasant, Talha)

Immediate tasks

everyone need to do testing of search_lib and make a report on wikipage Wikipage for report
To learn grep tool so it will help in this project
everyone make sure that their respective wikipages of dictionaries are upto date.

Date: 19th April 2021

Participants: PMR, Talha, Vasant, Dheeraj, Kanishka, Radhu

Key Points

Reports from alpha testers on pyami (search_lib) - commandline https://github.com/petermr/openDiagram/wiki/Test-Report-for-Search_lib
Each intern explain working of search-lib with their dictionary and also gives feedback
PMR told to analyze each and every fgealse positive and false negative values in the search_lib result and list them
Review current state of dictionaries and PMR told changes to respective owner for their dictionary.
PMR suggest tools such as WEKA, R programming, python pandas, Excel for statistical analysis including frequency annotations
Review of AMI gui.py Code (This is experimental but will develop into a GUI for The Game) for quick result analysis

Date: 19th April 2021

Participants: PMR, Talha, Vasant, Dheeraj, Kanishka, Radhu

Key Points

Reports from alpha testers on pyami (search_lib) - commandline https://github.com/petermr/openDiagram/wiki/Test-Report-for-Search_lib
Each intern explain working of search-lib with their dictionary and also gives feedback
PMR told to analyze each and every fgealse positive and false negative values in the search_lib result and list them
Review current state of dictionaries and PMR told changes to respective owner for their dictionary.
PMR suggest tools such as WEKA, R programming, python pandas, Excel for statistical analysis including frequency annotations
Review of AMI gui.py Code (This is experimental but will develop into a GUI for The Game) for quick result analysis

Date: 22nd April 2021

Participants: PMR, Talha, Vasant, Dheeraj, Ayush, Kanishka, Radhu

Key Points

Review of pygetpapers by Ayush and he's focusing on europe-pmc.py where we can run command without giving specific output file and also the aspect of multiprocessing so code become more precise
Ayush also discuss the points of gui.py tools
PMR discuss the gui.py tool including different parameter of code such as tkinter, button, dictionary, label
in gui.py we define dictionary invasive_plant, eoplant_part and country and by alteration we are going to modifying the program as well as it's framework
Review of each intern's dictionary: - adding synonyms. What software do we need?, -updating
- Dictionary Activity and Plant: it's up to date but once need to check english language synonyms
- Dictionary eoplant_part: Description, Wikidata URL, Wikidata ID all should be present
- Dictionary plant_compound: Delete all the synonyms and re add new appropriate synonyms
- Dictionary invasive_plant: Either to remove the language other than English from taxon common name Or remove comma , use any other separator
annotations and tooltips in non-EN languages
Review of miniprojects (minicorpora)https://github.com/petermr/CEVOpen/tree/master/minicorpora

Immediate tasks

Vasant will create wikipages for dictionary structure including metadata of each dictionary
PMR suggest to learn data analysis tools such as matplotlib R programming so it will help in the project
Everyone has to add README.md page for their respective dictionary and minicorpus

Date: 26th April 2021

Participants: PMR, Talha, Dheeraj, Ayush, Kanishka, Radhu, Ambreen

Key Points

General discussion about current situation of world due to covid pandemic, lockdown, vaccine and all.
We are concentrating very heavily on dictionaries, minicorpora and GUI interfaces . One goal is to support "TheGame" in May.
Dictionary review by Screen Sharing and every interns explain their work update and demonstration of gui.py on their system so it's gave an whether it's working on different operating system or not.
We need to add any of genus, geographic region, types of plant, common name in the plant dictionary
We review plant_genus with SPARQL query [https://w.wiki/3DpD] for taxon common name, images
PMR explain the importance of Metadata and we need to add it on our dictionary
PMR also add the file module ethic.xml in the dictionary so search_lib run without any errors
in gui.py module discuss the different option so it will give desire result including checkbox, additional file browser etc

Immediate tasks

Everyone need to add metadata to their dictionary
everyone need to test gui.py and write a report

Date: 29th April 2021

Participants: PMR, Talha, Ayush, Vasant, Shweata, Kanishka, Radhu, Ambreen

Key Points

PMR discussed a Code of Conduct, everyone should agree with the document https://www.contributor-covenant.org/version/2/0/code_of_conduct/
explanation of gui.py interface by PMR including all the parameter regarding dictionary, section
Ayush added html-links in the pygetpapers and explain how its work
pygetpapers working demonstration on jupyter notebook by ayush
every intern's explain their respective dictionary by screen sharing and give's current updates
PMR suggest to write SPARQL query for additional features which we want to add it into the dictionary and then we will merge it with current dictionary

Immediate tasks

Everyone come up with new SPARQL query for their dictionary
Alpha testing of gui.py and write a report https://github.com/petermr/pygetpapers/wiki/Test-report-of-gui.py

Date: 3rd May 2021

Participants: PMR, GY, Talha, Ayush, Vasant, Shweata, Kanishka, Radhu,

Key Points

PMR gives update of project to Gita ma'am and briefly discussed the progress and the future directions for the project
GY: New Interns are to join us from next week
PMR discuss the role of "volunteer" in the project https://github.com/petermr/CEVOpen/blob/master/VOLUNTEERING.md
Standup by interns (those who are present)
PMR explain Search engine optimization by searching "lantana camara" and explain how's different scientific portals gives different hits
Review of pygetpapers by Ayush with different flag such as --references, --synonyme as well as ayush explain how we use specific date criteria to search papers
- AND(First_PDATE:[2006-05-24 TO 2021-05-19])
- pygetpapers -q "(Lantana) AND (FIRST_PDATE:[2006-05-24 TO 2021-05-19])" -n
PMR explain ami-gui.py it's launch, browser and it's different category of section

Immediate tasks

Everybody should come up with a SPARQL query for respective dictionary

Date: 6th May 2021

Participants: PMR, GY, Talha, Vasant, Shweata, Kanishka, Radhu, Ambreen, Leeja

Key Points

Welcome Leeja, the new NIPGR intern!
Introduction given by each intern and brief overview of their role in project.
Brief introduction given to Leeja about getting started, projects and dictionaries.
Welcome Daniel Mietchen
Review of Wikidata/Scholia in relation to CEVOpen with Daniel
- https://github.com/Daniel-Mietchen/ideas/issues/499
- https://scholia.toolforge.org/topic/Q202864
- https://scholia.toolforge.org/venue/Q3359737
- https://scholia.toolforge.org/work/Q21090025
- List of newly described/ redescribed taxa http://tb.plazi.org/GgServer/static/newToday.html
- Also lexemes: https://www.wikidata.org/wiki/Wikidata:Lexicographical_coverage
- Re FAIR ethics, see also http://doi.org/10.5281/zenodo.2559998 and http://doi.org/10.5281/zenodo.4720432
- for generic questions about open science, you can use https://ask-open-science.org/
Daniel discussed Zika Virus worked with wikidata and recently published works on the this topic as well Ethics Statements for PMC articles
Daniel also discussed **Plazi **is an association supporting and promoting the development of persistent and openly accessible digital taxonomic literature.
PMR create initial project for Leeja (especially a dictionary- Essential oil-compound)
Brief recap of the 'CEVopen work' so far to Leeja
PMR gives demonstration of our software pygetpapers, pyami & gui.py to Leeja
Review of each interns SPARQL query and dictionary as well
PMR also discussed dictionary enhancements, especially to help select terms (description, images, categories (e.g. different subtypes))

Date: 10th May 2021

Participants: PMR, Ayush, Talha, Vasant, Shweata, Kanishka, Radhu, Leeja

Key Points

Standup by interns
- what I did
- what I plan to do
- what is blocking me
Discuss project of Leeja
- Leeja's role is to help create a phytochemistry resource that integrates the dictionaries:
  - plants (Radhu)
  - their essential oils and the compounds in them (Talha)
  - geographical information (Ambreen)
  - biological and other activity (Radhu)
- This will be driven by text from the phytochemical/EO literature and Wikidata. In general the papers will report:
  - what plant/s were used
  - where they were found/harvested
  - the oils extracted from them
  - the activity reported
Review of immediate priorities
- dictionaries
- searching using ami-gui
Shweata gave brief overview of Ethics subproject
Ayush discussed automated documentation of pygetpapers and few new feature
Ayush introduced a logo, table of contents and architecture diagrams
Ayush Added other things in readme https://github.com/petermr/pygetpapers/blob/main/README.md

Date: 13th May 2021 (VOLUNTARY session)

Participants: PMR, GY, Shweata, Radhu

Key Points

PMR developed the update_from_Sparql function for dictionaries
PMR discussed the code in search_lib.SearchDictionary.test()
- The components are:
  - id_name The field containg the wikidataURL
  - sparql_name The name of binding in the sparql file
  - dict_name name of the new child element in the dictionary
pyami update dictionaries from sparql

Date: 17th May 2021

Participants: PMR, GY, Shweata, Radhu, Kanishka, Vasant, Sagar

Key Points

Welcome Sagar the new NIPGR intern!
Introduction given to Sagar and brief overview of his role in project
PMR gave brief introduction about project and shows the dictionaries and explain the interrelationships and the minicorpora
PMR demonstrates the software ami_gui, ami_search to sagar
Shweata gave brief overview of Ethics Statement Project
Review of Dictionary and their SPAQRL output(interns who are present describe their projects)
- Radhu discussed eoplant and activity dictionary and update of SPARQL output of eoplant dictionary
- Kanishka discussed invasiveplants dictionary and PMR suggested to add GISD database
- Vasant discussed images display of plantpart dictionary
PMR suggested following points for the dictionary and SPARQL output
- SPARQL output are in .XML format
- Root element is Dictionary, and it must have a title. And it's got a number of entry elements. Entry element has a large number of attributes.
- Synonyms are child elements under entry

Immediate tasks

Sagar need to collect a list of all intern dictionaries and indicate which require updating from SPARQL
All SPARQL output names should be of the form: sparql_d(d).xml

Date: 19th May 2021 (VOLUNTARY session)

Participants: PMR, Ayush, Sagar, Shweata, Radhu, Kanishka

Key Points

Review of pygetpapers by Ayush
- He added the prototype code at https://github.com/ayush4921/funlilrepo/blob/main/test.py
- Ayush discussed issue with supplementary files and added check for zero size supp files
Review of ami_gui.py by PMR
- PMR discuss how to extract images with the help of Selenium from PDF
Creation of (exact) multiword search and demonstrate with the dictionaries country, eoplant and organization
Creation of sparql2amidict.py including display options based on ami_gui by PMR

Date: 20th May 2021

Participants: PMR, Sagar, Talha, Shweata, Radhu, Kanishka

Key Points

Discussion about today's meeting agenda
PMR demonstrates the progress of ami_gui with the dictionary eoPlant and organization and explain the multiword terms searching and getting images in the paper
Allocation of work to the Sagar - Dictionary manager
Report on dictionaries and Sagar create Wiki table and present it https://github.com/petermr/CEVOpen/wiki/Intern-Dictionaries and the tasks include:
- checking title of dictionary is the same as filename
- for each entry:
  - checking that Wikipedia links are present
  - checking Wikidata links
  - checking that term is a useful noun of phrase Much of this can be done automatically
Review of eoPlant and activity dictionary and discussed the changes as below:
- rename plant to eoPlant name of dictionary
- add minicorpus of 1000 paper for activity dictionary
Shweata showed how Ayush and she were able to extract phrases from Ethics Statement using SpaCy and also discussed the problems of organization dictionary i.e. the rendering issue

Date: 24th May 2021

Participants: PMR, Sagar, Talha, Vasant, Shweata, Radhu, Kanishka, Bhavini

Key Points

Welcome new intern - Bhavini
Introduction given to Bhavini and brief overview of his role in project
Brief introduction by each intern and getting started
Shweata gave introduction about project and shows the dictionaries and explain the interrelationships and the minicorpora
Shweata gave brief about code of conduct to bhavini
Each interns present their work by screen sharing and explain working status
ami-gui review by PMR- search strategy, term extraction (Rake)
Ayush explain pygetpapers to bhavini and aslo discussed suppdata and images issue with PMR
Feedback from ethics projectby Shweata

Date: 26th May 2021 (VOLUNTARY session)

Participants: PMR, Sagar, Shweata, Radhu

Key Points

Shweata and PMR discussed issue with publishing scholarly articles
PMR demonstrate ami_gui with eoPlant and organization dictionary
Sagar present the list dictionaries that need automatic updating by SPARQL files https://github.com/petermr/CEVOpen/wiki/Intern-Dictionaries
PMR updated the SPARQL update tool, At present it's a test
- SearchDictionary.test_update_in_repo()
with the help of SPARQL update tool PMR update features such as image_link, taxon in eoPlant dictionary

Immediate tasks

Every interns make sure to give information of dictionaries and their issues to Sagar

Date: 27th May 2021

Participants: PMR, GY, Sagar, Talha, Vasant, Shweata, Radhu, Kanishka, Bhavini

Key Points

Gita ma'am and PMR discuss the Guidance for interns thesis/reports https://github.com/petermr/CEVOpen/wiki/THESIS-strategy#update-2021-05-24
Sagar present the current update of the intern's dictionary and their SPARQL output to Gita ma'am
Gita ma'am suggest point such as dictionary link, table strategy to Sagar and also suggest NIPGR intern to make small video clip on their work
PMR demonstrate the latest version of ami_gui to gita Ma'am and shows how we can extract image for each particular plant species
Process the updating of:
- eoPlant @Radhu Ladani
- activity @Radhu Ladani
- Invasive @Kanishka
- Compounds @Talha Hasan
- Plant Parts @VASANT KUMAR
- Plant Genus @Shweata Hegde
- Organization @Shweata Hegde