Moved to Hedgedoc 28 May 2024 @mrchristian : https://demo.hedgedoc.org/QyniBhytQGGeIx3XdM8F2g
The idea is to create descriptions of what we would like the ALR to do.
The writing style will be based on the following:
The guide will be considered a Step-by-step guide acording to the Diátaxis method as the audience will be users familiar with data science tooling and paper writing (as Diátaxis describes it - a work place instruction).
- Technical writing method: Diátaxis - https://diataxis.fr/
- Technical writing style: MDN Web Docs - https://developer.mozilla.org/en-US/docs/MDN/Writing_guidelines/Writing_style_guide
- Style guide: APA - https://apastyle.apa.org/
The writing tooling has not yet been selected.
Contributors:
Assigned @NOT ASSIGNED -
Questions:
Notes:
Outline:
Assigned @neerajkumaris -
Questions:
How does the author report on their question in the literature review in terms of the use of ALR?
Notes:
When the author reports on their question in the literature review using Automated Literature Review (ALR) tools such as pygetpapers, they typically follow a systematic process that leverages the capabilities of the tool to enhance the efficiency and comprehensiveness of their review.
Outline: Outline for Reporting on the Research Question in the Literature Review Using pygetpapers :
Introduction:
- Introduce the research question and its significance.
- Provide background information on the topic.
- Explain the relevance of Automated Literature Review (ALR) tools like pygetpapers.
- State the purpose of the literature review and its organization.
Methodology for Literature Search:
- Describe the methodology used to conduct the literature search.
- Explain the use of pygetpapers for automating the retrieval of academic papers.
- Specify the search strategy, including keywords, phrases, and Boolean operators.
- Detail the sources from which papers were retrieved (e.g., Europe PMC).
Selection Criteria and Filtering:
- Define the inclusion and exclusion criteria used to filter the search results.
- Discuss the process of filtering papers based on relevance and quality.
- Highlight any manual or programmatic filtering techniques employed.
Organization and Categorization:
- Explain how the retrieved literature was organized and categorized.
- Discuss the criteria used for categorization, such as themes, methodologies, or findings.
- Provide insights into the structure of the literature review based on these categories.
Synthesis of Findings:
- Summarize the main findings from the retrieved literature.
- Discuss common themes, trends, and patterns identified in the literature.
- Highlight any contradictions or inconsistencies among the findings.
Critical Analysis and Quality Assessment:
- Conduct a critical analysis of the methodologies employed in the retrieved studies.
- Assess the reliability, validity, and generalizability of the findings.
- Utilize metrics such as citation counts or impact factors to evaluate the quality of the literature.
Conclusion:
- Summarize the key findings and insights from the literature review.
- Reiterate the significance of the research question and its implications for the field.
- Reflect on the role of ALR tools like pygetpapers in facilitating the literature review process.
Assigned @neerajkumaris -
Questions:
- How and when to add CoLab to GitHub? @Simon Worthington
Notes:
Outline:
When to Add CoLab to GitHub
-
To keep track of changes and revisions to your notebooks.
-
To easily share notebooks with collaborators and work together on the same project.
-
To keep a backup of your work in case of accidental deletions or changes.
-
To integrate your notebooks into a larger project managed on GitHub, benefiting from issues tracking, pull requests, and other GitHub features.
Steps to integrate Google Colab to GitHub
1. Linking GitHub to Colab
First, you need to link your GitHub account with your Google Colab account:
a. Open a Google Colab notebook.
b. Click on "File" in the menu.
c. Select "Save a copy in GitHub".
d. If you haven't linked your GitHub account before, a prompt will appear asking you to authenticate with GitHub. Follow the instructions to grant access.
2. Saving a Colab Notebook to GitHub
Once your GitHub account is linked, you can save notebooks directly to a GitHub repository:
a. Open the notebook you want to save.
b. Click on "File" in the menu.
c. Select "Save a copy in GitHub".
d. A dialog will appear where you can specify the repository, the branch, and the commit message.
e. Click "OK" to save the notebook to the specified GitHub repository.
3. Opening a Notebook from GitHub in Colab
To open a notebook directly from a GitHub repository:
a. Go to the Google Colab homepage.
b. Click on "GitHub" in the menu.
c. Authenticate with your GitHub account if prompted.
d. Search for the repository or notebook you want to open.
e. Click on the notebook to open it in Colab.
4. Using Git Commands in Colab
For more advanced use cases, you can use Git commands directly within a Colab notebook:
Start by cloning a repository:
!git clone https://github.com/yourusername/yourrepository.git
Navigate into the repository directory:
%cd yourrepository
You can now use other Git commands, such as git pull, git add, git commit, and git push, just like you would in a local development environment.
Assigned @neerajkumaris -
Questions:
Notes:
Outline:
Assigned @Amit
Start by removing the stop words and select the most relevant keywords
For example : Ques : What is the role of silicon transporter in the rice ?
Step 1 : Remove Stop words : Remaing words are : Role , Silicon , transorter, rice
Open the command terminal in ther directory where you wish to store your project
Step 2 :
pygetpapers -q "your query terms" -n "lantana_query_config"--save_query
Using this you will be able to count the total number of available papers on EUPMC for the query
We can also use the attributes like -startdate and enddate to filter out the query
Step 3 :
pygetpapers -q " role silicon transporter rice " -k 10 -p -x -makecsv -makehtml -o path/to/some/output/directory/optional --loglevel debug -x --logfile test_log.txt
This will download the 10 papers from the EUPMC in both the xml and pdf formats , make the csv and html of the associated metadata and download these papers in the given output directory
Similarly we can download more papers and make the entire cProject tree
--loglevel debug -x --logfile test_log.txt is to write the log to a .txt file in your HOME directory, while simultaneously printing it out.
When -o output path is not given the downloaded copora is saved to the pwd as a time-stamed directory
Flag | What it does | In this case pygetpapers ... |
---|---|---|
-q |
specifies the query | queries for 'invasive plant species' in METHODS section |
-k |
number of hits (default 100) | limits hits to 10 |
-o |
specifies output directory | outputs to invasive_plant_species_test |
-x |
downloads fulltext xml | |
-c |
saves per-paper metadata into a single csv | saves single CSV named europe_pmc.csv |
--makehtml |
saves per-paper metadata into a single HTML file | saves single HTML named europe_pmc.html |
--save_query |
saves the given query in a config.ini in output directory |
saves query to saved_config.ini |
pygetpapers
, by default, writes metadata to a JSON file within:
- individual paper directory for corresponding paper (
epmc_result.json
) - working directory for all downloaded papers (
epmc_results.json
)
-The query terms can be also specified in the purticular section like Abstract, methods, results etc
C:\Users\lilia>pygetpapers -q abstract:"money" -n
INFO: Total number of hits for the query are 22161
Assigned @Adesh212 -
Questions:
Notes:
Outline:
Assigned @Simon Worthington -
Notes:
The author needs to use the collection, and information and data about the collection in their literature review paper.
Questions:
What does the collection (or what would we like it to collect) and the process produce that the author can use in their paper?
- A complete Git repository
- A replicable and reusable system
- A DOI references data and code set on Zenodo or other academic repository as a full GitHub repository using Software Citation
- A saved query: https://pygetpapers.readthedocs.io/en/latest/user_documentation.html#example-query Flag --save_query as saved_config.ini
- A CProject for the whole query
- A CTree per paper
- Custom terms and dictionaries
- Data analysis results?
- Hits
- Word frequency
- Bibtex output and import with content an/or contnet links to Zotero
- CSS Paged media outputs
Outline:
Ideal process
- Add reference list to the paper and store it in Zotero
- Take a GitHub Release of repository add to Zenodo and link as data to paper
- Link to all papers in repo CTree
- Present question and methods for converting into query, show table of queries, link to data
- Provide information on how to reproduce and reuse / extend the CProject
- Present data analysis of literature review in paper
- Chart and data link to number of hits
- Chart of NGram of search term
- Table of word frequency
- Table of top ten papers
- Table of top journals, top authors
Current process
- Take a GitHub Release of repository add to Zenodo and link as data to paper
- Link to all papers in repo CTree
- Present question and methods for converting into query, show table of queries, link to data
- Provide information on how to reproduce and reuse / extend the CProject
- Present data analysis of literature review in paper
- Table of the number of hits
- Table of word frequency
- Table of top ten papers
- Table of top journals, top authors
Covers, title pages, table of contents, contributors, etc. See style guide: APA - https://apastyle.apa.org/
Zotero collection: