Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update repo-level documentation #43

Merged
merged 7 commits into from
Jun 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
125 changes: 125 additions & 0 deletions .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# Contributing to NMDC-notebooks

:+1: First of all: Thank you for taking the time to contribute!

The following is a set of guidelines for contributing to [nmdc_notebooks repo](https://github.com/microbiomedata/nmdc_notebooks). This guide
is aimed primarily at the developers for the notebooks and this repo, although anyone is welcome
to contribute.

## Table Of Contents

- [Code of Conduct](#code-of-conduct)
- [Guidelines for Contributions and Requests](#contributions)
* [Reporting issues](#reporting-issues)
* [Making pull requests](#pull-requests)
- [Best practices](#best-practices)
- [Adding new notebooks](#adding-new-notebooks)
- [Dependency Management](#dependency-management)

[//]: # (This is a comment, it will not be included)
[//]: # (in the output markdown file.

<a id="code-of-conduct"></a>

## Code of Conduct

The NMDC team strives to create a
welcoming environment for editors, users and other contributors.

Please carefully read NMDC's [Code of Conduct](https://github.com/microbiomedata/nmdc-schema/blob/main/CODE_OF_CONDUCT.md).

<a id="contributions"></a>

## Guidelines for Contributions and Requests

<a id="reporting-issues"></a>

### Reporting issues with exisiting notebooks

Please use the [Issue Tracker](https://github.com/microbiomedata/nmdc_notebooks/issues/) for reporting problems or suggest enhancements for the notebooks. Issues should be focused and actionable (a PR could close an issue). Complex issues should be broken down into simpler issues where possible.

Please review GitHub's overview article,
["Tracking Your Work with Issues"][about-issues].

### Pull Requests

See [Pull Requests](https://github.com/microbiomedata/nmdc-schema/pulls/) for all pull requests. Every pull request should be associated with an issue.

Please review GitHub's article, ["About Pull Requests"][about-pulls],
and make your changes on a [new branch][about-branches].

We recommend also reading [GitHub Pull Requests: 10 Tips to Know](https://blog.mergify.com/github-pull-requests-10-tips-to-know/)

## Best Practices

<a id="best-practices"></a>

- Read ["About Issues"][about-issues] and ["About Pull Requests"][about-pulls]
- Issues should be focused and actionable
- Bugs should be reported with a clear description of the problem and steps to reproduce. If bugs are found within a notebook, please include the link to the notebook in the issue and the specific cell that is causing the issue.
- Complex issues should be broken down into simpler issues where possible
- Pull Requests (PRs) should be atomic and aim to close a single issue
- PRs should reference issues following standard conventions (e.g. “Fixes #123”)
- Never work on the main branch, always work on an issue/feature branch
- Core developers can work on branches off origin rather than forks
- If possible create a draft or work-in-progress PR on a branch to maximize transparency of what you are doing
- PRs should be reviewed and merged in a timely fashion
- In the case of git conflicts, the contributor should try and resolve the conflict


[about-branches]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/about-branches
[about-issues]: https://docs.github.com/en/issues/tracking-your-work-with-issues/about-issues
[about-pulls]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/about-pull-requests

## Adding new notebooks

<a id="adding-new-notebooks"></a>

To add a new notebook to this repository:

1. Create a folder in the base directory
- Name the folder with a short version of the analysis/question that will be explored.
- Make name of folder `snake_case`
2. Create a `README.md` in the folder outlining the analysis or question.
3. Create a sub-folder for each language that will be demonstrated
- e.g. one subfolder named `R` and one subfolder named `python`
4. Instantiate a Jupyter Notebook for each folder coded in its corresponding language
_or_
4. Create a .Rmd and convert it to a Jupyter Notebook. Several methods for this exist and none are perfect, but [this open source method](https://github.com/mkearney/rmd2jupyter) currently works.
5. Run the entire notebook to ensure it is working as expected and save the *rendered* notebook in the folder.
6. Update the `README.md` in the folder to include links to the rendered notebook (using [nbviewer](https://nbviewer.org/) and [google colab](https://colab.research.google.com/)).

brynnz22 marked this conversation as resolved.
Show resolved Hide resolved

## Dependency Management

<a id="dependency-management"></a>

### R

This project uses `renv` for package management. After cloning the github repository, open the R project and run `renv::restore()` to make sure your packages match. To learn more about how renv works, [see this resource](https://rstudio.github.io/renv/articles/renv.html).

### Python

This project uses pip paired with venv to manage dependencies. Note that requirements_dev.txt should be used and updated for local development dependencies, and requirements.txt should be used for production/binder dependencies (updated manually and with discretion).

#### To install the dependencies:

1. Clone the github repository
2. create a virtual environment:
`python -m venv venv`
3. Activate the virtual environment:
`source venv/bin/activate`
4. Install the necessary packages:
`pip install -r requirements_dev.txt`
**Note** to update your package installations:
`pip install -U -r requirements_dev.txt`

#### To add new packages:

1. Activate the virtual environment:
`source venv/bin/activate`
2. Install any new packages:
`pip install <package>`
3. Capture the new requirements:
`pip freeze > requirements_dev.txt`
4. Push changes to github
23 changes: 23 additions & 0 deletions .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
---
name: Bug report
about: Create a report to help us squash bugs in notebooks
title: "[BUG]"
labels: ''
assignees: ''

---

**Describe the bug**
A clear and concise description of what and where the bug is.

**To Reproduce**
Steps to reproduce the behavior:
1. Go to notebook '....'
2. Run all cells until cell '....'
3. See error

**Expected behavior**
A clear and concise description of what you expected to happen. Or a link screenshot of the rendered version of the notebook that you are expecting to reproduce.

**Screenshots**
If applicable, add screenshots to help explain your problem.
20 changes: 20 additions & 0 deletions .github/ISSUE_TEMPLATE/feature_request.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
---
name: Feature request
about: Suggest an idea for a new notebook or improvement to existing notebook
title: "[FEATURE REQUEST]"
labels: ''
assignees: ''

---

**Is your feature request related to a problem? Please describe.**
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

**Describe the solution you'd like**
A clear and concise description of what you want to happen.

**Describe alternatives you've considered**
A clear and concise description of any alternative solutions or features you've considered.

**Additional context**
Add any other context or screenshots about the feature request here.
16 changes: 16 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
### All Submissions:

* [ ] Have you followed the guidelines in our [Contributing document](CONTRIBUTING.md)?
* [ ] Have you checked to ensure there aren't other open [Pull Requests](../../../pulls) for the same update/change?
* [ ] Does your PR link to an issue?

<!-- Erase any parts of this template not applicable to your Pull Request. -->

### New Notebook Submissions:

* [ ] Have you included a summary of the notebook in the README.md included updated links to the notebook?
* [ ] Does your PR include links to the new notebook **(in the branch)** for review using [nbviewer](https://nbviewer.jupyter.org/), [Colab](https://colab.research.google.com/), and [reviewnb](https://www.reviewnb.com/)? These three are the preferred ways to review changes and additions to notebooks during review.

### Bug Fix Submissions:

* [ ] Does your PR include links to the updated notebook **(in the branch)** for review using [nbviewer](https://nbviewer.jupyter.org/), [Colab](https://colab.research.google.com/), and [reviewnb](https://www.reviewnb.com/)? These three are the preferred ways to review changes and additions to notebooks during review.
brynnz22 marked this conversation as resolved.
Show resolved Hide resolved
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
.Ruserdata
.Rprofile
.ipynb_checkpoints
.DS_Store
venv/
.virtual_documents/
taxonomic_dist_by_soil_layer/python/contig_notebook_session.pkl
46 changes: 2 additions & 44 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,51 +24,9 @@ R and Python were chosen since they are popular languages among scientists to ex

A challenging aspect that has been highlighted with this process is accessing the (meta)data in a user-friendly way via the NMDC API. Because the NMDC metadata schema is highly modular, retrieving metadata is not straight forward without extensive knowledge of the metadata schema's infrastructure, modeling language ([LinkML](https://linkml.io/)), and naming conventions. A proposed solution to this challenge is the creation of an R or Python package that would allow users to access NMDC's data in an easier and more straight forward way.

## Adding new notebooks
## Contributing

To add a new notebook to this repository:

1. Create a folder in the base directory
- Name the folder with a short version of the analysis/question that will be explored.
- Make name of folder `snake_case`
2. Create a `README.md` in the folder outlining the analysis or question.
3. Create a sub-folder for each language that will be demonstrated
- e.g. one subfolder named `R` and one subfolder named `python`
4. Instantiate a Jupyter Notebook for each folder coded in its corresponding language
_or_
4. Create a .Rmd and convert it to a Jupyter Notebook. Several methods for this exist and none are perfect, but [this open source method](https://github.com/mkearney/rmd2jupyter) currently works.

## Dependency Management

### R

This project uses `renv` for package management. After cloning the github repository, open the R project and run `renv::restore()` to make sure your packages match. To learn more about how renv works, [see this resource](https://rstudio.github.io/renv/articles/renv.html).

### Python

This project uses pip paired with venv to manage dependencies. Note that requirements_dev.txt should be used for development dependencies, and requirements.txt should be used for production/binder dependencies (added manually and with discretion).

#### To install the dependencies:

1. Clone the github repository
2. create a virtual environment:
`python -m venv venv`
3. Activate the virtual environment:
`source venv/bin/activate`
4. Install the necessary packages:
`pip install -r requirements_dev.txt`
**Note** to update your package installations:
`pip install -U -r requirements_dev.txt`

#### To add new packages:

1. Activate the virtual environment:
`source venv/bin/activate`
2. Install any new packages:
`pip install <package>`
3. Capture the new requirements:
`pip freeze > requirements_dev.txt`
4. Push changes to github
We welcome contributions to this repository. Please see the [Contributing document](CONTRIBUTING.md) for more information on how to contribute.