Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update data_plan.md #76

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 45 additions & 4 deletions docs/howto_guides/data_plan.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,56 @@ While a new concept, making your DMP machine readable, increases the likelihood

### What to include in your DMP?

While your DMP should include information about all data collected through the duration of your research, these guidelines and best practices focus on large omic data generated from microbiome samples. If you are unfamiliar with microbiome data management and metadata standards, we recommend you begin with an [Introduction to Metadata and Ontologies: Everything You Always Wanted to Know About Metadata and Ontologies (But Were Afraid to Ask)](https://microbiomedata.org/introduction-to-metadata-and-ontologies/) and the [NMDC Metadata Standards Documentation](https://w3id.org/nmdc/nmdc). These resources will introduce you to multi-omics metadata standards that leverage existing community-driven standards.
While your DMP should include information about all data collected through the duration of your research, these guidelines and best practices focus on large omics data generated from microbiome samples. If you are unfamiliar with microbiome data management and metadata standards, we recommend you begin with an [Introduction to Metadata and Ontologies: Everything You Always Wanted to Know About Metadata and Ontologies (But Were Afraid to Ask)](https://microbiomedata.org/introduction-to-metadata-and-ontologies/) and [A Practical Approach to Using the Genomic Standards Consortium MIxS Reporting Standard for Comparative Genomics and Metagenomics](https://link.springer.com/protocol/10.1007/978-1-0716-3838-5_20). These resources will introduce you to metadata standards that leverage existing community-driven standards.

***The NMDC DMPTool Template***
### The NMDC DMPTool Template

In partnership with the University of California Curation Center of the California Digital Library, the NMDC team has created a microbiome-specific DMPTool Template. DMPTool is an open-source application that assists researchers in the creation of data management plans compliant with federal funding requirements. The NMDC DMPTool template is funding-organization agnostic and was developed to support microbiome data management best practices with specifications unique to microbiome standards and data processing. Once you [create an account at DMPTool](https://dmptool.org/), this link will take you to the [NMDC Microbiome Omics Research DMP Template](https://dmptool.org/plans?plan%5Bfunder%5D%5Bid%5D=%7B+%22id%22%3A+4265%2C+%22name[…]Microbiome+Data+Collaborative%22+%7D&plan%5Btemplate_id%5D=1321) ***which provides step-by-step prompts for your DMP.***
In partnership with the [University of California Curation Center](https://cdlib.org/services/uc3/) of the [California Digital Library](https://cdlib.org/), the NMDC team has created a microbiome-specific [DMPTool Template](https://dmptool.org/plans?plan%5Bfunder%5D%5Bid%5D=%7B+%22id%22%3A+4265%2C+%22name%22%3A+%22National+Microbiome+Data+Collaborative%22+%7D&plan%5Borg%5D%5Bid%5D=%7B+%22id%22%3A+4265%2C+%22name%22%3A+%22National+Microbiome+Data+Collaborative%22+%7D&plan%5Btemplate_id%5D=1321). DMPTool is an open-source application that assists researchers in the creation of data management plans compliant with federal funding requirements. The NMDC DMPTool template is funding-organization agnostic and was developed to support microbiome data management best practices with specifications unique to microbiome standards and data processing. Once you [create an account at DMPTool](https://dmptool.org/), you will be prompted to the [NMDC Microbiome Omics Research DMP Template](https://dmptool.org/plans?plan%5Bfunder%5D%5Bid%5D=%7B+%22id%22%3A+4265%2C+%22name[…]Microbiome+Data+Collaborative%22+%7D&plan%5Btemplate_id%5D=1321) ***which provides step-by-step prompts for your DMP.***

All of the sections below are laid out in the NMDC DMPTool template. This is a living document and the NMDC team welcomes community feedback on this resource.

![](../_static/images/howto_guides/data_mgt/data_mgt_list.png)
**Sample and Data Types and Sources:** This section outlines what kinds of data will be produced throughout the project.

1. Describe the data set including basic identification information, average size, volume of estimated number of data files produced
2. What types of data are being generated?
3. How are they being generated (tools and instruments)?
4. What analysis stages will the data go through?
5. Will you be using existing data for any of your findings?

**Data Standards and Formats:** This section defines all variables of interest and communicates that you are aware of and will abide by community best practices whenever possible.

1. How is your data being processed?
2. What are the recognized community standards for your data and which will you follow?
3. How will your data adhere to [FAIR Data Principles](https://www.go-fair.org/fair-principles/)?
4. Who will ensure that the data standards and formats are maintained?
5. How will you define and categorize variables of interest that are not part of standard fields? List all variables of interest.

**Roles and Responsibilities:** This section shows how your data management plan will be executed and ensures that your team’s data management responsibilities are clearly defined.

1. Who is responsible for data storage and access, quality control, documentation, and preservation during the project?
2. Who is responsible for coordinating the various data once collection is complete?
3. What is your estimated budget for data management activities?

**Data Dissemination & Archiving:** This section describes what the final data products will be and how you will protect data, if applicable.

1. What are the anticipated data products? Include secondary products (publications, presentations etc.)
2. When will your data be released?
3. Who is the target audience for your data set?
4. Is the security of your data important for privacy reasons? If so, how do you intend to protect your data?

**Policies for data sharing, public access, and re-use:** This section communicates that you understand your funders data sharing policies and that you have a plan to ensure public availability.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read this as either "data sharing policies of the funder" or "data sharing policies of the funders." In the first case, I'd suggest changing funders to funder's (singular possessive). In the second case, I'd suggest changing funders to funders' (plural possessive). The "suggested change" snippet below shows the latter.

Suggested change
**Policies for data sharing, public access, and re-use:** This section communicates that you understand your funders data sharing policies and that you have a plan to ensure public availability.
**Policies for data sharing, public access, and re-use:** This section communicates that you understand your funders' data sharing policies and that you have a plan to ensure public availability.


1. How will you comply with your funders data policies? Most federal funders require the public availability of all data produced.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read this as either "data policies of the funder" or "data policies of the funders." In the first case, I'd suggest changing funders to funder's (singular possessive). In the second case, I'd suggest changing funders to funders' (plural possessive). The "suggested change" snippet below shows the latter.

Suggested change
1. How will you comply with your funders data policies? Most federal funders require the public availability of all data produced.
1. How will you comply with your funders' data policies? Most federal funders require the public availability of all data produced.

2. What are your data attribution standards for other researchers who may use your data?
3. Are there any inherent restrictions on the sharing of the data?

**Data and Sample Preservation:** This section communicates the sustainability plan for your data, showing your funder that the data products will last after the completion of the project.

1. Who is responsible for maintaining the data and metadata over time?
2. How much of your budget (if any) will be dedicated to data maintenance and preservation?
3. How much storage do you anticipate needing for your final data products?
4. How will you ensure adherence to plan your DMP?


### For more information
You can find more information about [Data Management Best Practices](https://microbiomedata.org/data-management/) on the NMDC website.
Expand Down