Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Determine if and how climate_environment will be used in the submission schema #586

Open
1 of 4 tasks
Tracked by #587 ...
mslarae13 opened this issue Jan 6, 2023 · 14 comments
Open
1 of 4 tasks
Tracked by #587 ...
Assignees
Labels
backlog Issue not assigned to a sprint or not completed during a sprint. Needs to be reprioritized. nmdc-schema-mixs-submission

Comments

@mslarae13
Copy link
Contributor

mslarae13 commented Jan 6, 2023

I am not sure how to write this example according to the structure. Would like to see some completed examples

Completion

  • Provide Montana an export from NCBI of metadata records that included this field and how they wrote it (@turbomam )
  • Provide Montana some examples of how it SHOULD be written
  • Update NMDC schema with examples (Montana)
  • Incorporate change into submission portal (Mark & kitware)
@turbomam
Copy link
Member

I do think I can check off most of those boxes within a day or two into next week

@turbomam
Copy link
Member

turbomam commented May 21, 2023

GSC's MIxS Specification

from https://github.com/GenomicsStandardsConsortium/mixs/blob/main/mixs/excel/mixs_v6.xlsx

Environmental package agriculture plant-associated
Structured comment name climate_environment climate_environment
Package item climate environment climate environment
Definition Treatment involving an exposure to a particular climate; treatment regimen including how many times the treatment was repeated, how long each treatment lasted, and the start and end time of the entire treatment; can include multiple climates Treatment involving an exposure to a particular climate; treatment regimen including how many times the treatment was repeated, how long each treatment lasted, and the start and end time of the entire treatment; can include multiple climates
Expected value climate name;treatment duration;interval;experimental duration climate name;treatment interval and duration
Value syntax {text};{period};{interval};{period} {text};{Rn/start_time/end_time/duration}
Example   tropical climate;R2/2018-05-11T14:30/2018-05-11T19:30/P1H30M
Requirement C X
Preferred unit    
Occurrence m m
MIXS ID MIXS:0001040 MIXS:0001040

BBOP relational version of NCBI biosample_set:

select
	value,
	count(1)
from
	all_attribs aa
where
	aa.harmonized_name = 'climate_environment'
group by
	value
having
	count(1) > 1
order by
	count(1) desc;
value count
Mediterranean, subtropical 4120
not applicable 1283
NA 763
not collected 482
Humid subtropical 392
Lab microcosm 192
greenhouse 113
Warm temperate (Cfb) 95
Boreal (Dfb) 88
control conditions 40
controlled conditions, branch partially covered by plastic bag 36
freeze-thaw 35
continental with Mediterranean influences 27
Humid continental climate 25
subalpine 24
riparian zone 23
controlled conditions 18
freeze 17
thaw 17
submediterranean 14
drought 14
Controlled 12
tropical wet and dry climate 12
cold 8
heat 8
missing 7
Tropical 7
2 weeks cold storage 6
Dry 6
3 weeks cold storage 6
4 weeks cold storage 6
5 weeks cold storage 6
Orchard at harvest 6
Agricultural environment 6
Dry, Hot 5
Temperate 5
Common Garden, Flooding 3
Common Garden, Control 3
Common Garden, Drougth 3
Greenhouse, Heat, CO2 3
https://kare.ucanr.edu/Weather_Physical_-_Biological_Data/ 3
Greenhouse, Drought, Heat, CO2 3
Greenhouse, Flooding, Heat 3
temperate climate 3
Greenhouse, Drought, Heat 3
Greenhouse, Heat 3
Greenhouse, Flooding, Heat, CO2 3
common garden setup 2
KG biological replicates 3 2
KG biological replicates 1 2
ambient conditions 2
S biological replicates 2 2
desertic 2
CK biological replicates 3 2
KG biological replicates 2 2
watered 2
S biological replicates 1 2
none 2
Not applicable 2
S biological replicates 3 2
CK biological replicates 2 2
CK biological replicates 1 2

@turbomam
Copy link
Member

turbomam commented May 21, 2023

Any Biosamples with a climate_environment value in the current NMDC production MongoDB?

db.getCollection("biosample_set").find( { climate_environment : { $exists : true } } );

0

@turbomam turbomam moved this from 🔖 Ready to 🏗 In progress in SubPort Squad Issues May 21, 2023
@turbomam turbomam changed the title Submission portal- soil template- climate environment Determine if and how climate_environment will be used in the submission schema May 21, 2023
@turbomam turbomam changed the title Determine if and how climate_environment will be used in the submission schema Determine if and how climate_environment will be used in the submission schema May 21, 2023
@ssarrafan
Copy link
Collaborator

Adding to current sprint per Mark. Need feedback from @mslarae13

@mslarae13
Copy link
Contributor Author

@turbomam the NCBI examples are quite variable, and not what I'd expect. I am not surprised there's no time or duration for climate manipulation, but rather people just describe the comment.

I think , considering the examples we have, this field should just be a way to describe the climate_environment. and not as a "duration of treatment"

So from the plant example "tropical climate;R2/2018-05-11T14:30/2018-05-11T19:30/P1H30M" would just be "tropical climate"

@turbomam thoughts?

@turbomam
Copy link
Member

turbomam commented Jun 1, 2023

@mslarae13 That's fine with me. Do you want to allow any string, or would you like to have a validation pattern, or some enumerated values?

@ssarrafan
Copy link
Collaborator

Discussion seems to be ongoing, moving to new sprint

@mslarae13
Copy link
Contributor Author

Name is misleading and should be climate treatment... People are putting biome and other information that should be in a different column here... Ramona suggests deprecating this term. I will put an issue into GSC

@ramonawalls
Copy link

Nearly all of the values in this slot are wrong and should go into one of the ENVO slots (e.g., biome). There is a legitimate need to record information about experimental environmental conditions, but there are either existing slots for that information or we should add new one(s) that are less confusing.

I recommend deprecating this term in MIxS and replacing it mostly with existing terms. If there is some environmental data that can't be captured, we can create new terms that are clearer.

@mslarae13
Copy link
Contributor Author

@mslarae13
Copy link
Contributor Author

@turbomam
Should we remove this from NMDC now? Or wait for GSC update?

Also, we pulled this term into the soil package. It can go away. (GSC only has it in agriculture & plant)

@pkalita-lbl FYI

@turbomam
Copy link
Member

turbomam commented Jun 7, 2023

I'll remove in 7.6.1

@turbomam turbomam moved this from 🏗 In progress to 👀 In review/Pending Release in SubPort Squad Issues Jun 12, 2023
@ssarrafan ssarrafan added the backlog Issue not assigned to a sprint or not completed during a sprint. Needs to be reprioritized. label Jun 20, 2023
@mslarae13
Copy link
Contributor Author

@turbomam did this get completed?

@mslarae13
Copy link
Contributor Author

Decision: Will deprecate this term.
@sierra-moxon has created a deprecation protocol for NMDC. We'll discuss implementing this in to GSC & will deprecate this term when ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backlog Issue not assigned to a sprint or not completed during a sprint. Needs to be reprioritized. nmdc-schema-mixs-submission
Projects
None yet
Development

No branches or pull requests

4 participants