Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Water content validation and structure #148

Closed
Tracked by #587
mslarae13 opened this issue Aug 5, 2022 · 21 comments
Closed
Tracked by #587

Water content validation and structure #148

mslarae13 opened this issue Aug 5, 2022 · 21 comments
Assignees
Labels
Prod code release needed Updates made, production not currently using this version. Code push required to see in prod

Comments

@mslarae13
Copy link

Relates to #143 which was the interim fix.

Need to improve how we will validate water content and how we will parse it.

@ssarrafan
Copy link

@mslarae13 @turbomam is this actively being worked on?

@pkalita-lbl
Copy link
Collaborator

The task here is to add these examples from the linked issue into the checked examples in the submission schema repo as well as examples shown to the user in DH:

percent dry or wet weight % (75%, 75 %, .75)
g of water / g dry soil (5 g water / g dry soil)
cubic centimeter per cubic centimeter
Water holding capacity (0.75, 75% water, .75 g water per g soil WHC)
water filled pore space (60% WFPS)

@pkalita-lbl pkalita-lbl assigned pkalita-lbl and unassigned turbomam Mar 8, 2023
@pkalita-lbl
Copy link
Collaborator

End of sprint update: this one came in late, but I intend to work on it in the next sprint.

@pkalita-lbl
Copy link
Collaborator

continue to next sprint

@mslarae13
Copy link
Author

mslarae13 commented Apr 4, 2023

Make water content method an enumeration?
Flexible enough to put a DOI with the enumerations? Must be URL encodable, but it's manageable
Giver permissible values that says WHC, meaning this DOI describing the protocol.
If users didn't follow one of these they should submit a ticket or issue or feed back to request an additional method be added to the enumeration

Montana will work with Patrick to get the enumerated values for method of water capacity

Don't smoosh the text and the meaning into the drop down. Leave the text as "water filled pore space"

  • Long term, get DH smarter to pull permissible value descriptions and meanings into the help side bar (mid term plan)
  • Short term, put comment field with the slot information

Could we make the side bar in DH should link to the web documentation page?

  • Concern, link ML documentation can be hard for people to understand if they aren't familiar with it. Can't assume people understand it
  • But, small change to show it and it's there for people to get familiar with

@pkalita-lbl
Copy link
Collaborator

So that path forward based on my understanding of today's discussion in the submission portal squad meeting:

  • Update the water_cont_soil_meth ("water content method") slot to specify enumerated values. The permissible values will be well-known measurement protocols (e.g. "percent wet weight", "water holding capacity"). In the schema we will use the meaning of each permissible value to point to a DOI describing the method more precisely. The user-facing description of the water_cont_soil_meth needs to also make it clear what each permissible value means.
  • Since the water_cont_soil_meth slot specifies information about the measurement protocol it does not need to be encoded or validated in the water_content ("water content") slot. The validation pattern should just be a number with with an optional percent sign or limited set of units (e.g. "g", "g/g", "cc/cc").

@pkalita-lbl
Copy link
Collaborator

This isn't going to be finished in this sprint since we kind of needed to change direction on it last minute. At this point the next step is for @mslarae13 and me to put an initial stake in the ground for "water content method" permissible values.

@ssarrafan
Copy link

I'll remove from sprint and add to the POST GSP backlog. @pkalita-lbl @mslarae13

@mslarae13 mslarae13 moved this from 🔖 Ready to 🏗 In progress in SubPort Squad Issues May 12, 2023
@ssarrafan
Copy link

Adding backlog label, removing from sprint.

@mslarae13
Copy link
Author

@pkalita-lbl ... @turbomam I could still use your thoughts on my comment above

@mslarae13
Copy link
Author

@ssarrafan goal is to have this done this sprint

@pkalita-lbl
Copy link
Collaborator

Sorry I guess I missed that comment somehow. I don't have strong feelings on it one way or the other. If you think a unit column makes the most sense I trust your judgement!

@ssarrafan
Copy link

@ssarrafan goal is to have this done this sprint

ok will add to current sprint

@mslarae13
Copy link
Author

Ok. let's do that then. Because putting value and units is messy and standard units with a protocol is looking impossible.
@pkalita-lbl Can you add this slot?

@pkalita-lbl
Copy link
Collaborator

Recapping discussions with Mark and Montana:

  • Similar to having an enumerated list of methods, an enumerated list of units is also not really feasible.
  • If the unit portion is going to be an arbitrary string, it might as well be an arbitrary string in the water_content slot, as opposed to introducing a new, separate unit slot
  • We will validate water_content as {value} {unit} where the value part is numeric and the unit part is an arbitrary string.
  • Because of the heterogeneity this allows we won't be able to use these values for faceted searching and filtering. That's a limitation we're choosing to accept.

Next steps:

  • I will add any necessary regex patterns to the existing slots along with testing examples to the submission schema

@ssarrafan
Copy link

Based on last comment I'll move this to the next sprint.

@pkalita-lbl
Copy link
Collaborator

Changes are in the submission schema (microbiomedata/submission-schema#124) and have been released as part of v7.6.5. Still need to bring that version into the submission portal.

@pkalita-lbl
Copy link
Collaborator

This update is now on https://data-dev.microbiomedata.org/

@github-project-automation github-project-automation bot moved this from 🏗 In progress to ✅ Done in SubPort Squad Issues Jun 15, 2023
@github-project-automation github-project-automation bot moved this from 👀 Schema to ✅ Done in Post GSP Backlog Jun 15, 2023
@mslarae13
Copy link
Author

@pkalita-lbl is this in prod?

@pkalita-lbl
Copy link
Collaborator

No. Last production portal release was June 9. Submission schema v7.6.5 went into the portal codebase about a week after that. Therefore, this change is still only available on dev.

@mslarae13 mslarae13 added the Prod code release needed Updates made, production not currently using this version. Code push required to see in prod label Jul 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Prod code release needed Updates made, production not currently using this version. Code push required to see in prod
Projects
Status: Done
Status: ✅ SubPort 1 - Done
Development

No branches or pull requests

4 participants