-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Relax certain required: true
slots in schema
#583
Comments
alt I agree, elevation is typically used for soil. the env triad is required for data search depth, if we're talking about soils should be required. It's a vital slot for data reuse (as is geographic location / lat lon & I'll suggest we make those required) idk what subsurface depth is? One thing to consider, currently, as the class Biosample sits, depth, while important for soil, sediment, and water.. isn't relevant for plants. We will need to think about "what is required for all biosamples" vs certain types. |
Regarding the env triad, the choices in the general case are:
1 adds additional overhead in the need to perform updates using change sheets later. 2 adds some complexity to the ingest, in that we essentially have to merge two curation streams. Note that in the specific case of BioScales, we need to merge two streams anyway. Here is the spreadsheet that we got from ORNL It includes the triad. It also includes other metadata we need to load. |
Apologies some of the slots I mentioned in the above list are not enforced as |
@sujaypatil96 moving to the next sprint but please let me know if you won't be actively working on it for the next few weeks. |
@ssarrafan we plan to address this at the metadata call today. |
After a brief discussion, @turbomam and I agree with point 2 from @cmungall: keep schema script and force annotation prior to ingest. This approach is only possible in this case because ORNL has provided a supplementary file. |
see #612 |
leave envo required. we should be able to populate these for soil via gold addition of what stan provided. |
We may have to revisit leaving all the envo slots as required true. Per Reddy envo terms don't exist for endosphere so env_broad_scale is populated but env_local_scale and env_medium are not for the bioscales endosphere samples. @mslarae13 @cmungall @emileyfadrosh |
@sujaypatil96 is this issue still being worked on? I'll move to the next sprint due to the current activity but let me know if It can be closed or if it needs to go to the backlog. |
So far we've found workarounds for the environmental terms so those are still required for now. |
GOLD filled in missing values for the MIxS environmental triad for the BioScales project, so we've decided not to relax the schema, but to just leave it as is. So at the moment at least we don't need the changes from the original request of this issue so I think this issue can be closed. |
There are a few slots on the Biosample class in the NMDC Schema, most of which are set as
required: true
in the schema.This issue seeks to request the modification of the
required: true
constraint on certain slots on the Biosample class torecommended: true
in order to accommodate the fetching of biosample records from upstream sources such as the GOLD database.The following slots are the ones on which we are requesting the relaxation:
altdepthsubsurface_depthSee microbiomedata/sample-annotator#113 for more details.
The text was updated successfully, but these errors were encountered: