Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a temporary NCBI prefix expansion #260

Merged
merged 1 commit into from
Oct 2, 2024
Merged

Conversation

turbomam
Copy link
Member

@turbomam turbomam commented Sep 26, 2024

a prefix of NCBI was added to our data via the submission portal without also being added to our schema.

Example:

  host_taxid:
    has_raw_value: Brachypodium distachyon [NCBI:txid15368]
    term:
      id: NCBI:txid15368

We will need to talk more about what prefix and expansion to use for NCBI taxonomy. If we are going to use NCBI as a permanent prefix, them we must ensure that its expansion can be used to access all entities "owned" by NCBI

I acknowledge that the style (NCBI:txid15368) is encouraged by NCBI and MIxS, but putting a semantic part of the prefix (txid) after the colon is at the very least a bad smell

This PR adds a temporary expansion of

  NCBI: "http://example.com/ncbitaxon/" 

See also


Guidelines

Soft Schema Freeze

The nmdc-schema and berkeley-schema-fy24 schemas are under a soft freeze, which means changes should not be made that have any downstream implications. To ensure this, all PRs created during the freeze will be closely reviewed with every component of the NMDC system in mind.

Reviewers

To ensure no changes are made unexpectedly, PR creators will use this PR template to tag and notify all task coordinators. Review should be specifically requested from all Berkeley Schema Roll Out task coordinators that you expect to be affected by this PR.

We expect task coordinators to review PRs and provide feedback/approval within 1 week of when they are identified as reviewers.

PRs will NOT be merged until all task coordinators (or their delegates) have approved it; either here on GitHub (via "Review changes > Approve" or an equivalent comment) or verbally.

Expedition, questions, and discussion can happen at any meeting.

Delays in review & merging should be addressed in meetings or with NMDC leadership.

If you expect the changes to
impact this component...
...request a review
from this person
Metadata
Schema
@mslarae13
Runtime
Mongo database
Database migrations
@eecavanna,
who will pull in
@shreddd as needed
Postgres
Ingest
@naglepuff
Data Portal @aclum
Workflows: MG & MT @mbthornton-lbl
Workflows: MetaB & NOM @corilo
Workflows: LipidO @kheal
Workflows: MetaP @SamuelPurvine
ETL code @sujaypatil96
Jupyter notebooks @brynnz22

PR Information

What type of PR is this? (check all applicable)

  • Refactor
  • Feature
  • Bug Fix
  • Optimization
  • Documentation
  • Schema change: Structure and content
    • created, updated, or deleted a class, slot, or enum
    • changed whether a slot is multivalued
    • changed the way a slot is assigned to a class
    • changed the permissible_values of an enum
    • etc.
  • Schema change: Cleanup and preparation
    • updated the description of a class, slot, or enum
    • updated the mappings of a class, slot, or enum to an ontology
    • added an enum for future use (it is not in the range of any slot)
    • etc.

Description

PRs should be small and concise.

Aim to create small, focused pull requests that fulfill a single purpose. Smaller pull requests are easier and faster to review and merge, leave less room to introduce bugs, and provide a clearer history of changes.

  • Replace this text with a description of what this PR branch contains. Please keep in mind that all reviewers will be reading this description. Example: "In this branch, I..."

Related Issues

All PRs should relate to or fix an issue(s). Please identify the issue(s) below.

  • Related Issue(s): #
  • Fixes: #

Did you add/update any tests?

  • Yes
  • No (Add a justification below)
  • I need help with writing tests

Could this schema change make it so any valid data becomes invalid?

This is a question about what the schema allows. It is not a question about what happens to exists in the NMDC database right now.

Example: If, in this PR branch, you renamed a slot from foo to foo_bar, the answer to this question would be "yes," even if nothing in the NMDC database currently uses the foo slot.

More examples: slot or class name changes, changes to a slot's multivalued state, changes to a slot's range (e.g. string to integer), changes to slot assignments to classes, changes to an enum's permissible_values

  • Yes (A migrator is required)
  • No
  • I need help determining this

If you answered "Yes", does this PR branch include that migrator?

  • Yes
  • No, this PR is incomplete and I need help writing the migrator

Does this PR have any downstream implications?

Examples: any change here that requires a change to workflows, workflow automation, the Mongo-to-Postgres ingest process, Jupyter notebooks, the Runtime, etc.

  • Yes (Explain below)
  • No

Copy link

PR Preview Action v1.4.8
🚀 Deployed preview to https://microbiomedata.github.io/berkeley-schema-fy24/pr-preview/pr-260/
on branch gh-pages at 2024-09-26 18:27 UTC

@eecavanna
Copy link

eecavanna commented Sep 26, 2024

Hi @turbomam, is this PR "ready for review"?

If not, I'd recommend changing its status to "Draft." Otherwise, I'd recommend filling out the "PR template" and customizing the "Description" section. GitHub notified me about it because my username is tagged in the PR's description field.

@turbomam turbomam marked this pull request as draft September 27, 2024 18:04
@turbomam turbomam changed the title temporary-ncbi-expansion-3 temporary ncbi prefix expansion Sep 27, 2024
@turbomam turbomam marked this pull request as ready for review October 2, 2024 18:05
@aclum
Copy link
Collaborator

aclum commented Oct 2, 2024

As far as I'm aware the discussion has not been resolved

@aclum
Copy link
Collaborator

aclum commented Oct 2, 2024

This resolves https://www.ncbi.nlm.nih.gov/taxonomy/?term=txid15368, id prefer a real expansion if possible.

@eecavanna
Copy link

eecavanna commented Oct 2, 2024

Per our conversation in today's metadata meeting, here's the existing issue that will still be an issue even after this gets merged in:

@eecavanna eecavanna changed the title temporary ncbi prefix expansion Add a temporary NCBI prefix expansion Oct 2, 2024
@turbomam turbomam merged commit 5829cc9 into main Oct 2, 2024
3 checks passed
@turbomam turbomam deleted the temporary-ncbi-expansion-3 branch October 2, 2024 21:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants