-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: split all_ontology into individual files. #93
Conversation
9334ae2
to
3916a1d
Compare
30d4734
to
1c7c43b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: needs to update filepaths in https://github.com/chanzuckerberg/cellxgene-ontology-guide/blob/main/.github/workflows/generate_all_ontology.yml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you think of splitting these out into 1 schema per ontology? Technically this schema would allow for efo-ontology-v1.json, for example, to have CL terms in it, which should not be the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds overly verbose and prescriptive of the expected format. If there was actually a difference in the shape of the json files or how we generate them, I'd say split them up.
@@ -2,7 +2,7 @@ | |||
|
|||
PACKAGE_ROOT = os.path.dirname(os.path.realpath(__file__)) | |||
DATA_ROOT = os.path.join(PACKAGE_ROOT, "data") | |||
ALL_ONTOLOGY_FILENAME = "all_ontology.json.gz" | |||
ONTOLOGY_FILENAME_SUFFIX = ".json.gz" | |||
ONTOLOGY_INFO_FILENAME = "ontology_info.json" | |||
ONTOLOGY_ASSET_RELEASE_URL = "https://github.com/chanzuckerberg/cellxgene-ontology-guide/releases/download" | |||
SCHEMA_VERSION_TO_ONTOLOGY_ASSET_TAG = {"5.0.0": "ontology-assets-v0.0.1"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can remove this mapping now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removing ONTOLOGY_ASSET_RELEASE_URL, and SCHEMA_VERSION_TO_ONTOLOGY_ASSET_TAG
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could keep this if we want to support loading files from non-packaged versions, and just say that requires a network call. But, that's not a requirement for MVP so we can get rid of it for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removing for now. Less code to maintain
class CXGSchema: | ||
"""A class to represent the ontology information used by a cellxgene schema version.""" | ||
|
||
def __init__(self, version: Optional[str] = None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets add a docstring explaining that version will default to latest, packaged version if not provided as input.
from entities import Ontology, OntologyFileType, OntologyVariant | ||
|
||
from cellxgene_ontology_guide.supported_versions import CXGSchema | ||
|
||
|
||
class OntologyParser: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thoughts on refactoring this class to also take an ontology_name, so we can load the ontology file to read from in init and not have to request that as an input in every method? i.e. they can initialize cl_parser = OntologyParser("5.0.0", "CL")
We can also take care of this in a follow-up PR, if you prefer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll do that as a follow up
🤖 I have created a release *beep* *boop* --- ## [0.1.0](python-api-v0.0.1...python-api-v0.1.0) (2024-03-15) ### Features * add data to the python package ([#87](#87)) ([0eb6831](0eb6831)) * add is_valid_term_id method to OntologyParser ([#115](#115)) ([72c2073](72c2073)) * include license file with python package ([#85](#85)) ([2be3d81](2be3d81)) * load GH Release Assets for schema version in memory ([#72](#72)) ([58bad0a](58bad0a)) * refactor ancestry mapping to include distance from descendant node + implement functions to support curated list term mapping ([#96](#96)) ([7fc3562](7fc3562)) * refer to ontology source filenames in ontology_info and return that in get_ontology_download_url ([#106](#106)) ([ff9d826](ff9d826)) * split all_ontology into individual files. ([#93](#93)) ([ead59e5](ead59e5)) * Support getting download link for ontology from source repo ([#86](#86)) ([fd55b76](fd55b76)) ### Misc * clean-up ontology_parser single fetch and bulk fetch methods + account for acceptable non-ontology terms ([#112](#112)) ([2ef7435](2ef7435)) * **deps-dev:** bump semantic-version from 2.8.5 to 2.10.0 in /api/python ([#98](#98)) ([dfe0b39](dfe0b39)) ### BugFixes * imports for api ([4cd3386](4cd3386)) * lint errors ([f5e4583](f5e4583)) * python api releases ([bf0477e](bf0477e)) * update requirements ([#114](#114)) ([9888f3d](9888f3d)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
🤖 I have created a release *beep* *boop* --- ## [0.1.0](python-api-v0.0.1...python-api-v0.1.0) (2024-03-15) ### Features * add data to the python package ([#87](#87)) ([0eb6831](0eb6831)) * add is_valid_term_id method to OntologyParser ([#115](#115)) ([72c2073](72c2073)) * include license file with python package ([#85](#85)) ([2be3d81](2be3d81)) * load GH Release Assets for schema version in memory ([#72](#72)) ([58bad0a](58bad0a)) * refactor ancestry mapping to include distance from descendant node + implement functions to support curated list term mapping ([#96](#96)) ([7fc3562](7fc3562)) * refer to ontology source filenames in ontology_info and return that in get_ontology_download_url ([#106](#106)) ([ff9d826](ff9d826)) * split all_ontology into individual files. ([#93](#93)) ([ead59e5](ead59e5)) * Support getting download link for ontology from source repo ([#86](#86)) ([fd55b76](fd55b76)) ### Misc * clean-up ontology_parser single fetch and bulk fetch methods + account for acceptable non-ontology terms ([#112](#112)) ([2ef7435](2ef7435)) * **deps-dev:** bump semantic-version from 2.8.5 to 2.10.0 in /api/python ([#98](#98)) ([dfe0b39](dfe0b39)) ### BugFixes * imports for api ([4cd3386](4cd3386)) * lint errors ([f5e4583](f5e4583)) * python api releases ([bf0477e](bf0477e)) * update requirements ([#114](#114)) ([9888f3d](9888f3d)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
🤖 I have created a release *beep* *boop* --- ## [0.1.0](python-api-v0.0.2...python-api-v0.1.0) (2024-03-15) ### Features * add data to the python package ([#87](#87)) ([0eb6831](0eb6831)) * add is_valid_term_id method to OntologyParser ([#115](#115)) ([72c2073](72c2073)) * include license file with python package ([#85](#85)) ([2be3d81](2be3d81)) * refactor ancestry mapping to include distance from descendant node + implement functions to support curated list term mapping ([#96](#96)) ([7fc3562](7fc3562)) * refer to ontology source filenames in ontology_info and return that in get_ontology_download_url ([#106](#106)) ([ff9d826](ff9d826)) * split all_ontology into individual files. ([#93](#93)) ([ead59e5](ead59e5)) * Support getting download link for ontology from source repo ([#86](#86)) ([fd55b76](fd55b76)) ### Misc * clean-up ontology_parser single fetch and bulk fetch methods + account for acceptable non-ontology terms ([#112](#112)) ([2ef7435](2ef7435)) * **deps-dev:** bump semantic-version from 2.8.5 to 2.10.0 in /api/python ([#98](#98)) ([dfe0b39](dfe0b39)) ### BugFixes * imports for api ([4cd3386](4cd3386)) * update requirements ([#114](#114)) ([9888f3d](9888f3d)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
🤖 I have created a release *beep* *boop* --- <details><summary>python-api: 0.1.0</summary> ## [0.1.0](python-api-v0.0.2...python-api-v0.1.0) (2024-03-15) ### Features * add data to the python package ([#87](#87)) ([0eb6831](0eb6831)) * add is_valid_term_id method to OntologyParser ([#115](#115)) ([72c2073](72c2073)) * include license file with python package ([#85](#85)) ([2be3d81](2be3d81)) * refactor ancestry mapping to include distance from descendant node + implement functions to support curated list term mapping ([#96](#96)) ([7fc3562](7fc3562)) * refer to ontology source filenames in ontology_info and return that in get_ontology_download_url ([#106](#106)) ([ff9d826](ff9d826)) * split all_ontology into individual files. ([#93](#93)) ([ead59e5](ead59e5)) * Support getting download link for ontology from source repo ([#86](#86)) ([fd55b76](fd55b76)) ### Misc * automate testpypi releases ([#118](#118)) ([b5a1a66](b5a1a66)) * clean-up ontology_parser single fetch and bulk fetch methods + account for acceptable non-ontology terms ([#112](#112)) ([2ef7435](2ef7435)) * **deps-dev:** bump semantic-version from 2.8.5 to 2.10.0 in /api/python ([#98](#98)) ([dfe0b39](dfe0b39)) ### BugFixes * imports for api ([4cd3386](4cd3386)) * update requirements ([#114](#114)) ([9888f3d](9888f3d)) </details> <details><summary>ontology-assets: 0.1.0</summary> ## [0.1.0](ontology-assets-v0.0.1...ontology-assets-v0.1.0) (2024-03-15) ### Features * load GH Release Assets for schema version in memory ([#72](#72)) ([58bad0a](58bad0a)) * refactor ancestry mapping to include distance from descendant node + implement functions to support curated list term mapping ([#96](#96)) ([7fc3562](7fc3562)) * refer to ontology source filenames in ontology_info and return that in get_ontology_download_url ([#106](#106)) ([ff9d826](ff9d826)) * **release:** generate descendant mapping for tissues and cells ([#100](#100)) ([841fddf](841fddf)) * remove all-ontology.json.gz ([83fefd6](83fefd6)) * split all_ontology into individual files. ([#93](#93)) ([ead59e5](ead59e5)) ### Misc * update ontology decendant mappings ([#117](#117)) ([48451af](48451af)) ### BugFixes * lint errors ([f5e4583](f5e4583)) * Schema format and validation fixes. ([#113](#113)) ([0465ee7](0465ee7)) </details> --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
🤖 I have created a release *beep* *boop* --- <details><summary>assets: 0.4.0</summary> ## [0.4.0](assets-v0.3.0...assets-v0.4.0) (2024-04-09) ### Features * add function to fetch curated ontology term lists ([#141](#141)) ([5c7db62](5c7db62)) * fetch ontology term descriptions, if available ([#181](#181)) ([0120377](0120377)) * load GH Release Assets for schema version in memory ([#72](#72)) ([58bad0a](58bad0a)) * refactor ancestry mapping to include distance from descendant node + implement functions to support curated list term mapping ([#96](#96)) ([7fc3562](7fc3562)) * refer to ontology source filenames in ontology_info and return that in get_ontology_download_url ([#106](#106)) ([ff9d826](ff9d826)) * **release:** generate descendant mapping for tissues and cells ([#100](#100)) ([841fddf](841fddf)) * remove all-ontology.json.gz ([83fefd6](83fefd6)) * split all_ontology into individual files. ([#93](#93)) ([ead59e5](ead59e5)) * upload assets on release ([#56](#56)) ([84a1c5d](84a1c5d)) ### Misc * deprecate older version of cellxgene schema ([#172](#172)) ([186e762](186e762)) * move curated lists to ontology-assets ([#48](#48)) ([77916df](77916df)) * moving the generated artifacts ([c03c8e3](c03c8e3)) * release main ([#130](#130)) ([0b37dc8](0b37dc8)) * release main ([#146](#146)) ([4ca76f0](4ca76f0)) * release main ([#185](#185)) ([9b2fe53](9b2fe53)) * release main ([#74](#74)) ([e748fe9](e748fe9)) * release tsmith/release-assets ([63b782d](63b782d)) * release tsmith/release-assets ([#57](#57)) ([6a6b02a](6a6b02a)) * update ontology decendant mappings ([#117](#117)) ([48451af](48451af)) * update ontology decendant mappings ([#142](#142)) ([fb23618](fb23618)) * update ontology decendant mappings ([#162](#162)) ([12def74](12def74)) * update ontology descendant mappings ([#167](#167)) ([5d3d097](5d3d097)) * update ontology descendant mappings ([#180](#180)) ([65ca10f](65ca10f)) ### BugFixes * lint errors ([f5e4583](f5e4583)) * Schema format and validation fixes. ([#113](#113)) ([0465ee7](0465ee7)) </details> --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Reason for Change
Changes
ontologyCategory
to the top level in artifact-schemas/all_ontology_schema.json. This is because each ontology is save into isn't own file, so the terms do not need to be nested under their ontology name.CXGSchema
to access ontology terms.Testing steps
Notes for Reviewer