Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(release): generate descendant mapping for tissues and cells #100

Merged
merged 49 commits into from
Mar 15, 2024

Conversation

Bento007
Copy link
Collaborator

@Bento007 Bento007 commented Mar 8, 2024

Reason for Change

Changes

  • add script to generate descendant mapping. It's based off a script in single-cell-curation
  • update GHA to install the local API to run generate descendant mapping.

Testing steps

  • Added test cases
  • Compared to tissue_descendants.json and cell_type_descendant.json generating by the single-cell-curation repo. Results:
  • verified PR is auto generated for decendant mapping updates chore: update ontology decendant mappings #116
  • Add GHA run every monday at midnight, by workflow_dispatch, or if the ontology assets are updated
------Comparing Ontology Guide and single-cell-curation Cell Type Descendant
KEYS: In Ontology Guide not in single-cell-curation
	 {'CL:4030031'}
DESCENDANT: In Ontology Guide  not in single-cell-curation
	 CL:0000738 {'CL:2000054'}
	 CL:0002320 {'CL:0002554'}
	 CL:0002319 {'CL:4033051', 'CL:4033052'}
	 CL:0000117 {'CL:4033051', 'CL:4033052'}
	 CL:0000057 {'CL:0002554'}
	 CL:0000540 {'CL:4033051', 'CL:4033052'}
	 CL:0000219 {'CL:2000054'}
	 CL:0000542 {'CL:2000054'}
	 CL:0000988 {'CL:2000054'}

KEYS: In single-cell-curation not in Ontology Guide
	 None
DESCENDANT: In single-cell-curation not in Ontology Guide

Notes for Reviewer

@Bento007 Bento007 marked this pull request as draft March 8, 2024 23:00
from cellxgene_ontology_guide.ontology_parser import OntologyParser


def load_prod_datasets() -> Any:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this gets us to parity with the current system, but I'm still concerned about the fact this step means the mappings become outdated as soon as a new CL term is introduced.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to resolve, we'd either need to make the artifacts larger (how much larger?) to map all CL terms or perhaps we can set-up a mechanism to run this script periodically and update the mappings regularly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the needs of the frontend end, we can update the descendant mappings outside of schema update. We should run this at a regular cadence.

return entity_name


def key_organoids_by_ontology_term_id(entity_names: List[str]) -> Dict[str, str]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there an equivalent need for cell culture? if not, why do we also tag cell culture terms?

if entity_name in organoids_by_ontology_term_id:
descendant_accept_list.append(organoids_by_ontology_term_id[entity_name])

if not descendant_accept_list:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm assuming doing this achieves parity with the current set-up, but just to confirm--we don't want to include self as a descendant nor do we want to include an empty list? wouldn't this cause certain terms to be "orphaned" and not appear in the filters? or is that not how it works?

the filter functionality.
"""
ontology_term_id = entity_name.replace(" (organoid)", "")
organoids_by_ontology_term_id[ontology_term_id] = entity_name
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be a dictionary if the value can be derived from the key? thinking this should be a set and, where needed, we can append " (organoid)"

descendant_accept_list.append(descendant)

# Add organoid descendants, if any.
if descendant in organoids_by_ontology_term_id:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be a comp bio question but--in doing this, we will always mark a term as having both ontology_term_id and ontology_term_id (organoid) as descendants, if the ontology_term_id is in the accept list and is an organoid. Is that intended? It sounds like that could make sense, but I'm not certain

@Bento007 Bento007 merged commit 841fddf into main Mar 15, 2024
5 checks passed
@Bento007 Bento007 deleted the tsmith/decendent-mappings branch March 15, 2024 19:59
Bento007 pushed a commit that referenced this pull request Mar 15, 2024
🤖 I have created a release *beep* *boop*
---


<details><summary>python-api: 0.1.0</summary>

##
[0.1.0](python-api-v0.0.2...python-api-v0.1.0)
(2024-03-15)


### Features

* add data to the python package
([#87](#87))
([0eb6831](0eb6831))
* add is_valid_term_id method to OntologyParser
([#115](#115))
([72c2073](72c2073))
* include license file with python package
([#85](#85))
([2be3d81](2be3d81))
* refactor ancestry mapping to include distance from descendant node +
implement functions to support curated list term mapping
([#96](#96))
([7fc3562](7fc3562))
* refer to ontology source filenames in ontology_info and return that in
get_ontology_download_url
([#106](#106))
([ff9d826](ff9d826))
* split all_ontology into individual files.
([#93](#93))
([ead59e5](ead59e5))
* Support getting download link for ontology from source repo
([#86](#86))
([fd55b76](fd55b76))


### Misc

* automate testpypi releases
([#118](#118))
([b5a1a66](b5a1a66))
* clean-up ontology_parser single fetch and bulk fetch methods + account
for acceptable non-ontology terms
([#112](#112))
([2ef7435](2ef7435))
* **deps-dev:** bump semantic-version from 2.8.5 to 2.10.0 in
/api/python
([#98](#98))
([dfe0b39](dfe0b39))


### BugFixes

* imports for api
([4cd3386](4cd3386))
* update requirements
([#114](#114))
([9888f3d](9888f3d))
</details>

<details><summary>ontology-assets: 0.1.0</summary>

##
[0.1.0](ontology-assets-v0.0.1...ontology-assets-v0.1.0)
(2024-03-15)


### Features

* load GH Release Assets for schema version in memory
([#72](#72))
([58bad0a](58bad0a))
* refactor ancestry mapping to include distance from descendant node +
implement functions to support curated list term mapping
([#96](#96))
([7fc3562](7fc3562))
* refer to ontology source filenames in ontology_info and return that in
get_ontology_download_url
([#106](#106))
([ff9d826](ff9d826))
* **release:** generate descendant mapping for tissues and cells
([#100](#100))
([841fddf](841fddf))
* remove all-ontology.json.gz
([83fefd6](83fefd6))
* split all_ontology into individual files.
([#93](#93))
([ead59e5](ead59e5))


### Misc

* update ontology decendant mappings
([#117](#117))
([48451af](48451af))


### BugFixes

* lint errors
([f5e4583](f5e4583))
* Schema format and validation fixes.
([#113](#113))
([0465ee7](0465ee7))
</details>

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Bento007 pushed a commit that referenced this pull request Apr 9, 2024
🤖 I have created a release *beep* *boop*
---


<details><summary>assets: 0.4.0</summary>

##
[0.4.0](assets-v0.3.0...assets-v0.4.0)
(2024-04-09)


### Features

* add function to fetch curated ontology term lists
([#141](#141))
([5c7db62](5c7db62))
* fetch ontology term descriptions, if available
([#181](#181))
([0120377](0120377))
* load GH Release Assets for schema version in memory
([#72](#72))
([58bad0a](58bad0a))
* refactor ancestry mapping to include distance from descendant node +
implement functions to support curated list term mapping
([#96](#96))
([7fc3562](7fc3562))
* refer to ontology source filenames in ontology_info and return that in
get_ontology_download_url
([#106](#106))
([ff9d826](ff9d826))
* **release:** generate descendant mapping for tissues and cells
([#100](#100))
([841fddf](841fddf))
* remove all-ontology.json.gz
([83fefd6](83fefd6))
* split all_ontology into individual files.
([#93](#93))
([ead59e5](ead59e5))
* upload assets on release
([#56](#56))
([84a1c5d](84a1c5d))


### Misc

* deprecate older version of cellxgene schema
([#172](#172))
([186e762](186e762))
* move curated lists to ontology-assets
([#48](#48))
([77916df](77916df))
* moving the generated artifacts
([c03c8e3](c03c8e3))
* release main
([#130](#130))
([0b37dc8](0b37dc8))
* release main
([#146](#146))
([4ca76f0](4ca76f0))
* release main
([#185](#185))
([9b2fe53](9b2fe53))
* release main
([#74](#74))
([e748fe9](e748fe9))
* release tsmith/release-assets
([63b782d](63b782d))
* release tsmith/release-assets
([#57](#57))
([6a6b02a](6a6b02a))
* update ontology decendant mappings
([#117](#117))
([48451af](48451af))
* update ontology decendant mappings
([#142](#142))
([fb23618](fb23618))
* update ontology decendant mappings
([#162](#162))
([12def74](12def74))
* update ontology descendant mappings
([#167](#167))
([5d3d097](5d3d097))
* update ontology descendant mappings
([#180](#180))
([65ca10f](65ca10f))


### BugFixes

* lint errors
([f5e4583](f5e4583))
* Schema format and validation fixes.
([#113](#113))
([0465ee7](0465ee7))
</details>

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants