Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(schema 5.1.0): throw error if obsm is not type np.ndarray #859

Merged
merged 17 commits into from
May 6, 2024

Conversation

Bento007
Copy link
Contributor

@Bento007 Bento007 commented Apr 25, 2024

Reason for Change

Changes

  • change warning to an error when obsm field is not type numpy.ndarray.
  • validating that obsm ndarray that don't match spatial or X_ must have a column length of at least 1.
  • split out the checks in _validate_obsm to make it clear why an error occured.

Testing

  • Update the unit tests to match expect behavior
  • Added a test for unknown obsm keys

@Bento007 Bento007 requested a review from danieljhegeman April 25, 2024 23:10
@Bento007 Bento007 changed the title fix: throw error if obsm is not type np.ndarray fix[WIP]: throw error if obsm is not type np.ndarray Apr 25, 2024
"WARNING: All embeddings have to be of 'numpy.ndarray' type, 'adata.obsm['harmony']' is <class 'pandas.core.frame.DataFrame'>').",
]
assert validator.errors == [
"WARNING: All embeddings have to be of 'numpy.ndarray' type, 'adata.obsm['harmony']' is <class 'pandas.core.frame.DataFrame'>')."
Copy link
Contributor

@nayib-jose-gloria nayib-jose-gloria Apr 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should be consistent and append errors with ERROR (blocks validation) rather than WARNING (doesn't)

@Bento007 Bento007 requested review from nayib-jose-gloria and MillenniumFalconMechanic and removed request for danieljhegeman April 29, 2024 22:28
@Bento007 Bento007 changed the title fix[WIP]: throw error if obsm is not type np.ndarray fix: throw error if obsm is not type np.ndarray Apr 29, 2024
@Bento007 Bento007 changed the title fix: throw error if obsm is not type np.ndarray fix(schema 5.1.0): throw error if obsm is not type np.ndarray Apr 29, 2024
nayib-jose-gloria

This comment was marked as duplicate.

Copy link
Contributor

@nayib-jose-gloria nayib-jose-gloria left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks good for ndarray check, but I think we're missing validation for the rest of this rule:

The value for each str key MUST be a numpy.ndarray of shape (n_obs, m), where n_obs is the number of rows in X and m >= 1.

We throw a warning if a non-spatial, non-X-prefix embedding value has a row count != n_obs--that should become an error as well. And I don't think we check for column count >= 1 at all; we warn on column count < 2 (and error for spatial / X-prefix embedding keys)

f" 'adata.obsm['{key}']' has shape of '{value.shape}'."
if len(value.shape) < 2:
self.errors.append(
f"All embeddings must at least two dimensions. 'adata.obsm['{key}']' has a shape length of '{len(value.shape)}'."
Copy link
Contributor

@nayib-jose-gloria nayib-jose-gloria May 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EDIT: nvm previous statment was incorrect


if unknown_key and value.shape[1] < 1:
self.errors.append(
f"All other embeddings must have at least one column. 'adata.obsm['{key}']' has columns='{value.shape[1]}'."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: in this context, I don't think the curator reading would know what "other embeddings" mean. Maybe "any embeddings not specified in the schema reference"?

Copy link
Contributor

@nayib-jose-gloria nayib-jose-gloria left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 nit about wording on an error statement but otherwise looks good. Ready for product review after that

@Bento007 Bento007 requested a review from brian-mott May 2, 2024 22:33
@nayib-jose-gloria
Copy link
Contributor

@brian-mott ready for review!

Copy link
Collaborator

@brian-mott brian-mott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks Trent and Nayib!

@Bento007 Bento007 merged commit e06940c into main May 6, 2024
6 checks passed
@Bento007 Bento007 deleted the tsmith/857-obsm branch May 6, 2024 16:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants