Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop unused categories in ExperimentAxisQuery.to_anndata #204

Merged
merged 5 commits into from
Jul 24, 2024
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion python-spec/src/somacore/query/query.py
Original file line number Diff line number Diff line change
Expand Up @@ -298,7 +298,7 @@ def to_anndata(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a description of what drop_levels does ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pablo-gar @eddelbuettel @mojaveazure are you okay with drop_levels?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Presuming we'll want a same-ish name in Python & R both ...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seurat uses drop to match R, but I'm fine with drop_levels in SOMA

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me!

Lifecycle: maturing
"""
return self._read(
ad = self._read(
X_name,
column_names=column_names or AxisColumnNames(obs=None, var=None),
X_layers=X_layers,
Expand All @@ -308,6 +308,16 @@ def to_anndata(
varp_layers=varp_layers,
).to_anndata()

# Drop unused categories on axis dataframes
for name in ad.obs:
if pd.api.types.is_categorical_dtype(ad.obs[name]):
ad.obs[name] = ad.obs[name].cat.remove_unused_categories()
for name in ad.var:
if pd.api.types.is_categorical_dtype(ad.var[name]):
ad.var[name] = ad.var[name].cat.remove_unused_categories()

return ad

# Context management

def close(self) -> None:
Expand Down
Loading