Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

leiden nmi ari #24

Merged
merged 3 commits into from
Oct 11, 2022
Merged

leiden nmi ari #24

merged 3 commits into from
Oct 11, 2022

Conversation

adamgayoso
Copy link
Member

Adds leiden nmi ari scores to more closely match scib (they use louvain). This does a search of 10 resolutions of leiden clustering to pick the optimal NMI (as in scIB, but it uses 20 res params)

@codecov
Copy link

codecov bot commented Oct 9, 2022

Codecov Report

Merging #24 (318467f) into main (a258e0b) will decrease coverage by 0.51%.
The diff coverage is 90.24%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #24      +/-   ##
==========================================
- Coverage   93.75%   93.23%   -0.52%     
==========================================
  Files           9        9              
  Lines         288      325      +37     
==========================================
+ Hits          270      303      +33     
- Misses         18       22       +4     
Flag Coverage Δ
unittests 93.23% <90.24%> (-0.52%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/scib_metrics/_ari_nmi.py 92.15% <90.00%> (-7.85%) ⬇️
src/scib_metrics/__init__.py 100.00% <100.00%> (ø)

@adamgayoso adamgayoso requested a review from justjhong October 10, 2022 17:23

def test_nmi_ari_cluster_labels_leiden_parallel():
X, labels = dummy_x_labels(return_symmetric_positive=True)
nmi, ari = scib_metrics.nmi_ari_cluster_labels_leiden(X, labels, optimize_resolution=True, n_jobs=2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to check this against some leiden impl to make sure the clusters are the same? Is this reliable for diff random seeds?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah jk i see its leiden vs louvain

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leiden is faster. And the igraph implementation is even faster than the one scanpy uses

)
except ImportError:
logger.info("Using for loop over resolutions. pip install joblib for parallelization.")
out = [nmi_ari_cluster_labels_leiden(X, labels, False, r) for r in resolutions]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this recursive pattern feels like a recipe for bugs. would be much cleaner to just split out the else case into a helper then call that from both cases (optimize or not optimize)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

speedy

@adamgayoso adamgayoso enabled auto-merge (squash) October 11, 2022 03:20
@adamgayoso adamgayoso merged commit f88d530 into main Oct 11, 2022
@adamgayoso adamgayoso deleted the leiden branch October 14, 2022 02:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants