Skip to content

Commit

Permalink
Clarify awful min_sample condition
Browse files Browse the repository at this point in the history
  • Loading branch information
nickjcroucher committed Feb 23, 2024
1 parent 5454f7a commit f721ca1
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion PopPUNK/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -513,7 +513,8 @@ def fit(self, X, max_num_clusters, min_cluster_prop, use_gpu = False):

# DBSCAN parameters
cache_out = "./" + self.outPrefix + "_cache"
min_samples = min(int(min_cluster_prop * self.subsampled_X.shape[0]), 1023)
min_samples = max(int(min_cluster_prop * self.subsampled_X.shape[0]), 10) # do not allow clusters of < 10 points
min_samples = min(min_samples,1023) # do not allow clusters to require more than 1023 points at the start
min_cluster_size = max(int(0.01 * self.subsampled_X.shape[0]), 10)

# Check on initialisation of GPU libraries and memory
Expand Down

0 comments on commit f721ca1

Please sign in to comment.