Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Writing result zip takes too long #1212

Closed
brodmo opened this issue Jul 29, 2023 · 1 comment
Closed

Writing result zip takes too long #1212

brodmo opened this issue Jul 29, 2023 · 1 comment
Labels
enhancement Issue/PR that involves features, improvements and other changes minor Minor issue/feature/contribution/change

Comments

@brodmo
Copy link
Contributor

brodmo commented Jul 29, 2023

With larger submission sizes (>500) zipping and writing the output makes up the bulk of the runtime and takes overall way too long. The main reason for this is probably that a file is written for every comparison pair. Writing the result should be made faster or made skippable, ideally both.

@tsaglam tsaglam added enhancement Issue/PR that involves features, improvements and other changes minor Minor issue/feature/contribution/change labels Aug 11, 2023
@tsaglam
Copy link
Member

tsaglam commented Jun 4, 2024

With the current implementation, it takes ~7 seconds on my M1 MacBook with the board game dataset (434 submissions, with a mean size of 1529 LOC) and -n 10000. Overall runtime ~35 seconds (14 seconds parsing, 11 seconds comparison, 2 seconds clustering). Thus, I would consider it closed for now.

@tsaglam tsaglam closed this as completed Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Issue/PR that involves features, improvements and other changes minor Minor issue/feature/contribution/change
Projects
None yet
Development

No branches or pull requests

2 participants