Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize ROC AUC Computation #667

Merged
merged 3 commits into from
Jul 11, 2024
Merged

Optimize ROC AUC Computation #667

merged 3 commits into from
Jul 11, 2024

Conversation

czaloom
Copy link
Collaborator

@czaloom czaloom commented Jul 10, 2024

Feature Description

ROC AUC computation experienced significant slowdowns in resource-constrained settings as a large query was spawned for each unique label.

The solution was to aggregate the ROC AUC computation into a single query.

Local Performance

before (v0.29.0)

    "number_of_datums": 5000,
    "number_of_unique_labels": 50,
    "number_of_annotations": 10000,
    "ingest_runtime": "17.0 seconds",
    "base_eval_runtime": "8.0 seconds",
    "base+pr_eval_runtime": "14.3 seconds",
    "base+pr+detailed_eval_runtime": "14.2 seconds",
    "del_runtime": "0.3 seconds"

after

        "number_of_datums": 5000,
        "number_of_unique_labels": 50,
        "number_of_annotations": 10000,
        "ingest_runtime": "17.1 seconds",
        "eval_runtime": "2.1 seconds",
        "eval_pr_runtime": "8.7 seconds",
        "eval_pr_detail_runtime": "9.2 seconds",
        "del_runtime": "0.7 seconds"

Resource Constrained Performance (dev01)

before (v0.29.0)

{
    "number_of_datums": 5000,
    "number_of_unique_labels": 50,
    "number_of_annotations": 10000,
    "base_eval_runtime": "201.3 seconds",
    "base+pr_eval_runtime": "226.3 seconds",
    "base+pr+detailed_eval_runtime": "228.5 seconds",
}

after

{
    "number_of_datums": 5000,
    "number_of_unique_labels": 50,
    "number_of_annotations": 10000,
    "eval_base_runtime": "85.1 seconds",
    "eval_pr_runtime": "111.5 seconds",
    "eval_pr_detailed_runtime": "110.5 seconds",
}

@czaloom czaloom linked an issue Jul 10, 2024 that may be closed by this pull request
3 tasks
@czaloom czaloom marked this pull request as ready for review July 10, 2024 22:47
@czaloom czaloom requested review from ntlind and ekorman as code owners July 10, 2024 22:47
@czaloom czaloom self-assigned this Jul 10, 2024
label not in label_to_count
or label.key not in label_key_to_count
):
raise RuntimeError("ROCAUC computation failed.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could this error message be made more specific?

Copy link
Collaborator Author

@czaloom czaloom Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to ROC AUC computation failed as the label '{label}' could not be found.

@czaloom czaloom merged commit 19cd451 into main Jul 11, 2024
12 checks passed
@czaloom czaloom deleted the czaloom-optimize-roc-auc branch July 11, 2024 00:43
ntlind pushed a commit that referenced this pull request Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize ROC AUC Computation
2 participants