-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Classification Performance #637
Conversation
…oom-patch-581-performance-issues
@@ -1368,8 +1380,8 @@ def test__compute_curves( | |||
}, | |||
("dog", 0.05, "tn"): {"all": 1, "total": 1}, | |||
("dog", 0.8, "fn"): { | |||
"missed_detections": 1, | |||
"misclassifications": 1, | |||
"missed_detections": 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A prediction having a score less than the threshold is still a valid prediction though
what is the point of the score threshold in that case?
the score threshold is meant to mean "only consider predictions with a score greater than x to be valid predictions"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point of a score threshold is to determine whether the prediction is positive vs negative.
Whether that prediction is correct determines its truth (True, False).
Combine these and you get TP, FP, FN and TN.
The variation of missing_detection
doesnt really map well to the classification task (as compared to the obj det task) as we enforce the existence of predictions to groundtruths at ingestion time. (See validate_matching_label_keys
)
This logic also applies to hallucination
for FP, which, if you look at that test never gets a value counted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reached out to Matt and I think this is a definition issue. Missing detection
doesnt make sense for classification. The condition of FN that you are referring to fits something closer to a "no winner" condition.
Matt suggested "No prediction" and im wondering if "Null Prediction" would make more sense.
How does all this sound to you?
…oom-patch-581-performance-issues
…performance-issues
@@ -1368,8 +1380,8 @@ def test__compute_curves( | |||
}, | |||
("dog", 0.05, "tn"): {"all": 1, "total": 1}, | |||
("dog", 0.8, "fn"): { | |||
"missed_detections": 1, | |||
"misclassifications": 1, | |||
"missed_detections": 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A prediction having a score less than the threshold is still a valid prediction though
what is the point of the score threshold in that case?
the score threshold is meant to mean "only consider predictions with a score greater than x to be valid predictions"
Changes
Performance
Before (v0.27.3)
After