-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement precision
and recall
metrics for classification evaluation
#49671
Implement precision
and recall
metrics for classification evaluation
#49671
Conversation
Pinging @elastic/ml-core (:ml) |
91d83e0
to
3edeb6b
Compare
run elasticsearch-ci/1 |
2b97424
to
93fc929
Compare
93fc929
to
726c822
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving a few comments. I'll revisit when recall has been adjusted.
...ain/java/org/elasticsearch/client/ml/dataframe/evaluation/classification/AccuracyMetric.java
Outdated
Show resolved
Hide resolved
...ain/java/org/elasticsearch/client/ml/dataframe/evaluation/classification/AccuracyMetric.java
Outdated
Show resolved
Hide resolved
...in/java/org/elasticsearch/client/ml/dataframe/evaluation/classification/PrecisionMetric.java
Outdated
Show resolved
Hide resolved
5249855
to
86dcaf3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more comments as I go along the way.
return Tuple.tuple( | ||
List.of( | ||
AggregationBuilders.terms(ACTUAL_CLASSES_NAMES_AGG_NAME) | ||
.field(actualField) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we also need a size
parameter in all these like in the multiclass confusion matrix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just added size
parameter to Precision
and Recall
metrics.
Also, I've added other_class_count
parameter to Precision.Result
and Recall.Result
so that the user can tell if the result is complete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just added size parameter to Precision and Recall metrics.
Also, I've added other_class_count parameter to Precision.Result and Recall.Result so that the user can tell if the result is complete.
I reverted this change and implemented max cardinality enforcement as discussed. PTAL
String className = bucket.getKeyAsString(); | ||
NumericMetricsAggregation.SingleValue precisionAgg = bucket.getAggregations().get(PER_PREDICTED_CLASS_PRECISION_AGG_NAME); | ||
double precision = precisionAgg.value(); | ||
if (Double.isFinite(precision)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be checking this? If for some reason precision
is not finite, we'll end up reporting zero instead of NaN or infinity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will not report zero but rather we will not report this particular class.
My reasoning behind this condition is that if we have an actual class that is never predicted (e.g: "cat"), precision for "cats" cannot be calculated so there is no point in reporting a precision entry with NaN
value.
6c6406d
to
6f254d8
Compare
run elasticsearch-ci/default-distro |
run elasticsearch-ci/bwc |
…tionMetric, RegressionMetric)
This reverts commit 81be647d6465e62971ac763605be4a080161cbdf.
d820384
to
dd74939
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR implements
precision
andrecall
metrics for classification evaluation.Additionally, it:
EvaluationMetric
interface by allowing pipeline aggregations to be requested byaggs
methodactualIsTrueQuery
method from interface to implementation as this method is implementation-specificRelates #48759