-
Notifications
You must be signed in to change notification settings - Fork 56
Add stats to track knn request counts #89
Add stats to track knn request counts #89
Conversation
} catch (Exception ex) { | ||
KNNCounter.GRAPH_INDEX_ERRORS.increment(); | ||
throw ex; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These Exceptions will only include the parsing errors(customer error) which we could ignore. Graphs are indexed part of the KNN80DocValuesConsumer. We may want to keep track of failures for graph creation there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh okay that makes sense. Will update.
} catch (Exception ex) { | ||
KNNCounter.GRAPH_QUERY_ERRORS.increment(); | ||
throw new RuntimeException("Unable to query the index: " + ex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a thought, why not we just rely on load_exception_count metric from cache stats. This seem to track number of exceptions while loading graph which will be invoked during queries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using load_exception_count would only count exceptions for loading the graph into memory, not the actual query of the graph. Adding the metric here allows us to check if the library query of the graph fails. In your opinion, should this metric track the number of query errors where a query is a call to the ES search API for knn, or for a query where a query is a call to the library function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense. It should track the number of query errors.
throw new IllegalStateException("KNN plugin is disabled. To enable " + | ||
"update knn.plugin.enabled setting to true"); | ||
} | ||
KNNCounter.GRAPH_INDEX_REQUESTS.increment(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might add a little confusion for bulk requests, which could index multiple vectors but still part of same request. If the intention is to count the number of graph requests , probably we could count at KNN80DocValuesConsumer
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the intention of this metric is to count the total number of requests to index graphs. Will move to KNN80DocValuesConsumer
#### graph_query_requests | ||
The number of graph queries that have been made. | ||
|
||
#### graph_query_errors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we add metrics for counting KNNQueries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Will add
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for adding the metrics.
Issue #, if available:
#88
Description of changes:
PR adds stats to track the number of requests and errors for KNN query and index operations.
For query operations, bookkeeping is added in the queryIndex function in KNNIndex. For index operations, it is added in the KNNVectorFieldMapper parse function.
Unit tests have been added to make sure that the counting functionality works properly.
Documentation has been updated as well.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.