-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Preserve order of inference results #100143
Conversation
Pinging @elastic/ml-core (Team:ML) |
Hi @davidkyle, I've created a changelog YAML for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how hard it'd be, but would it be worth adding a test for the ordering?
.../main/java/org/elasticsearch/xpack/ml/action/TransportInferTrainedModelDeploymentAction.java
Outdated
Show resolved
Hide resolved
* the listener will never call {@code finalListener::onFailure} | ||
* instead failures are returned as inference results. | ||
*/ | ||
private ActionListener<InferenceResults> orderedListener( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can we make this static?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 and I've added a test
if (result instanceof ErrorInferenceResults errorResult) { | ||
// Any failure fails all requests | ||
// TODO is this the correct behaviour for batched requests? | ||
finalListener.onFailure(errorResult.getException()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know the code well enough but maybe in the future we could make the response similar to a bulk response where an entry in the results array can either be a failure or a successful result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That the idea. The rest response does not have to change but internal users (such as ingest) can make better decisions about how to handle a response which is partially successful
…on/TransportInferTrainedModelDeploymentAction.java Co-authored-by: Jonathan Buttner <[email protected]>
When a request contains multiple inputs the order in which those inputs are processed is not deterministic if the C++ process is using more than one allocation. This change ensures the inference results are returned in the same order as the request inputs so that a caller knows result 1 is for input 1 etc.
Another change is to return all results even if there was a failure. Failures are returned as
ErrorInferenceResults
, the caller should check for instances ofErrorInferenceResults
and handle them appropriately.This is labelled as a bug because the
_infer
API accepts multiple inputs and previously the returned order was not guaranteed to match the input order.