Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Preserve order of inference results #100143

Merged
merged 6 commits into from
Oct 2, 2023

Conversation

davidkyle
Copy link
Member

When a request contains multiple inputs the order in which those inputs are processed is not deterministic if the C++ process is using more than one allocation. This change ensures the inference results are returned in the same order as the request inputs so that a caller knows result 1 is for input 1 etc.

Another change is to return all results even if there was a failure. Failures are returned as ErrorInferenceResults, the caller should check for instances of ErrorInferenceResults and handle them appropriately.

This is labelled as a bug because the _infer API accepts multiple inputs and previously the returned order was not guaranteed to match the input order.

@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label Oct 2, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@elasticsearchmachine
Copy link
Collaborator

Hi @davidkyle, I've created a changelog YAML for you.

docs/changelog/100143.yaml Outdated Show resolved Hide resolved
docs/changelog/100143.yaml Outdated Show resolved Hide resolved
Copy link
Contributor

@jonathan-buttner jonathan-buttner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how hard it'd be, but would it be worth adding a test for the ordering?

* the listener will never call {@code finalListener::onFailure}
* instead failures are returned as inference results.
*/
private ActionListener<InferenceResults> orderedListener(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can we make this static?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 and I've added a test

if (result instanceof ErrorInferenceResults errorResult) {
// Any failure fails all requests
// TODO is this the correct behaviour for batched requests?
finalListener.onFailure(errorResult.getException());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the code well enough but maybe in the future we could make the response similar to a bulk response where an entry in the results array can either be a failure or a successful result?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That the idea. The rest response does not have to change but internal users (such as ingest) can make better decisions about how to handle a response which is partially successful

davidkyle and others added 2 commits October 2, 2023 17:42
…on/TransportInferTrainedModelDeploymentAction.java

Co-authored-by: Jonathan Buttner <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :ml Machine learning Team:ML Meta label for the ML team v8.11.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants