Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] [PyTorch] Communications with between ES and the native process #1700

Closed
davidkyle opened this issue Jan 28, 2021 · 3 comments
Closed

Comments

@davidkyle
Copy link
Member

Request ID

The design has to accommodate multiple concurrent inference requests to the server and a mechanism to tie a specific request to a model output is required. This could be inferred from the processing order which is strictly FIFO but adding a ID token to each request provides additional context for development and debugging. The token has no semantics and is purely passed through the C++. Anomaly Detector flushID is the prior art here.

Payload

The inference payload is a series of numeric tokens. An individual inference request will consist of the request ID, the payload tokens and a marker to delineate each request.

Anomaly Detection uses a concise length encoded binary protocol because of the high volume of data sent across pipes. Compared with Anomaly Detection the input is small so a more verbose input format can be used which has the advantage of being descriptive.

Input Format

A JSON document:

{
  “request_id” : “string”,  
  “token_ids” : [int, int,...],
  “attention_mask” : [int, int,...],
  “token_type_ids” : [int, int,...],
  “position_ids” : [int, int,...]
}

token_ids and attention_mask are required for all uses, token_type_ids and position_ids are optional depending on the model type.

Output Format

A JSON document for flexibility containing the request ID token, the result tensor and optionally the predicted tokens depending on the model type:

{
  “request_id” : “string”,  
  “predictions” : [float, float,...],
  “tokens” : [int, int,...]
}
@droberts195
Copy link
Contributor

Should the output have token_ids rather than tokens? Presumably these are IDs that are looked up against the same mapping table as the input tokens?

@droberts195
Copy link
Contributor

It might be worth saying the input format is ND-JSON rather than arbitrary JSON. Then each input document or command document can be one line of a text file.

We have functionality for parsing a stream of arbitrarily formatted JSON documents separated by \0 characters, but this is not a friendly format for testing at the command line using simple text files. So ND-JSON is probably a better format for the long term.

@davidkyle
Copy link
Member Author

Closed by elastic/elasticsearch#70713 and #1770

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants