Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Store compressed model definitions in ByteReferences #71679

Merged
merged 9 commits into from
Apr 20, 2021

Conversation

davidkyle
Copy link
Member

@davidkyle davidkyle commented Apr 14, 2021

Binary data is stored in lucene base64 encoded, the same data stored in a Java string uses 2 bytes (UTF16) to represent each base64 character consuming twice the amount of memory required. This change uses ByteReferences to hold the binary data which is not base64 encoded, encoding must take place the bytes can be persisted and this is performed by the XContent classes.

For BWC I've added a new field mapping binary_definition to .ml-inference-* which means the index version has to be incremented.

Compatibility for HLRC and REST API users is preserved as the compressed_definition field still contains the base64 encoded rep.

@davidkyle davidkyle added the :ml Machine learning label Apr 14, 2021
@elasticmachine elasticmachine added the Team:ML Meta label for the ML team label Apr 14, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@davidkyle davidkyle force-pushed the binary-def branch 2 times, most recently from 7714b2e to 6dd2c1e Compare April 14, 2021 17:24
String compressedString = docs.stream()
.map(TrainedModelDefinitionDoc::getCompressedString)
.collect(Collectors.joining());
// BWC for when we tracked the total definition length
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

total definition length is now tracked again as we need to know the size of PyTorch models up front


for (int i=0; i<numDocs; i++) {
// actual content of the model definition is not important here
modelDefs.add("model_def_" + i);
modelDefs.add(new BytesArray(Base64.getEncoder().encode(("model_def_" + i).getBytes(StandardCharsets.UTF_8))));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be nice to have a private method BytesArray base64Encode(String) and use it throughout

BytesStreamOutput out = new BytesStreamOutput();
try (OutputStream compressedOutput = new GZIPOutputStream(out, BUFFER_SIZE)) {
reference.writeTo(compressedOutput);
}
return new String(Base64.getEncoder().encode(BytesReference.toBytes(out.bytes())), StandardCharsets.UTF_8);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so this will write out the raw bytes of the GZIP. Is this what we want or do we want to run the Base64 encoder?

I thought the guarantees around base64 character sizes was one of the reasons we could skip transforming into a string?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Binary data is stored in lucene base64 encoded,

Ah, so since the mapping is binary we get that for free.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's handled by the Jackson JSON generator which is used by the various XContentBuilder::value(byte[] value) methods to write bytes

String compressedString = trainedModelConfig.getCompressedDefinition();
if (compressedString.length() > MAX_COMPRESSED_STRING_SIZE) {
BytesReference compressedDefinition = trainedModelConfig.getCompressedDefinition();
if (compressedDefinition.length() > MAX_COMPRESSED_STRING_SIZE) {
listener.onFailure(
ExceptionsHelper.badRequestException(
"Unable to store model as compressed definition has length [{}] the limit is [{}]",
Copy link
Member

@benwtrent benwtrent Apr 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to indicate that the length is in UTF-8 bytes.

EDIT: Well, maybe not utf-8 bytes...but bytes or something

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the fact the size is in bytes to the message in 7a59661

@davidkyle davidkyle merged commit 64c04e5 into elastic:feature/pytorch-inference Apr 20, 2021
davidkyle added a commit that referenced this pull request Jun 3, 2021
The feature branch contains changes to configure PyTorch models with a 
TrainedModelConfig and defines a format to store the binary models. 
The _start and _stop deployment actions control the model lifecycle 
and the model can be directly evaluated with the _infer endpoint. 
2 Types of NLP tasks are supported: Named Entity Recognition and Fill Mask.

The feature branch consists of these PRs: #73523, #72218, #71679
#71323, #71035, #71177, #70713
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml Machine learning Team:ML Meta label for the ML team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants