Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Storing TorchScript models #70827

Closed
1 task done
davidkyle opened this issue Mar 24, 2021 · 6 comments
Closed
1 task done

[ML] Storing TorchScript models #70827

davidkyle opened this issue Mar 24, 2021 · 6 comments
Labels
:ml Machine learning Team:ML Meta label for the ML team

Comments

@davidkyle
Copy link
Member

davidkyle commented Mar 24, 2021

TorchScript models are compressed binary files where the size is dependent on purpose but for neural networks models a size larger than 1GB is not unusual. Storing such a large model as a single document in Elasticsearch would be sub-optimal as it requires contiguous memory. Additionally uploading a document larger than the HTTP max content length limit is infeasible.

Better performance will be found by splitting the model into chunks and streaming those chunks to the native process to be reassembled on use. ml already uses this pattern for Anomaly Detection job state. AD uses 16MB chunks but it is worth benchmarking smaller chunk sizes. Because the model is binary data it must be base64 encoded then it can be stored in a Binary field or an index with mappings disabled.

Once the model is split into constituent chunks a meta-document will track the number of chunks, the IDs of the documents containing model chunks will follow a predictable naming convention. One vital piece of information required by the native process is the size of the model in bytes. This must be sent before any chunks are streamed, this value should be stored in the meta document as reading all the chunks into a buffer to calculate the un-base64 encoded size defeats the purpose of streaming.

TODO

  • Java strings are in UTF-16 format, storing base64 encoded data in UTF-16 strings is needlessly wasteful as we know the high byte will never be used. Parsing and decoding the strings further inflates the memory requirements. Is it possible to store the model in byte []?
@davidkyle davidkyle added the :ml Machine learning label Mar 24, 2021
@elasticmachine elasticmachine added the Team:ML Meta label for the ML team label Mar 24, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@davidkyle
Copy link
Member Author

Implementation Considerations

Code review of the first implementation in #71323 has raised a number of questions, many relating to index management

  • The maximum recommended shard size is 50GB, with enough models this value would be exceeded. Should we use size based rollover on the index where models are stored?
  • If rollover is used can we ensure that a model is not spread across multiple indices as that would complicate model management?
  • The above implies some sort of hand rolled ILM. Is it easier to support model definitions in multiple indices?

Where should the models be stored?

1. User designated Index?

Perhaps the most flexible solution from an end users perspective and would allow direct upload to the index as there are no of the access restrictions that come with system indices. For obvious performance reasons the binary blobs the models consist of should not be analysed or indexed for search, since ml does not own the index the burden of defining the correct index mappings falls on the user who must also be aware that the requirement exists.

2. Store in a hidden Index?

Offers some protection against accidental deletion or corruption as the index will not appear in wild card expansion and the user still has direct access to PUT documents in the index. The index and mappings would be managed by ml. The precedent is the .ml-state-* index which has a ILM policy for size based rollover at 50GB.

3. A System Index?

Write access is limited to ml APIs which would preclude uploading the model state directly to index. A new ml REST endpoint would be required for uploading models which is actually a desirable outcome as validation can be performed.

4. .ml-inference-* ?

This system index does not currently have a ILM policy for size based rollover and is mainly used for config. Is mixing config and large binary blobs wise?

@dimitris-athanasiou
Copy link
Contributor

If rollover is used can we ensure that a model is not spread across multiple indices as that would complicate model management?

How does it complicate model management?

I imagine the model index is a single shard index with rollover when it reaches a certain size. So, this already pushes docs of the same model into the same index. It is, of course, possible to have multiple models being stored concurrently, which could result in individual shards hosting docs for multiple models.

The complication seems to be deleting models. We have had trouble doing this in the past as well. The delete-by-query request becomes too expensive. The only way I know to make this fast is to keep a model into its own index (or index pattern).

We could consider a hybrid strategy where we have a shared index pattern for models that are small (we need to define some threshold) and we force bigger models to go into their own index pattern.

@davidkyle
Copy link
Member Author

How does it complicate model management?

In retrospect not significantly. The design relies on model IDs being unique which would normally be enforced by using unique document IDs but that will not work over multiple indices so an up front check is required before PUT. There are various bits of house keeping we would have to implement but that is not a huge amount of code.

The current implementation for boosted tree models does not work with multiple indices, amending that will be the bulk of the work.

We could consider a hybrid strategy where we have a shared index pattern for models that are small (we need to define some threshold) and we force bigger models to go into their own index pattern.

Interesting 🤔

@joshdevins
Copy link
Member

We have a strategy that we are using for now. Shall we close this or are there open discussion still?

@davidkyle
Copy link
Member Author

Closed by #71323

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml Machine learning Team:ML Meta label for the ML team
Projects
None yet
Development

No branches or pull requests

4 participants