[ML][Inference] adding .ml-inference* index and storage #47267

benwtrent · 2019-09-28T19:57:14Z

Adds a new .ml-inference index, a trained model configuration document object, and methods to put/get a document from the index.

I opted for versioning the indices similar to how we do with Transforms. This is partially blocked by #47241, so, opening as WIP until that PR is merged.

elasticmachine · 2019-09-28T19:57:15Z

Pinging @elastic/ml-core

...igh-level/src/main/java/org/elasticsearch/client/ml/inference/NamedXContentObjectHelper.java

...ugin/core/src/main/java/org/elasticsearch/xpack/core/ml/utils/NamedXContentObjectHelper.java

przemekwitek · 2019-09-30T08:29:33Z

...ugin/core/src/main/java/org/elasticsearch/xpack/core/ml/utils/NamedXContentObjectHelper.java

+import java.io.IOException;
+import java.util.List;
+
+public final class NamedXContentObjectHelper {


Just to make sure: to what extent do we want to decouple server code from client code?
It feels to me that a utility class like this one could be placed in some "common" package visible by both server and client.

Potentially, but I am not convinced it belongs in the root server code. It is something only used by this plugin.

NOTE: I think for this to be able to be used by the client + xpack.core, it would need to be put in the common ES Server code. That does not seem justified to me

przemekwitek · 2019-09-30T08:34:19Z

client/rest-high-level/src/main/java/org/elasticsearch/client/ml/job/util/TimeUtil.java

@@ -45,4 +46,13 @@ public static Date parseTimeField(XContentParser parser, String fieldName) throw
            "unexpected token [" + parser.currentToken() + "] for [" + fieldName + "]");
    }

+    public static Instant parseTimeFieldToInstant(XContentParser parser, String fieldName) throws IOException {


Could org.elasticsearch.client.transform.transforms.util.TimeUtil class be moved to org.elasticsearch.client.common and reused here?

.../rest-high-level/src/main/java/org/elasticsearch/client/ml/inference/TrainedModelConfig.java

.../plugin/core/src/main/java/org/elasticsearch/xpack/core/ml/inference/TrainedModelConfig.java

...main/java/org/elasticsearch/xpack/core/ml/inference/persistence/InferenceIndexConstants.java

...l/src/main/java/org/elasticsearch/xpack/ml/inference/persistence/InferenceInternalIndex.java

przemekwitek · 2019-09-30T08:49:04Z

.../ml/src/main/java/org/elasticsearch/xpack/ml/inference/persistence/TrainedModelProvider.java

+                    r -> listener.onResponse(true),
+                    e -> {
+                        logger.error(
+                            new ParameterizedMessage("[{}] failed to store trained model for inference", trainedModelConfig.getModelId()),


We use Messages.INFERENCE_FAILED_TO_STORE_MODE in line 80. Should it be used here as well?

seems to me it should be moved into the if-clause and log corresponding messages

I consider the two messages to be different. I think potentially user facing errors should be sentences (or close to it). Logged messages should follow the prevailing format of [<RESOURCE_ID>] <MESSAGE>

I updated the message to include the model version. But again, for logging, it is much easier to grep and visually parse logs if the resource ID is right at the front. This is the prevailing pattern in our ML plugin and the Transform Plugin.

(nit) should not have piggy backed my comment: For me it would be clearer if logging corresponds to the listener.onFailure calls, so we have 2 different log messages for the 2 failure cases.

przemekwitek · 2019-09-30T08:50:19Z

...k/plugin/ml/src/test/java/org/elasticsearch/xpack/ml/integration/TrainedModelProviderIT.java

+
+public class TrainedModelProviderIT extends MlSingleNodeTestCase {
+
+    TrainedModelProvider trainedModelProvider;


hendrikmuhs · 2019-09-30T11:24:27Z

...main/java/org/elasticsearch/xpack/core/ml/inference/persistence/InferenceIndexConstants.java

+ */
+public final class InferenceIndexConstants {
+
+    public static final String INDEX_VERSION = "000001";


this looks like a rollover pattern, I think version should be non-fixed size digits (-> single digit in this case) and the rollover pattern should be an additional suffix if rollover is wanted.
Are you planning to use rollover for this in future to delete old models? It seems that configs and models are stored in this index, that would rule out rollover.

@hendrikmuhs This is for index versioning. Simply using 1 won't work.

you mean because of sorting on retrieval and there limitations? ok, 1 digit limits to 9 versions. But almost 1 million versions seems a little bit to much.

I think a single extension for both potential future rollover and mappings version is OK. The rollover API is designed with periodic changes to index templates in mind. And then since rollover uses six digits it's best to use six digits. If we use fewer and then decide to do rollover then it forces us to have two suffices.

for support reasons I would prefer 2 suffixes, but you have a point.

It's a good hint, I am currently using 1 digit in transforms, which limits versioning to 9 versions. I will upgrade to use more digits, 6 sounds still to much as I rule out rollover for this index.

hendrikmuhs · 2019-09-30T11:33:22Z

.../ml/src/main/java/org/elasticsearch/xpack/ml/inference/persistence/TrainedModelProvider.java

+                    r -> listener.onResponse(true),
+                    e -> {
+                        logger.error(
+                            new ParameterizedMessage("[{}] failed to store trained model for inference", trainedModelConfig.getModelId()),


seems to me it should be moved into the if-clause and log corresponding messages

.../ml/src/main/java/org/elasticsearch/xpack/ml/inference/persistence/TrainedModelProvider.java

przemekwitek

LGTM

przemekwitek · 2019-09-30T13:17:32Z

...evel/src/test/java/org/elasticsearch/client/ml/inference/NamedXContentObjectHelperTests.java

+
+    static class NamedTestObject implements NamedXContentObject {
+
+        private String fieldValue;


nit: Could you move the member variable down, right before constructor?

hendrikmuhs

LGTM

benwtrent · 2019-09-30T15:47:11Z

run elasticsearch-ci/2

droberts195

LGTM

* [ML][Inference] adding .ml-inference* index and storage * Addressing PR comments * Allowing null definition, adding validation tests for model config * fixing line length

#47310) * [ML][Inference] adding .ml-inference* index and storage (#47267) * [ML][Inference] adding .ml-inference* index and storage * Addressing PR comments * Allowing null definition, adding validation tests for model config * fixing line length * adjusting for backport

[ML][Inference] adding .ml-inference* index and storage

c6fbaa5

benwtrent added >non-issue :ml Machine learning v8.0.0 v7.5.0 labels Sep 28, 2019

przemekwitek reviewed Sep 30, 2019

View reviewed changes

hendrikmuhs reviewed Sep 30, 2019

View reviewed changes

Addressing PR comments

c8d4aab

przemekwitek approved these changes Sep 30, 2019

View reviewed changes

hendrikmuhs approved these changes Sep 30, 2019

View reviewed changes

benwtrent added 2 commits September 30, 2019 11:19

Allowing null definition, adding validation tests for model config

9fa1e9d

fixing line length

4f1f040

droberts195 approved these changes Sep 30, 2019

View reviewed changes

benwtrent merged commit f6f2c9f into elastic:master Sep 30, 2019

benwtrent deleted the feature/ml-inference-adding-inference-index branch September 30, 2019 16:26

benwtrent mentioned this pull request Sep 30, 2019

[7.x] [ML][Inference] adding .ml-inference* index and storage (#47267) #47310

Merged

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML][Inference] adding .ml-inference* index and storage #47267

[ML][Inference] adding .ml-inference* index and storage #47267

benwtrent commented Sep 28, 2019

elasticmachine commented Sep 28, 2019

przemekwitek Sep 30, 2019

benwtrent Sep 30, 2019 •

edited

Loading

przemekwitek Sep 30, 2019

przemekwitek Sep 30, 2019

hendrikmuhs Sep 30, 2019

benwtrent Sep 30, 2019

benwtrent Sep 30, 2019

hendrikmuhs Sep 30, 2019

przemekwitek Sep 30, 2019

hendrikmuhs Sep 30, 2019

benwtrent Sep 30, 2019

hendrikmuhs Sep 30, 2019

droberts195 Sep 30, 2019

hendrikmuhs Sep 30, 2019

hendrikmuhs Sep 30, 2019

przemekwitek left a comment

przemekwitek Sep 30, 2019

hendrikmuhs left a comment

benwtrent commented Sep 30, 2019

droberts195 left a comment


		public class TrainedModelProviderIT extends MlSingleNodeTestCase {

		TrainedModelProvider trainedModelProvider;


		static class NamedTestObject implements NamedXContentObject {

		private String fieldValue;

[ML][Inference] adding .ml-inference* index and storage #47267

[ML][Inference] adding .ml-inference* index and storage #47267

Conversation

benwtrent commented Sep 28, 2019

elasticmachine commented Sep 28, 2019

Choose a reason for hiding this comment

benwtrent Sep 30, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

przemekwitek left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hendrikmuhs left a comment

Choose a reason for hiding this comment

benwtrent commented Sep 30, 2019

droberts195 left a comment

Choose a reason for hiding this comment

benwtrent Sep 30, 2019 •

edited

Loading