Introduce data stream options and failure store configuration classes #109515

gmarouli · 2024-06-10T07:34:55Z

In order to facilitate enabling and disabling the failure store & component template composition, we introduce new metadata classes that can support a more extensible failure store configuration.

We would like to introduce data stream options. Data stream options capture the configuration of data stream level (smaller and larger) features, such the failure store and in the future data stream lifecycle. They are different than settings because they are applied on a data stream level and not per backing index.

This PR is only setting the basic classes to enable follow up PRs that will actually use them.

Examples, these are not final, they are only used to help visualise a potential direction:

GET _data_stream/my-*/_options
{
  "data_streams": [
    {
      "name": "my-non-opinionated-ds",
      "options": {
      }
    },
    {
      "name": "my-fs",
      "options": {
        "failure_store": {
          "enabled": true
        }
      }
    },
    {
      "name": "my-no-fs",
      "options": {
        "failure_store": {
          "enabled": false
        }
      }
    }
  ]
}

// If we decide to add lifecycle here too:
PUT _data_stream/my-fs/_options
{
   "failure_store": {
     "enabled": true
   },
   "lifecycle": {
   }
}

What we see above are 3 data streams:

my-fs with the failure store explicitly enabled
my-no-fs with the failure store explicitly disabled, and
my-non-opinionated-ds which does not specify what to do with the failure store, so for now it means failure store disabled but that could change in the future.

Template composition examples pending

elasticsearchmachine · 2024-06-10T07:35:30Z

Pinging @elastic/es-data-management (Team:Data Management)

nielsbauman

I left one minor typo suggestion, other than that it LGTM :).
I do think it makes sense to wait with merging this, because the DataStreamOptions implementation might differ a bit depending on the ongoing discussions we're having. What do you think?

server/src/main/java/org/elasticsearch/cluster/metadata/DataStreamFailureStore.java

dakrone

I left a couple of comments. If we decide we want to include lifecycle options (which I would like to, but I know it's up for discussion), then it'd be interesting to see that here, even if it's not used yet.

server/src/main/java/org/elasticsearch/cluster/metadata/DataStreamFailureStore.java

server/src/main/java/org/elasticsearch/cluster/metadata/DataStreamOptions.java

gmarouli · 2024-06-10T16:46:21Z

I left one minor typo suggestion, other than that it LGTM :). I do think it makes sense to wait with merging this, because the DataStreamOptions implementation might differ a bit depending on the ongoing discussions we're having. What do you think?

Yes, definitely, I am open to change it. This is supposed to help identify red flags by having something more concrete. With this I can at least keep preparing for the API even if a few things could change, without feeling I am going on a wrong direction.

gmarouli · 2024-06-10T16:47:45Z

I left a couple of comments. If we decide we want to include lifecycle options (which I would like to, but I know it's up for discussion), then it'd be interesting to see that here, even if it's not used yet.

Happy to include it here just for demonstration purposes. I can always revert the commit if we think it doesn't fit anymore.

gmarouli · 2024-06-11T07:01:49Z

@elasticmachine update branch

gmarouli · 2024-06-12T07:17:48Z

@elasticmachine update branch

server/src/main/java/org/elasticsearch/cluster/metadata/DataStreamFailureStore.java

server/src/main/java/org/elasticsearch/cluster/metadata/DataStreamOptions.java

…reamOptions.java Co-authored-by: Niels Bauman <[email protected]>

gmarouli · 2024-06-14T10:36:13Z

@elasticmachine update branch

gmarouli · 2024-06-17T12:40:33Z

@dakrone & @jbaiera could you give this another look please. Thank you ☺️ .

gmarouli · 2024-06-18T06:38:23Z

@elasticmachine update branch

gmarouli · 2024-06-18T10:54:24Z

@elasticmachine update branch

gmarouli · 2024-06-18T12:23:50Z

@elasticmachine update branch

dakrone

LGTM, I left one comment

dakrone · 2024-06-18T21:09:36Z

server/src/main/java/org/elasticsearch/cluster/metadata/DataStreamFailureStore.java

+    public static DataStreamFailureStore read(StreamInput in) throws IOException {
+        return new DataStreamFailureStore(in.readBoolean());
+    }


Is there a reason you went with a static method rather than a constructor? We've been moving to constructors for this more and more in the code.

Oh I wasn't aware we prefer constructors. I preferred the static method because it allows more flexibility, so I thought it was better to always use a static method to facilitate all cases.

However, if this is not the direction we are following and in this case there is no blocker in using a constructor, I will change it to use a constructor.

I'm personally also more of a fan of static methods (because in some cases we need to use a static method rather than a constructor).

gmarouli · 2024-09-04T09:34:01Z

@elasticmachine update branch

gmarouli · 2024-09-04T16:05:35Z

@jbaiera if you want to see how it looks with the lifecycle as well, check the diff in a26bb6d (the revert commit)

jbaiera · 2024-09-09T06:21:11Z

server/src/main/java/org/elasticsearch/cluster/metadata/DataStreamFailureStore.java

+ * supports the following configurations:
+ * - enabled
+ */
+public record DataStreamFailureStore(boolean enabled) implements SimpleDiffable<DataStreamFailureStore>, ToXContentObject {


One of the things we're talking about doing is adding a cluster setting that will contain index patterns which will state that failure store is enabled for any matching indices by default. The plan is to fall back to using that cluster setting in the case that failure stores haven't been explicitly enabled or disabled via templates at creation time, or in the state.

To save on headache going forward, should this boolean be an optional Boolean?

Good point! thinking further about this, I would expect that the following to be the not explicitly set:

Explicitly enabled:

PUT _data_stream/my-fs/_options { "failure_store": { "enabled": true } }

Explicitly disabled:

PUT _data_stream/my-fs/_options { "failure_store": { "enabled": false } }

No explicit configuration:

PUT _data_stream/my-fs/_options {}

This follows the way that the lifecycle works. In template composition, a user can set failure_store: null to remove any configuration set by previous templates.

Since this is a new configuration type we could always decide we would like to change the way we work with this, but I thought we should at least consider it.

What do you think?

Ah I see, since there is only one property we can just assume the entire section will be removed from options if it's not present. I'm good with that

gmarouli · 2024-09-11T06:04:46Z

@elasticmachine update branch

jbaiera

LGTM!

jbaiera · 2024-09-11T13:46:11Z

server/src/main/java/org/elasticsearch/cluster/metadata/DataStreamFailureStore.java

+ * supports the following configurations:
+ * - enabled
+ */
+public record DataStreamFailureStore(boolean enabled) implements SimpleDiffable<DataStreamFailureStore>, ToXContentObject {


Ah I see, since there is only one property we can just assume the entire section will be removed from options if it's not present. I'm good with that

gmarouli · 2024-09-11T13:48:32Z

@elasticmachine update branch

…tion-ironbank-ubi * upstream/main: (302 commits) Deduplicate BucketOrder when deserializing (elastic#112707) Introduce test utils for ingest pipelines (elastic#112733) [Test] Account for auto-repairing for shard gen file (elastic#112778) Do not throw in task enqueued by CancellableRunner (elastic#112780) Mute org.elasticsearch.script.StatsSummaryTests testEqualsAndHashCode elastic#112439 Mute org.elasticsearch.repositories.blobstore.testkit.integrity.RepositoryVerifyIntegrityIT testTransportException elastic#112779 Use a dedicated test executor in MockTransportService (elastic#112748) Estimate segment field usages (elastic#112760) (Doc+) Inference Pipeline ignores Mapping Analyzers (elastic#112522) Fix verifyVersions task (elastic#112765) (Doc+) Terminating Exit Codes (elastic#112530) (Doc+) CAT Nodes default columns (elastic#112715) [DOCS] Augment installation warnings (elastic#112756) Mute org.elasticsearch.repositories.blobstore.testkit.integrity.RepositoryVerifyIntegrityIT testCorruption elastic#112769 Bump Elasticsearch to a minimum of JDK 21 (elastic#112252) ESQL: Compute support for filtering ungrouped aggs (elastic#112717) Bump Elasticsearch version to 9.0.0 (elastic#112570) add CDR related data streams to kibana_system priviliges (elastic#112655) Support widening of numeric types in union-types (elastic#112610) Introduce data stream options and failure store configuration classes (elastic#109515) ...

…#109515) In order to facilitate enabling and disabling the failure store & component template composition, we introduce new metadata classes that can support a more extensible failure store configuration. We would like to introduce **data stream options**. Data stream options capture the configuration of data stream level (smaller and larger) features, such the failure store and in the future data stream lifecycle. They are different than settings because they are applied on a data stream level and not per backing index. This PR is only setting the basic classes to enable follow up PRs that will actually use them. Examples, these are not final, they are only used to help visualise a potential direction: ``` GET _data_stream/my-*/_options { "data_streams": [ { "name": "my-non-opinionated-ds", "options": { } }, { "name": "my-fs", "options": { "failure_store": { "enabled": true } } }, { "name": "my-no-fs", "options": { "failure_store": { "enabled": false } } } ] } // If we decide to add lifecycle here too: PUT _data_stream/my-fs/_options { "failure_store": { "enabled": true }, "lifecycle": { } } ``` What we see above are 3 data streams: - `my-fs` with the failure store explicitly enabled - `my-no-fs` with the failure store explicitly disabled, and - `my-non-opinionated-ds` which does not specify what to do with the failure store, so for now it means failure store disabled but that could change in the future. Template composition examples pending

Introduce data stream options and failure store configuration classes

da3bab3

gmarouli added >non-issue :Data Management/Data streams Data streams and their lifecycles labels Jun 10, 2024

gmarouli requested review from jbaiera and nielsbauman June 10, 2024 07:35

elasticsearchmachine added the v8.15.0 label Jun 10, 2024

elasticsearchmachine added the Team:Data Management Meta label for data/management team label Jun 10, 2024

gmarouli added 2 commits June 10, 2024 11:06

Fix format

6352d2e

Fix license

9e41720

nielsbauman reviewed Jun 10, 2024

View reviewed changes

server/src/main/java/org/elasticsearch/cluster/metadata/DataStreamFailureStore.java Outdated Show resolved Hide resolved

gmarouli requested review from dakrone and nielsbauman June 10, 2024 12:52

Remove unnecessary getter

9f19c4a

dakrone reviewed Jun 10, 2024

View reviewed changes

Fix wrong field name

47cca13

Add lifecycle to show how they combine

4fcd3af

gmarouli requested a review from dakrone June 11, 2024 07:01

Merge branch 'main' into add-data-stream-options

89c950b

Merge branch 'main' into add-data-stream-options

9b1984e

nielsbauman reviewed Jun 14, 2024

View reviewed changes

Update server/src/main/java/org/elasticsearch/cluster/metadata/DataSt…

fbc03be

…reamOptions.java Co-authored-by: Niels Bauman <[email protected]>

elasticmachine and others added 3 commits June 14, 2024 20:36

Merge branch 'main' into add-data-stream-options

2efb558

Merge branch 'main' into add-data-stream-options

3fcf60c

Review comments

a0453dc

Merge branch 'main' into add-data-stream-options

dcc04fd

Merge branch 'main' into add-data-stream-options

28777ee

Merge branch 'main' into add-data-stream-options

8613f63

dakrone approved these changes Jun 18, 2024

View reviewed changes

gmarouli added 2 commits June 19, 2024 12:42

Merge with main

24aa320

Use constructor instead of static method.

dd769d0

elasticsearchmachine added v8.16.0 and removed v8.15.0 labels Jul 4, 2024

elasticmachine and others added 2 commits September 4, 2024 10:34

Merge branch 'main' into add-data-stream-options

9f1fd8c

Remove lifecycle from the data stream options

a26bb6d

jbaiera reviewed Sep 9, 2024

View reviewed changes

Merge branch 'main' into add-data-stream-options

f31a50a

gmarouli requested a review from jbaiera September 9, 2024 11:02

Merge branch 'main' into add-data-stream-options

dc4cb37

jbaiera approved these changes Sep 11, 2024

View reviewed changes

Merge branch 'main' into add-data-stream-options

1d3d566

gmarouli added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Sep 11, 2024

elasticsearchmachine merged commit 077b585 into elastic:main Sep 11, 2024
15 checks passed

gmarouli deleted the add-data-stream-options branch September 11, 2024 14:46

gmarouli mentioned this pull request Sep 19, 2024

Move the failure store enable flag into the data stream options #113176

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce data stream options and failure store configuration classes #109515

Introduce data stream options and failure store configuration classes #109515

gmarouli commented Jun 10, 2024 •

edited

Loading

elasticsearchmachine commented Jun 10, 2024

nielsbauman left a comment

dakrone left a comment

gmarouli commented Jun 10, 2024

gmarouli commented Jun 10, 2024

gmarouli commented Jun 11, 2024

gmarouli commented Jun 12, 2024

gmarouli commented Jun 14, 2024

gmarouli commented Jun 17, 2024

gmarouli commented Jun 18, 2024

gmarouli commented Jun 18, 2024

gmarouli commented Jun 18, 2024

dakrone left a comment

dakrone Jun 18, 2024

gmarouli Jun 19, 2024

nielsbauman Jun 19, 2024

gmarouli commented Sep 4, 2024

gmarouli commented Sep 4, 2024 •

edited

Loading

jbaiera Sep 9, 2024

gmarouli Sep 9, 2024

jbaiera Sep 11, 2024

gmarouli commented Sep 11, 2024

jbaiera left a comment

jbaiera Sep 11, 2024

gmarouli commented Sep 11, 2024

Introduce data stream options and failure store configuration classes #109515

Introduce data stream options and failure store configuration classes #109515

Conversation

gmarouli commented Jun 10, 2024 • edited Loading

elasticsearchmachine commented Jun 10, 2024

nielsbauman left a comment

Choose a reason for hiding this comment

dakrone left a comment

Choose a reason for hiding this comment

gmarouli commented Jun 10, 2024

gmarouli commented Jun 10, 2024

gmarouli commented Jun 11, 2024

gmarouli commented Jun 12, 2024

gmarouli commented Jun 14, 2024

gmarouli commented Jun 17, 2024

gmarouli commented Jun 18, 2024

gmarouli commented Jun 18, 2024

gmarouli commented Jun 18, 2024

dakrone left a comment

Choose a reason for hiding this comment

dakrone Jun 18, 2024

Choose a reason for hiding this comment

gmarouli Jun 19, 2024

Choose a reason for hiding this comment

nielsbauman Jun 19, 2024

Choose a reason for hiding this comment

gmarouli commented Sep 4, 2024

gmarouli commented Sep 4, 2024 • edited Loading

jbaiera Sep 9, 2024

Choose a reason for hiding this comment

gmarouli Sep 9, 2024

Choose a reason for hiding this comment

jbaiera Sep 11, 2024

Choose a reason for hiding this comment

gmarouli commented Sep 11, 2024

jbaiera left a comment

Choose a reason for hiding this comment

jbaiera Sep 11, 2024

Choose a reason for hiding this comment

gmarouli commented Sep 11, 2024

gmarouli commented Jun 10, 2024 •

edited

Loading

gmarouli commented Sep 4, 2024 •

edited

Loading