-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove index_mode property from index templates #85985
Remove index_mode property from index templates #85985
Conversation
aa63aae
to
da14b26
Compare
da14b26
to
720cf09
Compare
…rty_from_index_template
…rty_from_index_template
Pinging @elastic/es-data-management (Team:Data Management) |
Pinging @elastic/es-analytics-geo (Team:Analytics) |
Pinging @elastic/clients-team (Team:Clients) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall it looks reasonable to me, but I am a bit concern about switching from enum to boolean to indicate time series database. I think we should either rename this boolean to reflect what it actually does instead of which type of data streams it is used in or we should keep it an enum since I anticipate other types of specialized indices will follow if this one is successful.
About Generating the |
The plan is to not support the dynamic case, in this case the In case no time series dimension fields are defined as concrete field, but a dynamic mapping template is defined then we will require a |
@imotov What other specialised index modes do you expect? I'm not against adding IndexMode back to DataStream class, but if we don't add more index modes in the future, it feels like 'an extension point' in data streams that isn't real. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generally looks good to me also, but I think I agree with @imotov about keeping it as an enum for future index types.
// (index_mode was behind a feature in the xcontent parser, so it could never actually used) | ||
// (this used to be an optional enum, so just need to (de-)serialize a false boolean value here) | ||
boolean value = in.readBoolean(); | ||
assert value == false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a message to this assert?
…rty_from_index_template
} | ||
|
||
var settings = MetadataIndexTemplateService.resolveSettings(indexTemplate, componentTemplates()); | ||
if (IndexMetadata.INDEX_ROUTING_PATH.exists(settings)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to auto-generate the routing path maybe we should use index_mode
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is going to change once index.routing_path
is going to be generated from the template's mapping. Maybe we can also use index_mode
index setting here, but that isn't something we said a user would need to configure in order to create data streams in tsdb mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think configuring the mode is still a good thing. Right now we support configuring fields as time_series_dimension
s and without modifying the index itself. It'd be pretty surprising if upgrading Elasticsearch caused new indices with those mappings to be time_series. I'm not sure that's a terrible idea, but I think it's worth double checking. Personally I'd only try and generate the routing_path
if the index is configured in time_series
mode. But I might be wrong here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either way, that concern doesn't block this PR - it'd block the PR that generates the routing_path
.
After chatting with @dakrone I realised there will be more instances of IndexMode, so I will replace DataStream's |
weizijun commented 5 days ago
I think this is a pretty decent way to generate it, yeah. It's much better than what I was trying to do - building it dynamically from the keyword fields, updating it as things were added. Down that road lies madness. I also spent a while thinking about how to generate the field's value when initializing the index itself for the first time. Like, build it from templates and stuff and then derive the routing_path statically from the keyword dimensions. This should have worked. But everything happens in the wrong order. So I never pushed on it. I'm a little worried that this is abstracting away a complex and easy to misconfigure thing inside of templates that may not be owned by folks that know what's up. But it doesn't seem like a bad default. I've mostly listed all of the keyword dimensions in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would prefer to keep it enum everywhere, but I am ok with converting it to boolean internally. I am not sure I like the tsdb abbreviation though.
@@ -81,6 +81,9 @@ public final class DataStream implements SimpleDiffable<DataStream>, ToXContentO | |||
private final boolean replicated; | |||
private final boolean system; | |||
private final boolean allowCustomRouting; | |||
// This boolean field is for keeping track of whether a data stream is a tsdb data stream. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks like a leftover from the previous iteration.
* | ||
* @return new {@code DataStream} instance with the rollover operation applied | ||
*/ | ||
public DataStream rollover(Index writeIndex, long generation, IndexMode indexModeFromTemplate) { | ||
public DataStream rollover(Index writeIndex, long generation, boolean isTsdbTemplate) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is internal, so it is probably ok to keep it a boolean here, but I would still prefer not to use the abbreviation Tsdb. I think we are calling it TimeSeries everywhere else in the code.
Right, dynamic field mapping make things for us more difficult. But I suspect, like you, that in most cases the keyword dimension fields are statically configured in the mapping of the template. |
@imotov I've renamed the variables/parameters to use not use tsdb abbreviation and removed the stale comment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, except for the enum serialization part
// (index_mode is removed and was part of code based when tsdb was behind a feature flag) | ||
// (index_mode was behind a feature in the xcontent parser, so it could never actually used) | ||
// (this used to be an optional enum, so just need to (de-)serialize a false boolean value here) | ||
boolean value = in.readBoolean(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the most important place to keep it enum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The index_mode is removed from the the composable index templates, the only reason that a boolean is read here is because index mode wasn't behind a feature flag in the binary serialisation code (it was in the xcontent serialization).
The index mode is kept in the DataSteam
class.
So I think this is good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM also
This PR removes the
index_mode
field from index templates (that was part ofdata_stream
json object).This field was used to determine whether a tsdb data stream should created or a regular data stream.
As part of this change, a tsdb data stream is now created when the template that matches with the data stream to be created has the following properties:
index.routing_path
setting has been defined in the composable index template.index.routing_path
setting refers to fields in the mapping of this template that are of typekeyword
and thetime_series_dimension
attribute is set totrue
.Prior to this change a template that creates tsdb data stream looks like:
And with this change a template that creates a tsdb data stream looks like:
After this PR the following changes will be made:
index.routing_path
setting if not defined based on the mapping. This will make specifying this setting no longer required.index.routing_path
with a wildcard patten is required and in this case configuring these fields in the template mapping isn't required. Validation will happen upon indexing when dynamic fields are mapped.