Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Min and max time for *any* index? #86579

Closed
nik9000 opened this issue May 9, 2022 · 1 comment
Closed

Min and max time for *any* index? #86579

nik9000 opened this issue May 9, 2022 · 1 comment
Labels
>enhancement needs:triage Requires assignment of a team area label

Comments

@nik9000
Copy link
Member

nik9000 commented May 9, 2022

Description

In #85162 @martijnvg caused us to skip indices who's @timestamp doesn't overlap the query. It's implemented by reading the index.time_series.start_time and index.time_series.end_time. I wonder if we can set these settings on any index. Like, could we make an ILM action that runs on read-only indices and checks the min and max time in all shards and then sets the settings to those values? Such an action would cause us to skip querying them fairly cheaply.

@nik9000 nik9000 added >enhancement needs:triage Requires assignment of a team area label labels May 9, 2022
@martijnvg
Copy link
Member

martijnvg commented May 11, 2022

Today we already compute IndexLongFieldRange for indices that are readonly (frozen and searchable snapshot backed indices) and store this range in the cluster state. So I think this isn't needed?

Also IndexLongFieldRange is more accurate since it is computed based on min and max value of @timestamp field across all shards of an index. Whereas the index.time_series.start_time and index.time_series.end_time we set can be seen as upper boundaries. Indexing documents with @timestamp outside this range isn't possible (and will fail), but because of various reasons the actual range of the @timestamp may be a bit smaller than what the the range is configured in these index settings. In any way what is defined the index.time_series.start_time and index.time_series.end_time is good enough to decide whether to not query the shards of backing index of a tsdb data stream that fall outside of this range.

@nik9000 nik9000 closed this as completed May 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement needs:triage Requires assignment of a team area label
Projects
None yet
Development

No branches or pull requests

2 participants