-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add min/max range of the event.ingested field to cluster state for searchable snapshots #106252
Add min/max range of the event.ingested field to cluster state for searchable snapshots #106252
Conversation
f8c917d
to
de3b350
Compare
…each method on that class
…Map<String, DateFieldMapper.DateFieldType>>
ShardStateAction.execute updates eventIngestedRange and StartedShardEntry now has ShardLongFieldRange eventIngestedRange but it is not yet populated (set to UNKNOWN) - addressed in next commit
…k-on changes related. No testing against eventIngestedRange yet. Also see now that I need to add eventIngested support to RecoveryListener.onRecoveryDone intf
Next need to implement getEventIngestedRange on concrete impls of IndicesClusterStateService.Shard interface
…ateService.Shard interface
de3b350
to
7aa4b20
Compare
IndexLongFieldRange timestampRange = indexMetadata.getTimestampRange(); | ||
if (timestampRange.containsAllShardRanges() == false) { | ||
timestampRange = indexMetadata.getTimeSeriesTimestampRange(dateFieldType); | ||
if (timestampRange.containsAllShardRanges() == false) { /// MP TODO: is this logic needed for event.ingested? Not clear what this is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martijnvg - Could you help me understand what timestampRange.containsAllShardRanges()
is doing and why it being false means we need to check getTimeSeriesTimestampRange()
?
And does that check have any bearing on the event.ingested timestamp range I'm adding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has been a while since I looked at this. But the background is that for the @timestamp
we do additional filtering of shards.
Each tsdb backing index has an index.time_series.start_time
and index.time_series.end_time
index settings. Each document has @timestamp
field. During indexing documents are routed to the backing index that matches with the start end and time range.
For searches with a range query on @timestamp
field, this gives us the ability to filter out shards that have backing indices that do not match with the range.
This logic here is a fallback to use the time serie range (based on start and end time index settings in cluster state) in case the range matches with shards that are not read only (I guess not yet frozen or backed by searchable snapshot). Obviously this only applies to tsdb and the @timestamp
field. So I think this doesn't apply to the event.ingested
field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indices that are not readonly use IndexLongFieldRange.UNKNOWN
, but that's not quite everything, it's also for indices with multiple primaries: we only track the timestamp range for the whole index, being the union of all the per-shard ranges, and until we've started up each shard copy at least once we don't know exactly what the overall range will be so IndexLongFieldRange#isComplete
returns false
.
…ewrite-uses-event.ingested
…ewrite-uses-event.ingested
...sticsearch/xpack/searchablesnapshots/SearchableSnapshotsCanMatchOnCoordinatorIntegTests.java
Outdated
Show resolved
Hide resolved
…ewrite-uses-event.ingested
…ewrite-uses-event.ingested
…ewrite-uses-event.ingested
…nge; to IndexMetadata.Builder Removed null checks in IndexMetadata Added eventIngestedRangeInfo to AbstractBuilderTestCase
Added reading/writing of event.ingested in serialization code of static IndexMetadata.readFrom method and writeTo method
…ewrite-uses-event.ingested
…ED_FIELD_NAME Updated javadoc in CoordinatorRewriteContext
…egTests. One test seed that consistently fails with this code: ./gradlew -p x-pack/plugin/frozen-indices internalClusterTest --tests "org.elasticsearch.index.engine.frozen.FrozenIndexIT.testTimestampFieldTypeExposedByAllIndicesServices" -Dtests.seed=93869665C2B3B240
…ewrite-uses-event.ingested
if (in.getTransportVersion().onOrAfter(TransportVersions.EVENT_INGESTED_RANGE_IN_CLUSTER_STATE)) { | ||
this.eventIngestedRange = ShardLongFieldRange.readFrom(in); | ||
} else { | ||
this.eventIngestedRange = ShardLongFieldRange.UNKNOWN; // MP TODO: is this the right choice? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, UNKNOWN
makes sense for bwc.
server/src/main/java/org/elasticsearch/cluster/action/shard/ShardStateAction.java
Outdated
Show resolved
Hide resolved
…IndexIT and FrozenIndexTests
@thecoop - that commit has been reverted. After discussing with Henning we are not going to add any TransportVersion logic to the XContent config or parsers. |
…ewrite-uses-event.ingested
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few more comments.
server/src/main/java/org/elasticsearch/cluster/metadata/IndexMetadata.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/metadata/IndexMetadata.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/metadata/IndexMetadata.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/metadata/IndexMetadata.java
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/cluster/metadata/IndexMetadataTests.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/cluster/metadata/IndexMetadataTests.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/cluster/serialization/ClusterSerializationTests.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/cluster/serialization/ClusterSerializationTests.java
Outdated
Show resolved
Hide resolved
…ewrite-uses-event.ingested
Added back the assert in public Builder.eventIngestedRange method - assert minClusterTransportVersion != null || eventIngestedRange == IndexLongFieldRange.UNKNOWN now passes tests. Changed IndexMetadataTests to use package private version of eventIngestedRange builder method that takes no TransportVersion
…onstructor and adjusted a few related tests
…ewrite-uses-event.ingested
…sert around eventIngestedRange Changes to: ClusterStateDiffIT and MetadataIndexTemplateService, which was causing several bwc tests to fail.
…test failures in TransportVersionUtilsTests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
.put(IndexMetadata.builder(temporaryIndexName).settings(finalResolvedSettings)) | ||
.put( | ||
IndexMetadata.builder(temporaryIndexName) | ||
.eventIngestedRange(IndexLongFieldRange.UNKNOWN) // necessary to pass asserts in ClusterState constructor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to avoid the package private call here (while technically in the same package it is very much client code of the IndexMetadata
class). Can we instead do:
.eventIngestedRange(IndexLongFieldRange.UNKNOWN) // necessary to pass asserts in ClusterState constructor | |
.eventIngestedRange(IndexLongFieldRange.NO_SHARDS, state.getMinTransportVersion()) // necessary to pass asserts in ClusterState constructor |
??
public Builder eventIngestedRange(IndexLongFieldRange eventIngestedRange, TransportVersion minClusterTransportVersion) { | ||
assert eventIngestedRange != null : "eventIngestedRange cannot be null"; | ||
assert minClusterTransportVersion != null || eventIngestedRange == IndexLongFieldRange.UNKNOWN | ||
: "eventIngestedRange should only be UNKNOWN if minClusterTransportVersion is null, but minClusterTransportVersion: " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be:
: "eventIngestedRange should only be UNKNOWN if minClusterTransportVersion is null, but minClusterTransportVersion: " | |
: "eventIngestedRange must be UNKNOWN when minClusterTransportVersion is null, but minClusterTransportVersion: " |
…ewrite-uses-event.ingested
…ewrite-uses-event.ingested
…ewrite-uses-event.ingested
…SerializationTests
#110352) Min/max range for the event.ingested timestamp field (part of Elastic Common Schema) was added to IndexMetadata in cluster state for searchable snapshots in #106252. This commit modifies the search coordinator to rewrite searches to MatchNone if the query searches a range of event.ingested that, from the min/max range in cluster state, is known to not overlap. This is the same behavior we currently have for the @timestamp field.
elastic#110352) Min/max range for the event.ingested timestamp field (part of Elastic Common Schema) was added to IndexMetadata in cluster state for searchable snapshots in elastic#106252. This commit modifies the search coordinator to rewrite searches to MatchNone if the query searches a range of event.ingested that, from the min/max range in cluster state, is known to not overlap. This is the same behavior we currently have for the @timestamp field.
#110352) (#110782) Min/max range for the event.ingested timestamp field (part of Elastic Common Schema) was added to IndexMetadata in cluster state for searchable snapshots in #106252. This commit modifies the search coordinator to rewrite searches to MatchNone if the query searches a range of event.ingested that, from the min/max range in cluster state, is known to not overlap. This is the same behavior we currently have for the @timestamp field.
Min/max range for the event.ingested timestamp field (part of Elastic Common Schema) was added to IndexMetadata in cluster state for searchable snapshots in elastic#106252. This commit modifies the search coordinator to rewrite searches to MatchNone if the query searches a range of event.ingested that, from the min/max range in cluster state, is known to not overlap. This is the same behavior we currently have for the @timestamp field.
Min/max range for the event.ingested timestamp field (part of Elastic Common Schema) was added to IndexMetadata in cluster state for searchable snapshots in elastic#106252. This commit modifies the search coordinator to rewrite searches to MatchNone if the query searches a range of event.ingested that, from the min/max range in cluster state, is known to not overlap. This is the same behavior we currently have for the @timestamp field.
Min/max range for the event.ingested timestamp field (part of Elastic Common Schema) was added to IndexMetadata in cluster state for searchable snapshots in elastic#106252. This commit modifies the search coordinator to rewrite searches to MatchNone if the query searches a range of event.ingested that, from the min/max range in cluster state, is known to not overlap. This is the same behavior we currently have for the @timestamp field.
#111523) * Search coordinator uses event.ingested in cluster state to do rewrites Min/max range for the event.ingested timestamp field (part of Elastic Common Schema) was added to IndexMetadata in cluster state for searchable snapshots in #106252. This commit modifies the search coordinator to rewrite searches to MatchNone if the query searches a range of event.ingested that, from the min/max range in cluster state, is known to not overlap. This is the same behavior we currently have for the @timestamp field.
elastic#111523) * Search coordinator uses event.ingested in cluster state to do rewrites Min/max range for the event.ingested timestamp field (part of Elastic Common Schema) was added to IndexMetadata in cluster state for searchable snapshots in elastic#106252. This commit modifies the search coordinator to rewrite searches to MatchNone if the query searches a range of event.ingested that, from the min/max range in cluster state, is known to not overlap. This is the same behavior we currently have for the @timestamp field.
Add
event.ingested
min/max range to cluster state for searchable snapshots in orderto support shard skipping on the search coordinator node for event.ingested
like we do for the
@timestamp
field.The key motivation is that the Elastic Security solution uses two timestamp fields:
@timestamp
which is user populated andevent.ingested
which is populated byan ingest processor which tells the time that the event was ingested.
In some cases, queries are filtered by @timestamp, while in some others they are filtered
by event.ingested. When data is stored in the cold or frozen tier, we have a shard skipping
mechanism on the coordinating node, which allows the coordinating node to skip shards
based on min and max value stored in the cluster state for the @timestamp field. This is
done to prevent returning irrelevant errors in case shards in frozen/cold are unavailable
(as they have no replicas) and queries that don’t target them (as the datetime filter won’t
match them) are executed.
This works well for the @timestamp field and the Security solution could benefit from having
the same skipping mechanism applied to the event.ingested field.