-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Segment Replication] Support realtime TermVector requests with Segment Replication #9585
[Segment Replication] Support realtime TermVector requests with Segment Replication #9585
Conversation
… replication. Signed-off-by: Rishikesh1159 <[email protected]>
Gradle Check (Jenkins) Run Completed with:
|
Codecov Report
@@ Coverage Diff @@
## main #9585 +/- ##
============================================
- Coverage 71.16% 71.10% -0.07%
- Complexity 58115 58125 +10
============================================
Files 4831 4831
Lines 273999 274062 +63
Branches 39920 39925 +5
============================================
- Hits 195005 194880 -125
- Misses 62604 62890 +286
+ Partials 16390 16292 -98
... and 468 files with indirect coverage changes 📢 Have feedback on the report? Share it here. |
Signed-off-by: Rishikesh Pasham <[email protected]>
Compatibility status:Checks if related components are compatible with change 6f429b6 Incompatible componentsSkipped componentsCompatible components |
Gradle Check (Jenkins) Run Completed with:
|
server/src/main/java/org/opensearch/action/termvectors/MultiTermVectorsRequest.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/action/termvectors/TransportMultiTermVectorsAction.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/action/termvectors/TransportTermVectorsAction.java
Outdated
Show resolved
Hide resolved
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Compatibility status:Checks if related components are compatible with change 2648fd4 Incompatible componentsSkipped componentsCompatible componentsCompatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git] |
Gradle Check (Jenkins) Run Completed with:
|
...er/src/internalClusterTest/java/org/opensearch/indices/replication/SegmentReplicationIT.java
Outdated
Show resolved
Hide resolved
...er/src/internalClusterTest/java/org/opensearch/indices/replication/SegmentReplicationIT.java
Show resolved
Hide resolved
...er/src/internalClusterTest/java/org/opensearch/indices/replication/SegmentReplicationIT.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/action/termvectors/MultiTermVectorsRequest.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/action/termvectors/TransportTermVectorsAction.java
Outdated
Show resolved
Hide resolved
Gradle Check (Jenkins) Run Completed with:
|
server/src/main/java/org/opensearch/action/termvectors/TransportTermVectorsAction.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/action/termvectors/TransportTermVectorsAction.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Rishikesh1159 <[email protected]>
…159/OpenSearch into fix-vectorTerm-segrep
Compatibility status:Checks if related components are compatible with change 4e43c57 Incompatible componentsSkipped componentsCompatible componentsCompatible components: [https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/reporting.git] |
Gradle Check (Jenkins) Run Completed with:
|
…nt Replication (#9585) * support realtime TermVector and MultiTermVector requests with segment replication. Signed-off-by: Rishikesh1159 <[email protected]> * Fix TermVector requests with segrep. Signed-off-by: Rishikesh1159 <[email protected]> * Refacotring. Signed-off-by: Rishikesh1159 <[email protected]> * Address comments on PR. Signed-off-by: Rishikesh1159 <[email protected]> --------- Signed-off-by: Rishikesh1159 <[email protected]> Signed-off-by: Rishikesh Pasham <[email protected]> (cherry picked from commit caf4c80) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…nt Replication (#9585) * support realtime TermVector and MultiTermVector requests with segment replication. Signed-off-by: Rishikesh1159 <[email protected]> * Fix TermVector requests with segrep. Signed-off-by: Rishikesh1159 <[email protected]> * Refacotring. Signed-off-by: Rishikesh1159 <[email protected]> * Address comments on PR. Signed-off-by: Rishikesh1159 <[email protected]> --------- Signed-off-by: Rishikesh1159 <[email protected]> Signed-off-by: Rishikesh Pasham <[email protected]> (cherry picked from commit caf4c80) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…nt Replication (#9585) (#9936) * support realtime TermVector and MultiTermVector requests with segment replication. * Fix TermVector requests with segrep. * Refacotring. * Address comments on PR. --------- (cherry picked from commit caf4c80) Signed-off-by: Rishikesh1159 <[email protected]> Signed-off-by: Rishikesh Pasham <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…nt Replication (#9585) (#9935) * support realtime TermVector and MultiTermVector requests with segment replication. * Fix TermVector requests with segrep. * Refacotring. * Address comments on PR. --------- (cherry picked from commit caf4c80) Signed-off-by: Rishikesh1159 <[email protected]> Signed-off-by: Rishikesh Pasham <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
if (request.request().doc() != null && request.request().routing() == null) { | ||
// artificial document without routing specified, ignore its "id" and use either random shard or according to preference | ||
GroupShardsIterator<ShardIterator> groupShardsIter = clusterService.operationRouting() | ||
.searchShards(state, new String[] { request.concreteIndex() }, null, request.request().preference()); | ||
.searchShards(state, new String[] { request.concreteIndex() }, null, preference); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Rishikesh1159 I don't think the primary override here is necessary. In this case, an "artificial" document has been supplied and the user is asking to compute term vectors as if this was a real document. I don't think the "real time" parameter is applicable here since it doesn't need to look up any index data.
@msfroh Does that sound right to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That does sound right to me.
I think the override condition should be:
if (request.request().realtime()
&& preference == null
&& state.getMetadata().isSegmentReplicationEnabled(request.concreteIndex())
&& request.request().doc() == null) {
Or you could revert the edit to this line and continue to use request.request().preference()
for the artificial doc case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interestingly the code seems to unconditionally attempt to fetch the document even if an artificial document is supplied. Not sure if that complicates things here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's probably worth fixing -- from the looks of it, that GetResult
is ignored, which seems a little wasteful.
That said, I don't think it hurts anything. In particular, if the doc
parameter is specified, then it GETs a document with ID based on an autoincrementing (not really random) integer:
OpenSearch/server/src/main/java/org/opensearch/action/termvectors/TermVectorsRequest.java
Lines 315 to 323 in aca2e9d
public TermVectorsRequest doc(BytesReference doc, boolean generateRandomId, MediaType mediaType) { | |
// assign a random id to this artificial document, for routing | |
if (generateRandomId) { | |
this.id(String.valueOf(randomInt.getAndAdd(1))); | |
} | |
this.doc = doc; | |
this.mediaType = mediaType; | |
return this; | |
} |
(generateRandomId
is always true on the REST parsing code path.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still, there is no value in routing to a primary shard if an artificial doc is specified.
…nt Replication (opensearch-project#9585) * support realtime TermVector and MultiTermVector requests with segment replication. Signed-off-by: Rishikesh1159 <[email protected]> * Fix TermVector requests with segrep. Signed-off-by: Rishikesh1159 <[email protected]> * Refacotring. Signed-off-by: Rishikesh1159 <[email protected]> * Address comments on PR. Signed-off-by: Rishikesh1159 <[email protected]> --------- Signed-off-by: Rishikesh1159 <[email protected]> Signed-off-by: Rishikesh Pasham <[email protected]> Signed-off-by: Kaushal Kumar <[email protected]>
…nt Replication (opensearch-project#9585) * support realtime TermVector and MultiTermVector requests with segment replication. Signed-off-by: Rishikesh1159 <[email protected]> * Fix TermVector requests with segrep. Signed-off-by: Rishikesh1159 <[email protected]> * Refacotring. Signed-off-by: Rishikesh1159 <[email protected]> * Address comments on PR. Signed-off-by: Rishikesh1159 <[email protected]> --------- Signed-off-by: Rishikesh1159 <[email protected]> Signed-off-by: Rishikesh Pasham <[email protected]>
…nt Replication (opensearch-project#9585) * support realtime TermVector and MultiTermVector requests with segment replication. Signed-off-by: Rishikesh1159 <[email protected]> * Fix TermVector requests with segrep. Signed-off-by: Rishikesh1159 <[email protected]> * Refacotring. Signed-off-by: Rishikesh1159 <[email protected]> * Address comments on PR. Signed-off-by: Rishikesh1159 <[email protected]> --------- Signed-off-by: Rishikesh1159 <[email protected]> Signed-off-by: Rishikesh Pasham <[email protected]> Signed-off-by: Ivan Brusic <[email protected]>
…nt Replication (opensearch-project#9585) * support realtime TermVector and MultiTermVector requests with segment replication. Signed-off-by: Rishikesh1159 <[email protected]> * Fix TermVector requests with segrep. Signed-off-by: Rishikesh1159 <[email protected]> * Refacotring. Signed-off-by: Rishikesh1159 <[email protected]> * Address comments on PR. Signed-off-by: Rishikesh1159 <[email protected]> --------- Signed-off-by: Rishikesh1159 <[email protected]> Signed-off-by: Rishikesh Pasham <[email protected]> Signed-off-by: Shivansh Arora <[email protected]>
Description
This PR supports realtime reads for TermVectors requests when segment replication is enabled.
If there are no preferences mentioned for the incoming TermVectors requests then we route it to primary shards
Related Issues
Resolves #9563
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.