Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add XContent chunking to SearchResponse #94736

Merged
merged 13 commits into from
May 12, 2023

Conversation

romseygeek
Copy link
Contributor

This commit adds xcontent chunking to SearchResponse and MultiSearchResponse
by making SearchHits implement ChunkedToXContent.

Relates to #89838

@romseygeek romseygeek added :Search/Search Search-related issues that do not fall into other categories >refactoring v8.8.0 labels Mar 24, 2023
@romseygeek romseygeek self-assigned this Mar 24, 2023
@elasticsearchmachine elasticsearchmachine added the Team:Search Meta label for search team label Mar 24, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

return Iterators.concat(
ChunkedToXContentHelper.startObject(),
Iterators.single((b, p) -> b.field("took", tookInMillis).startArray(Fields.RESPONSES)),
Iterators.flatMap(Arrays.stream(items).iterator(), item -> item.toXContentChunked(params)),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be a nicer way to include an array of objects which themselves implement ChunkedToXContent but I couldn't find one

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine IMO but there's no need for the intermediate stream:

Suggested change
Iterators.flatMap(Arrays.stream(items).iterator(), item -> item.toXContentChunked(params)),
Iterators.flatMap(Iterators.forArray(items), item -> item.toXContentChunked(params)),

return Iterators.concat(
Iterators.single((ToXContent) SearchResponse.this::headerToXContent),
Iterators.single(clusters),
Iterators.flatMap(Iterators.single(internalResponse), r -> r.toXContentChunked(params))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly I feel there ought to be a nicer way of doing this but I couldn't find one...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're looking for this:

Suggested change
Iterators.flatMap(Iterators.single(internalResponse), r -> r.toXContentChunked(params))
internalResponse.toXContentChunked(params)

@romseygeek
Copy link
Contributor Author

This is an interesting test failure. There's a composite aggs test that checks that we get a specific exception if keys are being formatted in a lossy fashion, and this exception is thrown during xcontent parsing. But with chunking we don't seem to catch exceptions thrown by parsing code, and so we get a 'failure encoding chunk' message instead. Is this something we've already encountered elsewhere or do I need to add some extra error handling here?

@romseygeek
Copy link
Contributor Author

For a standard RestToXContentListener errors are handled by the onFailure() method, but we never get there when using the chunked handler.

@DaveCTurner
Copy link
Contributor

DaveCTurner commented Mar 24, 2023

This is a drawback of the chunked encoding: we have to be sure that the serialization will succeed before we start. Once we start sending chunks to the client we can't change our minds about the response code and return an error.

@romseygeek
Copy link
Contributor Author

It looks as though we have quite a few places (mainly in aggs) that have this late checking. I can work through the tests and try and fix them, but it blocks making any changes here for the moment, unfortunately.

@romseygeek
Copy link
Contributor Author

I've opened #94760 to discuss moving key format checks earlier in the aggregation process.

return Iterators.concat(
Iterators.flatMap(Iterators.single(hits), r -> r.toXContentChunked(params)),
Iterators.single((ToXContent) (b, p) -> {
if (aggregations != null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see why moving these checks inside the Iterator helps enumerating everything in one concat call, but this way we create iterators the we already know will be a noop. Don't know if its worth pulling these out. maybe readability suffers then, leave it up to you to decide.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I went back and forth on this a bit, but I think this is the most readable way of doing it, and creating no-op iterators is pretty low cost.

@@ -191,7 +192,7 @@ public Object getProperty(List<String> path) {

@Override
public XContentBuilder doXContentBody(XContentBuilder builder, Params params) throws IOException {
searchHits.toXContent(builder, params);
ChunkedToXContent.wrapAsToXContent(searchHits).toXContent(builder, params);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question out of curiosity: I thought the final goal would be to convert everything in search to a "chunked" xcontent variant. This looks a bit like for aggregations we aren't? Is this reading correct, if so is this future work or does that mean aggregations remain larger "chunks"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the plan is to do this bit-by-bit. So we start off by chunking search into search hits and aggregations, and we can then look at breaking these chunks into smaller ones in the future.

@@ -194,7 +195,7 @@ public XContentBuilder toXContent(XContentBuilder builder, Params params) throws

if (searchResponse != null) {
builder.field("response");
searchResponse.toXContent(builder, params);
ChunkedToXContent.wrapAsToXContent(searchResponse).toXContent(builder, params);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to verify if AsyncSearchResponse will be changed over to ChunkedToXContentObject as well at some point and if we already have plans for that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've created #95661 to keep track of these.

@cbuescher cbuescher self-requested a review March 27, 2023 09:59
@cbuescher
Copy link
Member

@romseygeek thanks, change looks great in general to me, I left a couple of questions

@cbuescher
Copy link
Member

@romseygeek fyi I'm out for a couple of days, the questions I left aren't strong asks for changes so shouldn't block you from continuing this and for other to approve and get this PR merged.

@romseygeek
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/part-1

1 similar comment
@romseygeek
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/part-1

@romseygeek
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/part-2

@gmarouli gmarouli added v8.9.0 and removed v8.8.0 labels Apr 26, 2023
@romseygeek
Copy link
Contributor Author

@elasticmachine update branch

@romseygeek
Copy link
Contributor Author

This is ready for another round of reviews

@romseygeek
Copy link
Contributor Author

@elasticmachine update branch

Copy link
Member

@cbuescher cbuescher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update, I left one question regarding how much extra work we expect for the xContent validation happening for the search hits in the FetchPhase.

@@ -72,8 +74,12 @@ public void execute(SearchContext context) {

Profiler profiler = context.getProfilers() == null ? Profiler.NOOP : Profilers.startProfilingFetchPhase();
SearchHits hits = null;
try {
hits = buildSearchHits(context, profiler);
try (XContentBuilder xContentValidator = new XContentBuilder(XContentType.JSON.xContent(), OutputStream.nullOutputStream())) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question for better understanding: this validator builder is used to completely serialize the search hits once on the node before sending them across the wire and then thrown away? Is there an understanding of the extra work this involves? I see why validation is necessary for chunking but I want to make sure I understand the tradeoffs here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think #95673 means that I can remove this bit, good catch!

@romseygeek romseygeek requested a review from cbuescher May 12, 2023 10:46
Copy link
Member

@cbuescher cbuescher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM

@romseygeek romseygeek merged commit a3edf6b into elastic:main May 12, 2023
@romseygeek romseygeek deleted the rest-chunking/search-response branch May 12, 2023 11:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>refactoring :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team v8.9.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants