-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REST high-level client: add synced flush API #29189
Conversation
WIP: Need clarifications
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
1 similar comment
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
@javanna This turned out to be a bit more complex than I initially thought it to be. It seems a bit tricky to reconstruct the SyncedFlushResponse object back from its JSON/XContent. Could you please have a look at |
Pinging @elastic/es-core-infra |
— Fixed issues with parser for null values — Cleaned up the response parsing for SyncedFlush — TODO: Run complete test suite and check style cleanup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh boy!
I wonder if in this case we shouldn't parse the normal response object and make a new one that is more like what is returned over the REST API. This is just so different and you have to go through so many hoops to make so many thing parseable I wonder if it is worth breaking from tradition here.
@javanna, I know you were reviewing this too, what do you think?
Map<String, ShardCounts> shardsCountsPerIndex = new HashMap<>(); | ||
Map<String, List<ShardsSyncedFlushResult>> shardsResultPerIndex = new HashMap<>(); | ||
// If it is an object we try to parse it for Fields._SHARD or for an index entry | ||
for (Token curToken = parser.currentToken(); curToken != Token.END_OBJECT; curToken = parser.nextToken()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that we need to parse the top level object by hand because of the unexpected names. ObjectParser
doesn't support that sort of thing.
Integer successfulShards = null; | ||
Integer failedShards = null; | ||
Map<ShardId, List<FailureContainer>> failures = new HashMap<>(); | ||
if (curToken == Token.START_OBJECT) { // Start parsing for _shard or for index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compared to our other parsing code this is a little weird because it doesn't know what field it is parsing up front. I get why you do this, but it is weird.
Also it is weird because we don't serialize all that much information. You get almost nothing if there isn't an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is indeed true. Which is why I agree with your initial idea. It would be much better to have a separate response object altogether as this whole effort felt more like a workaround to me. Sorry, I misunderstood initially. We could probably have something like
{
"_shards" : ...
"indexes" : {
"index1" : {
...
},
"index2" : {
...
},
...
}
}
And then parse the index information like a Map. But I am not sure about the repercussions that changing the JSON structure might have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can't change the format of the response at this time. We can at some point, but for now we just have to parse what we have, and figure out what we should do to make things better for the future, meaning potentially breaking changes etc.
.field("relocating_node", relocatingNodeId()) | ||
.field("shard", id()) | ||
.field("index", getIndexName()); | ||
.field(Fields.STATE, state()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've stopped making these Fields
objects in a past year or so and just started to use string constants or even quoted strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some initial comments, thanks a lot for your efforts @sohaibiftikhar I definitely didn't realize how complicated this API was going to be, I changed its difficulty in the meta issue to "medium". I definitely didn't see that it was printing out ShardRouting
hence a lot of other stuff that needs to be parsed back. On alternative proposals, I am open to ideas, but if this is what we return through REST today and what we print out, I don't follow how using a different object would make things simpler. Maybe changing the format of the REST response would, but that is a different matter. @nik9000 could you clarify?
* See <a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-synced-flush.html"> | ||
* Synced flush API on elastic.co</a> | ||
*/ | ||
public void syncedFlushAsync(SyncedFlushRequest syncedFlushRequest, ActionListener<SyncedFlushResponse> listener, Header... headers) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as odd as this sounds, could you rename the methods to flushSynced as that's how this API is referred to in our SPEC ? request and response can and should stay the same.
@@ -233,6 +234,15 @@ static Request flush(FlushRequest flushRequest) { | |||
return new Request(HttpPost.METHOD_NAME, endpoint, parameters.getParams(), null); | |||
} | |||
|
|||
static Request syncedFlush(SyncedFlushRequest syncedFlushRequest) { | |||
String[] indices = syncedFlushRequest.indices() == null ? Strings.EMPTY_ARRAY : syncedFlushRequest.indices(); | |||
String endpoint = endpoint(indices, "_flush", "synced"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_flush/synced
can now be provided as a single argument, it also won't be encoded this way which is fine as we know that it doesn't need to be encoded.
} | ||
Map<String, String> expectedParams = new HashMap<>(); | ||
setRandomIndicesOptions(syncedFlushRequest::indicesOptions, syncedFlushRequest::indicesOptions, expectedParams); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you remove one of these empty lines please?
endpoint.add(String.join(",", indices)); | ||
} | ||
endpoint.add("_flush"); | ||
endpoint.add("synced"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
endpoint.add("_flush/synced") ?
String[] indices = syncedFlushRequest.indices() == null ? Strings.EMPTY_ARRAY : syncedFlushRequest.indices(); | ||
String endpoint = endpoint(indices, "_flush", "synced"); | ||
Params syncedFlushparameters = Params.builder(); | ||
// This request takes no other parameters other than the indices. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove this comment
if (curToken == Token.FIELD_NAME) { | ||
String level2FieldName = parser.currentName(); | ||
curToken = parser.nextToken(); | ||
switch (level2FieldName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we usually have a switch based on the token found. For instance here we would have one for start_array, and we would do something if the current field name is failures
otherwise ignore. And another if token.isValue()
for total, successful and failed. You can find something like this for instance in SearchResponse
. This makes it easier to reason about what we are parsing I think.
if (failureReason != null && | ||
shardId != null && | ||
totalCopies != null && | ||
successfulCopies != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that given our responses, this is always true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Same reason as above. I have a general tendency of always expecting illegal JSONs I think. This would mean that if someone changed the toXContent
(removing some fields) method tests would start to fail. I will remove it if you think it is not required.
new FailureContainer(shardId, failureReason, totalCopies, successfulCopies, routing) | ||
); | ||
} else { | ||
throw new ParsingException(startLocation, "Unable to construct ShardsSyncedFlushResult"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we try not to throw exception when parsing responses as they may become a problem when it comes to forward compatibility. The client should be able to speak to future versions that have added fields, arrays or objects, by just reading what it knows and ignoring the rest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding fields will not throw this exception. Removing will. IMHO it should because otherwise the response object would get constructed with nulls.
} else { | ||
parser.skipChildren(); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe for readability we could split this into a couple of methods for each section / object?
} | ||
if (totalShards != null && | ||
successfulShards != null && | ||
failedShards != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when can it happen that some of these is null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same reason as above. Just precaution against removing keys from the XContent string in the future. Will remove it if you want.
I am not sure about simpler. But what it would do is make it logically correct I think. For example consider the method |
— Split the `toXContent` method into multiple methods
349f7b3
to
723300b
Compare
85bdde4
to
0543779
Compare
— fixed document:integTest
0543779
to
0bf9dcf
Compare
OK! So, I vanished for a while because I was at the beach. And then a weekend. And then we had to get a few people together to talk about an approach. So, there are a bunch of APIs like this: the Response objects just don't match the http response that we make. We use the Response objects elsewhere so we can't change them. The simplest way to actually integrate this into the high level rest client is to make a response object that parses the response from the server. Could you do that? Make a new response object that parses the response from the server and put it in the high level rest client. I'm really sorry to throw away so much work that you've already done. I suppose it isn't a waste because your work was super important in making it clear that we can't share all of the responses even though we'd really like to. |
@nik9000 Thanks for the feedback. Since I always get confused with naming could you suggest a name for this new response object? From my understanding, we would still need some part of what we are doing here ( |
@sohaibiftikhar the idea would be to have something that resembles more the json response, and get away from the current response and the heavy object that it holds. SyncedFlushResponse is still good I think, it is going to be in an another package anyway, and renaming it is always possible later if we wish to do so. |
I think I'd make an object to hold whatever we return instead of reusing |
@nik9000 I get your point. But we pretty much serialize everything in |
I talked to @sohaibiftikhar over another channel and convinced him that we shouldn't reuse |
Since the required changes are more or less entirely different is it okay if I move this to a separate PR? |
Either way is fine with me! |
Moved to #30650 |
Relates to #27205
Moved to #30650