Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring ext_proc clear_route_cache API. #27852

Conversation

yanjunxiang-google
Copy link
Contributor

@yanjunxiang-google yanjunxiang-google commented Jun 7, 2023

This is a follow up PR for #27657.

In that PR, there is a comment to refactor the clear_route_cache API to make it cleaner.

Commit Message:
Additional Description:
Risk Level:
Testing:
Docs Changes:
Release Notes:
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional API Considerations:]

@repokitteh-read-only
Copy link

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to (api/envoy/|docs/root/api-docs/).
envoyproxy/api-shepherds assignee is @mattklein123
CC @envoyproxy/api-watchers: FYI only for changes made to (api/envoy/|docs/root/api-docs/).

🐱

Caused by: #27852 was opened by yanjunxiang-google.

see: more, trace.

@yanjunxiang-google
Copy link
Contributor Author

yanjunxiang-google commented Jun 7, 2023

Since ext_proc API does not provide the framework to differentiate INBOUND or OUTBOUND processing,
move clear_route_cache to an INBOUND request only API is not possible.

The best we can do seem to be make the naming explicit.

The initial change is just for discussing the API part. Other changes will be added later.

BTW, this is a breaking API change.

@yanjunxiang-google
Copy link
Contributor Author

@mattklein123
Copy link
Member

I don't think we can change this API at this point? When was this originally added and is there an implementation?

/wait

@yanjunxiang-google
Copy link
Contributor Author

yanjunxiang-google commented Jun 8, 2023

I don't think we can change this API at this point? When was this originally added and is there an implementation?

/wait

Thanks for the comments!
The background is that we found an crashing issue during ext_proc filter fuzzer test as below:
#27657
There is a comments to make an API change as here: #27657 (comment).

@mattklein123
Copy link
Member

I get the impression that a lot of people are using this thing so it seems unlikely that we can change the API. I don't have enough context to understand exactly what should happen here. cc @yanavlasov @adisuissa

@tyxia
Copy link
Member

tyxia commented Jun 8, 2023

I get the impression that a lot of people are using this thing so it seems unlikely that we can change the API. I don't have enough context to understand exactly what should happen here. cc @yanavlasov @adisuissa

Yea we are proposing the API change(include breaking ones) since its state is alpha and we assumed that this API is not actively being used by others. cc @htuch who is also involved the API discussion internally.

// Specify the response data specially for downstream client request.
// If the external processing server set this field when responding
// to the upstream response request, it will be ignored by Envoy.
DownstreamRequestResponse downstream_response = 5;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the goal is to make the naming more explicit, can we just change clear_route_cache to downstream_clear_route_cache?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the API change based on an offline discussion with @yanavlasov .

If the direction of the API change is good, we will add in the related code change soon.

@htuch
Copy link
Member

htuch commented Jun 8, 2023

@gbrail can you weight in on the original motivation for the placement? Thanks.

Signed-off-by: Yanjun Xiang <[email protected]>
@gbrail
Copy link
Contributor

gbrail commented Jun 9, 2023

In the design of the original API, I think that we were trying to keep things from becoming even more complicated, which is why there is a clear_route_cache field on a structure that only makes sense when processing HTTP request headers. I'm not sure that the benefit of duplicating data structures outweighs the benefit of removing a flag that can only have an effect when called after request headers are processed, versus making sure that the implementation ignores the flag if it's set when it can't have an effect.

Also, the ext_proc API has been designed since the start to use terms like "request headers," "response body," etc, because I felt that those were intuitive to people who work with HTTP, versus using "upstream" and "downstream." If we want to make this change, then using a combination of "upstream/downstream" and "request/response" in this API will be confusing -- I'd rather be consistent within this API. This PR adds the "downstream" terminology to an API that didn't previously have it.

Finally, I do believe that quite a few people are using ext_proc even though the API is marked "work in progress," and that it's time we figure out what we need to do so that this extension is considered ready for production. At the same time I no longer have bandwidth to devote lots of time to taking this project to completion so someone else is going to have to do that work.

So in short, I would rather that we ensure that ext_proc will ignore this flag when inappropriate and document that fact rather than change the API in an incompatible way.

@gbrail
Copy link
Contributor

gbrail commented Jun 9, 2023

Thank you for that change. I am pretty sure that because of the design of protobuf, these changes will require people to change their code, but will still be compatible at the wire level, right? If so then I can see how these changes won't have much of an impact.

@tyxia
Copy link
Member

tyxia commented Jun 9, 2023

Thank you for that change. I am pretty sure that because of the design of protobuf, these changes will require people to change their code, but will still be compatible at the wire level, right? If so then I can see how these changes won't have much of an impact.

In order to delete proto field, besides it requires clients to change their code, I think we also need to reserve the deleted field number.
Otherwise, future developers can reuse the field number when making their own updates to the type. This can cause severe issues (https://protobuf.dev/programming-guides/proto3/#consequences)

// Clear the route cache for the current request.
// This is necessary if the remote server
// modified headers that are used to calculate the route.
bool clear_route_cache = 5;
Copy link
Member

@tyxia tyxia Jun 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we need to reserve field number, if we decide to go with this API change to deprecate this field

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a breaking change, like the ext_proc server has to change code to be able to work with Envoy. With that, do we still need to keep this deprecated field?
I think the bigger question is that does it worth to make this breaking change if it does not cause functional issue with the fix proposed in #27657?

Copy link
Member

@tyxia tyxia Jun 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For 1st question, I think so. Besides the link I shared above, here also explains why .

For 2nd question, my comment here only apply to this specific proto change. Regarding #27657, IMHO I am ok with proceeding with it first (as i already approved it :)) while API change is being discussed here , especially if API change will cause wide breakage. But let's see what others think.

@mattklein123
Copy link
Member

Same question/comment about API changes as the other PR.

/wait

@yanavlasov
Copy link
Contributor

I think the PR can be closed, I think it will not allow some use cases I was unaware of.

@yanjunxiang-google
Copy link
Contributor Author

I think the PR can be closed, I think it will not allow some use cases I was unaware of.

Done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants