Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python] fix typing for API responses #16802

Merged
merged 44 commits into from
Oct 30, 2023

Conversation

fa0311
Copy link
Contributor

@fa0311 fa0311 commented Oct 12, 2023

PR checklist

  • Read the contribution guidelines.
  • Pull Request title clearly describes the work in the pull request and Pull Request description provides details about how to validate the work. Missing information here may result in delayed response from the community.
  • Run the following to build the project and update samples:
    ./mvnw clean package 
    ./bin/generate-samples.sh ./bin/configs/*.yaml
    ./bin/utils/export_docs_generators.sh
    
    Commit all changed files.
    This is important, as CI jobs will verify all generator outputs of your HEAD commit as it would merge with master.
    These must match the expectations made by your contribution.
    You may regenerate an individual generator by passing the relevant config(s) as an argument to the script, for example ./bin/generate-samples.sh bin/configs/java*.
    For Windows users, please run the script in Git BASH.
  • File the PR against the correct branch: master (upcoming 7.1.0 minor release - breaking changes with fallbacks), 8.0.x (breaking changes without fallbacks)
  • If your PR is targeting a particular programming language, @mention the technical committee members, so they are more likely to review the pull request.

image
image
image

@fa0311
Copy link
Contributor Author

fa0311 commented Oct 12, 2023

This PullRequest removes the following arguments

  • async_req
  • _return_http_data_only
  • _preload_content

@fa0311
Copy link
Contributor Author

fa0311 commented Oct 12, 2023

This PullRequest is the first step in achieving #16788.

@multani
Copy link
Contributor

multani commented Oct 13, 2023

@fa0311 There's some overlap with #16789, maybe you can also take a look.

@fa0311
Copy link
Contributor Author

fa0311 commented Oct 13, 2023

@fa0311 There's some overlap with #16789, maybe you can also take a look.

I have seen #16789.
Some of the changes to __call_api and deserialize are duplicates of #16789.
The difference is that this PullRequest removes _preload_content and call_api always returns ApiResponse[Any].

My idea is to split call_api into 3 functions
The current call_api returns a complex Union but I think I need to create a function that always returns an ApiResponse[Any].

It is essential to remove _preload_content and _return_http_data_only.

We would like to achieve the following in the end.

@validate_call
def get(self) -> GetResponse:
    ...
    # param_serialize returns a Param class that call_api can easily process
    param: Param = self.param_serialize(...)
    # call_api is a low-level function that takes Param and always returns RESTResponse
    res: RESTResponse = util.call_api(param)
    # response_deserialize is a function that receives RESTResponse and deserializes it, always returning ApiResponse[Any].
    return util.response_deserialize(res, ...).data


@validate_call
def get_with_http_info(self) -> ApiResponse[GetResponse]:
    ...
    param: Param = self.param_serialize(...)
    res: RESTResponse = util.call_api(param)
    return util.response_deserialize(res, ...)


@validate_call
def get_with_preload_content(self) -> bytes:
    """
    Equivalent to current preload_content.
    This is to avoid unnecessary deserialization.
    """
    ...
    param: Param = self.param_serialize(...)
    res: RESTResponse = util.call_api(param)
    return res.data


@validate_call
def get_with_async(self) -> ApplyResult:
    """
    Equivalent to current async_req.
    """
    ...
    param: Param = self.param_serialize(...)
    res: ApplyResult = util.call_api_async(param)
    return res
@validate_call
async def get(self) -> ApiResponse[GetResponse]:
    ...
    param: Param = self.param_serialize(...)
    res: RESTResponse = await util.call_api_aiohttp(param)
    return util.response_deserialize(res, ...)

@fa0311
Copy link
Contributor Author

fa0311 commented Oct 13, 2023

This pull request involves many disruptive changes and I would like your input.

@fa0311 fa0311 marked this pull request as ready for review October 13, 2023 18:44
@robertschweizer
Copy link
Contributor

I think it's a good idea to remove these flags that modify the return value type of the API call methods. Also using ApiResponse only for fully decoded+deserialized responses is good for typing and making the Python API understandable.

I see these use-cases for _preload_content=False:

  • work around client-side validation errors if the server is known/expected to return invalid responses
  • stream responses (this worked before switching to pydantic I think, I'm trying to allow this again in [python] feat: Allow streaming large response bodies #16789)
  • save time if response decoding/model deserialization is unnecessary

I think the new method get_with_preload_content (or rather get_without_preload_content) should return the urllib3/aiohttp response object, because in contrast to just bytes, that allows to:

  • access response headers and status code
  • decode the bytes content to str using the encoding from the response header
  • stream large response bodies

@wing328
Copy link
Member

wing328 commented Oct 16, 2023

I think it's a good idea to remove these flags that modify the return value type of the API call methods

👍

should return the urllib3/aiohttp response object

good idea to include urllib3/aiohttp response object in the ApiResponse object

@wing328
Copy link
Member

wing328 commented Oct 16, 2023

@validate_call
def get_with_http_info(self) -> ApiResponse[GetResponse]:
    ...
    param: Param = self.param_serialize(...)
    res: RESTResponse = util.call_api(param)
    return util.response_deserialize(res, ...)


@validate_call
def get_with_preload_content(self) -> bytes:
    """
    Equivalent to current preload_content.
    This is to avoid unnecessary deserialization.
    """
    ...
    param: Param = self.param_serialize(...)
    res: RESTResponse = util.call_api(param)
    return res.data

Not against get_with_preload_content but another way I would consider is to handle preload content in

def get_with_http_info(self) -> ApiResponse[GetResponse]:

In other words, the _with_http_info method and ApiResponse can together handle all (or most) edge cases.

(some other client generators use similar design to handle edge cases)

@wing328
Copy link
Member

wing328 commented Oct 16, 2023

It is essential to remove _preload_content and _return_http_data_only.

Good idea 👍

@wing328
Copy link
Member

wing328 commented Oct 16, 2023

def get_with_preload_content(self) -> bytes:
    """
    Equivalent to current preload_content.
    This is to avoid unnecessary deserialization.
    """

What about moving preload_content flag to configuration.py instead (which already contains many different options: https://github.com/OpenAPITools/openapi-generator/blob/master/modules/openapi-generator/src/main/resources/python/configuration.mustache#L24) ?

@fa0311
Copy link
Contributor Author

fa0311 commented Oct 16, 2023

As pointed out, get_with_preload_content was a mistake for get_without_preload_content.
I will rethink the idea of without_preload_content, as I want to avoid the Union type with None as much as possible.

@fa0311 fa0311 marked this pull request as draft October 16, 2023 16:29
@fa0311 fa0311 force-pushed the remove-preload_content branch from 036ed28 to 178d493 Compare October 16, 2023 16:31
@fa0311 fa0311 marked this pull request as ready for review October 20, 2023 10:18
Copy link
Contributor

@robertschweizer robertschweizer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for these changes, I'm excited about the clean typing!

**kwargs,

@validate_call
{{#asyncio}}async {{/asyncio}}def {{operationId}}_without_preload_content{{>partial_api_args}} -> RESTResponseType:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of _without_preload_content, this could also be called _raw to be shorter.

Just a suggestion. If you go with it, best make sure name clashes are handled, because I find it sort of likely that there are REST APIs that have both a get and a get_raw endpoint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't think of a way to handle name conflicts, any ideas?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you could export the suffix to a parameter upon generation? That way it would work for most use cases but if someone happens to have conflicts they can just choose a different one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of _without_preload_content, this could also be called _raw to be shorter.

Just a suggestion. If you go with it, best make sure name clashes are handled, because I find it sort of likely that there are REST APIs that have both a get and a get_raw endpoint.

If the OpenAPI method is called get_raw, the generated method would be called get_raw_raw, no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For an OpenAPI method get, the raw endpoint would be called get_raw and clash with the OpenAPI method get_raw.

I would like it best if the generator automatically added a trailing underscore to make sure the raw function name is unique. Maybe we should discuss this in a separate issue though, it could be solved after this PR is merged.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. It's already the case with the previous functions though, right? (One could have defined q `get_with_http_info| OpenAPI method, and it would have clashed too)

Why don't we keep it simple here, with the "long" name?
The preload content feature was hidden before behind a very special flag, I would assume it wasn't really used that much by external callers and mostly used internally, in which case:

  1. The long name is fine: internal calls don't care, external users are probably going to use it seldomly anyway.
  2. I'm not even sure if this would be actually used at all.

So I would say: keep the current long name for now.

{{/allParams}}
**kwargs,
@validate_call
{{#asyncio}}async {{/asyncio}}def {{operationId}}_without_preload_content{{>partial_api_args}} -> RESTResponseType:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every generated client will only ever support one backend library at a time. That's why I'm not sure the RESTResponseType alias is useful in the final generated client.

For me, not using async or tornado, seeing urllib3.HTTPResponse here would be most informative when working with the generated client.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @robertschweizer : I don't think there's a need to abstract the underlying response from the actual client.

The client itself is already a bit abstracted because it's doing internal manipulation of the parameters but the response type doesn't do anything special, and at best it would be a downgraded version of the underlying response.

Since it's being used only in specific cases (when the caller only wants to do the HTTP call without the model API), and that it's not possible anyway to use multiple HTTP clients at the same, I would just remove the abstraction and return the HTTP réponse object from the underlying library.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separating the generated client(urllib3/aiohttp) from the api(api.mustache) will eliminate the need to modify api.mustache when adding a new RESTClientObject.

My idea is to be able to add a new RESTClientObject with minimal changes.
(In the future I may add httpx as a client.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I see. Re-reading the change, we can maybe try to remove RESTReponse instead, but keep the type alias RESTReponseType for the use case you mentioned.

This can be done in a subsequent PR though.

@multani
Copy link
Contributor

multani commented Oct 22, 2023

Thanks for the big work! I don't have too much time to review in details until beginning of November, but it looks really good from what I've seen.

I would suggest to keep it focused though: it's already addressing a lot of different topics (typing, code cleanup, cosmetics, etc.), and although the final result looks much much better than before, I think it's also fine to not have it perfect and had new changes on top of it afterwards.
It will make the review process a bit easier too!

@wing328
Copy link
Member

wing328 commented Oct 30, 2023

Awesome work 👍

Let's go with what you've so far and we can file separate PRs to further improve the python client generator if needed.

@wing328 wing328 merged commit 8827da8 into OpenAPITools:master Oct 30, 2023
31 checks passed
@wing328 wing328 added this to the 7.1.0 milestone Nov 11, 2023
robertschweizer added a commit to robertschweizer/openapi-generator that referenced this pull request Dec 5, 2023
This was removed in OpenAPITools#16802, but using a higher value than 1,
or at least making this configurable makes complete sense.

Without this, we get a lot of these log messages:

[ WARNING] Connection pool is full, discarding connection:
wing328 pushed a commit that referenced this pull request Dec 6, 2023
This was removed in #16802, but using a higher value than 1,
or at least making this configurable makes complete sense.

Without this, we get a lot of these log messages:

[ WARNING] Connection pool is full, discarding connection:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants