Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REQ] Python: improve the typing of API clients #16788

Closed
multani opened this issue Oct 11, 2023 · 9 comments
Closed

[REQ] Python: improve the typing of API clients #16788

multani opened this issue Oct 11, 2023 · 9 comments

Comments

@multani
Copy link
Contributor

multani commented Oct 11, 2023

Is your feature request related to a problem? Please describe.

In #16694, I tried to add typing information on the generated API clients, but the issues are a bit larger than expected.

Going from bottom (REST client) to up (the model API client):

REST API client

By "REST API client", I mean the RESTClientObject object.

The REST API client request() method can return very different type of responses:

  1. A basic ApiResponse, a generic HTTP response
  2. Any kind of higher-level object via response_types_map: the client can extract the payload of the response and deserialize it into any kind of object if the payload matches the format of that object (so, an int, datetime, or any model; but actually, almost any type)

This has several issues:

  • This high variability in the kind of responses the client can return makes it very difficult to type correctly.
  • The different response types change depending on parameters passed to the REST client. This makes the typing even more difficult to implement and pushes all the type verification to the client of the object.
  • Finally, it adds the higher-level deserialization logic of the response into the REST Client, which could be seen more as a low-level client.

Model API client

By "Model API client", I mean the API client created for each model.

I'm taking a hypothetical Foo model with basic CRUD methods as an example, that would produce a FooApi class.

We can find the following kind of methods:

  1. A get_foo_by_id method returning:
    1. Foo (usual case)
    2. None in some cases, sometimes for errors too
  2. A get_foo_by_id_with_http_info, which could return either:
  • An ApiResponse: a generic HTTP response
  • An instance of the associated Foo model, if _return_http_data_only is False
  1. There are other methods that can return more complex objects, like List[Foo], etc. but the mechanism is the same as explained above.

In addition, the urllib3 client can also produce an AsyncResult with one of the 2 behaviors explained just above.
This behavior doesn't exist for the aiohttp-based client.

This also has several issues, similar to those of the underlying REST Client explained above:

  • This high variability in the kind of responses the client can return makes it very difficult to type correctly.
  • The behavior can change drastically, depending on the parameters passed
  • The deserialization logic is done in the REST Client, although it's only (mostly) used in the high-level APIs get_foo_by_id and similar.
  • The AsyncResult responses create additional complexity in the typing of the responses, since it also depends on the parameter async_req to be set.

Describe the solution you'd like

After some good suggestions from @fa0311, @robertschweizer and @wing328 , I'd like to make the propose the following changes:

Remove all the parameters that change the response type and move the associated behaviors somewhere else:

Remove _return_http_data_only

Remove the deserialization behavior away from the RESTClientObject class and move it into the xxx() methods of the model API client:
1. The xxx_with_http_info() methods would not be able to deserialize the response to high-level objects anymore
2. The xxx_with_http_info() methods would only return ApiResponses object instead
3. The deserialization logic would be moved outside of the REST Client:
1. It actually doesn't have any state and could be free functions instead.
2. This can all be moved to a dedicated deserialization module.

Remove async_req

Extract the AsyncResult into dedicated *_async() methods.

  1. This means that, for each xxx() + xxx_with_http_info() methods, this would add 2 new methods:
    1. xxx_async()
    2. xxx_with_http_info_async()
  2. The behavior of the 2 new methods would be the same as the original methods, except the processing of the response would be done only when the actual result is fetched from the AsyncResult object.
  3. This would require to refactor a bit the code to make these behaviors reusable beteen xxx()/xxx_async() and xxx_with_http_info()/xxx_with_http_info_async():
    1. The deserilization logic for a xxx() method would be moved into a _xxx_deserialization() method
    2. xxx() would call this method directly before returning its result
    3. xxx_async() would call this method when the result is fetched.

Create a new ApiRequest object

Finally, to makes it easier to reuse the code between the sync/async methods, I'd like to introduce a new ApiRequest object:

  1. The behavior from the *_with_http_info() would be moved to a dedicated "private" method
    1. This method would take all the arguments and build a new ApiRequest object containing the right parameters to execute the request
    2. This method could be called both by the xxx_with_http_info() and the xxx_with_http_info_async() methods
  2. The REST Client, instead of taking many parameters, now only takes the ApiRequest object and uses whatever it needs from that.

Additional context

These changes are not really backward compatible. All the parameters removed will not be supported anymore.

Each changes could be done separately, although the typing benefits would only be seen once all the changes have been done
Ideally, the best order would be:

  • introducing the new ApiRequest object and use it both from the *_with_http_info() methods and the REST Client. The REST Client can be kept backward compatible for the time being.
  • async_req can be extracted then be extracted. This simply duplicates the methods, removes the parameter and reuses the ApiRequest created above.
  • The deserialization logic can finally be moved and the typing simplified
@multani
Copy link
Contributor Author

multani commented Oct 11, 2023

Another option would be to drop the async_req feature, that would simplify the proposal quite a lot:

  • no need for the ApiRequest object
  • no need to duplicate all the methods into _async variants

@robertschweizer
Copy link
Contributor

Thanks a lot for looking into this at such detail!

With "The REST API client request method", do you actually mean ApiClient.call_api? There's also a RESTClientObject class, best clarify which class you mean throughout the proposal.

I think the generator should make a decision to only support either async_req (stdlib multiprocessing) or aiohttp. Supporting both feels like overkill. @wing328

I fully support removing _return_http_data_only and moving the deserialization logic into ApiResponse, as stated in #16694. I believe this can also be done independently from the decision about async_req, and without breaking a realistic use-case (so already on the next minor release). However, type annotations won't be clean until there is a decision about async_req.

Have you considered using typing.overload to correct the type annotations of the current implementation? This would bloat the API classes, but at least provide correct type information to users. With the improvements you suggested above, there will be less and less @overloads over time.

@fa0311
Copy link
Contributor

fa0311 commented Oct 12, 2023

I suggest to split call_api.
call_api is too reusable and therefore too complex.

@validate_call
def get(self) -> Response:
    ...
    param = self.param_serialize(...)
    res = self.api_client.call_api(param)
    return self.api_client.response_deserialize(res, ...).data

@validate_call
def get_with_http_info(self) -> ApiResponse[Response]:
    ...
    param = self.param_serialize(...)
    res = self.api_client.call_api(param)
    return self.api_client.response_deserialize(res, ...)

@validate_call
def get_with_async(self) -> ApiResponse[Response]:
    ...
    param = self.param_serialize(...)
    res = self.api_client.call_api_async(param)
    return self.api_client.response_deserialize(res, ...)

@multani
Copy link
Contributor Author

multani commented Oct 13, 2023

@robertschweizer Thanks for the feedback!

With "The REST API client request method", do you actually mean ApiClient.call_api? There's also a RESTClientObject class, best clarify which class you mean throughout the proposal.

I edited the message, sorry for the confusion. By "REST API client", I mean the RESTClientObject, I edited the message with added links.

I fully support removing _return_http_data_only and moving the deserialization logic into ApiResponse, as stated in #16694. I believe this can also be done independently from the decision about async_req, and without breaking a realistic use-case (so already on the next minor release). However, type annotations won't be clean until there is a decision about async_req.

As long as we don't really care about typing, we can do these changes in whichever order. It's when we would like to enforce typing correctly that this will matter.

Have you considered using typing.overload to correct the type annotations of the current implementation? This would bloat the API classes, but at least provide correct type information to users. With the improvements you suggested above, there will be less and less @overloads over time.

No, I didn't really consider using typing.overload. I think we are fairly close to not needing it, and the time needed to implement it would probably be better spent on simply making the typing correct IMO.

I think the generator should make a decision to only support either async_req (stdlib multiprocessing) or aiohttp. Supporting both feels like overkill. @wing328

The asynchronous requests feature from the synchronous client is very basic: it simply embeds the thread pool to defer requests in a dedicated thread. It should be fairly simple to remove and be reimplemented externally by whoever needs it IMO.
That maybe too much of a breaking change though?

@multani
Copy link
Contributor Author

multani commented Oct 13, 2023

I suggest to split call_api. call_api is too reusable and therefore too complex.

@validate_call
def get(self) -> Response:
    ...
    param = self.param_serialize(...)
    res = self.api_client.call_api(param)
    return self.api_client.response_deserialize(res, ...).data

@validate_call
def get_with_http_info(self) -> ApiResponse[Response]:
    ...
    param = self.param_serialize(...)
    res = self.api_client.call_api(param)
    return self.api_client.response_deserialize(res, ...)

@validate_call
def get_with_async(self) -> ApiResponse[Response]:
    ...
    param = self.param_serialize(...)
    res = self.api_client.call_api_async(param)
    return self.api_client.response_deserialize(res, ...)

Would it be too much of a lost if get_with_http_info() simply returns ApiResponse?

I feel the set of parameters that had to be passed to this method to actually return a Response object was unpractical enough that they were not really used and thus, ApiResponse was simply returned.

That would remove the calls to deserialize in these methods, and make the typing simpler.

@multani
Copy link
Contributor Author

multani commented Oct 13, 2023

I suggest to split call_api. call_api is too reusable and therefore too complex.

@validate_call
def get(self) -> Response:
    ...
    param = self.param_serialize(...)

I believe this is similar to the "ApiRequest" I explained in my initial message 👍
I think it wold be great, but it's a rather large change.

    res = self.api_client.call_api(param)
    return self.api_client.response_deserialize(res, ...).data

I think this call/function can completely be removed from the API client: deserialization doesn't use any state from the class. It would make the code simpler and easier to test.

@UltimateLobster
Copy link

As python has added lots of features to its typing in recent years, I would absolutely love to see better typings. I'd love to help whatever way I can, is there any thing I can do?

@multani
Copy link
Contributor Author

multani commented Oct 22, 2023

@UltimateLobster I think the best would be to keep an eye on #16802, help out if needed and reassess the status once it has been merged.
This pull request is large enough, it's likely to solve most of the typing issues but also may conflict with other changes done in parallel, so best wait for it to be merged first.

@multani
Copy link
Contributor Author

multani commented Oct 22, 2023

@fa0311 I'm closing this issue, I think most of the discussion moved to #16802 already, I don't think we need both at this stage.

@multani multani closed this as not planned Won't fix, can't repro, duplicate, stale Oct 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants