Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Paginate on lastest Response #35560

Merged
merged 5 commits into from
Nov 11, 2023

Conversation

Joffreybvn
Copy link
Contributor

@Joffreybvn Joffreybvn commented Nov 9, 2023

The new pagination functionality of the HttpOperator do not work as expected when non-deferred:
If should pass the latest response to the pagination_function, instead of always passing the very first response.

This PR clarify and fix this behavior.



Apologize for this. Between all the changes in the initial PR, I overlooked this issue, and my initial test was not strong enough to highlight it. Will definitively take more care !


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

Copy link
Member

@hussein-awala hussein-awala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm not mistaken, your change just renames the method paginate_sync first_response param to response, how does this fix the bug?

while True:
next_page_params = self.pagination_function(first_response)
next_page_params = self.pagination_function(response)
if not next_page_params:
break
response = self.hook.run(**self._merge_next_page_parameters(next_page_params))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm not mistaken, your change just renames the method paginate_sync first_response param to response, how does this fix the bug?

Lines 177/178 set the response to a response variable.

With this PR, the response is correctly reattributed and reach the pagination_function. Previously, the pagination function was repeatedly called with the first_response, which never changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, the response value is updated after each iteration

@Joffreybvn Joffreybvn force-pushed the fix/paginate-on-last-response branch from b05a84d to a670e3c Compare November 10, 2023 06:15
Copy link
Member

@hussein-awala hussein-awala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

while True:
next_page_params = self.pagination_function(first_response)
next_page_params = self.pagination_function(response)
if not next_page_params:
break
response = self.hook.run(**self._merge_next_page_parameters(next_page_params))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, the response value is updated after each iteration

@eladkal eladkal merged commit 1f76986 into apache:main Nov 11, 2023
46 checks passed
@ephraimbuddy ephraimbuddy added the changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) label Nov 20, 2023
@ephraimbuddy ephraimbuddy added this to the Airflow 2.8.0 milestone Nov 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) provider:http
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants