Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add pagination for artifacts retrieval #453

Merged
merged 2 commits into from
Aug 8, 2024
Merged

add pagination for artifacts retrieval #453

merged 2 commits into from
Aug 8, 2024

Conversation

fabiogallotti
Copy link
Contributor

@fabiogallotti fabiogallotti commented Aug 4, 2024

Hello @ewjoachim,

following the discussion we had some weeks ago with @AlessandroMiola here, I'm proposing you a small fix in order to handle the pagination for artifacts.

I'm looping page by page and, if the result of page n+1 is empty, I'm breaking out of the loop.
The rationale behind is that If we ask a page without artifact we get an empty response (see for example https://api.github.com/repos/pydantic/pydantic/actions/runs/9815232181/artifacts?page=3)

To be honest, I'm not very happy with this solution, probably the best one would be to migrate from the https://github.com/michaelliao/githubpy client (that I noticed it's not maintained anymore) to something like https://github.com/PyGithub/PyGithub, but the changes would be substantial I guess.

I also took the occasion to introduce a Makefile to ease local development, let me know what you think about it ;)

Closes #438

Copy link

github-actions bot commented Aug 4, 2024

End-to-end public repo

Admin commands cheatsheet:

  • /e2e (in approved PR review body): Trigger end-to-end tests on external contributions
  • /invite (in comment): Invite the author & admins to the end-to-end private repo

Copy link

github-actions bot commented Aug 4, 2024

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  coverage_comment
  github.py
Project Total  

This report was generated by python-coverage-comment-action

Copy link
Member

@ewjoachim ewjoachim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for tackling this ! We're off to a good start, though I believe we could improve some parts.

I'm also discussing the Makefile decision in the comments below.

Thank you again. If at any point, you'd rather I take it from here, let me know :)

.gitignore Outdated
Comment on lines 6 to 8
.python-version
__pycache__
.vscode
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree __pycache__ is missing

For the 2 others, it seem they would belong to your personal global gitignore. It's not that we couldn't do it here, but whenever people suggest we add those (which happens once a year) I'd rather show the proper way of dealing with those and avoid similar suggestions on other repos. That said, let me emphasize you did nothing wrong by suggesting those.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, thanks for letting me know this

Makefile Outdated

.PHONY: test
test: ## run all tests
poetry run pytest tests -vv --cov-report term-missing --cov=coverage_comment
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe all those arguments are already in pyproject.toml, no need to repeat them here.

Makefile Outdated
Comment on lines 16 to 17
poetry run ruff check --fix coverage_comment tests
poetry run ruff format coverage_comment tests
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-commit takes care of linting. A more appropriate (and more complete) lint command would be

pre-commit run --all-files

Or even just

pre-commit

Makefile Outdated
.PHONY: install
install: ## install dependencies
poetry install --with dev
poetry run pre-commit install
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pre-commit shouldn't be installed inside poetry, so that's rather pre-commit install without poetry run

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a huge fan of Makefiles, mainly because they tend to look like a mess (not commenting your Makefile in particular, but it's an inherent property of the format). Also, they lost me many years ago when I discovered you have to mix tabs and space on some lines.

My philosophy is more to try and use standard tools and configure them so they're ready to use.

In the case of your Makefile: if you think the project is missing some directions on how to launch some actions, you're right of course, but I'd like to direct you to changing the CONTRIBUTING.md doc first. That said, I'm totally ok to keep the Makefile as a documentation TL;DR, but let's make it so (remove the help target, fix the few comments I left which should show that targets are synonym to a 1 (or few) words command.

If you're curious, my go-to approach these days when I need to make complex commands accessible in a project is a scripts folder, which offers the advantage of letting you write some scripts with Python when they become too complex to be readable in bash. But I don't think we need this here. A Makefile will be very OK.

Comment on lines 59 to 66
while True:
result = repo_path.actions.runs(run_id).artifacts.get(page=str(page))
if not result:
break
artifacts.extend(result.artifacts)
page += 1
try:
artifact = next(
Copy link
Member

@ewjoachim ewjoachim Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about incorporating reading the pages and paginating into one generator (yield), so that if you find the artifact on the first page, you don't need to fetch the second page ? Also, this will limit the number of tests you have to change while still letting you do dedicated tests for pagination.

Also, if the number of elements github returns on a page is fewer than the max, you know in advance there won't be a next page.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right, I'll work on improving it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid unnecessarily fetching the next page, I leveraged on the total_count attribute of the response, instead of using the github default max page size of 30, it seems safer to me

@fabiogallotti fabiogallotti requested a review from ewjoachim August 5, 2024 10:28
Copy link
Member

@ewjoachim ewjoachim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome ! Thanks for your first contribution to this project, and for your patience !

@ewjoachim ewjoachim merged commit e6ee17f into py-cov-action:main Aug 8, 2024
4 checks passed
@fabiogallotti fabiogallotti deleted the add-pagination-for-artifacts branch August 8, 2024 09:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants