Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: improve /sites endpoint performance #90

Merged
merged 2 commits into from
Dec 8, 2020

Conversation

kwajiehao
Copy link
Contributor

@kwajiehao kwajiehao commented Dec 8, 2020

Overview

This PR addresses issue #83

Previously, site retrieval was slow because the GitHub API for retrieving org repos was paginated, and we retrieved the data sequentially, one page at a time. This meant that it often took up to 7 or even 8 seconds each time this endpoint is accessed (each page took around 3 seconds, perhaps due to the large amount of data being sent).

Since we know the number of repos our isomerpages GitHub organization has, we can use this information to speed up our endpoint by making concurrent calls instead of stepping through the API pagination. This PR improves performance by making concurrent API calls to retrieve repo info so that it now only takes around 3 seconds for the endpoint to respond, more than halving the amount of time take previously.

This commit also introduces an optional env var, ISOMERPAGES_REPO_PAGE_COUNT, which determines how many pages of the GitHub API to comb simultaneously.

@kwajiehao kwajiehao linked an issue Dec 8, 2020 that may be closed by this pull request
@kwajiehao kwajiehao linked an issue Dec 8, 2020 that may be closed by this pull request
alexanderleegs
alexanderleegs previously approved these changes Dec 8, 2020
Copy link
Contributor

@alexanderleegs alexanderleegs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Jie Hao Kwa added 2 commits December 8, 2020 17:43
Previously, site retrieval was slow because the GitHub API for retrieving
org repos was paginated, and we retrieved the data sequentially,
one page at a time. This meant that it often took up to 7 or even 8 seconds
each time this endpoint is accessed (each page took around 3 seconds,
perhaps due to the large amount of data being sent).

This commit improves performance by making these api calls
concurrently, so that it now only takes around 3 seconds for the endpoint
to respond.

This commit also introduces an optional env var, ISOMERPAGES_REPO_PAGE_COUNT,
which determines how many pages of the GitHub API to comb simultaneously.
Since we know the number of repos our github org has, we can use this
info to speed up our endpoint by making concurrent calls instead of stepping
through the API pagination.
Copy link
Contributor

@alexanderleegs alexanderleegs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@kwajiehao kwajiehao merged commit dff32f4 into staging Dec 8, 2020
@kwajiehao kwajiehao deleted the fix/sites-performance branch December 8, 2020 09:45
@kwajiehao kwajiehao mentioned this pull request Dec 9, 2020
harishv7 pushed a commit that referenced this pull request Feb 17, 2023
* feat: speed site retrieval up by making concurrent api calls

Previously, site retrieval was slow because the GitHub API for retrieving
org repos was paginated, and we retrieved the data sequentially,
one page at a time. This meant that it often took up to 7 or even 8 seconds
each time this endpoint is accessed (each page took around 3 seconds,
perhaps due to the large amount of data being sent).

This commit improves performance by making these api calls
concurrently, so that it now only takes around 3 seconds for the endpoint
to respond.

This commit also introduces an optional env var, ISOMERPAGES_REPO_PAGE_COUNT,
which determines how many pages of the GitHub API to comb simultaneously.
Since we know the number of repos our github org has, we can use this
info to speed up our endpoint by making concurrent calls instead of stepping
through the API pagination.

* refactor: remove unnecessary filter

Co-authored-by: Jie Hao Kwa <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Paginate/optimize site retrieval code
2 participants