Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve API Rate limitation #117

Open
2 tasks
JafarIronclad opened this issue Nov 1, 2020 · 18 comments
Open
2 tasks

Resolve API Rate limitation #117

JafarIronclad opened this issue Nov 1, 2020 · 18 comments
Labels
documentation Improvements or additions to documentation enhancement New feature or request milestone: missing p-feature: Organizations Everything relating to Organizations page p-feature: search Everything relating to search page Priority: High High Priority issues are critical and have to be fixed with immediate effect role: product management size: missing

Comments

@JafarIronclad
Copy link
Contributor

Overview

Increase rate limits of GitHub API via dialogue with GitHub. Current limits of 10 searches and 60 general API calls are quickly reached in CTI normal use cases, which badly restricts frequency of detail updates for users.

Action Items

  • Create writeup showing what we're trying to accomplish, why we need the API rate restriction removed, what we've already done to attempt to solve the issue, what we've run into that negates those solution.
  • Peer review writeup before presentation to GitHub.

Resources/Instructions

[Pending]

@JafarIronclad
Copy link
Contributor Author

@JafarIronclad Self assign

@jsachsman jsachsman added documentation Improvements or additions to documentation enhancement New feature or request labels Nov 6, 2020
@shinjonathan
Copy link
Member

GitHub's API has a rate limit of 10 searches and 60 general API calls.

We are exploring ways to get a list of issues, project languages on each of these calls, but it requires us to go through those general API calls. We are hoping that we might be able to figure out a good way to increase the rate limit so we can pull the most up to date details on issues and project language.

Our project typically returns about ~30 results which means that the rate limit would be burned through on someone's first 2 searches.

@JafarIronclad
Copy link
Contributor Author

JafarIronclad commented Nov 8, 2020

Draft 11/8/2020 12:25 PM:

DRAFT 11/8/2020 12:25 PM – Brent Hengeveld - Not a release candidate

Dear GitHub,

My name is Brent Hengeveld, and I'm contacting you as a member of the open source civic coding community and the Civic Tech Index project (https://github.com/civictechindex/CTI-website-frontend/). We would appreciate your assistance in resolving a technical hurdle we've encountered in our project.

We are a team of volunteers building a user-populated online global database of civic tech organizations and projects, with the goal of offering a centralized, unified hub that any civic software or tech developer can refer to in order to find a desired initiative, as well as tags and issues.

We are working through an issue where GitHub's API limits on searches (10) and general calls (60) presents a bandwidth hurdle for the end user. You can reference the following files in our repository that perform these API calls: [LIST FILES]

We would like the user to be able to view a list of issues and project languages when reviewing a specific project page in our front end, which requires our database to make API calls in order to populate this data for a given project. In our testing, this limit is generally reached within 2 to 3 searches (given an average of 30 results returned per refined search query). At full operational scale, we believe this would cause search returns on project details to be unacceptably out of date for the end user.

The most straightforward solution we have would be to request an increase of the search and general API limits for our project. We understand this is a very uncommon and preferential privilege granted to a GitHub project given the key function of API limits (ensuring excessive server load in one repository's does not negatively impact other Github repositories), so we would continually monitor usage and communicate our needs so as to justify the rate increases on a continuous basis.

Should this not be permissible, we would appreciate assistance in reaching an alternative, workable solution that allows us to meet our design goals for this key function in our project: searchable, up to date issues and project language tracking detail for a database entry. Here are the other solutions we have attempted thus far, without success:

[INSERT PRIOR SOLUTIONS ATTEMPTED WITH OUTCOMES]

We appreciate your time and consideration, and are thrilled to employ GitHub's platform to realize our project vision. Please let us know if you have any further questions or require further information.

Brent Hengeveld
Civic Tech Index team – Affiliated with Hack for LA

@JafarIronclad
Copy link
Contributor Author

Of interest for when we're preparing our communication: Michael Hagger from GH's team responding on a different project, March 8th 2016 on this subject: CocoaPods/CocoaPods#4989

@giosce
Copy link

giosce commented Nov 15, 2020

My understanding is that for an authorized user, the limits are 1,800 per hour for search API and 5,000 per hour for repos API.
api.github.com/search/repositories?q=cloud+language:java
returns (among many other fields)
"stargazers_count": 4175,
"watchers_count": 4175,
"language": "Java",
"open_issues_count": 119,
"topics": [
"cloud-native",
"java",
"microservices",
"ribbon",
"spring",
"spring-boot",
"spring-cloud",
"spring-cloud-core"
],
"forks": 2128,
"open_issues": 119,
"watchers": 4175,

I see that in order to get the multiple languages we need to issue another API call https://api.github.com/repos/dyc87112/SpringCloud-Learning/languages
which returns
{
"Java": 899442,
"JavaScript": 371656,
"HTML": 242878,
"CSS": 92366
}
And it seems that to get the number of "good first issues" we'll need to call
api.github.com/repos/civictechindex/CTI-website-frontend/issues?labels=good first issue

Not sure how to find "new issues", maybe with the "since" param. Also probably we should filter on status=open

@ExperimentsInHonesty
Copy link
Member

todo is to validate what we learned so far and find the documentation that validates things like rate limits.

@Olivia-Chiong Olivia-Chiong mentioned this issue Dec 21, 2020
12 tasks
@giosce
Copy link

giosce commented Jan 3, 2021

  1. Let's confirm, this issue is for showing the list of projects, right?
  2. From my research on 11/14, I think that only to get multiple languages we need an additional API call per repo. Can we confirm?
  3. I have heard from CTI team that authenticating is not viable. Why? Simple user authentication via token seems doable (where the user token would be used by CTI).
  4. Agree that not authorized user rate limit isn't enough.
  5. Authorized users can do 30 search per minute https://docs.github.com/en/free-pro-team@latest/rest/reference/search#rate-limit
    and 5000 other API calls https://docs.github.com/en/free-pro-team@latest/rest/overview/resources-in-the-rest-api#rate-limiting

https://docs.github.com/en/free-pro-team@latest/rest/overview/resources-in-the-rest-api#authentication
https://docs.github.com/en/free-pro-team@latest/rest/guides/basics-of-authentication

@giosce
Copy link

giosce commented Jan 4, 2021

This call api.github.com/search/repositories?q=topic:civictechindex
Returns the following json

"total_count": 38,
"incomplete_results": false,
"items": [
    {
        "id": 7845579,
        "node_id": "MDEwOlJlcG9zaXRvcnk3ODQ1NTc5",
        "name": "openbudgetoakland",
        "full_name": "openoakland/openbudgetoakland",
        "private": false,
        "owner": {
            "login": "openoakland",
            "id": 2238933,
            "node_id": "MDEyOk9yZ2FuaXphdGlvbjIyMzg5MzM=",
            "avatar_url": "https://avatars2.githubusercontent.com/u/2238933?v=4",
            "gravatar_id": "",
            "url": "https://api.github.com/users/openoakland",
            "html_url": "https://github.com/openoakland",
            "followers_url": "https://api.github.com/users/openoakland/followers",
            "following_url": "https://api.github.com/users/openoakland/following{/other_user}",
            "gists_url": "https://api.github.com/users/openoakland/gists{/gist_id}",
            "starred_url": "https://api.github.com/users/openoakland/starred{/owner}{/repo}",
            "subscriptions_url": "https://api.github.com/users/openoakland/subscriptions",
            "organizations_url": "https://api.github.com/users/openoakland/orgs",
            "repos_url": "https://api.github.com/users/openoakland/repos",
            "events_url": "https://api.github.com/users/openoakland/events{/privacy}",
            "received_events_url": "https://api.github.com/users/openoakland/received_events",
            "type": "Organization",
            "site_admin": false
        },
        "html_url": "https://github.com/openoakland/openbudgetoakland",
        "description": "Visualizations of Oakland's budget data, and explanations about the budget process.",
        "fork": false,
        "url": "https://api.github.com/repos/openoakland/openbudgetoakland",
        "forks_url": "https://api.github.com/repos/openoakland/openbudgetoakland/forks",
        "keys_url": "https://api.github.com/repos/openoakland/openbudgetoakland/keys{/key_id}",
        "collaborators_url": "https://api.github.com/repos/openoakland/openbudgetoakland/collaborators{/collaborator}",
        "teams_url": "https://api.github.com/repos/openoakland/openbudgetoakland/teams",
        "hooks_url": "https://api.github.com/repos/openoakland/openbudgetoakland/hooks",
        "issue_events_url": "https://api.github.com/repos/openoakland/openbudgetoakland/issues/events{/number}",
        "events_url": "https://api.github.com/repos/openoakland/openbudgetoakland/events",
        "assignees_url": "https://api.github.com/repos/openoakland/openbudgetoakland/assignees{/user}",
        "branches_url": "https://api.github.com/repos/openoakland/openbudgetoakland/branches{/branch}",
        "tags_url": "https://api.github.com/repos/openoakland/openbudgetoakland/tags",
        "blobs_url": "https://api.github.com/repos/openoakland/openbudgetoakland/git/blobs{/sha}",
        "git_tags_url": "https://api.github.com/repos/openoakland/openbudgetoakland/git/tags{/sha}",
        "git_refs_url": "https://api.github.com/repos/openoakland/openbudgetoakland/git/refs{/sha}",
        "trees_url": "https://api.github.com/repos/openoakland/openbudgetoakland/git/trees{/sha}",
        "statuses_url": "https://api.github.com/repos/openoakland/openbudgetoakland/statuses/{sha}",
        "languages_url": "https://api.github.com/repos/openoakland/openbudgetoakland/languages",
        "stargazers_url": "https://api.github.com/repos/openoakland/openbudgetoakland/stargazers",
        "contributors_url": "https://api.github.com/repos/openoakland/openbudgetoakland/contributors",
        "subscribers_url": "https://api.github.com/repos/openoakland/openbudgetoakland/subscribers",
        "subscription_url": "https://api.github.com/repos/openoakland/openbudgetoakland/subscription",
        "commits_url": "https://api.github.com/repos/openoakland/openbudgetoakland/commits{/sha}",
        "git_commits_url": "https://api.github.com/repos/openoakland/openbudgetoakland/git/commits{/sha}",
        "comments_url": "https://api.github.com/repos/openoakland/openbudgetoakland/comments{/number}",
        "issue_comment_url": "https://api.github.com/repos/openoakland/openbudgetoakland/issues/comments{/number}",
        "contents_url": "https://api.github.com/repos/openoakland/openbudgetoakland/contents/{+path}",
        "compare_url": "https://api.github.com/repos/openoakland/openbudgetoakland/compare/{base}...{head}",
        "merges_url": "https://api.github.com/repos/openoakland/openbudgetoakland/merges",
        "archive_url": "https://api.github.com/repos/openoakland/openbudgetoakland/{archive_format}{/ref}",
        "downloads_url": "https://api.github.com/repos/openoakland/openbudgetoakland/downloads",
        "issues_url": "https://api.github.com/repos/openoakland/openbudgetoakland/issues{/number}",
        "pulls_url": "https://api.github.com/repos/openoakland/openbudgetoakland/pulls{/number}",
        "milestones_url": "https://api.github.com/repos/openoakland/openbudgetoakland/milestones{/number}",
        "notifications_url": "https://api.github.com/repos/openoakland/openbudgetoakland/notifications{?since,all,participating}",
        "labels_url": "https://api.github.com/repos/openoakland/openbudgetoakland/labels{/name}",
        "releases_url": "https://api.github.com/repos/openoakland/openbudgetoakland/releases{/id}",
        "deployments_url": "https://api.github.com/repos/openoakland/openbudgetoakland/deployments",
        "created_at": "2013-01-26T23:52:36Z",
        "updated_at": "2020-11-07T13:03:25Z",
        "pushed_at": "2020-12-03T20:09:58Z",
        "git_url": "git://github.com/openoakland/openbudgetoakland.git",
        "ssh_url": "[email protected]:openoakland/openbudgetoakland.git",
        "clone_url": "https://github.com/openoakland/openbudgetoakland.git",
        "svn_url": "https://github.com/openoakland/openbudgetoakland",
        "homepage": "http://openbudgetoakland.org",
        "size": 18370,
        "stargazers_count": 84,
        "watchers_count": 84,
        "language": "JavaScript",
        "has_issues": true,
        "has_projects": true,
        "has_downloads": true,
        "has_wiki": true,
        "has_pages": true,
        "forks_count": 129,
        "mirror_url": null,
        "archived": false,
        "disabled": false,
        "open_issues_count": 28,
        "license": {
            "key": "mit",
            "name": "MIT License",
            "spdx_id": "MIT",
            "url": "https://api.github.com/licenses/mit",
            "node_id": "MDc6TGljZW5zZTEz"
        },
        "topics": [
            "budget",
            "civic-hacking",
            "civictech",
            "civictechindex",
            "code-for-america",
            "data-visualization",
            "openoakland"
        ],
        "forks": 129,
        "open_issues": 28,
        "watchers": 84,
        "default_branch": "master",
        "permissions": {
            "admin": false,
            "push": false,
            "pull": true
        },
        "score": 1
    },
    {
        "id": 190321758,
        "node_id": "MDEwOlJlcG9zaXRvcnkxOTAzMjE3NTg=",
        "name": "311-data",
        "full_name": "hackforla/311-data",
        "private": false,
        "owner": {
            "login": "hackforla",
            "id": 11635254,
            "node_id": "MDEyOk9yZ2FuaXphdGlvbjExNjM1MjU0",
            "avatar_url": "https://avatars3.githubusercontent.com/u/11635254?v=4",
            "gravatar_id": "",
            "url": "https://api.github.com/users/hackforla",
            "html_url": "https://github.com/hackforla",
            "followers_url": "https://api.github.com/users/hackforla/followers",
            "following_url": "https://api.github.com/users/hackforla/following{/other_user}",
            "gists_url": "https://api.github.com/users/hackforla/gists{/gist_id}",
            "starred_url": "https://api.github.com/users/hackforla/starred{/owner}{/repo}",
            "subscriptions_url": "https://api.github.com/users/hackforla/subscriptions",
            "organizations_url": "https://api.github.com/users/hackforla/orgs",
            "repos_url": "https://api.github.com/users/hackforla/repos",
            "events_url": "https://api.github.com/users/hackforla/events{/privacy}",
            "received_events_url": "https://api.github.com/users/hackforla/received_events",
            "type": "Organization",
            "site_admin": false
        },
        "html_url": "https://github.com/hackforla/311-data",
        "description": "Empowering Neighborhood Associations to improve the analysis of their initiatives using 311 data",
        "fork": false,
        "url": "https://api.github.com/repos/hackforla/311-data",
        "forks_url": "https://api.github.com/repos/hackforla/311-data/forks",
        "keys_url": "https://api.github.com/repos/hackforla/311-data/keys{/key_id}",
        "collaborators_url": "https://api.github.com/repos/hackforla/311-data/collaborators{/collaborator}",
        "teams_url": "https://api.github.com/repos/hackforla/311-data/teams",
        "hooks_url": "https://api.github.com/repos/hackforla/311-data/hooks",
        "issue_events_url": "https://api.github.com/repos/hackforla/311-data/issues/events{/number}",
        "events_url": "https://api.github.com/repos/hackforla/311-data/events",
        "assignees_url": "https://api.github.com/repos/hackforla/311-data/assignees{/user}",
        "branches_url": "https://api.github.com/repos/hackforla/311-data/branches{/branch}",
        "tags_url": "https://api.github.com/repos/hackforla/311-data/tags",
        "blobs_url": "https://api.github.com/repos/hackforla/311-data/git/blobs{/sha}",
        "git_tags_url": "https://api.github.com/repos/hackforla/311-data/git/tags{/sha}",
        "git_refs_url": "https://api.github.com/repos/hackforla/311-data/git/refs{/sha}",
        "trees_url": "https://api.github.com/repos/hackforla/311-data/git/trees{/sha}",
        "statuses_url": "https://api.github.com/repos/hackforla/311-data/statuses/{sha}",
        "languages_url": "https://api.github.com/repos/hackforla/311-data/languages",
        "stargazers_url": "https://api.github.com/repos/hackforla/311-data/stargazers",
        "contributors_url": "https://api.github.com/repos/hackforla/311-data/contributors",
        "subscribers_url": "https://api.github.com/repos/hackforla/311-data/subscribers",
        "subscription_url": "https://api.github.com/repos/hackforla/311-data/subscription",
        "commits_url": "https://api.github.com/repos/hackforla/311-data/commits{/sha}",
        "git_commits_url": "https://api.github.com/repos/hackforla/311-data/git/commits{/sha}",
        "comments_url": "https://api.github.com/repos/hackforla/311-data/comments{/number}",
        "issue_comment_url": "https://api.github.com/repos/hackforla/311-data/issues/comments{/number}",
        "contents_url": "https://api.github.com/repos/hackforla/311-data/contents/{+path}",
        "compare_url": "https://api.github.com/repos/hackforla/311-data/compare/{base}...{head}",
        "merges_url": "https://api.github.com/repos/hackforla/311-data/merges",
        "archive_url": "https://api.github.com/repos/hackforla/311-data/{archive_format}{/ref}",
        "downloads_url": "https://api.github.com/repos/hackforla/311-data/downloads",
        "issues_url": "https://api.github.com/repos/hackforla/311-data/issues{/number}",
        "pulls_url": "https://api.github.com/repos/hackforla/311-data/pulls{/number}",
        "milestones_url": "https://api.github.com/repos/hackforla/311-data/milestones{/number}",
        "notifications_url": "https://api.github.com/repos/hackforla/311-data/notifications{?since,all,participating}",
        "labels_url": "https://api.github.com/repos/hackforla/311-data/labels{/name}",
        "releases_url": "https://api.github.com/repos/hackforla/311-data/releases{/id}",
        "deployments_url": "https://api.github.com/repos/hackforla/311-data/deployments",
        "created_at": "2019-06-05T03:46:06Z",
        "updated_at": "2020-12-21T22:52:31Z",
        "pushed_at": "2020-12-21T22:52:29Z",
        "git_url": "git://github.com/hackforla/311-data.git",
        "ssh_url": "[email protected]:hackforla/311-data.git",
        "clone_url": "https://github.com/hackforla/311-data.git",
        "svn_url": "https://github.com/hackforla/311-data",
        "homepage": "",
        "size": 78520,
        "stargazers_count": 27,
        "watchers_count": 27,
        "language": "JavaScript",
        "has_issues": true,
        "has_projects": true,
        "has_downloads": true,
        "has_wiki": true,
        "has_pages": true,
        "forks_count": 27,
        "mirror_url": null,
        "archived": false,
        "disabled": false,
        "open_issues_count": 118,
        "license": {
            "key": "gpl-3.0",
            "name": "GNU General Public License v3.0",
            "spdx_id": "GPL-3.0",
            "url": "https://api.github.com/licenses/gpl-3.0",
            "node_id": "MDc6TGljZW5zZTk="
        },
        "topics": [
            "311-data",
            "civictechindex",
            "code-for-all",
            "code-for-america",
            "hack-for-la",
            "neighborhood-councils"
        ],
        "forks": 27,
        "open_issues": 118,
        "watchers": 27,
        "default_branch": "dev",
        "permissions": {
            "admin": false,
            "push": false,
            "pull": true
        },
        "score": 1
    },

@giosce
Copy link

giosce commented Jan 4, 2021

Check with larger result set (more popular tags).

@giosce giosce self-assigned this Jan 4, 2021
@giosce
Copy link

giosce commented Jan 4, 2021

This API call api.github.com/search/repositories?q=topic:covid-19 returns
"total_count": 6472,
"incomplete_results": false,

Strangely, the same search term on github website https://github.com/search?q=covid-19&ref=opensearch
returns 86K+ results.

In any case, it seems that up to 6,472 (probably more) results are returned in one shot.
But, when analyzing the json response, there are only about 600 projects.

We need to check the "Link" header which in this case is saying that we need to invoke the API 34 times to get all the results (which I also guess count as 34 against the limit).
Link →https://api.github.com/search/repositories?q=topic%3Acovid19&page=2; rel="next", https://api.github.com/search/repositories?q=topic%3Acovid19&page=34; rel="last"

Still strange, it says 34 in both cases searching covid19 (which it says to return about 2k results) and covid-19. If I ask for page=35 it says that it only returns 100 results.
Searching for civictechindex it returns only 2 pages.

@Olivia-Chiong
Copy link
Member

Product needs to have a discussion on search function and what the next steps are.

@giosce
Copy link

giosce commented Jan 19, 2021

Added a test. This URL https://hackforla.github.io/github-api-test/ (search for covid) returns the following headers among which the "Link" which indicates that there are many more pages.

image

@Olivia-Chiong
Copy link
Member

@giosce I'm of the opinion we should write to GitHub to request for increased API rate limit or at least a methodology to achieve it.

@ExperimentsInHonesty
Copy link
Member

ExperimentsInHonesty commented Jan 25, 2021

@giosce will look to see if the languages api and the issue labels api per repo has same limits. See .js file from poc: https://github.com/hackforla/github-api-test/blob/master/index.js

issues api would be useful for
good+first+issue
help+wanted

Use the POC at https://github.com/civictechindex/github-api-test (make changes there to show everything we need for the real CTI website).

@giosce
Copy link

giosce commented Jan 26, 2021

I believe that non-search API calls (like languages and issues) have these limits (I'll finish to double check).
For API requests using Basic Authentication or OAuth, you can make up to 5,000 requests per hour
For users that belong to a GitHub Enterprise Cloud account, requests made using an OAuth token to resources owned by the same GitHub Enterprise Cloud account have an increased limit of 15,000 requests per hour.

https://docs.github.com/en/rest/overview/resources-in-the-rest-api#rate-limiting

@ExperimentsInHonesty ExperimentsInHonesty added p-feature: search Everything relating to search page p-feature: Organizations Everything relating to Organizations page labels Jan 26, 2021
@giosce
Copy link

giosce commented Jan 29, 2021

The following 3 API calls have a 5,000 calls per hour limit
api.github.com/repos/civictechindex/CTI-website-frontend/issues?labels=good first issue has a 5,000 calls per hour limit
https://api.github.com/repos/hackforla/food-oasis/issues has a 5,000 calls per hour limit
https://api.github.com/repos/code-for-chapel-hill/NC-COVID-Support/languages

Keep in mind that these calls too can return multiple pages for one response.

So, let's assume that the search is returning 25 repositories.
For each we do one "issues" call and one "languages" call for a total of 100 calls.
In a minute we can make 30 search calls per minute, if each returns 25 repositories it would be 3,000 calls. With this average we won't be able to make even twice 30 search in an hour.
Actually, we can't show repository details more than 2,500 times an hour, 40 per minute.

We also need to double check, I assume that every time a user looks at a contributor we load (or search) all its projects. If it is a search (for example by the organization github_tag) it is in the 30xmin limit, otherwise probably in the 5,000.
api.github.com/orgs/hackforla/repos has 5,000 limit
https://api.github.com/search/repositories?q=topic:hack-for-la has a 30 x min limit

Finally we'll need to evaluate the typical usage pattern and estimate the number of concurrent users before we can say we are hitting the rate limit.

@Olivia-Chiong
Copy link
Member

Pending implementation with GitHub API to find out if we need to write letter for exception.

@smsada smsada added the Priority: High High Priority issues are critical and have to be fixed with immediate effect label Mar 19, 2022
@smsada
Copy link
Member

smsada commented Mar 19, 2022

Additional reason we need to resolve Rate limitations:

  • As per the interview with Gio in Expert Interview: Gio #1111 an error message gets thrown on the search page when clearing project search filters
  • This is especially relevant once traffic increases
Screenshot of Error

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request milestone: missing p-feature: Organizations Everything relating to Organizations page p-feature: search Everything relating to search page Priority: High High Priority issues are critical and have to be fixed with immediate effect role: product management size: missing
Projects
Status: Prioritized Backlog
Development

No branches or pull requests

7 participants