Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API Rate Limit Exceeding #258

Closed
sbhaktha opened this issue Mar 10, 2016 · 4 comments
Closed

API Rate Limit Exceeding #258

sbhaktha opened this issue Mar 10, 2016 · 4 comments

Comments

@sbhaktha
Copy link

Hi @kohsuke,

I would appreciate your help on this.

I have used the HttpConnector and have specified a cache directory in my server. I am using OAuth and it looks like my quota is 5000 requests, based on this sample cache file:

https://api.github.com/repos/allenai/aristo-tables/contents/tables/weather_terms?ref=master
GET
2
Authorization: token e381d0427927aef5e2858ac06b6cb01a34b0a603
Accept-Encoding: gzip
HTTP/1.1 200 OK
30
Server: GitHub.com
Date: Thu, 10 Mar 2016 20:57:33 GMT
Content-Type: application/json; charset=utf-8
Transfer-Encoding: chunked
Status: 200 OK
**X-RateLimit-Limit: 5000**
X-RateLimit-Remaining: 4689
X-RateLimit-Reset: 1457646852
Cache-Control: private, max-age=60, s-maxage=60
Vary: Accept, Authorization, Cookie, X-GitHub-OTP
ETag: W/"c2dc693298f7806038e984d1ac857ffb"
Last-Modified: Tue, 08 Mar 2016 23:54:00 GMT
X-OAuth-Scopes: read:repo_hook, repo
X-Accepted-OAuth-Scopes:
X-OAuth-Client-Id: 47355241bdf02ac9122d
X-GitHub-Media-Type: github.v3; format=json
Access-Control-Expose-Headers: ETag, Link, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval
Access-Control-Allow-Origin: *
Content-Security-Policy: default-src 'none'
Strict-Transport-Security: max-age=31536000; includeSubdomains; preload
X-Content-Type-Options: nosniff
X-Frame-Options: deny
X-XSS-Protection: 1; mode=block
Vary: Accept-Encoding
X-Served-By: 01d096e6cfe28f8aea352e988c332cd3
Content-Encoding: gzip
X-GitHub-Request-Id: 36D5C9C0:101B5:A516A26:56E1DFBD
OkHttp-Selected-Protocol: http/1.1
OkHttp-Sent-Millis: 1457643453839
OkHttp-Received-Millis: 1457643453959

My client refreshes periodically to be in sync with the repo, however, even though there has been no change in the repo, I run out of API rate limit every now and then. I thought it should just be reading from the cache.

The following call gets executed on every refresh:

  private def getTableDirs(
    oauthAccessToken: String,
    repo: GitRepoInfo,
    tableNamesFilter: Option[Seq[String]]
  ): Seq[GHContent] = {
    blocking {
      // Create a GitHubBuilder to be able to build a GitHub object with required
      // RateLimitHandler strategy and OAuth parameters. Instead of waiting, this will
      // throw an exception immediately if the request limit is exceeded.
      val gitHubBuilder =
        new GitHubBuilder()
          .withRateLimitHandler(RateLimitHandler.FAIL)
          .withOAuthToken(oauthAccessToken)
          .withConnector(
            new OkHttpConnector(
              new OkUrlFactory(
                new OkHttpClient().setCache(cache))))
      val github = gitHubBuilder.build()

      // Get the requested repo.
      val repoName = repo.fork + "/" + repo.repo
      val repository = github.getRepository(repoName)
      // Get all directories (expected to be Table directories) from the top level of the repo.
      val allTableDirs =
        repository.getDirectoryContent("tables", repo.branch).asScala.filter(_.isDirectory)
      // If there is a filter, restrict returned table directories to that set, if not return all.
      tableNamesFilter match {
        case Some(filter) =>
          val tableSet = filter.map(_.toLowerCase).toSet
          allTableDirs.filter(d => tableSet.contains(d.getName.toLowerCase))
        case None =>
          allTableDirs
      }
    }
  }

Further, there are other calls like ghContent.read -- is each of these a separate request to Git? Even so, I wouldn't think they would be called every time but just looked up from the cache.

Any ideas?

@kohsuke
Copy link
Collaborator

kohsuke commented Mar 12, 2016

When you step execute the code what do you see?

@Shredder121
Copy link
Contributor

I know what the issue here is.
Okhttp only caches if it's allowed to cache.
CacheStrategy.isCacheable(Response response, Request request) checks to see if the cache headers indicate it's safe to cache.
RFC 2616, 13.4 indicates the rules.

Hint: the cache headers are sub-optimal.

@kohsuke Would you consider updating the Okhttp dependency?
It seems that the newest CacheStrategy.isCacheable(Response response, Request request) is updated to reflect the fact that the cache is a private cache. Which should result in better performance overall.

@kohsuke
Copy link
Collaborator

kohsuke commented Jun 4, 2016

Thanks for the detective work.

@kohsuke kohsuke closed this as completed Jun 4, 2016
@Shredder121
Copy link
Contributor

You're welcome 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants