Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Github] Failure to Read > 29 Branches #327

Closed
ZoriaRPG opened this issue Nov 3, 2019 · 2 comments
Closed

[Github] Failure to Read > 29 Branches #327

ZoriaRPG opened this issue Nov 3, 2019 · 2 comments

Comments

@ZoriaRPG
Copy link

ZoriaRPG commented Nov 3, 2019

Related to #82

Reading from github in a new environment, source-integration fails to import all branches/revisions. From observation, we're seeing that the likely culprit is the Github API page system:

e.g. the first 'page' of our repo:
https://api.github.com/repos/armageddongames/zeldaclassic/branches

The plugin fails to try to parse beyond that first page--perhaps at one time, they were one unified page?

See:
https://api.github.com/repos/armageddongames/zeldaclassic/branches?page=2
https://api.github.com/repos/armageddongames/zeldaclassic/branches?page=3
&c.

Beyond this, it seems to always stop at the 29th branch, out of 30. I cannot explain this one unless the software has a hardcoded limit, but we can reproduce it with 100% reliability.

Fetching all branches with '*', we always stop at the 29th branch on the first page, and the plugin believes that there remains naught more to see, or to fetch beyond that, unless we manually direct it at each missing branch.

We tested this from the CLI, to ensure that it was not a server / php timeout. Happens every time, without variation.

@dregad
Copy link
Member

dregad commented Dec 23, 2019

Looks like your observation is correct - according to Documentation

Requests that return multiple items will be paginated to 30 items by default

perhaps at one time, they were one unified page

More likely, I don't think anyone ever faced (or reported) this problem before.

it seems to always stop at the 29th branch, out of 30

The JSON returned by the API does contain 30 elements, but please consider that the array is 0-based so the last branch has ID 29. Can you please confirm that you are actually getting 29 branches ? If so then something else is possibly broken.

In any case, there is indeed a bug, because that the logic to retrieve branches does not do any pagination.

Considering that proper pagination handling per GitHub's documentation requires reading the request's Link header, it will not be easy to fix considering that the current MantisBT low-level APIs used to retrieve the JSON from GitHub only return the request's payload but not the headers. This would require some refactoring to rely on another method to consume the API (e.g GuzzleHttp).

Maybe as a workaround we could try (against GitHub's recommendation) to construct the URL by increasing the page number until payload is empty.

@dregad
Copy link
Member

dregad commented Dec 23, 2019

Just a quick proof-of-concept

// Set to max value allowed by GitHub API, to minimize number of requests
$t_per_page = 100; 
$t_url = "https://api.github.com/repos/armageddongames/zeldaclassic/branches?per_page=$t_per_page";
$t_page = 1;
$t_count_branches = 0;

$t_options = [
    // Whatever is needed here, e.g. proxy, etc.
    // Reference http://docs.guzzlephp.org/en/stable/request-options.html
];
$t_client = new GuzzleHttp\Client( $t_options );

do {
    echo "Processing page ", $t_page++, " - GET $t_url\n";
    $res = $t_client->get( $t_url );

    // Process payload
    $t_json = json_decode( $res->getBody() );
    $t_count_branches += count( $t_json );

    // Get the next page
    $links = GuzzleHttp\Psr7\parse_header( $res->getHeader( 'Link' ) );
    foreach( $links as $link ) {
        if( $link['rel'] == 'next' ) {
            $t_url = trim( $link[0], '<>' );
            continue 2;
        }
    }
    // There is no "next" link - all pages have been processed
    break;
} while (true);

echo "Total $t_count_branches branches found.\n";

Returns 553 branches, matching number at https://github.com/armageddongames/zeldaclassic/.

@dregad dregad closed this as completed in 66cec45 Feb 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants