Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scale down not getting all runners for org #1151

Closed
mcaulifn opened this issue Aug 31, 2021 · 9 comments
Closed

Scale down not getting all runners for org #1151

mcaulifn opened this issue Aug 31, 2021 · 9 comments

Comments

@mcaulifn
Copy link
Contributor

mcaulifn commented Aug 31, 2021

The pagination here is not working correctly for getting all runners for an org. If there are more than 30, the full list is not fetched. I confirmed that this works with this code:

import { Octokit } from '@octokit/rest';

async function getRunners(owner: string): Promise<any> {
  const octokit = new Octokit({
    auth: 'xxx',
    baseUrl: 'https://github.local/api/v3',
  });
  return await octokit.paginate(octokit.actions.listSelfHostedRunnersForOrg, {
    org: owner,
  });
}

async function run(): Promise<void> {
  console.log(await getRunners('org-name'));
}

run();
@bjorntheart
Copy link

30 results is standard behaviour according the the pagination docs.
Aren't you missing a .then() block?

async function run(): Promise<void> {  
  // console.log(await getRunners('org-name'));
  await getRunners('org-name').then((runners) => console.log(runners))
}

@bjorntheart
Copy link

wait... I just saw your open PR #1164. Is your question here still relevant? :-)

@mcaulifn
Copy link
Contributor Author

The 30 results per page is expected if just calling the endpoint without octokit.paginate. Looking at the code, I read it as the entire set of results should be gathered before the promise is resolved.

I would have expected this to also be properly awaited but it didn't appear to be in my local testing.

The PR is for a different issue where subsequent Lambda runs can retain data in memory. Because of that, the class still has data from previous runs.

@npalm
Copy link
Member

npalm commented Sep 30, 2021

@mcaulifn So scale down is not working over 30+ runners? We haven't seen an issue over the last month wit a larger set of runners.

@mcaulifn
Copy link
Contributor Author

I am not seeing this consistently. I'll close this for now and re-open if we see it again.

@maschwenk
Copy link
Contributor

maschwenk commented Oct 31, 2023

We're seeing a similar issue. We run 500+ runners routinely, so if the pagination was broken it would really cause chaos that would be more obvious. For that reason I'm fairly confident the pagination code works just fine. However, it seems like we're still seeing situations where Runners are missing from the data that Github provides. If it was inconsistent for @mcaulifn I wonder if there's a chance this is actually an API bug on Github's side?

We're having cases where we can see a Job clearly in the middle executing on a Runner, seeing the logs of the scale down attempt to do the hard termination (e.g. not bothering to de-register first). I wonder if, in that unhappy path of termination without de-registering we should re query Github to double check it's not there?

@andrewdibiasio6
Copy link

Should this be re-opened? We hit this issue often, and we have hundreds of runners often.

    "level": "INFO",
    "message": "Runner 'i-0010112d6c39eddc2' is orphaned and will be removed.",
    "service": "runners-scale-down",
    "timestamp": "2024-06-13T18:30:39.363Z",
    "module": "scale-down",
    ...
}

@mrmeyers99
Copy link

mrmeyers99 commented Oct 2, 2024

We're also seeing this issue fairly frequently as well.

@mrmeyers99
Copy link

Can we get this issue re-opened? We are seeing it very consistently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants