Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use os.sched_getaffinity to get info about available CPUs #6131

Closed
wants to merge 2 commits into from

Conversation

pawelchcki
Copy link

when running inside containers with CPU limits python's
multiprocessing.cpu_count() will return total number of CPU's
available on the machine. However in reality the process and its
children will not be able to be scheduled on all CPUs.
This issue is most pronounced with taskset/cpuset is set for the process.
Causing scheduling contention over limited number of CPU's.

In case of CFS cpu limits, its similar, but due to bursty nature, at least
in a short timeframe the processes will be able to be scheduled on all CPUs.
The overall efficiency however will be reduced still.

There is no straightforward way to read the CPU shares assigned via CFS.
So just handling simple case when affinity is explicitly set should be a good start

@lgtm-com
Copy link

lgtm-com bot commented May 27, 2021

This pull request introduces 1 alert when merging 30a1837bb5e09b361d2956fb218a71ec3300c101 into 34caa75 - view on LGTM.com

new alerts:

  • 1 for Unused local variable

except AttributeError:
pass

if os.name == "nt" and cpu_count > 61:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "nt" generic across all modern Windows versions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hoffmang9 This exact check is in any case being done when trying to compile the chia blockchain, at the point where the code tries to import this external file and this line.
I ran into a problem installing the whole thing, as described in #5158.

Copy link
Author

@pawelchcki pawelchcki May 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hoffmang9 TBF I'm not that familiar with python and especially python on windows. I followed some quick online docs, checked existing Chia sources.

And the same check is already used in Chia
(+ the code that Nikolaj mentioned)

According to the docs it should return "nt" for all Windows versions.
Besides WSL or Cygwin envs running on Windows.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries - GH now requires approval for first time contributors to start full CI - hence you got a comment from some serious pain that caused that comment from me in the first place regarding the Python windows issues. My quick review does look like "nt" is safe indeed.

pass

if os.name == "nt" and cpu_count > 61:
cpu_count = 61 # Windows Server 2016 has an issue https://bugs.python.org/issue26903
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks you can spare some magic and alternatively do

def get_available_cpus() -> int:
    try:
        cpu_count = os.sched_getaffinity(0)
    except AttributeError:
        cpu_count = multiprocessing.cpu_count()

    # Note: Windows Server 2016 has an issue https://bugs.python.org/issue26903
    if os.name == "nt":
        MAX = 61
        cpu_count = min(cpu_count, MAX)

    return cpu_count

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion, I've included it in the change.

@pawelchcki
Copy link
Author

Thank you both for quick review!

when running inside containers with CPU limits python's
multiprocessing.cpu_count() will return total number of CPU's
available on the machine. However in reality the process and its
children will not be able to be scheduled on all CPUs.
This issue is most pronounced with taskset/cpuset is set for the process.
Causing scheduling contention over limited number of CPU's.

In case of CFS cpu limits, its similar, but due to bursty nature, at least
in a short timeframe the processes will be able to be scheduled on all CPUs.

There is no straightforward way to read the CPU shares assigned via CFS.
So just handling simple case when affinity is explicitly set should be a good start
@mariano54
Copy link
Contributor

Thanks for your contribution! do you mind fixing this merge conflict?

@pawelchcki
Copy link
Author

@mariano54 merge conflict resolved :)

@lgtm-com
Copy link

lgtm-com bot commented Jul 14, 2021

This pull request introduces 1 alert when merging 3abd3f6 into 35dfdbc - view on LGTM.com

new alerts:

  • 1 for Variable defined multiple times

@github-actions
Copy link
Contributor

This PR has been flagged as stale due to no activity for over 60 days. It will not be automatically closed, but it has been given a stale-pr label and should be manually reviewed.

@github-actions github-actions bot added the stale-pr Flagged as stale and in need of manual review label Sep 12, 2021
@hoffmang9 hoffmang9 removed the stale-pr Flagged as stale and in need of manual review label Sep 12, 2021
@github-actions
Copy link
Contributor

This PR has been flagged as stale due to no activity for over 60 days. It will not be automatically closed, but it has been given a stale-pr label and should be manually reviewed by the relevant parties.

@github-actions github-actions bot added the stale-pr Flagged as stale and in need of manual review label Oct 28, 2021
@github-actions github-actions bot added the merge_conflict Branch has conflicts that prevent merge to main label Apr 30, 2022
@github-actions
Copy link
Contributor

This pull request has conflicts, please resolve those before we can evaluate the pull request.

1 similar comment
@github-actions
Copy link
Contributor

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@github-actions github-actions bot removed the stale-pr Flagged as stale and in need of manual review label Oct 13, 2023
Copy link
Contributor

This PR has been flagged as stale due to no activity for over 60 days. It will not be automatically closed, but it has been given a stale-pr label and should be manually reviewed by the relevant parties.

@github-actions github-actions bot added the stale-pr Flagged as stale and in need of manual review label Nov 27, 2023
@hoffmang9
Copy link
Member

Closing in favor of #17409

@hoffmang9 hoffmang9 closed this Mar 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge_conflict Branch has conflicts that prevent merge to main stale-pr Flagged as stale and in need of manual review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants