Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resource: emitting "shadowed" resources in an exclusive allocation #689

Closed
SteVwonder opened this issue Jul 8, 2020 · 2 comments
Closed

Comments

@SteVwonder
Copy link
Member

Currently, if a user constructs a jobspec that requests an exclusive allocation on a "non-leaf" resource, the child resources are not emitted by the resource writer. For example, the following jobspec will "implicitly" allocate cores exclusively to the job, but the rv1* writers will fail to produce an R_lite because no cores were "explicitly" matched and thus emitted by the traverser:

version: 1
resources:
  - type: node
    count: 1
    with:
      - type: slot
        count: 2
        label: default
        with:
          - type: socket
            count: 1

# a comment
attributes:
  system:
    duration: 3600
tasks:
  - command: [ "app" ]
    slot: default
    count:
      per_slot: 1

Reproducer 1 with resource-query and the default simple writer (note the absence of cores):

(1) sherbein ~/.../flux-framework/flux-sched/resource/utilities
❯ ./resource-query -L ../../t/data/hwloc-data/004N/exclusive/04-brokers/0.xml -f hwloc                                                                                                                                                                               13:42:49 ()
INFO: Loading a matcher: CA
resource-query> match allocate ../../t/data/resource/jobspecs/basics/test009.yaml
      ---------socket0[1:x]
      ---------socket1[1:x]
      ------cab1234[1:s]
      ---cluster0[1:s]
INFO: =============================
INFO: JOBID=1
INFO: RESOURCES=ALLOCATED
INFO: SCHEDULED AT=Now
INFO: =============================

Reproducer 2 with resource-query and the rv1 writer (note the absence of any resource output):

(130) 2m 10s sherbein ~/.../flux-framework/flux-sched/resource/utilities (hwloc2 !?S)
❯ ./resource-query -L ../../t/data/hwloc-data/004N/exclusive/04-brokers/0.xml -f hwloc -F rv1                                                                                                                                                                        13:46:17 ()
INFO: Loading a matcher: CA
resource-query> match allocate ../../t/data/resource/jobspecs/basics/test009.yaml
INFO: =============================
INFO: JOBID=1
INFO: RESOURCES=ALLOCATED
INFO: SCHEDULED AT=Now
INFO: =============================
@dongahn
Copy link
Member

dongahn commented Jul 12, 2020

I believe this can be done with a bit of redesign of our walker. In fact, during the redesign, we may want to think about fuller support for partial specification which can omit not only the prefix levels of the resource graph but also intermediate levels.

@SteVwonder
Copy link
Member Author

SteVwonder commented Aug 19, 2020

From the coffee time today:

If a user asks for a node-exclusive allocation or the system is configured to automatically give node-exclusive access, only the explicitly requested resources will be emitted in R while this issue is still open. Node-exclusive discussion context:

Additionally, we don't want to emit every shadowed resource. Node-local storage is one example; we want to leave the storage available for staging in data for future jobs.

Finally, we also want to retain the ability to emit "pedantic" R (that does not include shadowed resources) for user-level schedulers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants