Generalize resource management #2

w568w · 2024-09-30T06:14:55Z

Currently, node resources are simply understood as NVIDIA GPU resources. Any scheduling revolves around one to multiple GPUs. All resources are abstracted as a Device struct that holds information specific to NVIDIA GPUs. It is not possible to specify other resources or submit a task that requires zero graphics card.

Device needs to be further generalized into something like Resource, which should at least include memory resource for each node.

p.s. as for CPU, I believe the implementation is easier than that of memory; moreover, once memory resource management is implemented, it is easy to reuse the logic for CPU, so we can temporarily set it aside for now.

The text was updated successfully, but these errors were encountered:

w568w added enhancement New feature or request complexity: high Requires fundemental changes or thorough insight on the whole project. labels Sep 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generalize resource management #2

Generalize resource management #2

w568w commented Sep 30, 2024

Generalize resource management #2

Generalize resource management #2

Comments

w568w commented Sep 30, 2024