Provide a manual override for Katana CPU topology and thread pinning #71

arthurp · 2021-02-09T16:38:49Z

We should provide a way to manually provide CPU topology, overriding the parsing of /proc and maybe even libnuma.
Previous issues (e.g., #64) have come up around incorrectly reading CPU information. A manual override would allow us to work around future issues of this kind quickly, and it would enable experimentation with non-standard thread pinnings and that kind of thing (e.g., I could tell Katana that my local machine has 4 NUMA nodes with 4 cores each for testing and debugging NUMA-specific crashes).

As a strawman, I propose the following environment variable:

KATANA_OVERRIDE_CORES="0,1,2,3:4,5,6,7"

Meaning "treat cores 0, 1, 2, 3 as one NUMA node and 4, 5, 6, 7 as another, ignore all other cores". In this mode Katana would ignore all other information about hyperthreading or memory and just do as it's told.

More specifically, the environment variable KATANA_OVERRIDE_CORES is a list of "fake NUMA nodes" separated by : (colon). Each fake NUMA node is a list of core numbers separated by , (comma). Or generally: a1,...,an:...:b1,...,bm. All core numbers are the numbers used by sched_setaffinity, with no remapping, to avoid bugs in the mapping itself. This provides enough information for the Katana thread pool to set things up correctly. There is a little more information available in HWTopoInfo, but it's not used and we should be able to generate reasonable fake information even if it was.

Issues with this simple approach:

It doesn't provide any way to specify hyperthreads.
It is very opaque and wouldn't be useful for end users. Though honestly, end users shouldn't be overriding these things anyway.

The text was updated successfully, but these errors were encountered:

ddn0 · 2021-02-09T17:28:22Z

Seems reasonable to me if only to provide a path for faking configs for testing.

Small suggestion for enhancement: make the format of "katana topo override" value an implementation detail subject to future change.

libgalois used to respond to an environment variable along the lines of "DEBUG_PRINT_TOPO", which would print the mappings it had inferred. I thought it was a little weird to have this functionality in the library itself and stuck into every program using libgalois, so this capability is now just available in the topo unit test.

It might make sense to promote this to an actual tool that (1) prints out the current inferred information (human, json) and (2) accepts some json data and spits out the internal override value suitable for an environment variable.

Pros:

Slightly more self-documenting UX
Puts the reading of topo information next to the configuring of topo information for a user
Better error handling when a tool can spit out error messages

Cons:

The tool version may be different than the library version (probably not a major concern for a developer tool)

Separately, it might make sense to use hwloc directly:
https://www.open-mpi.org/projects/hwloc/

arthurp · 2021-02-09T17:33:45Z

It might make sense to promote this to an actual tool that (1) prints out the current inferred information (human, json) and (2) accepts some json data and spits out the internal override value suitable for an environment variable.

If we have a tool like this, why not just allow the user to provide a json filename in the environment variable and parse it in libgalois? No reason to have the indirection since libgalois already depends on a json parser via libtsuba.

ddn0 · 2021-02-09T17:40:03Z

That's fine too. Though, I'd start with something that takes the contents of the json config (or base64 encoding thereof) in an environment variable first. In a lot of cases, it is easier to hack in an environment variable than it is a file (e.g., multiple hosts, inside a container).

arthurp · 2021-02-09T22:38:11Z

That's a good point. And I would much prefer to have a simple format that is expert readable than a totally opaque base64 blob. So I think your initial idea is good.We could definitely just put a little json data into an environment variable. We would get a format like [[0,1,2,3],[4,5,6,7]] which is reasonably simple and wouldn't require tricks to quote properly on a shell. We could support more complex things too if needed.

arthurp added enhancement New feature or request good first issue Good for newcomers labels Feb 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide a manual override for Katana CPU topology and thread pinning #71

Provide a manual override for Katana CPU topology and thread pinning #71

arthurp commented Feb 9, 2021

ddn0 commented Feb 9, 2021

arthurp commented Feb 9, 2021

ddn0 commented Feb 9, 2021

arthurp commented Feb 9, 2021

Provide a manual override for Katana CPU topology and thread pinning #71

Provide a manual override for Katana CPU topology and thread pinning #71

Comments

arthurp commented Feb 9, 2021

ddn0 commented Feb 9, 2021

arthurp commented Feb 9, 2021

ddn0 commented Feb 9, 2021

arthurp commented Feb 9, 2021