1659: Improve communication statistics in VT #1993

cz4rs · 2022-10-12T17:05:41Z

github-actions · 2022-10-12T17:14:39Z

Pipelines results

PR tests (gcc-12, ubuntu, mpich)

Build for 634004e (2022-11-14 20:53:30 UTC)

Compilation - successful

Testing - passed

Build log

PR tests (clang-3.9, ubuntu, mpich)

Build for 634004e (2022-11-14 20:53:30 UTC)

Compilation - successful

Testing - passed

Build log

PR tests (gcc-5, ubuntu, mpich)

Build for 634004e (2022-11-14 20:53:30 UTC)

FAILED: src/CMakeFiles/vt.dir/Unity/unity_20_cxx.cxx.o 
/usr/bin/ccache /usr/lib/ccache/g++ -DJSON_USE_IMPLICIT_CONVERSIONS=1 -DVT_NO_COLOR_ENABLED -I/vt/lib/CLI -Irelease -I/vt/src -I/vt/lib/json/include -I/vt/lib/brotli/c/include -I/vt/lib/libfort/lib -isystem /vt/lib/fmt/include -isystem /vt/lib/EngFormat-Cpp/include -isystem /build/checkpoint/install/include -isystem /build/detector/install/include -O3 -DNDEBUG -Wall -pedantic -Wshadow -Wno-unknown-pragmas -Wsign-compare -ftemplate-backtrace-limit=100 -Werror -Wno-unused-variable -fPIC -fopenmp -std=c++14 -MD -MT src/CMakeFiles/vt.dir/Unity/unity_20_cxx.cxx.o -MF src/CMakeFiles/vt.dir/Unity/unity_20_cxx.cxx.o.d -o src/CMakeFiles/vt.dir/Unity/unity_20_cxx.cxx.o -c src/CMakeFiles/vt.dir/Unity/unity_20_cxx.cxx
In file included from /usr/include/c++/5/bits/hashtable.h:35:0,
                 from /usr/include/c++/5/unordered_map:47,
                 from /vt/src/vt/handler/handler.h:48,
                 from /vt/src/vt/handler/handler.cc:44,
                 from src/CMakeFiles/vt.dir/Unity/unity_20_cxx.cxx:3:
/usr/include/c++/5/bits/hashtable_policy.h: In instantiation of 'struct std::__detail::__is_noexcept_hash<vt::vrt::collection::balance::LBType, std::hash<vt::vrt::collection::balance::LBType> >':
/usr/include/c++/5/type_traits:137:12:   required from 'struct std::__and_<std::__is_fast_hash<std::hash<vt::vrt::collection::balance::LBType> >, std::__detail::__is_noexcept_hash<vt::vrt::collection::balance::LBType, std::hash<vt::vrt::collection::balance::LBType> > >'
/usr/include/c++/5/type_traits:148:38:   required from 'struct std::__not_<std::__and_<std::__is_fast_hash<std::hash<vt::vrt::collection::balance::LBType> >, std::__detail::__is_noexcept_hash<vt::vrt::collection::balance::LBType, std::hash<vt::vrt::collection::balance::LBType> > > >'
/usr/include/c++/5/bits/unordered_map.h:100:66:   required from 'class std::unordered_map<vt::vrt::collection::balance::LBType, std::__cxx11::basic_string<char> >'
/vt/src/vt/vrt/collection/balance/read_lb.h:123:36:   required from here
/usr/include/c++/5/bits/hashtable_policy.h:85:34: error: no match for call to '(const std::hash<vt::vrt::collection::balance::LBType>) (const vt::vrt::collection::balance::LBType&)'
  noexcept(declval<const _Hash&>()(declval<const _Key&>()))>
                                  ^
In file included from /usr/include/c++/5/bits/move.h:57:0,
                 from /usr/include/c++/5/bits/stl_pair.h:59,
                 from /usr/include/c++/5/bits/stl_algobase.h:64,
                 from /usr/include/c++/5/vector:60,
                 from /vt/src/vt/handler/handler.h:47,
                 from /vt/src/vt/handler/handler.cc:44,
                 from src/CMakeFiles/vt.dir/Unity/unity_20_cxx.cxx:3:
/usr/include/c++/5/type_traits: In instantiation of 'struct std::__not_<std::__and_<std::__is_fast_hash<st%0D%0A%0D%0A%0D%0A ==> And there is more. Read log. <==

Build log

PR tests (gcc-10, ubuntu, openmpi, no LB)

Build for 634004e (2022-11-14 20:53:30 UTC)

Compilation - successful

Testing - passed

Build log

PR tests (gcc-7, ubuntu, mpich, trace runtime, LB)

Build for 317f055 (2022-11-09 20:30:46 UTC)

Compilation - successful

Testing - passed

Build log

PR tests (clang-5.0, ubuntu, mpich)

Build for 634004e (2022-11-14 20:53:30 UTC)

Compilation - successful

Testing - passed

Build log

PR tests (gcc-9, ubuntu, mpich, zoltan)

Build for 634004e (2022-11-14 20:53:30 UTC)

Compilation - successful

Testing - passed

Build log

PR tests (clang-9, ubuntu, mpich)

Build for 634004e (2022-11-14 20:53:30 UTC)

Compilation - successful

Testing - passed

Build log

PR tests (gcc-6, ubuntu, mpich)

Build for 634004e (2022-11-14 20:53:30 UTC)

Compilation - successful

Testing - passed

Build log

PR tests (clang-13, alpine, mpich)

Build for 634004e (2022-11-14 20:53:30 UTC)

Compilation - successful

Testing - passed

Build log

PR tests (clang-11, ubuntu, mpich)

Build for 634004e (2022-11-14 20:53:30 UTC)

Compilation - successful

Testing - passed

Build log

PR tests (nvidia cuda 11.0, ubuntu, mpich)

Build for 634004e (2022-11-14 20:53:30 UTC)

Compilation - successful

Testing - passed

Build log

PR tests (intel icpx, ubuntu, mpich)

Build for 634004e (2022-11-14 20:53:30 UTC)

Compilation - successful

Testing - passed

Build log

PR tests (gcc-8, ubuntu, mpich, address sanitizer)

Build for 634004e (2022-11-14 20:53:30 UTC)

Compilation - successful

Testing - passed

Build log

PR tests (clang-12, ubuntu, mpich)

Build for 634004e (2022-11-14 20:53:30 UTC)

Compilation - successful

Testing - passed

Build log

PR tests (nvidia cuda 10.1, ubuntu, mpich)

Build for 634004e (2022-11-14 20:53:30 UTC)

Compilation - successful

Testing - passed

Build log

PR tests (clang-13, ubuntu, mpich)

Build for 634004e (2022-11-14 20:53:30 UTC)

Compilation - successful

Testing - passed

Build log

PR tests (clang-14, ubuntu, mpich)

Build for 634004e (2022-11-14 20:53:30 UTC)

Compilation - successful

Testing - passed

Build log

PR tests (gcc-11, ubuntu, mpich, json schema test)

Build for 634004e (2022-11-14 20:53:30 UTC)

Compilation - successful

Testing - passed

Build log

PR tests (clang-10, ubuntu, mpich)

Build for 634004e (2022-11-14 20:53:30 UTC)

Compilation - successful

Testing - passed

Build log

PR tests (intel icpc, ubuntu, mpich)

Build for 634004e (2022-11-14 20:53:30 UTC)

Compilation - successful

Testing - passed

Build log

src/vt/vrt/collection/balance/lb_invoke/lb_manager.h

cz4rs · 2022-11-09T17:27:31Z

Sample output generated with:

mpiexec -n 2 ./build/tests/collection_extended --gtest_filter=LoadBalancerExplodeOther/TestLoadBalancerOther.test_load_balancer_other_1/4

expand JSON

{
    "type": "LBStatsfile",
    "phases": [
        {
            "id": 0,
            "pre-LB": {
                "Object_comm": {
                    "avg": 466.0,
                    "car": 4.0,
                    "imb": 0.7682403433476395,
                    "kur": -2.4373563791593496,
                    "max": 824.0,
                    "min": 112.0,
                    "npr": 4.0,
                    "skw": 0.00012438128147181126,
                    "std": 354.01129925469894,
                    "sum": 1864.0,
                    "var": 125324.0
                },
                "Object_load_modeled": {
                    "avg": 4.64694857068285e-05,
                    "car": 70.0,
                    "imb": 8.919304963295684,
                    "kur": 28.781199587775358,
                    "max": 0.0004609450002135418,
                    "min": 0.0,
                    "npr": 66.0,
                    "skw": 4.959831982971411,
                    "std": 6.113038713175178e-05,
                    "sum": 0.003252863999477995,
                    "var": 3.736924230877844e-09
                },
                "Object_load_raw": {
                    "avg": 4.64694857068285e-05,
                    "car": 70.0,
                    "imb": 8.919304963295684,
                    "kur": 28.781199587775358,
                    "max": 0.0004609450002135418,
                    "min": 0.0,
                    "npr": 66.0,
                    "skw": 4.959831982971411,
                    "std": 6.113038713175178e-05,
                    "sum": 0.003252863999477995,
                    "var": 3.736924230877844e-09
                },
                "Rank_comm": {
                    "avg": 932.0,
                    "car": 2.0,
                    "imb": 0.0042918454935623185,
                    "kur": -2.75,
                    "max": 936.0,
                    "min": 928.0,
                    "npr": 2.0,
                    "skw": 0.0,
                    "std": 4.0,
                    "sum": 1864.0,
                    "var": 16.0
                },
                "Rank_load_modeled": {
                    "avg": 0.0016264319997389975,
                    "car": 2.0,
                    "imb": 0.30121517548248056,
                    "kur": -2.75,
                    "max": 0.0021163379999507015,
                    "min": 0.0011365259995272936,
                    "npr": 2.0,
                    "skw": 0.0,
                    "std": 0.0004899060002117039,
                    "sum": 0.003252863999477995,
                    "var": 2.400078890434301e-07
                },
                "Rank_load_raw": {
                    "avg": 0.0016264319997389975,
                    "car": 2.0,
                    "imb": 0.30121517548248056,
                    "kur": -2.75,
                    "max": 0.0021163379999507015,
                    "min": 0.0011365259995272936,
                    "npr": 2.0,
                    "skw": 0.0,
                    "std": 0.0004899060002117039,
                    "sum": 0.003252863999477995,
                    "var": 2.400078890434301e-07
                }
            }
        },
        {
            "id": 1,
            "migration count": 8,
            "post-LB": {
                "Object_comm": {
                    "avg": 840.0,
                    "car": 4.0,
                    "imb": 0.06666666666666665,
                    "kur": -2.4375,
                    "max": 896.0,
                    "min": 784.0,
                    "npr": 4.0,
                    "skw": 0.0,
                    "std": 56.0,
                    "sum": 3360.0,
                    "var": 3136.0
                },
                "Object_load_modeled": {
                    "avg": 5.6572871423148694e-05,
                    "car": 70.0,
                    "imb": 4.605796416967453,
                    "kur": 17.54840783783443,
                    "max": 0.00031713599992144736,
                    "min": 0.0,
                    "npr": 66.0,
                    "skw": 3.3960689438183804,
                    "std": 4.28437352241483e-05,
                    "sum": 0.003960100999620408,
                    "var": 1.835585647956926e-09
                },
                "Object_load_raw": {
                    "avg": 5.6572871423148694e-05,
                    "car": 70.0,
                    "imb": 4.605796416967453,
                    "kur": 17.54840783783443,
                    "max": 0.00031713599992144736,
                    "min": 0.0,
                    "npr": 66.0,
                    "skw": 3.3960689438183804,
                    "std": 4.28437352241483e-05,
                    "sum": 0.003960100999620408,
                    "var": 1.835585647956926e-09
                },
                "Object_work_modeled": {
                    "avg": 5.6572871423148694e-05,
                    "car": 70.0,
                    "imb": 4.605796416967453,
                    "kur": 17.54840783783442,
                    "max": 0.00031713599992144736,
                    "min": 0.0,
                    "npr": 66.0,
                    "skw": 3.396068943818379,
                    "std": 4.284373522414831e-05,
                    "sum": 0.003960100999620408,
                    "var": 1.8355856479569265e-09
                },
                "Rank_comm": {
                    "avg": 1680.0,
                    "car": 2.0,
                    "imb": 0.0,
                    "kur": 0.0,
                    "max": 1680.0,
                    "min": 1680.0,
                    "npr": 2.0,
                    "skw": 0.0,
                    "std": 0.0,
                    "sum": 3360.0,
                    "var": 0.0
                },
                "Rank_load_modeled": {
                    "avg": 0.001980050499810204,
                    "car": 2.0,
                    "imb": 0.0024769568420375254,
                    "kur": -2.75,
                    "max": 0.001984954999443289,
                    "min": 0.0019751460001771193,
                    "npr": 2.0,
                    "skw": 0.0,
                    "std": 4.9044996330849244e-06,
                    "sum": 0.003960100999620408,
                    "var": 2.4054116650930158e-11
                },
                "Rank_load_raw": {
                    "avg": 0.001980050499810204,
                    "car": 2.0,
                    "imb": 0.0024769568420375254,
                    "kur": -2.75,
                    "max": 0.001984954999443289,
                    "min": 0.0019751460001771193,
                    "npr": 2.0,
                    "skw": 0.0,
                    "std": 4.9044996330849244e-06,
                    "sum": 0.003960100999620408,
                    "var": 2.4054116650930158e-11
                },
                "Rank_work_modeled": {
                    "avg": 0.001980050499810204,
                    "car": 2.0,
                    "imb": 0.26022391825216284,
                    "kur": -2.75,
                    "max": 0.002495306999207969,
                    "min": 0.0014647940004124393,
                    "npr": 2.0,
                    "skw": 0.0,
                    "std": 0.0005152564993977649,
                    "sum": 0.003960100999620408,
                    "var": 2.6548926017163883e-07
                }
            },
            "pre-LB": {
                "Object_comm": {
                    "avg": 840.0,
                    "car": 4.0,
                    "imb": 0.06666666666666665,
                    "kur": -2.4375,
                    "max": 896.0,
                    "min": 784.0,
                    "npr": 4.0,
                    "skw": 0.0,
                    "std": 56.0,
                    "sum": 3360.0,
                    "var": 3136.0
                },
                "Object_load_modeled": {
                    "avg": 5.6572871423148694e-05,
                    "car": 70.0,
                    "imb": 4.605796416967453,
                    "kur": 17.54840783783442,
                    "max": 0.00031713599992144736,
                    "min": 0.0,
                    "npr": 66.0,
                    "skw": 3.396068943818379,
                    "std": 4.284373522414831e-05,
                    "sum": 0.003960100999620408,
                    "var": 1.8355856479569265e-09
                },
                "Object_load_raw": {
                    "avg": 5.6572871423148694e-05,
                    "car": 70.0,
                    "imb": 4.605796416967453,
                    "kur": 17.54840783783442,
                    "max": 0.00031713599992144736,
                    "min": 0.0,
                    "npr": 66.0,
                    "skw": 3.396068943818379,
                    "std": 4.284373522414831e-05,
                    "sum": 0.003960100999620408,
                    "var": 1.8355856479569265e-09
                },
                "Rank_comm": {
                    "avg": 1680.0,
                    "car": 2.0,
                    "imb": 0.0,
                    "kur": 0.0,
                    "max": 1680.0,
                    "min": 1680.0,
                    "npr": 2.0,
                    "skw": 0.0,
                    "std": 0.0,
                    "sum": 3360.0,
                    "var": 0.0
                },
                "Rank_load_modeled": {
                    "avg": 0.001980050499810204,
                    "car": 2.0,
                    "imb": 0.26022391825216284,
                    "kur": -2.75,
                    "max": 0.002495306999207969,
                    "min": 0.0014647940004124393,
                    "npr": 2.0,
                    "skw": 0.0,
                    "std": 0.0005152564993977649,
                    "sum": 0.003960100999620408,
                    "var": 2.6548926017163883e-07
                },
                "Rank_load_raw": {
                    "avg": 0.001980050499810204,
                    "car": 2.0,
                    "imb": 0.26022391825216284,
                    "kur": -2.75,
                    "max": 0.002495306999207969,
                    "min": 0.0014647940004124393,
                    "npr": 2.0,
                    "skw": 0.0,
                    "std": 0.0005152564993977649,
                    "sum": 0.003960100999620408,
                    "var": 2.6548926017163883e-07
                }
            }
        },
(...)
        {
            "id": 9,
            "migration count": 0,
            "post-LB": {
                "Object_comm": {
                    "avg": 4632.0,
                    "car": 4.0,
                    "imb": 0.07081174438687388,
                    "kur": -2.4375,
                    "max": 4960.0,
                    "min": 4304.0,
                    "npr": 4.0,
                    "skw": 0.0,
                    "std": 328.0,
                    "sum": 18528.0,
                    "var": 107584.0
                },
                "Object_load_modeled": {
                    "avg": 8.89744285359484e-05,
                    "car": 70.0,
                    "imb": 17.76672912338164,
                    "kur": 34.416259845604124,
                    "max": 0.0016697589992418216,
                    "min": 0.0,
                    "npr": 66.0,
                    "skw": 5.845439578625132,
                    "std": 0.00022748406793370163,
                    "sum": 0.006228209997516387,
                    "var": 5.1749001163664976e-08
                },
                "Object_load_raw": {
                    "avg": 8.89744285359484e-05,
                    "car": 70.0,
                    "imb": 17.76672912338164,
                    "kur": 34.416259845604124,
                    "max": 0.0016697589992418216,
                    "min": 0.0,
                    "npr": 66.0,
                    "skw": 5.845439578625132,
                    "std": 0.00022748406793370163,
                    "sum": 0.006228209997516387,
                    "var": 5.1749001163664976e-08
                },
                "Object_work_modeled": {
                    "avg": 8.89744285359484e-05,
                    "car": 70.0,
                    "imb": 17.76672912338164,
                    "kur": 34.416259845604124,
                    "max": 0.0016697589992418216,
                    "min": 0.0,
                    "npr": 66.0,
                    "skw": 5.845439578625132,
                    "std": 0.00022748406793370163,
                    "sum": 0.006228209997516387,
                    "var": 5.1749001163664976e-08
                },
                "Rank_comm": {
                    "avg": 9264.0,
                    "car": 2.0,
                    "imb": 0.0,
                    "kur": 0.0,
                    "max": 9264.0,
                    "min": 9264.0,
                    "npr": 2.0,
                    "skw": 0.0,
                    "std": 0.0,
                    "sum": 18528.0,
                    "var": 0.0
                },
                "Rank_load_modeled": {
                    "avg": 0.0031141049987581937,
                    "car": 2.0,
                    "imb": 0.002597214844879403,
                    "kur": -2.75,
                    "max": 0.0031221929984894814,
                    "min": 0.003106016999026906,
                    "npr": 2.0,
                    "skw": 0.0,
                    "std": 8.08799973128771e-06,
                    "sum": 0.006228209997516387,
                    "var": 6.541573965331006e-11
                },
                "Rank_load_raw": {
                    "avg": 0.0031141049987581937,
                    "car": 2.0,
                    "imb": 0.002597214844879403,
                    "kur": -2.75,
                    "max": 0.0031221929984894814,
                    "min": 0.003106016999026906,
                    "npr": 2.0,
                    "skw": 0.0,
                    "std": 8.08799973128771e-06,
                    "sum": 0.006228209997516387,
                    "var": 6.541573965331006e-11
                },
                "Rank_work_modeled": {
                    "avg": 0.0031141049987581937,
                    "car": 2.0,
                    "imb": 0.002597214844879403,
                    "kur": -2.75,
                    "max": 0.0031221929984894814,
                    "min": 0.003106016999026906,
                    "npr": 2.0,
                    "skw": 0.0,
                    "std": 8.08799973128771e-06,
                    "sum": 0.006228209997516387,
                    "var": 6.541573965331006e-11
                }
            },
            "pre-LB": {
                "Object_comm": {
                    "avg": 4632.0,
                    "car": 4.0,
                    "imb": 0.07081174438687388,
                    "kur": -2.4375,
                    "max": 4960.0,
                    "min": 4304.0,
                    "npr": 4.0,
                    "skw": 0.0,
                    "std": 328.0,
                    "sum": 18528.0,
                    "var": 107584.0
                },
                "Object_load_modeled": {
                    "avg": 8.89744285359484e-05,
                    "car": 70.0,
                    "imb": 17.76672912338164,
                    "kur": 34.416259845604124,
                    "max": 0.0016697589992418216,
                    "min": 0.0,
                    "npr": 66.0,
                    "skw": 5.845439578625132,
                    "std": 0.00022748406793370163,
                    "sum": 0.006228209997516387,
                    "var": 5.1749001163664976e-08
                },
                "Object_load_raw": {
                    "avg": 8.89744285359484e-05,
                    "car": 70.0,
                    "imb": 17.76672912338164,
                    "kur": 34.416259845604124,
                    "max": 0.0016697589992418216,
                    "min": 0.0,
                    "npr": 66.0,
                    "skw": 5.845439578625132,
                    "std": 0.00022748406793370163,
                    "sum": 0.006228209997516387,
                    "var": 5.1749001163664976e-08
                },
                "Object_work_modeled": {
                    "avg": 8.758099999275665e-05,
                    "car": 70.0,
                    "imb": 17.66915196946732,
                    "kur": 35.73043645952325,
                    "max": 0.0016350629985026899,
                    "min": 0.0,
                    "npr": 66.0,
                    "skw": 5.931927884228807,
                    "std": 0.00021919888106694175,
                    "sum": 0.006130669999492966,
                    "var": 4.804814946099927e-08
                },
                "Rank_comm": {
                    "avg": 9264.0,
                    "car": 2.0,
                    "imb": 0.0,
                    "kur": 0.0,
                    "max": 9264.0,
                    "min": 9264.0,
                    "npr": 2.0,
                    "skw": 0.0,
                    "std": 0.0,
                    "sum": 18528.0,
                    "var": 0.0
                },
                "Rank_load_modeled": {
                    "avg": 0.0031141049987581937,
                    "car": 2.0,
                    "imb": 0.002597214844879403,
                    "kur": -2.75,
                    "max": 0.0031221929984894814,
                    "min": 0.003106016999026906,
                    "npr": 2.0,
                    "skw": 0.0,
                    "std": 8.08799973128771e-06,
                    "sum": 0.006228209997516387,
                    "var": 6.541573965331006e-11
                },
                "Rank_load_raw": {
                    "avg": 0.0031141049987581937,
                    "car": 2.0,
                    "imb": 0.002597214844879403,
                    "kur": -2.75,
                    "max": 0.0031221929984894814,
                    "min": 0.003106016999026906,
                    "npr": 2.0,
                    "skw": 0.0,
                    "std": 8.08799973128771e-06,
                    "sum": 0.006228209997516387,
                    "var": 6.541573965331006e-11
                },
                "Rank_work_modeled": {
                    "avg": 0.003065334999746483,
                    "car": 2.0,
                    "imb": 0.07422451348080084,
                    "kur": -2.75,
                    "max": 0.003292857998758336,
                    "min": 0.0028378120007346297,
                    "npr": 2.0,
                    "skw": 0.0,
                    "std": 0.00022752299901185324,
                    "sum": 0.006130669999492966,
                    "var": 5.176671507934777e-08
                }
            }
        }
    ]
}

codecov · 2022-11-09T21:24:16Z

Codecov Report

Merging #1993 (fe6c671) into develop (3ac0077) will increase coverage by 0.01%.
The diff coverage is 100.00%.

❗ Current head fe6c671 differs from pull request most recent head 52d4808. Consider uploading reports for the commit 52d4808 to get more accurate results

@@             Coverage Diff             @@
##           develop    #1993      +/-   ##
===========================================
+ Coverage    84.45%   84.47%   +0.01%     
===========================================
  Files          732      728       -4     
  Lines        25843    25850       +7     
===========================================
+ Hits         21826    21837      +11     
+ Misses        4017     4013       -4

Impacted Files	Coverage Δ
src/vt/elm/elm_comm.h	`89.74% <ø> (ø)`
src/vt/vrt/collection/balance/baselb/baselb.h	`100.00% <ø> (ø)`
src/vt/vrt/collection/balance/greedylb/greedylb.h	`100.00% <ø> (ø)`
src/vt/vrt/collection/balance/lb_common.h	`57.89% <ø> (-2.11%)`	⬇️
...c/vt/vrt/collection/balance/lb_invoke/lb_manager.h	`100.00% <ø> (ø)`
...vrt/collection/balance/temperedwmin/temperedwmin.h	`100.00% <ø> (ø)`
tests/unit/lb/test_lbargs_enum_conv.nompi.cc	`100.00% <ø> (ø)`
src/vt/vrt/collection/balance/baselb/baselb.cc	`95.14% <100.00%> (+0.14%)`	⬆️
src/vt/vrt/collection/balance/lb_common.cc	`78.72% <100.00%> (ø)`
.../vt/vrt/collection/balance/lb_invoke/lb_manager.cc	`80.00% <100.00%> (+0.38%)`	⬆️
... and 13 more

src/vt/vrt/collection/balance/lb_type.h

PhilMiller · 2022-11-11T19:00:23Z

I'm not a huge fan of this architecture, where strategy-specific stuff bleeds into the manager, but I understand the motivation well enough.

Two points about not presenting this as something that application developers should see or their code should call:

Instead of 'custom model', could we call it 'strategy specific model'?

It would be nice if setting this model were a private method that was only called from a friended method of BaseLB, that the derived strategy could call. Essentially, making BaseLB a small instance of an attorney pattern for LBManager as the client.

src/vt/vrt/collection/balance/lb_common.h

tests/unit/lb/test_lbargs_enum_conv.nompi.cc

src/vt/vrt/collection/balance/lb_invoke/lb_manager.h

src/vt/vrt/collection/balance/baselb/baselb.h

nlslatt

Looks great in general. I do have one concern but it's definitely open to discussion.

src/vt/vrt/collection/balance/lb_invoke/lb_manager.cc

cz4rs · 2022-12-06T15:35:22Z

JSON schema validator fails (correctly) with Wrong keys 'Object_work_modeled', 'Rank_work_modeled' (...) error.

Running schema validator on: ./tests/vt_lb_statistics.2022-12-06-01-29-50.json.br
Validating file: /build/vt/tests/vt_lb_statistics.2022-12-06-01-29-50.json.br
Invalid JSON schema in /build/vt/tests/vt_lb_statistics.2022-12-06-01-29-50.json.br
[JSON_data_files_validator] SchemaError Key 'phases' error:
Or({'id': <class 'int'>, Optional('migration count'): <class 'int'>, Optional('post-LB'): {'Object_comm': {'avg': <class 'float'>, 'car': <class 'float'>, 'imb': <class 'float'>, 'kur': <class 'float'>, 'max': <class 'float'>, 'min': <class 'float'>, 'npr': <class 'float'>, 'skw': <class 'float'>, 'std': <class 'float'>, 'sum': <class 'float'>, 'var': <class 'float'>}, 'Object_load_modeled': {'avg': <class 'float'>, 'car': <class 'float'>, 'imb': <class 'float'>, 'kur': <class 'float'>, 'max': <class 'float'>, 'min': <class 'float'>, 'npr': <class 'float'>, 'skw': <class 'float'>, 'std': <class 'float'>, 'sum': <class 'float'>, 'var': <class 'float'>}, 'Object_load_raw': {'avg': <class 'float'>, 'car': <class 'float'>, 'imb': <class 'float'>, 'kur': <class 'float'>, 'max': <class 'float'>, 'min': <class 'float'>, 'npr': <class 'float'>, 'skw': <class 'float'>, 'std': <class 'float'>, 'sum': <class 'float'>, 'var': <class 'float'>}, 'Rank_comm': {'avg': <class 'float'>, 'car': <class 'float'>, 'imb': <class 'float'>, 'kur': <class 'float'>, 'max': <class 'float'>, 'min': <class 'float'>, 'npr': <class 'float'>, 'skw': <class 'float'>, 'std': <class 'float'>, 'sum': <class 'float'>, 'var': <class 'float'>}, 'Rank_load_modeled': {'avg': <class 'float'>, 'car': <class 'float'>, 'imb': <class 'float'>, 'kur': <class 'float'>, 'max': <class 'float'>, 'min': <class 'float'>, 'npr': <class 'float'>, 'skw': <class 'float'>, 'std': <class 'float'>, 'sum': <class 'float'>, 'var': <class 'float'>}, 'Rank_load_raw': {'avg': <class 'float'>, 'car': <class 'float'>, 'imb': <class 'float'>, 'kur': <class 'float'>, 'max': <class 'float'>, 'min': <class 'float'>, 'npr': <class 'float'>, 'skw': <class 'float'>, 'std': <class 'float'>, 'sum': <class 'float'>, 'var': <class 'float'>}}, 'pre-LB': {'Object_comm': {'avg': <class 'float'>, 'car': <class 'float'>, 'imb': <class 'float'>, 'kur': <class 'float'>, 'max': <class 'float'>, 'min': <class 'float'>, 'npr': <class 'float'>, 'skw': <class 'float'>, 'std': <class 'float'>, 'sum': <class 'float'>, 'var': <class 'float'>}, 'Object_load_modeled': {'avg': <class 'float'>, 'car': <class 'float'>, 'imb': <class 'float'>, 'kur': <class 'float'>, 'max': <class 'float'>, 'min': <class 'float'>, 'npr': <class 'float'>, 'skw': <class 'float'>, 'std': <class 'float'>, 'sum': <class 'float'>, 'var': <class 'float'>}, 'Object_load_raw': {'avg': <class 'float'>, 'car': <class 'float'>, 'imb': <class 'float'>, 'kur': <class 'float'>, 'max': <class 'float'>, 'min': <class 'float'>, 'npr': <class 'float'>, 'skw': <class 'float'>, 'std': <class 'float'>, 'sum': <class 'float'>, 'var': <class 'float'>}, 'Rank_comm': {'avg': <class 'float'>, 'car': <class 'float'>, 'imb': <class 'float'>, 'kur': <class 'float'>, 'max': <class 'float'>, 'min': <class 'float'>, 'npr': <class 'float'>, 'skw': <class 'float'>, 'std': <class 'float'>, 'sum': <class 'float'>, 'var': <class 'float'>}, 'Rank_load_modeled': {'avg': <class 'float'>, 'car': <class 'float'>, 'imb': <class 'float'>, 'kur': <class 'float'>, 'max': <class 'float'>, 'min': <class 'float'>, 'npr': <class 'float'>, 'skw': <class 'float'>, 'std': <class 'float'>, 'sum': <class 'float'>, 'var': <class 'float'>}, 'Rank_load_raw': {'avg': <class 'float'>, 'car': <class 'float'>, 'imb': <class 'float'>, 'kur': <class 'float'>, 'max': <class 'float'>, 'min': <class 'float'>, 'npr': <class 'float'>, 'skw': <class 'float'>, 'std': <class 'float'>, 'sum': <class 'float'>, 'var': <class 'float'>}}}) did not validate {'id': 1, 'migration count': 10, 'post-LB': {'Object_comm': {'avg': 840.0, 'car': 4.0, 'imb': 0.06666666666666665, 'kur': -2.4375, 'max': 896.0, 'min': 784.0, 'npr': 4.0, 'skw': 0.0, 'std': 56.0, 'sum': 3360.0, 'var': 3136.0}, 'Object_load_modeled': {'avg': 1.213230003876171e-05, 'car': 70.0, 'imb': 3.080347517465974, 'kur': 4.413427772269169, 'max': 4.950400034431368e-05, 'min': 0.0, 'npr': 66.0, 'skw': 1.3208459347226218, 'std': 8.071501322276251e-06, 'sum': 0.0008492610027133196, 'var': 6.514913359550728e-11}, 'Object_load_raw': {'avg': 1.213230003876171e-05, 'car': 70.0, 'imb': 3.080347517465974, 'kur': 4.413427772269169, 'max': 4.950400034431368e-05, 'min': 0.0, 'npr': 66.0, 'skw': 1.3208459347226218, 'std': 8.071501322276251e-06, 'sum': 0.0008492610027133196, 'var': 6.514913359550728e-11}, 'Object_work_modeled': {'avg': 1.213230003876171e-05, 'car': 70.0, 'imb': 3.080347517465974, 'kur': 4.413427772269172, 'max': 4.950400034431368e-05, 'min': 0.0, 'npr': 66.0, 'skw': 1.3208459347226216, 'std': 8.071501322276253e-06, 'sum': 0.0008492610027133196, 'var': 6.514913359550729e-11}, 'Rank_comm': {'avg': 1680.0, 'car': 2.0, 'imb': 0.0, 'kur': 0.0, 'max': 1680.0, 'min': 1680.0, 'npr': 2.0, 'skw': 0.0, 'std': 0.0, 'sum': 3360.0, 'var': 0.0}, 'Rank_load_modeled': {'avg': 0.0004246305013566598, 'car': 2.0, 'imb': 0.0011763182995736532, 'kur': -2.75, 'max': 0.00042513000198596274, 'min': 0.0004241310007273569, 'npr': 2.0, 'skw': 0.0, 'std': 4.995006293029292e-07, 'sum': 0.0008492610027133196, 'var': 2.4950087867402225e-13}, 'Rank_load_raw': {'avg': 0.0004246305013566598, 'car': 2.0, 'imb': 0.0011763182995736532, 'kur': -2.75, 'max': 0.00042513000198596274, 'min': 0.0004241310007273569, 'npr': 2.0, 'skw': 0.0, 'std': 4.995006293029292e-07, 'sum': 0.0008492610027133196, 'var': 2.4950087867402225e-13}, 'Rank_work_modeled': {'avg': 0.0004246305013566598, 'car': 2.0, 'imb': 0.358218497683666, 'kur': -2.75, 'max': 0.0005767410016233043, 'min': 0.0002725200010900153, 'npr': 2.0, 'skw': 0.0, 'std': 0.00015211050026664452, 'sum': 0.0008492610027133196, 'var': 2.3137604291368863e-08}}, 'pre-LB': {'Object_comm': {'avg': 840.0, 'car': 4.0, 'imb': 0.06666666666666665, 'kur': -2.4375, 'max': 896.0, 'min': 784.0, 'npr': 4.0, 'skw': 0.0, 'std': 56.0, 'sum': 3360.0, 'var': 3136.0}, 'Object_load_modeled': {'avg': 1.213230003876171e-05, 'car': 70.0, 'imb': 3.080347517465974, 'kur': 4.413427772269172, 'max': 4.950400034431368e-05, 'min': 0.0, 'npr': 66.0, 'skw': 1.3208459347226216, 'std': 8.071501322276253e-06, 'sum': 0.0008492610027133196, 'var': 6.514913359550729e-11}, 'Object_load_raw': {'avg': 1.213230003876171e-05, 'car': 70.0, 'imb': 3.080347517465974, 'kur': 4.413427772269172, 'max': 4.950400034431368e-05, 'min': 0.0, 'npr': 66.0, 'skw': 1.3208459347226216, 'std': 8.071501322276253e-06, 'sum': 0.0008492610027133196, 'var': 6.514913359550729e-11}, 'Rank_comm': {'avg': 1680.0, 'car': 2.0, 'imb': 0.0, 'kur': 0.0, 'max': 1680.0, 'min': 1680.0, 'npr': 2.0, 'skw': 0.0, 'std': 0.0, 'sum': 3360.0, 'var': 0.0}, 'Rank_load_modeled': {'avg': 0.0004246305013566598, 'car': 2.0, 'imb': 0.358218497683666, 'kur': -2.75, 'max': 0.0005767410016233043, 'min': 0.0002725200010900153, 'npr': 2.0, 'skw': 0.0, 'std': 0.00015211050026664452, 'sum': 0.0008492610027133196, 'var': 2.3137604291368863e-08}, 'Rank_load_raw': {'avg': 0.0004246305013566598, 'car': 2.0, 'imb': 0.358218497683666, 'kur': -2.75, 'max': 0.0005767410016233043, 'min': 0.0002725200010900153, 'npr': 2.0, 'skw': 0.0, 'std': 0.00015211050026664452, 'sum': 0.0008492610027133196, 'var': 2.3137604291368863e-08}}}
Key 'post-LB' error:
Wrong keys 'Object_work_modeled', 'Rank_work_modeled' in {'Object_comm': {'avg': 840.0, 'car': 4.0, 'imb': 0.06666666666666665, 'kur': -2.4375, 'max': 896.0, 'min': 784.0, 'npr': 4.0, 'skw': 0.0, 'std': 56.0, 'sum': 3360.0, 'var': 3136.0}, 'Object_load_modeled': {'avg': 1.213230003876171e-05, 'car': 70.0, 'imb': 3.080347517465974, 'kur': 4.413427772269169, 'max': 4.950400034431368e-05, 'min': 0.0, 'npr': 66.0, 'skw': 1.3208459347226218, 'std': 8.071501322276251e-06, 'sum': 0.0008492610027133196, 'var': 6.514913359550728e-11}, 'Object_load_raw': {'avg': 1.213230003876171e-05, 'car': 70.0, 'imb': 3.080347517465974, 'kur': 4.413427772269169, 'max': 4.950400034431368e-05, 'min': 0.0, 'npr': 66.0, 'skw': 1.3208459347226218, 'std': 8.071501322276251e-06, 'sum': 0.0008492610027133196, 'var': 6.514913359550728e-11}, 'Object_work_modeled': {'avg': 1.213230003876171e-05, 'car': 70.0, 'imb': 3.080347517465974, 'kur': 4.413427772269172, 'max': 4.950400034431368e-05, 'min': 0.0, 'npr': 66.0, 'skw': 1.3208459347226216, 'std': 8.071501322276253e-06, 'sum': 0.0008492610027133196, 'var': 6.514913359550729e-11}, 'Rank_comm': {'avg': 1680.0, 'car': 2.0, 'imb': 0.0, 'kur': 0.0, 'max': 1680.0, 'min': 1680.0, 'npr': 2.0, 'skw': 0.0, 'std': 0.0, 'sum': 3360.0, 'var': 0.0}, 'Rank_load_modeled': {'avg': 0.0004246305013566598, 'car': 2.0, 'imb': 0.0011763182995736532, 'kur': -2.75, 'max': 0.00042513000198596274, 'min': 0.0004241310007273569, 'npr': 2.0, 'skw': 0.0, 'std': 4.995006293029292e-07, 'sum': 0.0008492610027133196, 'var': 2.4950087867402225e-13}, 'Rank_load_raw': {'avg': 0.0004246305013566598, 'car': 2.0, 'imb': 0.0011763182995736532, 'kur': -2.75, 'max': 0.00042513000198596274, 'min': 0.0004241310007273569, 'npr': 2.0, 'skw': 0.0, 'std': 4.995006293029292e-07, 'sum': 0.0008492610027133196, 'var': 2.4950087867402225e-13}, 'Rank_work_modeled': {'avg': 0.0004246305013566598, 'car': 2.0, 'imb': 0.358218497683666, 'kur': -2.75, 'max': 0.0005767410016233043, 'min': 0.0002725200010900153, 'npr': 2.0, 'skw': 0.0, 'std': 0.00015211050026664452, 'sum': 0.0008492610027133196, 'var': 2.3137604291368863e-08}}
+ echo 'Invalid schema in ./tests/vt_lb_statistics.2022-12-06-01-29-50.json.br.. exiting'

src/vt/vrt/collection/balance/lb_invoke/lb_manager.cc

nlslatt

Looks good

cz4rs force-pushed the 1659-improve-lb-statistics branch 4 times, most recently from ee0a4b1 to 7ef11b7 Compare October 17, 2022 10:28

cz4rs force-pushed the 1659-improve-lb-statistics branch from bd63f4f to 4d2b25d Compare November 8, 2022 16:11

cz4rs commented Nov 8, 2022

View reviewed changes

src/vt/vrt/collection/balance/lb_invoke/lb_manager.h Outdated Show resolved Hide resolved

cz4rs marked this pull request as ready for review November 9, 2022 20:30

cz4rs force-pushed the 1659-improve-lb-statistics branch from 041dc16 to 317f055 Compare November 9, 2022 20:30

cz4rs requested review from lifflander, nlslatt, jstrzebonski, JacobDomagala, nmm0, PhilMiller, stmcgovern and thearusable November 9, 2022 20:35

PhilMiller reviewed Nov 11, 2022

View reviewed changes

src/vt/vrt/collection/balance/lb_type.h Outdated Show resolved Hide resolved

cz4rs marked this pull request as draft November 14, 2022 17:18

cz4rs force-pushed the 1659-improve-lb-statistics branch from 317f055 to 46b8f88 Compare November 14, 2022 17:22

PhilMiller reviewed Nov 14, 2022

View reviewed changes

src/vt/vrt/collection/balance/lb_common.h Outdated Show resolved Hide resolved

PhilMiller reviewed Nov 14, 2022

View reviewed changes

tests/unit/lb/test_lbargs_enum_conv.nompi.cc Outdated Show resolved Hide resolved

cz4rs marked this pull request as ready for review November 14, 2022 20:53

cz4rs commented Nov 14, 2022

View reviewed changes

src/vt/vrt/collection/balance/lb_invoke/lb_manager.h Outdated Show resolved Hide resolved

cz4rs requested a review from PhilMiller November 14, 2022 20:59

PhilMiller reviewed Nov 15, 2022

View reviewed changes

src/vt/vrt/collection/balance/baselb/baselb.h Outdated Show resolved Hide resolved

cz4rs force-pushed the 1659-improve-lb-statistics branch from 634004e to 525c35e Compare November 28, 2022 09:51

cz4rs requested review from PhilMiller and removed request for jstrzebonski November 28, 2022 12:05

cz4rs force-pushed the 1659-improve-lb-statistics branch from 525c35e to af22bed Compare November 29, 2022 09:33

PhilMiller approved these changes Nov 29, 2022

View reviewed changes

nmm0 mentioned this pull request Nov 29, 2022

Meeting Agenda [do not close] #925

Open

cz4rs force-pushed the 1659-improve-lb-statistics branch from af22bed to 04e2408 Compare November 30, 2022 17:58

nlslatt reviewed Dec 2, 2022

View reviewed changes

src/vt/vrt/collection/balance/lb_invoke/lb_manager.cc Outdated Show resolved Hide resolved

cz4rs force-pushed the 1659-improve-lb-statistics branch from 04e2408 to f7c1611 Compare December 6, 2022 00:30

cz4rs force-pushed the 1659-improve-lb-statistics branch from e7ad6e6 to fe6c671 Compare December 6, 2022 18:43

PhilMiller reviewed Dec 6, 2022

View reviewed changes

src/vt/vrt/collection/balance/lb_invoke/lb_manager.cc Outdated Show resolved Hide resolved

cz4rs added 13 commits December 13, 2022 17:24

#1659: lb: remove redundant argument

8024e1d

#1659: clean-up LBType

f874ccd

#1659: lb: add custom model handling to LBManager

07624f2

#1659: lb: avoid repetition

ffcb97b

#1659: lb: clear custom model when TemperedWMin is destroyed

0570bd0

#1659: lb: use correct statistic when calculating work

65eeba6

#1659: lb: remove redundant std::hash specialization (LBType)

2b9b238

#1659: remove redundant std::hash specializations

7cdaa43

#1659: remove obsolete code

ba70458

#1659: lb: control strategy specific model through BaseLB

f44cd5c

#1659: lb: make BaseLB a friend of LBManager

c5219da

#1659: lb: use more general language

5feffc2

#1659: lb: update JSON schema

484a4c4

cz4rs force-pushed the 1659-improve-lb-statistics branch from fe6c671 to 484a4c4 Compare December 13, 2022 16:26

#1659: lb: improve naming

52d4808

cz4rs requested review from PhilMiller and nlslatt December 14, 2022 11:01

nlslatt approved these changes Dec 14, 2022

View reviewed changes

nlslatt merged commit 5d748ec into develop Dec 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1659: Improve communication statistics in VT #1993

1659: Improve communication statistics in VT #1993

cz4rs commented Oct 12, 2022 •

edited

Loading

github-actions bot commented Oct 12, 2022 •

edited

Loading

cz4rs commented Nov 9, 2022

codecov bot commented Nov 9, 2022 •

edited

Loading

PhilMiller commented Nov 11, 2022

nlslatt left a comment

cz4rs commented Dec 6, 2022

nlslatt left a comment

1659: Improve communication statistics in VT #1993

1659: Improve communication statistics in VT #1993

Conversation

cz4rs commented Oct 12, 2022 • edited Loading

github-actions bot commented Oct 12, 2022 • edited Loading

Pipelines results

cz4rs commented Nov 9, 2022

codecov bot commented Nov 9, 2022 • edited Loading

Codecov Report

PhilMiller commented Nov 11, 2022

nlslatt left a comment

Choose a reason for hiding this comment

cz4rs commented Dec 6, 2022

nlslatt left a comment

Choose a reason for hiding this comment

cz4rs commented Oct 12, 2022 •

edited

Loading

github-actions bot commented Oct 12, 2022 •

edited

Loading

codecov bot commented Nov 9, 2022 •

edited

Loading