test: TC for Metric P0 nv_load_time per model #7697

indrajit96 · 2024-10-14T08:10:43Z

What does the PR do?

Test Case of model load time metrics

Checklist

Commit Type:

Check the conventional commit type
box here and add the label to the github PR.

Related PRs:

Core : triton-inference-server/core#397

Where should the reviewer start?

qa/L0_metrics/general_metrics_test.py

Test plan:

Added tests for

Normal Mode Model Load
Explicit Model Load
Explicit Model Unload

CI Pipeline ID: https://gitlab-master.nvidia.com/dl/dgx/tritonserver/-/pipelines/19322776

Background

Improve metrics in Triton

qa/L0_metrics/test.sh

kthui · 2024-10-19T01:01:46Z

qa/L0_metrics/test.sh

+set +e
+CLIENT_PY="./general_metrics_test.py"
+CLIENT_LOG="general_metrics_test_client.log"
+SERVER_LOG="general_metrics_test_server.log"
+SERVER_ARGS="$BASE_SERVER_ARGS --model-control-mode=explicit --log-verbose=1"
+run_and_check_server
+MODEL_NAME='libtorch_float32_float32_float32'
+code=`curl -s -w %{http_code} -X POST ${TRITONSERVER_IPADDR}:8000/v2/repository/models/${MODEL_NAME}/load`
+# Test 2 for explicit mode LOAD
+python3 -m pytest --junitxml="general_metrics_test.test_metrics_load_time_explicit_load.report.xml" $CLIENT_PY::TestGeneralMetrics::test_metrics_load_time_explicit_load >> $CLIENT_LOG 2>&1
+
+code=`curl -s -w %{http_code} -X POST ${TRITONSERVER_IPADDR}:8000/v2/repository/models/${MODEL_NAME}/unload`
+# Test 3 for explicit mode UNLOAD
+python3 -m pytest --junitxml="general_metrics_test.test_metrics_load_time_explicit_unload.report.xml" $CLIENT_PY::TestGeneralMetrics::test_metrics_load_time_explicit_unload >> $CLIENT_LOG 2>&1
+kill_server


For test 2 and 3, I think it would be more helpful by testing the following model load/unload sequence against metrics:

Start the server without loading any model.

Check the metrics is empty.

Load a few models (and have one model with two versions that loads at a different speed).

Check the metrics is correct.

Call the load API again (without changing the model repository).

Check the metrics is unchanged.

Load a new model.

Check the metrics is updated correctly.

Unload models.

Check the metrics is unchanged.

…-server/server into ibhosale_metrics_google

qa/L0_metrics/general_metrics_test.py

+def get_model_load_times():
+    r = requests.get(f"http://{_tritonserver_ipaddr}:8002/metrics")
+    r.raise_for_status()
+    pattern = re.compile(rf'{MODEL_LOAD_TIME}"(.*?)".*?\ (\d+\.\d+)')


TC for Metric P0 nv_load_time per model

c9e8c6a

indrajit96 requested review from rmccorm4 and yinggeh October 14, 2024 08:10

Fix Pre-Commit

5b1f62f

indrajit96 changed the title ~~TC for Metric P0 nv_load_time per model~~ test: TC for Metric P0 nv_load_time per model Oct 14, 2024

indrajit96 mentioned this pull request Oct 14, 2024

feat: Add model_load_time metric triton-inference-server/core#397

Open

19 tasks

indrajit96 requested review from GuanLuo and kthui October 14, 2024 16:17

rmccorm4 reviewed Oct 14, 2024

View reviewed changes

qa/L0_metrics/test.sh Outdated Show resolved Hide resolved

rmccorm4 reviewed Oct 14, 2024

View reviewed changes

qa/L0_metrics/test.sh Outdated Show resolved Hide resolved

rmccorm4 reviewed Oct 14, 2024

View reviewed changes

qa/L0_metrics/test.sh Outdated Show resolved Hide resolved

Fix review comments

d421e49

indrajit96 requested a review from rmccorm4 October 16, 2024 22:59

indrajit96 added 3 commits October 17, 2024 18:02

Update Docs for new metric model load time

d47ebe5

Remove logs causing test to fail

3fcc649

Merge branch 'main' into ibhosale_metrics_google

8447d01

kthui reviewed Oct 19, 2024

View reviewed changes

indrajit96 added 2 commits October 19, 2024 01:44

Fix review comments add new test for versions

748d3c5

Merge branch 'ibhosale_metrics_google' of github.com:triton-inference…

b93d774

…-server/server into ibhosale_metrics_google

github-advanced-security bot found potential problems Oct 19, 2024

View reviewed changes

Pre-Commit Fix

9f3f577

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: TC for Metric P0 nv_load_time per model #7697

test: TC for Metric P0 nv_load_time per model #7697

indrajit96 commented Oct 14, 2024 •

edited

Loading

kthui Oct 19, 2024

test: TC for Metric P0 nv_load_time per model #7697

Are you sure you want to change the base?

test: TC for Metric P0 nv_load_time per model #7697

Conversation

indrajit96 commented Oct 14, 2024 • edited Loading

What does the PR do?

Checklist

Commit Type:

Related PRs:

Where should the reviewer start?

Test plan:

Background

kthui Oct 19, 2024

Choose a reason for hiding this comment

indrajit96 commented Oct 14, 2024 •

edited

Loading