[test] Ensure that the first token generation is not included into TPOT #1414

pavel-esir · 2024-12-19T15:52:08Z

CVS-155098

pavel-esir · 2024-12-19T15:56:13Z

src/cpp/src/perf_metrics.cpp

@@ -101,7 +101,7 @@ void PerfMetrics::evaluate_statistics(std::optional<TimePoint> start_time) {

        auto ttft = tok_times[0] - start_time_val;
        raw_metrics.m_times_to_first_token = std::vector<MicroSeconds>();
-        raw_metrics.m_times_to_first_token.emplace_back(ttft / batch_sizes[0]);
+        raw_metrics.m_times_to_first_token.emplace_back(ttft);


If we have batch 10 and it takes 1 sec until the first token is generated then time to the first token is still 1 sec not 100 ms! Therefore i removed / batch_sizes[0]

I agree, but should we notify llm_bench about the change?

I asked @eaidova what she thinks. Let's wait her answer

@peterchen-intel @wgzintel do you have objections? What the reason to divide first token latency on batch?

As i see from blame, this division was added by @ialbrecht. Do you have any objection to remove it?

pavel-esir added this to the 2025.0 milestone Dec 19, 2024

pavel-esir requested a review from Wovchena December 19, 2024 15:52

pavel-esir changed the title ~~Ensure that the first token generation is not included into TPOT~~ [test] Ensure that the first token generation is not included into TPOT Dec 19, 2024

github-actions bot added category: LLM LLM pipeline (stateful, static) no-match-files labels Dec 19, 2024

pavel-esir commented Dec 19, 2024

View reviewed changes

pavel-esir added 2 commits December 19, 2024 17:02

imporove perf_metrics tests

e7347a7

add left cpp file

dad4c6d

ilya-lavrenov assigned Wovchena and eaidova Dec 19, 2024

Wovchena approved these changes Dec 20, 2024

View reviewed changes

eaidova approved these changes Dec 20, 2024

View reviewed changes

pavel-esir force-pushed the improve_perf_tests branch from a247803 to a1e0d35 Compare December 20, 2024 10:34

relax a bit prefill time comparison

a1e0d35

ilya-lavrenov added bug Something isn't working port to LTS PR needs to be ported to LTS labels Dec 20, 2024

pavel-esir added this pull request to the merge queue Dec 23, 2024

Merged via the queue into openvinotoolkit:master with commit c09207c Dec 23, 2024
59 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[test] Ensure that the first token generation is not included into TPOT #1414

[test] Ensure that the first token generation is not included into TPOT #1414

pavel-esir commented Dec 19, 2024

pavel-esir Dec 19, 2024 •

edited

Loading

Wovchena Dec 19, 2024

pavel-esir Dec 19, 2024

eaidova Dec 20, 2024

pavel-esir Dec 20, 2024 •

edited

Loading

[test] Ensure that the first token generation is not included into TPOT #1414

[test] Ensure that the first token generation is not included into TPOT #1414

Conversation

pavel-esir commented Dec 19, 2024

pavel-esir Dec 19, 2024 • edited Loading

Choose a reason for hiding this comment

Wovchena Dec 19, 2024

Choose a reason for hiding this comment

pavel-esir Dec 19, 2024

Choose a reason for hiding this comment

eaidova Dec 20, 2024

Choose a reason for hiding this comment

pavel-esir Dec 20, 2024 • edited Loading

Choose a reason for hiding this comment

pavel-esir Dec 19, 2024 •

edited

Loading

pavel-esir Dec 20, 2024 •

edited

Loading