Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[test] Ensure that the first token generation is not included into TPOT #1414

Merged
merged 3 commits into from
Dec 23, 2024

Conversation

pavel-esir
Copy link
Contributor

CVS-155098

@pavel-esir pavel-esir added this to the 2025.0 milestone Dec 19, 2024
@pavel-esir pavel-esir requested a review from Wovchena December 19, 2024 15:52
@pavel-esir pavel-esir changed the title Ensure that the first token generation is not included into TPOT [test] Ensure that the first token generation is not included into TPOT Dec 19, 2024
@github-actions github-actions bot added category: LLM LLM pipeline (stateful, static) no-match-files labels Dec 19, 2024
@@ -101,7 +101,7 @@ void PerfMetrics::evaluate_statistics(std::optional<TimePoint> start_time) {

auto ttft = tok_times[0] - start_time_val;
raw_metrics.m_times_to_first_token = std::vector<MicroSeconds>();
raw_metrics.m_times_to_first_token.emplace_back(ttft / batch_sizes[0]);
raw_metrics.m_times_to_first_token.emplace_back(ttft);
Copy link
Contributor Author

@pavel-esir pavel-esir Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have batch 10 and it takes 1 sec until the first token is generated then time to the first token is still 1 sec not 100 ms! Therefore i removed / batch_sizes[0]

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but should we notify llm_bench about the change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked @eaidova what she thinks. Let's wait her answer

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterchen-intel @wgzintel do you have objections? What the reason to divide first token latency on batch?

Copy link
Contributor Author

@pavel-esir pavel-esir Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As i see from blame, this division was added by @ialbrecht. Do you have any objection to remove it?

@ilya-lavrenov ilya-lavrenov added bug Something isn't working port to LTS PR needs to be ported to LTS labels Dec 20, 2024
@pavel-esir pavel-esir added this pull request to the merge queue Dec 23, 2024
Merged via the queue into openvinotoolkit:master with commit c09207c Dec 23, 2024
59 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working category: LLM LLM pipeline (stateful, static) no-match-files port to LTS PR needs to be ported to LTS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants