Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add speculative decoding params to lm_bench #1221

Merged
merged 3 commits into from
Nov 20, 2024

Conversation

sbalandi
Copy link
Contributor

@sbalandi sbalandi commented Nov 18, 2024

Task: CVS-155520

@github-actions github-actions bot added category: llm_bench Label for tool/llm_bench folder category: sampling Sampling / Decoding algorithms labels Nov 18, 2024
@sbalandi sbalandi requested a review from eaidova November 18, 2024 10:21
@eaidova
Copy link
Collaborator

eaidova commented Nov 18, 2024

@sbalandi could you please include test into GHA for speculative decognig case

@ilya-lavrenov ilya-lavrenov requested a review from iefode November 18, 2024 11:30
@ilya-lavrenov ilya-lavrenov added this to the 2025.0 milestone Nov 18, 2024
@ilya-lavrenov ilya-lavrenov added the port to LTS PR needs to be ported to LTS label Nov 18, 2024
@github-actions github-actions bot added the category: GHA CI based on Github actions label Nov 18, 2024
@eaidova
Copy link
Collaborator

eaidova commented Nov 19, 2024

@sbalandi looks like the selected for test models are too large. Maybe we can use something less compute expensive? e.g. tinyllama with fp16 and int4/int8 precision as draft. Also you can use pre-converted models from here https://huggingface.co/collections/OpenVINO/llm-6687aaa2abca3bbcec71a9bd changing optimum-cli to huggingface-cli download command

@sbalandi sbalandi force-pushed the llm_bench_sd branch 2 times, most recently from 8caa747 to 795563d Compare November 19, 2024 13:27
@sbalandi sbalandi force-pushed the llm_bench_sd branch 3 times, most recently from 35050d3 to e1444e2 Compare November 19, 2024 15:56
@sbalandi sbalandi force-pushed the llm_bench_sd branch 3 times, most recently from 87eb848 to e4155c3 Compare November 19, 2024 19:17
@eaidova eaidova enabled auto-merge November 20, 2024 05:51
@github-actions github-actions bot removed the category: sampling Sampling / Decoding algorithms label Nov 20, 2024
@eaidova eaidova added this pull request to the merge queue Nov 20, 2024
Merged via the queue into openvinotoolkit:master with commit a2e1ae9 Nov 20, 2024
53 of 54 checks passed
ilya-lavrenov pushed a commit to ilya-lavrenov/openvino.genai that referenced this pull request Nov 20, 2024
github-merge-queue bot pushed a commit that referenced this pull request Nov 21, 2024
@ilya-lavrenov ilya-lavrenov removed the port to LTS PR needs to be ported to LTS label Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: GHA CI based on Github actions category: llm_bench Label for tool/llm_bench folder
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants